Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Abductive Event Reasoning on Task 12 (test)

92.6Base Accuracy

Ensemble Sonnet + GPT + Gemini

90.10490.75291.492.048Mar 4, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
92.695.2
2026.03
91.294.9
2026.03
90.794.3
2026.03
90.495.2
2026.03
90.294.8
2026.03
90.294.3