Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Linguistic Reasoning on BigBench Hard Disambiguation QA

55.1Accuracy

ReElicit

51.25252.25153.2554.249May 18, 2026
Updated 14d ago

Evaluation Results

MethodLinks
2026.05
55.1
2026.05
53.2
2026.05
52.4
2026.05
51.6
2026.05
51.4