Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Agentic Reasoning on FRAMES n=50 (full)

77.31Accuracy

GPT-5

32.496444.130755.76567.3993Dec 7, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
77.31-0.0880.496
2025.12
68.37-0.1440.655
2025.12
66.44-0.1560.689
2025.12
63.54-0.1740.735
2025.12
62.34-0.1740.713
2025.12
44.75-0.1570.663
2025.12
38.16-0.1710.712
2025.12
34.22-0.1690.617