Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Financial Reasoning on S&P 500 Scenario-based MCQs Stage I

87.14Accuracy

DeepSeek-v3.1

44.55255.608566.66577.7215Apr 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
87.14
2026.04
86.19
2026.04
82.38
2026.04
80.95
2026.04
76.67
2026.04
71.9
2026.04
70.95
2026.04
64.29
2026.04
57.62
2026.04
46.19