| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| FinQA | Accuracy77.6 | 69 | 8d ago | ||
| ConvFinQA | Accuracy85.7 | 23 | 2mo ago | ||
| Finance | MASA | Accuracy52.45 | 11 | 3mo ago | |
| S&P 500 Scenario-based MCQs Stage I | DeepSeek-v3.1 | Accuracy87.14 | 10 | 1mo ago | |
| OPENFIN (held-out) | DBS73.4 | 9 | 3mo ago | ||
| FinReason | QianfanHuijin-70B | Score86.72 | 7 | 3mo ago | |
| FinChain (test) | Qwen-2.5-32B-it | DRS82 | 6 | 16d ago | |
| Financial Reasoning Tasks and FinAgent Aggregated (Overall Avg) | Supervisor-Committee + DoC | Overall Average Accuracy74.9 | 5 | 1d ago | |
| FinAgent | Accuracy (Steps)66 | 5 | 1d ago | ||
| Risk Management | Supervisor-Committee + DoC | Accuracy (Correct/Complete Steps)90.5 | 5 | 1d ago | |
| Quant Finance | Supervisor-Committee + DoC | Accuracy96.3 | 5 | 1d ago | |
| Research | Supervisor-Committee + DoC | Accuracy54.2 | 5 | 1d ago |