| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| QUEST | NSFL | A AND B8.5 | 8 | 4d ago | |
| OpenBookQA (test) | AdamW | Accuracy68.2 | 6 | 11d ago | |
| COSE (test) | SIFT | Accuracy75.2 | 6 | 11d ago | |
| ESNLI (test) | POME | Accuracy (ESNLI Test)80.6 | 6 | 11d ago | |
| Flores+ Ko to En | InternVL3_5 38B-Thinking | Score31.8 | 4 | 1mo ago | |
| PIQA English | Qwen3-VL 32B-Thinking | Score88 | 4 | 1mo ago | |
| HellaSwag English | EXAONE 4.0 32B | Score65.7 | 4 | 1mo ago | |
| MMLU English | EXAONE 4.0 32B | Score89.9 | 4 | 1mo ago | |
| Flores+ En to Ko | HyperCLOVA X 32B Think | Score31.8 | 4 | 1mo ago | |
| HAERAE Bench Korean 1.0 | HyperCLOVA X 32B Think | Score87.4 | 4 | 1mo ago | |
| CLIcK Korean | HyperCLOVA X 32B Think | Score75.2 | 4 | 1mo ago | |
| KoBALT Korean | HyperCLOVA X 32B Think | Score50.6 | 4 | 1mo ago | |
| KMMLU Korean | EXAONE 4.0 32B | Score75.2 | 4 | 1mo ago |