| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMLU, ARC-Challenge, and CommonsenseQA Aggregate | RAISE | Average Score64.77 | 24 | 4d ago | |
| General Benchmarks Llama 3.1 8B | Generation Quality Score66.5 | 11 | 4d ago | ||
| Overall Evaluation Suite | Qwen3-30B-A3B-Instruct-2507 | Average Score73.6 | 4 | 4d ago |