| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MT-Bench | Qwen3-32B | MT-Bench Score7.58 | 28 | 26d ago | |
| Alpaca (test) | LoRA | Alpaca LC Win Rate71.87 | 20 | 1mo ago | |
| Vicuna Eval (test) | FAA | Vicuna Eval GPT-4 Score8.91 | 20 | 1mo ago | |
| CharacterEval | GPT-4+PCL* | Fluency3.612 | 13 | 1mo ago |