| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| GenAI-Bench | VQAScore | Kendall's Tau-c38.4 | 16 | 12d ago | |
| RichHF-18K | Gemini-2.5-Pro | Kendall's Tau33.9 | 11 | 1mo ago | |
| MLLM-as-a-Judge | LLaVA-Critic | CO Consistency Score30.3 | 11 | 1mo ago | |
| Q-Reasoning (test) | Proposed Human-Like Reasoning Framework (detailed) | ROUGE-1 Score51.4 | 6 | 3mo ago |