| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| HallusionBench | PolarMem | Score57.8 | 32 | 3mo ago | |
| Hallusion-Bench | SceneAlign | aAcc65.35 | 17 | 3mo ago | |
| TruthfulQA | Qwen3-1.7B | TruthfulQA Accuracy48.76 | 12 | 1d ago | |
| HallusionBench | Qwen2.5-VL + DRScaffold | fAcc37.3 | 7 | 8d ago | |
| POPE random | FTibVLM | Accuracy80.56 | 2 | 7d ago |