| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RPEval Implicit Memory, Multi-Preference | GPT-4.1 + RP-Reasoner | IA (Macro)0.44 | 16 | 1mo ago | |
| RPEval Implicit Memory, Single-Preference | Qwen2.5-7B | Ignorance Score0.02 | 16 | 1mo ago | |
| POPE MSCOCO (test) | ICD | Random Accuracy87.51 | 15 | 1mo ago | |
| AMBER Discrimination 1.0 (test) | Octopus | Accuracy76.7 | 10 | 1mo ago | |
| AMBER | MESA | Accuracy84.3 | 4 | 8d ago |