| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Hindi News v1.0 (test) | Hin-DPO | ROUGE-137.13 | 30 | 1mo ago | |
| ICEWS05-15 (test) | GETER | BLEU-445.98 | 17 | 1mo ago | |
| GDELT (test) | GETER | BLEU-434.46 | 17 | 1mo ago | |
| ICEWS14 (test) | GETER | BLEU-440.54 | 17 | 1mo ago | |
| SENECA-RC synthetic | SHAP | Per-instance Explanation Time (s)0.1 | 15 | 1mo ago | |
| Amazon Beauty (test) | RGCF-XRec | BLEU-46.2804 | 13 | 25d ago | |
| LIAR-RAW (test) | Oracle | ROU-125.5 | 11 | 1mo ago | |
| RAWFC (test) | Oracle | ROUGE-137.62 | 10 | 1mo ago | |
| ChartCheck (test) | MEVER | ROUGE-148.9 | 9 | 1mo ago | |
| AIChartClaim (test) | MEVER | ROUGE-142.9 | 9 | 1mo ago | |
| ChartCheck | MEVER | ROUGE-L40.8 | 9 | 1mo ago | |
| AIChartClaim | MEVER | ROUGE-L34.5 | 9 | 1mo ago | |
| ChartCheck 1.0 (test) | MEVER | ROUGE-148.7 | 9 | 1mo ago | |
| AIChartClaim 1.0 (test) | MEVER | ROUGE-142.7 | 9 | 1mo ago | |
| Toys (test) | PeaPOD | BLEU-42.5319 | 7 | 25d ago | |
| Sports (test) | VIP5 | BLEU-41.0639 | 7 | 25d ago | |
| Mocheg (test) | DePlot+FlanT5 | ROUGE-132.1 | 7 | 1mo ago | |
| Mocheg | MEVER | ROUGE-L23.4 | 7 | 1mo ago | |
| Mocheg 1.0 (test) | GPT-4o | ROUGE-130.3 | 7 | 1mo ago | |
| e-SNLI (out-domain) | PROMPTING-EIB | Grammar Score2.98 | 7 | 1mo ago | |
| ECQA (out-domain) | Grammar Score2.99 | 7 | 1mo ago | ||
| TripAdvisor (test) | Transformer | FMR4 | 7 | 1mo ago | |
| Amazon (test) | PETER+ | FMR (Faithfulness Rate)77 | 7 | 1mo ago | |
| Yelp (test) | FMR5 | 7 | 1mo ago | ||
| 153 pairs of plans | ACXON (Baseline) | Nw Score1,788.6 | 6 | 1mo ago |