Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Contextual Question Answering on PISTOL (A_C)
Loading...
98
ROUGE-L
NPO
32.272
49.336
66.4
83.464
Oct 20, 2025
ROUGE-L
LLM Judge Score
Updated 6d ago
Evaluation Results
Method
Method
Links
ROUGE-L
LLM Judge Score
NPO
Variant=Context-aware
2025.10
98
100
RMU
Variant=Context-aware
2025.10
96.2
80
UNDIAL
Variant=Context-aware
2025.10
96
100
NPO
Variant=Vanilla
2025.10
81.2
75
UNDIAL
Variant=Vanilla
2025.10
64.2
90
RMU
Variant=Vanilla
2025.10
34.8
0
Feedback
Search any
task
Search any
task