Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Verdict Prediction on AVerImaTeC official (test)
Loading...
89
Q-Eval
VILLAIN
34.92
48.96
63
77.04
Feb 4, 2026
Q-Eval
Evid-Eval
Veracity
Justification
Updated 1mo ago
Evaluation Results
Method
Method
Links
Q-Eval
Evid-Eval
Veracity
Justification
VILLAIN
Backbone=Gemma-3-27B,...
2026.02
89
53.6
54.6
55.6
AIC CTU
2026.02
80.7
32.5
34.7
30.4
Baseline
2026.02
55.5
17.1
11.4
13.2
ADA-AGGR
2026.02
37
46.3
53.7
43.3
Feedback
Search any
task
Search any
task