Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fact Verification on HOVER (test)
Loading...
56.6
AUROC
DiffuTruth
34.4792
40.2221
45.965
51.7079
Jul 25, 2025
Aug 27, 2025
Sep 30, 2025
Nov 2, 2025
Dec 6, 2025
Jan 8, 2026
Feb 11, 2026
AUROC
Updated 4d ago
Evaluation Results
Method
Method
Links
AUROC
DiffuTruth
2026.02
56.6
Hybrid
2026.02
56.6
Direct NLI
2026.02
52.5
GEPA
Backbone=Qwen3 8B, Opt...
2025.07
52.33
GEPA+Merge
Backbone=Qwen3 8B, Opt...
2025.07
51.67
MIPROv2
Backbone=Qwen3 8B
2025.07
47.33
GRPO
Backbone=Qwen3 8B, Opt...
2025.07
38.67
Baseline
Backbone=Qwen3 8B
2025.07
35.33
Feedback
Search any
task
Search any
task