Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Veracity Inference on PRONTOQA 3-hop (test)
Loading...
95.6
Hamming Similarity
AVI
65.128
73.039
80.95
88.861
May 17, 2025
Hamming Similarity
Updated 4d ago
Evaluation Results
Method
Method
Links
Hamming Similarity
AVI
Base LLM=Qwen 8B
2025.05
95.6
AVI
Base LLM=Qwen 4B
2025.05
88.6
Many2Many
Base LLM=Qwen 4B
2025.05
71
Many2Many
Base LLM=Qwen 8B
2025.05
66.3
Feedback
Search any
task
Search any
task