Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Information Coverage and Truthfulness Evaluation on Web-based Retrieval
Loading...
0.749
S_fact Score
Mixtral-8x22B
0.7126
0.72205
0.7315
0.74095
Jan 7, 2025
S_fact Score
ICAT-M Coverage
ICAT-M1 Score
ICAT-S Coverage
ICAT-S1 Score
ICAT-A Coverage
ICAT-A1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
S_fact Score
ICAT-M Coverage
ICAT-M1 Score
ICAT-S Coverage
ICAT-S1 Score
ICAT-A Coverage
ICAT-A1 Score
Mixtral-8x22B
Alignment LLM=Llama-3....
2025.01
0.749
-
-
0.503
0.601
0.429
0.545
GPT-4
Alignment LLM=Llama-3....
2025.01
0.748
-
-
0.551
0.634
0.541
0.627
Openchat 3.5 (7B)
Alignment LLM=Llama-3....
2025.01
0.741
-
-
0.52
0.611
0.444
0.555
Llama-3-70B
Alignment LLM=Llama-3....
2025.01
0.714
-
-
0.556
0.625
0.486
0.578
Feedback
Search any
task
Search any
task