Share your thoughts, 1 month free Claude Pro on usSee more

Information Coverage and Truthfulness Evaluation on Web-based Retrieval

0.749S_fact Score

Mixtral-8x22B

Updated 5mo ago

Evaluation Results

Method	Links
Mixtral-8x22B 2025.01		0.749	-	-	0.503	0.601	0.429	0.545
GPT-4 2025.01		0.748	-	-	0.551	0.634	0.541	0.627
Openchat 3.5 (7B) 2025.01		0.741	-	-	0.52	0.611	0.444	0.555
Llama-3-70B 2025.01		0.714	-	-	0.556	0.625	0.486	0.578