Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Open Question Answering on LegalMC4 (test)
Loading...
77.2
LLM Factual Correctness
GPT-5 (min. reasoning)
33.728
45.014
56.3
67.586
Jan 20, 2026
LLM Factual Correctness
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Factual Correctness
GPT-5 (min. reasoning)
Train Data=Base Model
2026.01
77.2
GPT-5-mini (min. reasoning)
Train Data=Base Model
2026.01
70.1
LLaMA 3.1 (8B)
Train Data=Difficulty-...
2026.01
55.4
Gemma 3 (12B)
Train Data=Difficulty-...
2026.01
54.5
LLaMA 3.1 (8B)
Train Data=Base Model
2026.01
43
Gemma 3 (12B)
Train Data=Standard In...
2026.01
41.8
Gemma 3 (12B)
Train Data=Base Model
2026.01
38.7
LLaMA 3.1 (8B)
Train Data=Standard In...
2026.01
35.4
Feedback
Search any
task
Search any
task