Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Evaluation on HealthBench (test)
Loading...
62.6
HealthBench Score (%)
GPT-5
18.816
30.183
41.55
52.917
Mar 6, 2026
HealthBench Score (%)
Qworld Score (%)
Updated 23d ago
Evaluation Results
Method
Method
Links
HealthBench Score (%)
Qworld Score (%)
GPT-5
Rank=1, ∆=→ 0
2026.03
62.6
35.4
GPT-5-mini
Rank=3, ∆=↓ 1
2026.03
61.7
34.8
DeepSeek-V3.2
Rank=4, ∆=↓ 1
2026.03
53
31.8
Gemini 3 Flash
Rank=5, ∆=→ 0
2026.03
52.5
30.4
Grok-4.1-Fast
Rank=6, ∆=↓ 2
2026.03
52.5
30.1
Qwen3-30B
Rank=2, ∆=↑ 4
2026.03
49.8
34.9
GPT-4.1
Rank=7, ∆=→ 0
2026.03
47
23.9
Claude Sonnet 4.5
Rank=8, ∆=→ 0
2026.03
43.5
22.7
GPT-4.1-mini
Rank=9, ∆=→ 0
2026.03
39.7
19.9
GPT-4.1-nano
Rank=10, ∆=→ 0
2026.03
33.9
18.1
Llama-3.1-70B
Rank=11, ∆=→ 0
2026.03
20.5
13.1
Feedback
Search any
task
Search any
task