Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Clinical Response Evaluation on Dermatologist clinical agreement dataset (test)
Loading...
67
Agreement - Strongly Agree
Gemini 2.5 Pro
30.704
40.127
49.55
58.973
Dec 9, 2025
Agreement - Strongly Agree
Agreement - Agree
Agreement - Neutral
Agreement - Disagree
Agreement - Strongly Disagree
Updated 4d ago
Evaluation Results
Method
Method
Links
Agreement - Strongly Agree
Agreement - Agree
Agreement - Neutral
Agreement - Disagree
Agreement - Strongly Disagree
Gemini 2.5 Pro
2025.12
67
25
6
0.3
1.8
LLM Blender - Ensemble
approach=Ensemble
2025.12
39.3
36.6
17.6
2.4
4.2
GPT-4o
2025.12
37.8
43.5
10.7
4.2
3.9
LLaMA 4 Maverick
2025.12
37.8
43.2
12.2
3.9
3
SkinGPT-4
2025.12
32.1
16.1
6.8
7.1
37.8
Feedback
Search any
task
Search any
task