Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clinical Response Evaluation on 400 Held-out Clinical Prompts (test)
Loading...
3.18
Factuality
PrivMedChat
2.7432
2.8566
2.97
3.0834
Mar 3, 2026
Factuality
Safety
Helpfulness
Conciseness
Empathy
Overall Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Factuality
Safety
Helpfulness
Conciseness
Empathy
Overall Score
PrivMedChat
epsilon=7
2026.03
3.18
3.1
2.62
2.99
2.41
2.86
DP-SFT
epsilon=7
2026.03
2.93
2.57
2.36
2.38
2.14
2.48
SFT
epsilon=–
2026.03
2.76
2.53
2.32
2.36
2.04
2.4
Feedback
Search any
task
Search any
task