Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clinical text revision on MIMIC Discharge Report
Loading...
62.5
Mistral Score
RAG
-51.38
-21.815
7.75
37.315
Jan 31, 2026
Mistral Score
GPT5-mini Score
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Mistral Score
GPT5-mini Score
Average Score
RAG
Method Category=In-Con...
2026.01
62.5
26.2
44.4
Weaving in Revision
Method Category=Experi...
2026.01
58.8
40
49.4
Weaving in Detection
Method Category=Experi...
2026.01
57.5
68.8
63.1
Self-Critic
Method Category=In-Con...
2026.01
53.8
30
41.9
Weaving in Total
Method Category=Experi...
2026.01
52.5
30
41.2
Weaving in Self-Critic
Method Category=Experi...
2026.01
50
33.7
41.9
Deepseek R1 Search
Method Category=Direct...
2026.01
47.5
30
38.8
Deepseek Thinking
Method Category=Direct...
2026.01
42.5
18.8
30.6
Claude-4.5 Sonnet
Method Category=Direct...
2026.01
-35
11
11
GPT-5.1
Method Category=Direct...
2026.01
-40
32
32
Gemini-3 Pro
Method Category=Direct...
2026.01
-47
20
20
Feedback
Search any
task
Search any
task