Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Patient Status Prediction on RealICU-Scale (test)
Loading...
51.9
Accuracy
ICU-Evo
16.124
25.412
34.7
43.988
May 13, 2026
Accuracy
F1 Score
Updated 20d ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
ICU-Evo
Backbone=Gemini-3.1-pr...
2026.05
51.9
34.8
RAG
Backbone=Gemini-3.1-pr...
2026.05
44.2
31.2
ICU-Evo
Backbone=GPT-5.4 [22],...
2026.05
43.8
32.7
Local-window
Backbone=GPT-5.4 [22],...
2026.05
41.5
26.5
RAG
Backbone=GPT-5.4 [22],...
2026.05
41.1
26.9
Local-window
Backbone=Gemini-3.1-pr...
2026.05
40.5
26.4
RAG
Backbone=Qwen3-235B [3...
2026.05
36.7
28.2
ICU-Evo
Backbone=Qwen3-235B [3...
2026.05
30.4
17.7
Full-context
Backbone=Qwen3-235B [3...
2026.05
20.1
11.6
Local-window
Backbone=Qwen3-235B [3...
2026.05
17.5
15.9
Feedback
Search any
task
Search any
task