Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM Filtering on Manually adjudicated gold-standard CUIs LV Systolic Dysfunction v1 (test)
Loading...
90
CUI Count
GPT-5-mini
53.6
63.05
72.5
81.95
Feb 20, 2026
CUI Count
Recall
Precision
F1-score
Updated 4d ago
Evaluation Results
Method
Method
Links
CUI Count
Recall
Precision
F1-score
GPT-5-mini
Framework=CUICurate, B...
2026.02
90
53
91
67
GPT-5
Framework=CUICurate, B...
2026.02
70
42
91
57
manual
Mode=manual curation
2026.02
55
29
82
43
Feedback
Search any
task
Search any
task