Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Full code system evaluation on Corti proprietary ED
Loading...
46.6
Recall
ChatGPT
25.696
31.123
36.55
41.977
Mar 31, 2026
Recall
Precision
F1 Score
Updated 18d ago
Evaluation Results
Method
Method
Links
Recall
Precision
F1 Score
ChatGPT
Approach=With tools
2026.03
46.6
34.7
39.8
ChatGPT
Approach=No tools
2026.03
42.3
42.8
42.5
Symphony
Approach=Corti
2026.03
37.6
54.3
44.5
Gemini
Approach=With tools
2026.03
27.1
35.8
39.8
CLH
Approach=Corti
2026.03
26.5
49.9
34.6
Feedback
Search any
task
Search any
task