Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Full code system evaluation on Corti proprietary AMB
Loading...
77
Recall
ChatGPT
41.224
50.512
59.8
69.088
Mar 31, 2026
Recall
Precision
F1 Score
Updated 18d ago
Evaluation Results
Method
Method
Links
Recall
Precision
F1 Score
ChatGPT
Approach=No tools
2026.03
77
47.6
58.8
ChatGPT
Approach=With tools
2026.03
70.4
50.8
59
Symphony
Approach=Corti
2026.03
64.4
66
65.2
CLH
Approach=Corti
2026.03
51.1
62
56
Gemini
Approach=With tools
2026.03
42.6
67.8
52.3
Feedback
Search any
task
Search any
task