Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Temporal Reasoning on When2Call
Loading...
100
Performance Score
AutoAdapt
-4
23
50
77
Mar 9, 2026
Performance Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Performance Score
AutoAdapt
2026.03
100
AutoAdapt
setting=Template Free...
2026.03
58.52
DS-Agent
2026.03
53
MLCopilot
setting=Template Free...
2026.03
52.3
AutoMLAgent
setting=Template Free...
2026.03
50.66
MLCopilot
2026.03
17
AutoMLAgent
2026.03
11
DS-Agent
setting=Template Free...
2026.03
0
Feedback
Search any
task
Search any
task