Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Terminal Task Execution on Terminal Bench medium-difficulty 2
Loading...
52.7
Accuracy
Jupiter-N
43.236
45.693
48.15
50.607
Apr 19, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Jupiter-N
Reasoning mode=on, Tem...
2026.04
52.7
Nemotron
Reasoning mode=on, Tem...
2026.04
43.6
Feedback
Search any
task
Search any
task