Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Real-World Agent on PinchBench
Loading...
90.5
Best Score
GPT-5.4
86.236
87.343
88.45
89.557
Mar 29, 2026
Best Score
Average Score
Updated 19d ago
Evaluation Results
Method
Method
Links
Best Score
Average Score
GPT-5.4
2026.03
90.5
81.6
KAT-Coder-V2
2026.03
88.7
81.9
Claude Opus 4.6
2026.03
87.4
82.3
MiniMax M2.7
2026.03
87.1
81.8
Gemini 3.1 Pro
2026.03
86.7
75.9
GLM-5
2026.03
86.4
80.3
Feedback
Search any
task
Search any
task