Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool-Integrated Reasoning on TIR-Bench
Loading...
20.8
Score
DeepEyesV2-RL
15.808
17.104
18.4
19.696
Nov 7, 2025
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
DeepEyesV2-RL
Learning Stage=RL, Inf...
2025.11
20.8
DeepEyesV2-SFT
Learning Stage=SFT, In...
2025.11
18.7
DeepEyes
Inference Mode=Zero-shot
2025.11
17.3
Qwen2.5-VL 7B
Learning Stage=Base, I...
2025.11
16
Feedback
Search any
task
Search any
task