Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tool-Integrated Reasoning on TIR-Bench

20.8Score

DeepEyesV2-RL

15.80817.10418.419.696Nov 7, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.11
20.8
2025.11
18.7
2025.11
17.3
2025.11
16