Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Reasoning on VisualProbe Hard
Loading...
75.9
Accuracy
DeepEyes-7B
67.58
69.74
71.9
74.06
Nov 25, 2025
Accuracy
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
DeepEyes-7B
Activation Replay=true
2025.11
75.9
DeepEyes-7B
Activation Replay=false
2025.11
75.4
Qwen2.5-VL-7B
Activation Replay=false
2025.11
67.9
Feedback
Search any
task
Search any
task