Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TreeBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Grounded ReasoningTreeBench
Overall Score54.8
153
PerceptionTreeBench
Overall Accuracy55.3
17
PerceptionTWI-oriented TreeBench online setting
Accuracy70.5
16
ReasoningTWI-oriented TreeBench online setting
Accuracy37.1
16
PerceptionTWI-oriented TreeBench (offline)
Accuracy71.1
12
ReasoningTreeBench TWI-oriented (offline)
Accuracy37.1
12
Visual PerceptionTreeBench
Score45.2
9
Visual GroundingTreeBench
Error Rate24.6
6
Showing 8 of 8 rows