Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tangram

Benchmarks

Task NameDataset NameSOTA ResultTrend
Two-piece spatial reasoningTangram
Pos IoU34
6
spatial reasoningTangram one-piece
Position IoU44.3
6
Action Coreference TrackingTangram 5 utterance length Scone-derived (test)
Accuracy57.92
4
Action Coreference TrackingTangram 3 utterance length Scone-derived (test)
Accuracy68.5
4
Showing 4 of 4 rows