Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CANVAS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reasoning over Large Structured ContextCanvas
ReasoningJudge Score4.96
4
Robot NavigationCANVAS
Gallery Miss Rate33.3
3
Showing 2 of 2 rows