Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Our Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal Spatial ReasoningOur-Bench SpatialTree-Bench 1.0 (test)
Average Score57.8
16
Spoken Question AnsweringOur Bench
Accuracy76.34
8
Compositional GenerationOur Bench
CLIP Score32.33
6
Layout-based generationOur Bench Layout only
F1 Score44
5
Layout-based generationOur Bench Layout + Reference
F1 Score35
4
Showing 5 of 5 rows