Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA In-the-wild model generalization benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
In-the-wild model generalization
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
Human Bench Average
Qwen3VL-2B
NSE Score
57.9
14
1mo ago
Human Bench Text-based Demo
Qwen3VL-32B
NSE
23.4
14
1mo ago
Human Bench Vision-based Demo
ProgressLM-RL-3B
NSE
15.5
14
1mo ago
Showing 3 of 3 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task