Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Behavior video labeling on Mouse behavior video dataset 2026 (test)
Loading...
54.5
Macro F1
Qwen3
18.932
28.166
37.4
46.634
Feb 19, 2026
Macro F1
Balanced Accuracy
MCC
Updated 4d ago
Evaluation Results
Method
Method
Links
Macro F1
Balanced Accuracy
MCC
Qwen3
ICL Setting=AR-ICL
2026.02
54.5
80.1
0.517
Qwen3
ICL Setting=Standard ICL
2026.02
49.2
78.2
0.451
Qwen3
ICL Setting=Temporal ICL
2026.02
49
77.7
0.424
Qwen3
ICL Setting=No ICL
2026.02
37
36.5
0.13
DINO Baseline
Backbone=DINO
2026.02
20.3
44.1
0.128
Feedback
Search any
task
Search any
task