Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Commonsense Reasoning on HellaSwag (Accuracy, Normalized Accuracy)
Loading...
44.3
Accuracy
AIRAhybrid-D
43.156
43.453
43.75
44.047
May 15, 2026
Accuracy
Normalized Accuracy
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
Normalized Accuracy
AIRAhybrid-D
Shot=0-shot, Architect...
2026.05
44.3
57.6
Nemotron-2 Approx.
Shot=0-shot
2026.05
44.2
57.2
AIRAhybrid-B
Shot=0-shot, Architect...
2026.05
44.2
57.6
AIRAhybrid-D
Shot=0-shot, Architect...
2026.05
44.1
57.8
AIRAhybrid-E
Shot=0-shot, Architect...
2026.05
44.1
57.1
AIRAhybrid-A
Shot=0-shot, Architect...
2026.05
44
56.6
AIRAhybrid-B
Shot=0-shot, Architect...
2026.05
44
57.1
Composer (2Mb-M-3A)
Shot=0-shot
2026.05
43.9
56.9
AIRAhybrid-E
Shot=0-shot, Architect...
2026.05
43.5
56.5
Nemotron-H Approx.
Shot=0-shot
2026.05
43.4
56.3
AIRAhybrid-C
Shot=0-shot, Architect...
2026.05
43.3
56.1
Mamba (Mb + M)
Shot=0-shot
2026.05
43.2
55.8
Feedback
Search any
task
Search any
task