Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on ARC-Challenge (Normalized Accuracy)
Loading...
46.93
Normalized Accuracy
CALDERA
-2.9172
10.0239
22.965
35.9061
May 29, 2024
Sep 7, 2024
Dec 18, 2024
Mar 30, 2025
Jul 10, 2025
Oct 20, 2025
Jan 30, 2026
Normalized Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Normalized Accuracy
CALDERA
Backbone=Mistral 7B, R...
2024.05
46.93
CALDERA
Backbone=Mistral 7B, R...
2024.05
46.59
QuIP#
Backbone=Mistral 7B, R...
2024.05
44.8
SpanNorm
Param=A2.4B-16B, Token...
2026.01
38.6
PreNorm
Param=A2.4B-16B, Token...
2026.01
37
SpanNorm
Param=5B, Tokens=200B,...
2026.01
36.3
PreNorm
Param=5B, Tokens=200B,...
2026.01
34.4
SpanNorm
Param=1.3B, Tokens=100...
2026.01
28.8
PreNorm
Param=1.3B, Tokens=100...
2026.01
26.8
PreNorm
Param=740M, Tokens=30B...
2026.01
25.3
SpanNorm
Param=740M, Tokens=30B...
2026.01
25.1
Autoregressive
Alpha (α)=1, Data repe...
2025.12
5.9
Dual
Alpha (α)=63/64, Data...
2025.12
5.7
Autoregressive
Alpha (α)=1, Data repe...
2025.12
5
Dual
Alpha (α)=3/4, Data re...
2025.12
3.3
Dual
Alpha (α)=1/8, Data re...
2025.12
1.7
Autoregressive
Alpha (α)=1, Data repe...
2025.12
-1
Feedback
Search any
task
Search any
task