Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on ARC Easy 25-shot
Loading...
92
Normalized Log Accuracy
HATified-SFT
79.416
82.683
85.95
89.217
Mar 16, 2026
Normalized Log Accuracy
Compression
Updated 1mo ago
Evaluation Results
Method
Method
Links
Normalized Log Accuracy
Compression
HATified-SFT
shots=25-shot
2026.03
92
-
Llama-Instruct
shots=25-shot
2026.03
91.1
-
HATified
shot count=25-shot, DP...
2026.03
89.7
5.53
T-Free
shot count=25-shot, DP...
2026.03
89.4
5.53
Llama-3.1-8B-TFree-HAT-SFT
shots=25, SFT=true
2026.03
88.9
5.53
Llama
shot count=25-shot, DP...
2026.03
88
4.94
Tülu
shot count=25-shot, DP...
2026.03
81.6
4.94
Llama-3.1-Tulu-3-8B-SFT
shots=25, SFT=true
2026.03
79.9
4.94
Feedback
Search any
task
Search any
task