Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-task evaluation on Average GSM8K, HumanEval, ARC-c
Loading...
60.77
Accuracy
ReMix
8.354
21.962
35.57
49.178
Mar 10, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
ReMix
Type=Mixture, Params=0...
2026.03
60.77
rsLoRA
Type=Weight Modulation...
2026.03
57.95
MixLoRA
Type=Mixture, Params=0...
2026.03
57.43
DoRA
Type=Weight Modulation...
2026.03
56.61
LoRA
Type=Weight Modulation...
2026.03
56.36
HydraLoRA
Type=Mixture, Params=0...
2026.03
55.1
Few-Shot
Type=No Tuning, Params...
2026.03
51.66
P-Tuning
Type=Prefix Injection,...
2026.03
34.89
VB-LoRA
Type=Mixture, Params=0...
2026.03
29.09
(IA)3
Type=Weight Modulation...
2026.03
21.02
Prompt Tuning
Type=Prefix Injection,...
2026.03
18.22
Zero-Shot
Type=No Tuning, Params...
2026.03
13.41
Prefix Tuning
Type=Prefix Injection,...
2026.03
10.37
Feedback
Search any
task
Search any
task