Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on SQuAD (ROUGE-Lsum)
Loading...
83.06
ROUGE-Lsum
SCALENET (Layer-wise)
52.432
60.3835
68.335
76.2865
Feb 10, 2026
ROUGE-Lsum
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE-Lsum
SCALENET (Layer-wise)
LLM Backbone=Qwen4B-In...
2026.02
83.06
SCALENET (Layer-wise)
LLM Backbone=Qwen4B-In...
2026.02
79.58
Base Model
LLM Backbone=Qwen4B-In...
2026.02
77.22
Naive TTA
LLM Backbone=Qwen4B-In...
2026.02
76.57
Naive TTA
LLM Backbone=Qwen4B-In...
2026.02
76.34
SCALENET (Layer-wise)
LLM Backbone=Llama3B-I...
2026.02
73.53
SCALENET (Layer-wise)
LLM Backbone=Llama3B-I...
2026.02
72.38
SCALENET (Step-wise)
LLM Backbone=Llama3B-I...
2026.02
71.33
SCALENET (Step-wise)
LLM Backbone=Llama3B-I...
2026.02
71.14
Base Model
LLM Backbone=Llama3B-I...
2026.02
70.8
Naive TTA
LLM Backbone=Llama3B-I...
2026.02
70.58
Naive TTA
LLM Backbone=Llama3B-I...
2026.02
65.32
SCALENET (Step-wise)
LLM Backbone=Qwen4B-In...
2026.02
54.02
SCALENET (Step-wise)
LLM Backbone=Qwen4B-In...
2026.02
53.61
Feedback
Search any
task
Search any
task