Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on NQ-Open (ROUGE-Lsum)
Loading...
0.2766
ROUGE-Lsum
Base Model
0.018888
0.085794
0.1527
0.219606
Feb 10, 2026
ROUGE-Lsum
Updated 7d ago
Evaluation Results
Method
Method
Links
ROUGE-Lsum
Base Model
LLM Backbone=Llama3B-I...
2026.02
0.2766
Naive TTA
LLM Backbone=Llama3B-I...
2026.02
0.2722
SCALENET (Step-wise)
LLM Backbone=Llama3B-I...
2026.02
0.2674
SCALENET (Step-wise)
LLM Backbone=Llama3B-I...
2026.02
0.2662
SCALENET (Layer-wise)
LLM Backbone=Llama3B-I...
2026.02
0.2507
SCALENET (Layer-wise)
LLM Backbone=Llama3B-I...
2026.02
0.2398
SCALENET (Layer-wise)
LLM Backbone=Qwen4B-In...
2026.02
0.1706
SCALENET (Layer-wise)
LLM Backbone=Qwen4B-In...
2026.02
0.1513
Naive TTA
LLM Backbone=Qwen4B-In...
2026.02
0.1495
Naive TTA
LLM Backbone=Qwen4B-In...
2026.02
0.1424
Base Model
LLM Backbone=Qwen4B-In...
2026.02
0.132
SCALENET (Step-wise)
LLM Backbone=Qwen4B-In...
2026.02
0.1041
SCALENET (Step-wise)
LLM Backbone=Qwen4B-In...
2026.02
0.0866
Naive TTA
LLM Backbone=Llama3B-I...
2026.02
0.0288
Feedback
Search any
task
Search any
task