Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Summarization on SummaryBench R&J
Loading...
100
Structural Score
Direct
-3.8544
23.1078
50.07
77.0322
May 14, 2026
Structural Score
Semantic Score
Updated 19d ago
Evaluation Results
Method
Method
Links
Structural Score
Semantic Score
Direct
LLM Backend=GPT-5.4 mini
2026.05
100
43.3
APWA
LLM Backend=GPT-5.4 mini
2026.05
95.4
42.4
MegaAgent
LLM Backend=GPT-4.1 mini
2026.05
14
4.3
Direct
2026.05
1
0.433
APWA
Config=5.4×mini
2026.05
1
0.528
APWA
Config=mini×mini
2026.05
0.954
0.424
APWA
Config=mini×nano
2026.05
0.95
0.426
APWA
Config=5.4×nano
2026.05
0.943
0.439
MegaAgent
2026.05
0.14
0.043
Feedback
Search any
task
Search any
task