Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Rebuttal Generation on RESEARCHARCADE
Loading...
70.3
SBERT Similarity
LLM
60.004
62.677
65.35
68.023
Nov 27, 2025
SBERT Similarity
Rouge-L
GPT-4o-mini Score
Updated 4d ago
Evaluation Results
Method
Method
Links
SBERT Similarity
Rouge-L
GPT-4o-mini Score
LLM
Backbone=GPTOSS-120B,...
2025.11
70.3
15.2
88.4
LLM
Backbone=Qwen3-8B, Tra...
2025.11
70
15.4
20.8
LLM
Backbone=Qwen3-0.6B, T...
2025.11
63.8
13.1
2.2
LLM
Backbone=Qwen3-0.6B, T...
2025.11
60.4
12.5
1.1
Feedback
Search any
task
Search any
task