Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Pairwise Comparison on SummEval (anchor set)

94.5Accuracy

GPT-4o

85.6687.95590.2592.545Feb 17, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.02
94.5
93.4
2025.02
92
91.1
2025.02
87.4
2025.02
86