Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-agent Reasoning on Ultrafeedback

73.66Accuracy

OW-L

70.092871.018971.94572.8711Oct 1, 2025
Updated 14d ago

Evaluation Results

MethodLinks
2025.10
73.66
2025.10
73.66
2025.10
73.26
2025.10
73.14
2025.10
72.44
2025.10
72.44
2025.10
72.21
2025.10
71.18
2025.10
70.23