Share your thoughts, 1 month free Claude Pro on usSee more

General Chat on WildBench 2025 (test)

1,062.4WB-Elo

SR-GRPO

Updated 4mo ago

Evaluation Results

Method	Links
SR-GRPO 2025.12		1,062.4
RM 2025.12		1,043.3
Self-Reward 2025.12		1,041.2
Perplexity 2025.12		1,040.9
IPO 2025.12		1,037.7
Base 2025.12		1,036.2
SR-GRPO 2025.12		932.5
IPO 2025.12		922.4
Self-Reward 2025.12		919.2
RM 2025.12		918.4
Perplexity 2025.12		917
Base 2025.12		913.5