Share your thoughts, 1 month free Claude Pro on usSee more

General Reasoning on Open-Platypus (test)

78.06Accuracy

Latent-GRPO

Updated 4mo ago

Evaluation Results

Method	Links
Latent-GRPO 2026.01		78.06	1,632.52
LLM-as-Judge 2026.01		65.21	3,522.18
Latent-GRPO 2026.01		64.82	1,218.92
LLM-as-Judge 2026.01		56.69	2,573.41
Latent-GRPO 2026.01		40.56	1,079.27
LLM-as-Judge 2026.01		34.45	1,937.82