Share your thoughts, 1 month free Claude Pro on usSee more

Multi-Task Reasoning on MMLU-Pro

69.3Pass@1

Composition-RL

Updated 2mo ago

Evaluation Results

Method	Links
Composition-RL 2026.02		69.3
Standard RLVR 2026.02		67.2
Composition-RL 2026.02		64.6
Composition-RL 2026.02		64.5
Composition-RL 2026.02		64.5
Composition-RL 2026.02		63.8
Standard RLVR 2026.02		62.6
Standard RLVR 2026.02		62.6
Composition-RL 2026.02		61.4
Standard RLVR 2026.02		58.6
Before RL (Base) 2026.05		47.21
GT-Reward 2026.05		43.17
VIGOR 2026.05		43.09
INTUITOR 2026.05		43.04
GT-Reward 2026.05		38.17
Before RL (Base) 2026.05		36.92
VIGOR 2026.05		32.65
INTUITOR 2026.05		24.48