Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning on LiveBench (test)

18.15Accuracy

MoE w/ optimal AR

16.704417.079717.45517.8303Jun 13, 2025
Updated 15d ago

Evaluation Results

MethodLinks
2025.06
18.15
2025.06
16.82
2025.06
16.76