Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Knowledge Reasoning on Super GPQA (Accuracy)

51Accuracy

Qwen2.5-32B-Instruct + Bootcamp-SFT-RL

37.4840.9944.548.01Aug 12, 2025
Updated 13d ago

Evaluation Results

MethodLinks
2025.08
51
2025.08
48.7
2025.08
48.5
2025.08
45.9
2025.08
39.3
2025.08
38