Share your thoughts, 1 month free Claude Pro on usSee more

Physics Reasoning on QFT Synthetic Hard

44.5Accuracy

Claude-Opus-4.5

Updated 3mo ago

Evaluation Results

Method	Links
Claude-Opus-4.5 2026.04		44.5	62.5
Gemini-2.5-flash 2026.04		30.2	57.5
OSS-20b 2026.04		22.2	53.8
Qwen3-4B-Thinking-2507 2026.04		3.2	7.5
Qwen3-4B-Instruct-2507 2026.04		2.5	6.2
DeepSeek-R1-Distill-Qwen-7B 2026.04		0	0