Share your thoughts, 1 month free Claude Pro on usSee more

Complex Reasoning on Seal-0

53.4Accuracy (Seal-0)

Claude-4.5-Sonnet

Updated 3mo ago

Evaluation Results

Method	Links
Claude-4.5-Sonnet 2026.04		53.4
OpenAI-GPT-5-high 2026.04		51.4
LiteResearcher-4B 2026.04		41.8
AgentCPM-Explore-4B 2026.04		40.5
Mirothinker 8B 2026.04		40.4
DeepSeek-V3.2 2026.04		38.5
Kimi-Researcher 2026.04		36
Kimi-K2-0905 2026.04		25.2