Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Morphology Reasoning on Time-MMD Energy (test)
Loading...
84.12
Accuracy
GPT-5.2
1.9184
23.2592
44.6
65.9408
May 28, 2026
Accuracy
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-5.2
Horizon=96, Category=A...
2026.05
84.12
DeepSeek-R1
Horizon=96, Category=A...
2026.05
79.51
KAIROSAGENT-4B (+ Turn-Level Reward RL)
Horizon=96, Category=C...
2026.05
50.47
KAIROSAGENT-4B (SFT-Only)
Horizon=96, Category=C...
2026.05
45.21
KAIROSAGENT-4B (+ Outcome-Level Reward RL)
Horizon=96, Category=C...
2026.05
43.33
Llama-3.1-8B-Instruct
Horizon=96, Category=C...
2026.05
38.72
DeepSeek-R1-Distill-Qwen-7B
Horizon=96, Category=C...
2026.05
5.08
Feedback
Search any
task
Search any
task