Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-step Reasoning on MuSR
Loading...
57.36
Murder Mystery Score
n=64 (RL)
50.2256
52.0778
53.93
55.7822
May 8, 2026
Murder Mystery Score
Object Placements Score
Team Allocations Score
Updated 22d ago
Evaluation Results
Method
Method
Links
Murder Mystery Score
Object Placements Score
Team Allocations Score
n=64 (RL)
n=64
2026.05
57.36
45.98
38.57
n=32 (RL)
n=32
2026.05
56.94
47.08
39.07
Vanilla RL
2026.05
53.15
46.23
23.46
Base Instruct
2026.05
50.5
45.25
25.7
Feedback
Search any
task
Search any
task