Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-term state poisoning evaluation on OpenClaw Log Replay
Loading...
4.34
Harm Score (HS)
Kimi k2.5
4.2844
4.6597
5.035
5.4103
May 7, 2026
Harm Score (HS)
Updated 23d ago
Evaluation Results
Method
Method
Links
Harm Score (HS)
Kimi k2.5
2026.05
4.34
Grok-1
2026.05
4.8
GPT-4o
2026.05
5.03
MiniMax-abab6.5
2026.05
5.73
Feedback
Search any
task
Search any
task