Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Claw-Eval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Real-World AgentClaw-Eval
Average Score80.6
22
Personal Assistant Agent PerformanceClaw-Eval general domain 0408
Pass@370.8
13
Showing 2 of 2 rows