Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Customer Support

Benchmarks

Task NameDataset NameSOTA ResultTrend
Self-Healing Tool RoutingCustomer Support Linear Pipeline S7: Triple Failure
LLM Calls9
3
Self-Healing Tool RoutingCustomer Support Linear Pipeline S6: Both Notif Down
LLM Calls8
3
Self-Healing Tool RoutingCustomer Support Linear Pipeline S5 Email Dies
LLM Calls5
3
Self-Healing Tool RoutingCustomer Support Linear Pipeline S4: Risk ($15k)
LLM Calls4
3
Self-Healing Tool RoutingCustomer Support Linear Pipeline S3: All Payment Down
LLM Calls0
3
Self-Healing Tool RoutingCustomer Support Linear Pipeline S2 Stripe Down
LLM Calls5
3
Self-Healing Tool RoutingCustomer Support Linear Pipeline S1: Happy Path
LLM Calls4
3
Showing 7 of 7 rows