| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Customer Support Linear Pipeline S7: Triple Failure | ReAct | LLM Calls9 | 3 | 1mo ago | |
| Customer Support Linear Pipeline S6: Both Notif Down | ReAct | LLM Calls8 | 3 | 1mo ago | |
| Customer Support Linear Pipeline S5 Email Dies | ReAct | LLM Calls5 | 3 | 1mo ago | |
| Customer Support Linear Pipeline S4: Risk ($15k) | ReAct | LLM Calls4 | 3 | 1mo ago | |
| Customer Support Linear Pipeline S3: All Payment Down | LLM Calls0 | 3 | 1mo ago | ||
| Customer Support Linear Pipeline S2 Stripe Down | ReAct | LLM Calls5 | 3 | 1mo ago | |
| Customer Support Linear Pipeline S1: Happy Path | ReAct | LLM Calls4 | 3 | 1mo ago |