| AIOps Hard 2025 | | Accuracy27.4 | | 17 | 28d ago |
| Fault injection Mid (unseen) | OpsLLM-32B | Accuracy (%)79 | | 17 | 28d ago |
| SFT data Easy (seen) | OpsLLM-32B | Accuracy (%)77.4 | | 17 | 28d ago |
| Petshop temporal traffic scenario | traversal | Recall@1100 | | 16 | 22d ago |
| Petshop high traffic scenario | corr | Recall@183 | | 16 | 22d ago |
| Petshop low traffic scenario | PRIM FT | Recall@188 | | 16 | 22d ago |
| CausRCA Probe (Sub) | PRIM-FT | MAP@345 | | 15 | 22d ago |
| CausRCA Hydraulics (Sub) | | MAP@398 | | 15 | 22d ago |
| CausRCA Coolant (Sub) | | MAP@3100 | | 15 | 22d ago |
| CausRCA Probe (Full) | ε-Diag | MAP@36 | | 14 | 22d ago |
| CausRCA Hydraulics (Full) | PRIM-FT | MAP@366 | | 14 | 22d ago |
| CausRCA Coolant (Full) | PRIM-FT | MAP@363 | | 14 | 22d ago |
| OB (OnlineBoutique) (test) | MARLIN | PR@161.1 | | 9 | 2mo ago |
| RE3OB Online Boutique with code-level faults | PRISM | F1 Top-1 Accuracy67 | | 9 | 3mo ago |
| RE2TT (Train Ticket with multimodal data) | PRISM | CPU Top-10.47 | | 9 | 3mo ago |
| RE3TT Train Ticket with code-level faults | PRISM | F1@11 | | 9 | 3mo ago |
| RCAEval Overall All nine datasets (RE1OB-RE3TT) 1.0 | PRISM | Top-1 Accuracy68 | | 9 | 3mo ago |
| AIOps Product Review and Cloud Computing (test) | MATMCD | MAP@530 | | 9 | 3mo ago |
| PetShop (temporal traffic, availability) | traversal | Top-3 Recall100 | | 8 | 22d ago |
| PetShop (temporal traffic, latency) | traversal | Top-3 Recall100 | | 8 | 22d ago |
| PetShop high traffic availability | traversal | Top-3 Recall100 | | 8 | 22d ago |
| PetShop high traffic, latency | circa | Top-3 Recall100 | | 8 | 22d ago |
| PetShop low traffic, availability | traversal | Top-3 Recall100 | | 8 | 22d ago |
| PetShop low traffic, latency | circa | Top-3 Recall86 | | 8 | 22d ago |
| Petshop average across all scenarios | corr | Recall@165 | | 8 | 22d ago |