Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

The PetShop Dataset -- Finding Causes of Performance Issues across Microservices

About

Identifying root causes for unexpected or undesirable behavior in complex systems is a prevalent challenge. This issue becomes especially crucial in modern cloud applications that employ numerous microservices. Although the machine learning and systems research communities have proposed various techniques to tackle this problem, there is currently a lack of standardized datasets for quantitative benchmarking. Consequently, research groups are compelled to create their own datasets for experimentation. This paper introduces a dataset specifically designed for evaluating root cause analyses in microservice-based applications. The dataset encompasses latency, requests, and availability metrics emitted in 5-minute intervals from a distributed application. In addition to normal operation metrics, the dataset includes 68 injected performance issues, which increase latency and reduce availability throughout the system. We showcase how this dataset can be used to evaluate the accuracy of a variety of methods spanning different causal and non-causal characterisations of the root cause analysis problem. We hope the new dataset, available at https://github.com/amazon-science/petshop-root-cause-analysis/ enables further development of techniques in this important area.

Michaela Hardt, William R. Orchard, Patrick Bl\"obaum, Shiva Kasiviswanathan, Elke Kirschbaum• 2023

Related benchmarks

TaskDatasetResultRank
Root Cause AnalysisPetshop high traffic scenario
Recall@183
16
Root Cause AnalysisPetshop temporal traffic scenario
Recall@1100
16
Root Cause AnalysisPetshop low traffic scenario
Recall@175
16
Root Cause AnalysisCausRCA Probe (Sub)
MAP@329
15
Root Cause AnalysisCausRCA Coolant (Sub)
MAP@341
15
Root Cause AnalysisCausRCA Hydraulics (Sub)
MAP@335
15
Root Cause AnalysisCausRCA Probe (Full)
MAP@36
14
Root Cause AnalysisCausRCA Coolant (Full)
MAP@320
14
Root Cause AnalysisCausRCA Hydraulics (Full)
MAP@317
14
Root Cause AnalysisRE3OB Online Boutique with code-level faults
F1 Top-1 Accuracy11
9
Showing 10 of 25 rows

Other info

Follow for update