SCOPE: Tree-based Self-Correcting Online Log Parsing via Syntactic-Semantic Collaboration
About
Log parsing is a critical step for automated log analysis in complex systems. Traditional heuristic-based methods offer high efficiency but are limited in accuracy due to overlooking semantic context. In contrast, recent LLM-based parsers improve accuracy via se mantic understanding but incur high latency from frequent model calls. To address this, we propose SCOPE, the first self-correcting online log parsing method that integrates the strengths of both heuristic and LLM-based paradigms. SCOPE introduces a novel bi-directional tree structure that enables efficient template match ing from both forward and reverse directions, resulting in a higher overall matching rate. Additionally, it adopts a two-stage syntactic semantic collaboration framework: a lightweight NLP model first utilizes part-of-speech (POS) information for syntax-based match ing, while the LLM is selectively invoked as a fallback to handle semantically complex cases when uncertainty remains. This design significantly reduces LLM API usage while maintaining high ac curacy, achieving a balance between efficiency and effectiveness. Extensive evaluations on diverse benchmark datasets show that SCOPE outperforms state-of-the-art methods in both accuracy and efficiency. The implementation and datasets are publicly released to facilitate further research.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Log Parsing | Hadoop Loghub 2.0 | Global Accuracy (GA)98.8 | 6 | |
| Log Parsing | Spark Loghub 2.0 | Global Accuracy (GA)100 | 6 | |
| Log Parsing | HPC Loghub 2.0 | GA97.2 | 6 | |
| Log Parsing | Linux Loghub 2.0 | GA92.5 | 6 | |
| Log Parsing | OpenSSH Loghub 2.0 | GA92.8 | 6 | |
| Log Parsing | Mac Loghub 2.0 | GA94.8 | 6 | |
| Log Parsing | BGL Loghub 2.0 | Global Accuracy96.2 | 6 | |
| Log Parsing | Thunderbird Loghub 2.0 | Global Accuracy (GA)90.8 | 6 | |
| Log Parsing | Proxifier Loghub 2.0 | GA100 | 6 | |
| Log Parsing | HDFS Loghub 2.0 | GA100 | 6 |