Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning

About

Temporal Knowledge Graph Reasoning (TKGR) aims at inferring missing (especially future) events from historical data. Current evaluation in TKGR uniformly weights all events, ignoring that most are trivial repetitions, which overestimate the true reasoning ability. Therefore, the rare outstanding events, whose prediction demands deeper reasoning, should be distinguished and emphasized. To this end, we propose a strikingness-aware evaluation framework, which introduces a rule-based strikingness measuring framework (RSMF) to quantify event strikingness by comparing its expected occurrence with peer events derived from temporal rules. Strikingness is then integrated as a weighting factor into metrics like weighted MRR and Hits@k. Experiments on four TKG benchmarks reveal: 1) All representative models perform worse as event strikingness increases, 2) Path-based methods excel on low-strikingness events and representation-based ones on high-strikingness events, 3) We design an ensemble method whose gains stem from fitting trivial events rather than reasoning improvement. Our framework provides a more rigorous evaluation, refocusing the field on predicting outstanding events.

Rikui Huang, Shengzhe Zhang, Wei Wei• 2026

Related benchmarks

TaskDatasetResultRank
Temporal Knowledge Graph reasoningICEWS 18
Hits@100.607
78
Temporal Knowledge Graph reasoningICEWS 14
Hits@140.23
66
Temporal Knowledge Graph reasoningICEWS0515
MRR58.53
14
Showing 3 of 3 rows

Other info

Follow for update