Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Wiki

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingWiki
Perplexity (PPL)2
281
Language GenerationWiki
Perplexity3.53
54
eXtreme Multi-label ClassificationWiki 500K
P@181.26
30
Temporal Knowledge Graph ReasoningWIKI
MRR0.4603
28
Graph ClusteringWiki
ARI35.8
27
Entity LinkingWIKI (test)
Micro F184.5
27
Entity DisambiguationWIKI (test)
Micro F189.2
24
Node ClassificationWiki
Micro F10.5907
23
Relation ClassificationWiki ZSL (test)
Precision (%)71.54
22
Node Classificationwiki (test)
Accuracy65.13
22
Probabilistic Forecastingwiki
CRPS0.214
21
Macroscopic time series forecastingWiki
SMAPE0.0362
20
Temporal Knowledge Graph ReasoningWIKI (meta-test)
MRR33.5
19
Time series forecastingwiki (test)
CRPS0.214
19
Definition ModelingWiki
BLEU62.07
18
Temporal Point Process modelingWiki real-world (test)
Negative Log-Likelihood-1.3727
18
Temporal Reasoning PredictionWIKI (test)
Positive Performance99.28
17
Semantic SimilarityWIKI (test)
BLEU-455.52
17
TKG reasoningWIKI (test)
MRR30.9
17
ClusteringWiki
Clustering Time (s)0.12
16
Relation ExtractionWiki ZSL (test)
Micro-F153.71
16
Link PredictionWIKI
Hits@179.85
16
Extractive Question AnsweringWiki (test)
EM78.6
16
ClusteringWiki
F1 Score51
16
Text SegmentationWiki-50
Pk16.5
15
Showing 25 of 106 rows