Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Wiki

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingWiki
Perplexity (PPL)2
298
Language GenerationWiki
Perplexity3.53
54
Multivariate Time Series ForecastingWiki
MAE0.041
32
eXtreme Multi-label ClassificationWiki 500K
P@181.26
30
Temporal Knowledge Graph ReasoningWIKI
MRR0.4603
28
Graph ClusteringWiki
ARI35.8
27
Entity LinkingWIKI (test)
Micro F184.5
27
Probabilistic Forecastingwiki
CRPS0.214
25
Entity DisambiguationWIKI (test)
Micro F189.2
24
ClusteringWiki
Accuracy59.67
23
Node ClassificationWiki
Micro F10.5907
23
Relation ClassificationWiki ZSL (test)
Precision (%)71.54
22
Node Classificationwiki (test)
Accuracy65.13
22
Macroscopic time series forecastingWiki
SMAPE0.0362
20
Temporal Knowledge Graph ReasoningWIKI (meta-test)
MRR33.5
19
Time series forecastingwiki (test)
CRPS0.214
19
Definition ModelingWiki
BLEU62.07
18
Temporal Point Process modelingWiki real-world (test)
Negative Log-Likelihood-1.3727
18
Temporal Reasoning PredictionWIKI (test)
Positive Performance99.28
17
Semantic SimilarityWIKI (test)
BLEU-455.52
17
TKG reasoningWIKI (test)
MRR30.9
17
ClusteringWiki
Clustering Time (s)0.12
16
Relation ExtractionWiki ZSL (test)
Micro-F153.71
16
Link PredictionWIKI
Hits@179.85
16
Extractive Question AnsweringWiki (test)
EM78.6
16
Showing 25 of 112 rows