Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ArXiv

Benchmarks

Task NameDataset NameSOTA ResultTrend
Node ClassificationArxiv
Accuracy83.78
219
SummarizationarXiv (test)
ROUGE-164.16
161
Language ModelingARXIV (test)
PPL2.36
145
Node ClassificationarXiv-year
Accuracy64.62
112
SummarizationArxiv
ROUGE-223.05
76
Language ModelingarXiv
Perplexity2.46
55
Node ClassificationArxiv
Clean Accuracy66.83
52
Membership Inference AttackarXiv Pythia
ROC AUC94
36
Node ClassificationArxiv (test)
ASR97.03
32
Membership Inference AttackArXiv
AUC85
32
Node ClassificationArxiv Covariate shift (degree split)
OOD Accuracy66.41
30
Graph Backdoor AttackArxiv
ASR97.03
28
GNN trainingARXIV
Speedup1.044
24
Language ModelingArxiv (val)
Perplexity18.22
24
SummarizationArXiv (test)
Completeness Score5
24
Long-document summarizationArXiv (test)
ROUGE-2 Score22.5
24
Text SegmentationarXiv
Pk0.3733
22
Node Classificationarxiv
ASR99.96
21
Rubric satisfaction evaluationArXiv
Claude-4 Sonnet Score38.1
21
Node unlearningArxiv
Average Runtime (s)0.03
20
Masked Language Modeling Fine-tuningarXiv (fine-tuning)
MSE7.92
20
Node ClassificationArxiv Covariate shift time split
OOD Test Accuracy66.47
20
Abstractive SummarizationarXiv (test)
R-153.7
20
Link PredictionarXiv 14 (test)
AUC93.66
20
Node ClassificationArxiv (2018-2020)
Accuracy60.78
18
Showing 25 of 143 rows