Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Web

Benchmarks

Task NameDataset NameSOTA ResultTrend
Link PredictionWeb (PB) (test)
Hits@5048.14
18
Open Information ExtractionWEB
F191.2
18
GUI NavigationWeb-Single (test)
Type EM0.81
14
Graph similarity rankingWEB
Kendall's Tau0.963
14
Causal DiscoveryWeb 2
F1 Score0.68
11
Causal DiscoveryWeb 1
F1 Score71
11
Density EstimationWeb (test)
Avg Log-Likelihood (nats)-27.87
9
Prompt Injection DetectionWeb Direct Prompt Injection
FPR0
7
GUI NavigationWeb
Length-0.14
6
Semantic SegmentationWeb 8
mIoU41.9
2
Time Series ForecastingWeb Tr. (out-of-domain)
MSE1.393
2
Showing 11 of 11 rows