Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pooled

Benchmarks

Task NameDataset NameSOTA ResultTrend
Anti-spoofingPooled
EER4.69
10
Mathematical ReasoningPooled 5-benchmark set
Accuracy54.44
6
Code Editing Copy-as-Decode Efficiency AnalysisPooled (All)
Number of Cases482
1
Competing Risks Survival AnalysisPooled 4 datasets (10 splits x 2 events)
Metric-
0
Showing 4 of 4 rows