Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Combined

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringCombined 7 Datasets
Average Score45
18
All-in-One Image RestorationCombined (Deraining, Desnowing, Dehazing)
PSNR34.02
13
Bayesian neural network regressionCombined (test)
RMSE3.939
6
Malicious Prompt DetectionCombined All Datasets (test)
ASR4.5
6
Language Understanding and ReasoningCombined (GSM8k, MATH500, MAWPS, SVAMP, AQuA, GLUE, CSQA, OBQA)
Average Score72.94
5
Probabilistic CalibrationCombined 20K labeled samples
Brier Score0.0759
5
Data-to-text generationCombined
FE8.05
3
Shadow DetectionCombined Dataset
Testing Time (hours)0.55
3
Showing 8 of 8 rows