Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mixed Dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Uncertainty QuantificationMixed Dataset (real and fake biographies)
ROC AUC0.9001
32
Idiomatic TranslationMixed Dataset en-bn
LLM-eval Score2.25
18
Offline Reinforcement LearningMixed Dataset Aggregate
Normalized Reward62.2
12
Polyp SegmentationMixed Dataset
Dice84.78
11
Idiomatic TranslationMixed Dataset en-te
LLM-eval Score1.83
10
Idiomatic TranslationMixed Dataset en-ta
LLM-eval Score1.87
10
Idiomatic TranslationMixed Dataset en-hi
LLM-eval Score2.39
10
Showing 7 of 7 rows