Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mixed Dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Uncertainty QuantificationMixed Dataset (real and fake biographies)
ROC AUC0.9001
32
Idiomatic TranslationMixed Dataset en-bn
LLM-eval Score2.25
18
Offline Reinforcement LearningMixed Dataset Aggregate
Normalized Reward62.2
12
Polyp SegmentationMixed Dataset
Dice84.78
11
Idiomatic TranslationMixed Dataset en-te
LLM-eval Score1.83
10
Idiomatic TranslationMixed Dataset en-ta
LLM-eval Score1.87
10
Idiomatic TranslationMixed Dataset en-hi
LLM-eval Score2.39
10
Showing 7 of 7 rows