Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mean

Benchmarks

Task NameDataset NameSOTA ResultTrend
Pixel-level manipulation detectionMEAN Across datasets
F1 Score72.8
20
Code GenerationMean Across MBPP, CodeAlpacaPy, HumanEval, LiveCodeBench
Speedup4.04
14
Offline Reinforcement LearningMean Medium-Replay
Normalized Return76.45
7
Offline Reinforcement LearningMean Medium
Normalized Return71.33
7
Offline Reinforcement LearningMean Medium-Expert
Normalized Return98.5
7
Physically-based renderingMean All scenes
PSNR31.8
4
Mathematical ReasoningMean across benchmarks
Speedup2.12
2
Showing 7 of 7 rows