Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Benchmark-side Pattern Shift

Benchmarks

Task NameDataset NameSOTA ResultTrend
Large Model Performance PredictionBenchmark-side Pattern Shift Math
Average Score46.59
6
Large Model Performance PredictionBenchmark-side Pattern Shift Chinese
Average Score42.16
3
Large Model Performance PredictionBenchmark-side Pattern Shift OCR
Score Avg47.6
3
Showing 3 of 3 rows