| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CodeXGLUE (test) | CodeT5-large | EM22.65 | 9 | 1mo ago | |
| DTVBENCH | GPT-4o | Manim Score10.6 | 8 | 4d ago | |
| ArtifactsBench | JANUSCODER-14B | Task Accuracy86 | 8 | 4d ago | |
| PandasPlotBench | JANUSCODER-14B | Code Error Rate9.7 | 8 | 4d ago | |
| mCoNaLa Average | ERNIE-Code | BLEU-45.71 | 7 | 1mo ago | |
| mCoNaLa Russian | ERNIE-Code | BLEU-46.55 | 7 | 1mo ago | |
| mCoNaLa Japanese | ERNIE-Code | BLEU-48.08 | 7 | 1mo ago | |
| mCoNaLa Spanish | ERNIE-Code | BLEU-42.51 | 7 | 1mo ago |