Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PandasPlotBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text-to-code generationPandasPlotBench
Code Error Rate9.7
8
Text-to-VisualizationPandasPlotBench (test)
Code Exec Success79
3
Showing 2 of 2 rows