Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on DS-1000
Loading...
69.9
Accuracy
EG-CFG
37.66
46.03
54.4
62.77
Jun 12, 2025
Accuracy
RSR
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
RSR
EG-CFG
Model=DeepSeek-V3-0324
2025.06
69.9
50.73
CONLINE
Model=GPT-4
2025.06
68
-
Baseline LLM
Model=GPT-4
2025.06
60.2
-
Baseline LLM
Model=GPT-4o
2025.06
59.9
-
SelfEvolve
Model=GPT-3.5 Turbo
2025.06
57.1
-
Baseline LLM
Model=Claude 3.5 Sonnet
2025.06
54.3
-
Baseline LLM
Model=DeepSeek-Coder-V...
2025.06
53.2
-
Self Debugging
Model=GPT-3.5 Turbo
2025.06
53
-
Baseline LLM
Model=Qwen2-72B-Instruct
2025.06
52.8
-
DocPrompting
Model=GPT-3.5 Turbo
2025.06
45.5
-
Baseline LLM
Model=DeepSeek-V3-0324
2025.06
38.9
0
Feedback
Search any
task
Search any
task