Share your thoughts, 1 month free Claude Pro on usSee more

Code Generation on MBPP EvalPlus base

90.5Pass@1

Claude 3.5 Sonnet

Updated 4mo ago

Evaluation Results

Method	Links
Claude 3.5 Sonnet 2024.07		90.5
Llama 3 405B 2024.07		88.6
GPT-4o 2024.07		87.8
Llama 3 70B 2024.07		86
GPT-4 2024.07		83.6
GPT-3.5 Turbo 2024.07		82
Mixtral 8x22B 2024.07		78.6
Llama 3 8B 2024.07		72.8
Nemotron 4 340B 2024.07		72.8
Gemma 2 9B 2024.07		71.7
Mistral 7B 2024.07		49.5