Share your thoughts, 1 month free Claude Pro on usSee more

Code Generation on MBPP, CodeAlpacaPy, HumanEval, and LiveCodeBench (Composite)

4.04Speedup

DEER

Updated 1mo ago

Evaluation Results

Method	Links
DEER 2025.12		4.04	5.03
DEER 2025.12		2.98	4.82
DEER 2025.12		2.83	4.61
DEER 2025.12		2.77	4.61
DEER 2025.12		2.7	4.11
EAGLE3 2025.12		2.48	3.45
EAGLE3 2025.12		2.43	3.22
EAGLE3 2025.12		2.4	3.31
EAGLE3 2025.12		2.39	3.54
Hydra 2025.12		2.25	2.66
Hydra 2025.12		2.22	2.58
EAGLE3 2025.12		2.21	3.05
MEDUSA 2025.12		1.32	1.97
MEDUSA 2025.12		1.23	1.94