Share your thoughts, 1 month free Claude Pro on usSee more

Coding on HumanEval+ (test)

67.7Pass@1

Base

Updated 1mo ago

Evaluation Results

Method	Links
Base 2026.05		67.7
KL-SFT 2026.05		67.1
STM 2026.05		67.1
Low-SFT 2026.05		65.9
DFT 2026.05		65.9
Iter-SFT 2026.05		65.9
Anchored Learning 2026.05		64.6
Self-SFT 2026.05		64
SFT 2026.05		62.2
FReDA-4B 2026.06		59.76
FReDA-4B 2026.06		58.54
Qwen3-4B-Base 2026.06		57.93
TiDAR-8B 2026.06		55.49
TiDAR-8B 2026.06		52.44
BlockDiff-4B 2026.06		51.83
Dream-7B-Base 2026.06		50
LLaDA-MoE-7B-A1B-Base 2026.06		42.07
Qwen2.5-3B-Base 2026.06		36
LLaDA-8B-Base 2026.06		31.1