Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Automated paper-to-code reproduction on Paper2CodeBench 2024
Loading...
4.35
Score (ICLR 2024)
paper2code + auto-plan & code optimized
3.83
3.965
4.1
4.235
Dec 2, 2025
Score (ICLR 2024)
Score (ICML 2024)
Score (NeurIPS 2024)
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score (ICLR 2024)
Score (ICML 2024)
Score (NeurIPS 2024)
Average Score
paper2code + auto-plan & code optimized
LLM=GPT-4.1
2025.12
4.35
4.43
4.23
4.34
paper2code + auto-code optimized
LLM=GPT-4.1
2025.12
4.28
4.39
4.18
4.29
paper2code + self-refine in plan
LLM=GPT-4.1
2025.12
4.14
4.26
3.75
4.05
paper2code + auto-plan optimized
LLM=GPT-4.1
2025.12
4.01
4
3.8
3.94
paper2code
LLM=GPT-4.1
2025.12
3.85
4.09
3.6
3.84
Feedback
Search any
task
Search any
task