Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Website Generation Functional Correctness on WebGen-Bench
Loading...
0.915
Content Presentation Score
INFINITEWEB
0.82868
0.85109
0.8735
0.89591
Jan 7, 2026
Content Presentation Score
User Interaction Score
Data Management Score
Functional Execution Score
Data Display Score
Design Validation Score
Overall Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Content Presentation Score
User Interaction Score
Data Management Score
Functional Execution Score
Data Display Score
Design Validation Score
Overall Score
INFINITEWEB
average_runs=3
2026.01
0.915
0.838
0.827
0.809
0.941
0.828
0.856
Codex
average_runs=3
2026.01
0.898
0.792
0.756
0.728
0.962
0.764
0.812
Claude-Code
average_runs=3
2026.01
0.879
0.701
0.676
0.673
0.875
0.611
0.743
Bolt.diy
average_runs=3
2026.01
0.832
0.599
0.628
0.584
0.841
0.417
0.67
Feedback
Search any
task
Search any
task