Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
CAD Generation on Hybrid Agentic Architecture CAD Evaluation Set
Loading...
97.5
Execution Success (R1)
Claude Sonnet 4.5
96.356
96.653
96.95
97.247
May 19, 2026
Execution Success (R1)
Meshing Success (R2)
Simulation Success (R3)
Updated 14d ago
Evaluation Results
Method
Method
Links
Execution Success (R1)
Meshing Success (R2)
Simulation Success (R3)
Claude Sonnet 4.5
Model=Claude Sonnet 4.5
2026.05
97.5
87.6
84.5
Gemini 3 Pro
Model=Gemini 3 Pro
2026.05
97.5
76.3
86.8
Gemini 3 Flash
Model=Gemini 3 Flash
2026.05
97
84.4
85.1
Claude Opus 4.5
Model=Claude Opus 4.5
2026.05
96.4
88.2
81.3
Feedback
Search any
task
Search any
task