Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Formulation design on Formulation design
Loading...
92.3
Validity
GPT-5
15.444
35.397
55.35
75.303
Apr 8, 2026
Validity
Success Rate
Updated 9d ago
Evaluation Results
Method
Method
Links
Validity
Success Rate
GPT-5
Backbone model=GPT-5
2026.04
92.3
92.2
Claude-3.5
Backbone model=Claude-3.5
2026.04
88.6
82.6
Qwen3-14B w SciDC
Backbone model=Qwen3-1...
2026.04
75.5
68.3
Qwen3-4B w SciDC
Backbone model=Qwen3-4...
2026.04
71
43.4
Qwen3-4B w/o K
Backbone model=Qwen3-4...
2026.04
56.8
56.8
Qwen3-14B
Backbone model=Qwen3-14B
2026.04
50.9
50.4
Qwen3-4B
Backbone model=Qwen3-4B
2026.04
43.4
43
Qwen3-14B w/o K
Backbone model=Qwen3-1...
2026.04
18.4
18.4
Feedback
Search any
task
Search any
task