Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Workflow evaluation for engineering design on Photonics2D
Loading...
1
TC
GPT-5-mini
0.168
0.384
0.6
0.816
May 19, 2026
TC
CO
Updated 14d ago
Evaluation Results
Method
Method
Links
TC
CO
GPT-5-mini
Workflow Style=W-RAND
2026.05
1
0.53
GPT-5-mini
Workflow Style=W-DISTRACT
2026.05
1
0.52
Gemini-3-Flash
Workflow Style=W-RAND
2026.05
1
0.6
Gemini-3-Flash
Workflow Style=W-DISTRACT
2026.05
1
0.55
Qwen3-4B
Workflow Style=W-RAND
2026.05
1
0.56
Qwen3-4B
Workflow Style=W-DISTRACT
2026.05
1
0.52
Qwen3.5-4B
Workflow Style=W-RAND
2026.05
1
0.55
Qwen3.5-4B
Workflow Style=W-DISTRACT
2026.05
1
0.57
Gemini-3-Flash
Workflow Style=W-COND
2026.05
0.53
0.54
Qwen3.5-4B
Workflow Style=W-COND
2026.05
0.47
0.49
GPT-5-mini
Workflow Style=W-COND
2026.05
0.4
0.47
Qwen3-4B
Workflow Style=W-COND
2026.05
0.2
0.43
Feedback
Search any
task
Search any
task