Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Explicit Attack on Toy
Loading...
500
Avg Queries (E)
o4-mini
-15.84
118.08
252
385.92
Feb 13, 2026
Avg Queries (E)
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Queries (E)
o4-mini
System Setup=Orchestra...
2026.02
500
o4-mini
System Setup=Pure SQL...
2026.02
42
gpt-4.1
System Setup=Orchestra...
2026.02
39
gem-2.5-flash
System Setup=Orchestra...
2026.02
33
gpt-4.1
System Setup=Orchestra...
2026.02
23
o4-mini
System Setup=Orchestra...
2026.02
16
sonnet-4
System Setup=Pure SQL...
2026.02
15
gpt-4.1
System Setup=Pure SQL...
2026.02
12
sonnet-4
System Setup=Orchestra...
2026.02
12
gpt-4.1
System Setup=Orchestra...
2026.02
10
gem-2.5-flash
System Setup=Orchestra...
2026.02
8
gpt-4.1-mini
System Setup=Pure SQL...
2026.02
6
gpt-4.1-mini
System Setup=Orchestra...
2026.02
6
gpt-4.1-mini
System Setup=Orchestra...
2026.02
6
gpt-4.1-mini
System Setup=Orchestra...
2026.02
4
gem-2.5-flash
System Setup=Pure SQL...
2026.02
4
gem-2.5-flash
System Setup=Orchestra...
2026.02
4
Feedback
Search any
task
Search any
task