Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Explicit Attack on Medium
Loading...
3
Avg Expected Queries (E)
gem-2.5-flash
-0.24
21.63
43.5
65.37
Feb 13, 2026
Avg Expected Queries (E)
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Expected Queries (E)
gem-2.5-flash
System Setup=Pure SQL...
2026.02
3
gem-2.5-flash
System Setup=Orchestra...
2026.02
3
gpt-4.1-mini
System Setup=Orchestra...
2026.02
4
gpt-4.1-mini
System Setup=Orchestra...
2026.02
4
gpt-4.1-mini
System Setup=Orchestra...
2026.02
4
gpt-4.1-mini
System Setup=Pure SQL...
2026.02
6
gpt-4.1
System Setup=Orchestra...
2026.02
8
sonnet-4
System Setup=Orchestra...
2026.02
9
gpt-4.1
System Setup=Pure SQL...
2026.02
10
sonnet-4
System Setup=Pure SQL...
2026.02
10
gpt-4.1
System Setup=Orchestra...
2026.02
17
gem-2.5-flash
System Setup=Orchestra...
2026.02
17
gpt-4.1
System Setup=Orchestra...
2026.02
18
gem-2.5-flash
System Setup=Orchestra...
2026.02
20
o4-mini
System Setup=Orchestra...
2026.02
59
o4-mini
System Setup=Pure SQL...
2026.02
84
Feedback
Search any
task
Search any
task