Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Interactive SQL Querying on InterCode SQL
Loading...
69.1
Avg Reward
Co-Evolving Agents
10.444
25.672
40.9
56.128
Nov 27, 2025
Avg Reward
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Reward
Co-Evolving Agents
Backbone=Qwen3-4B-Inst...
2025.11
69.1
ETO
Backbone=Qwen3-4B-Inst...
2025.11
58.8
Co-Evolving Agents
Adaptation=Fine-tuning...
2025.11
53.8
Llama-2-7B-Chat + PPO
Adaptation=Fine-tuning...
2025.11
52.4
Llama-2-7B-Chat + ETO
Adaptation=Fine-tuning...
2025.11
49.4
GPT-4
Adaptation=In-context
2025.11
38.5
GPT-3.5-Turbo
Adaptation=In-context
2025.11
37.8
Llama-2-7B-Chat + RFT
Adaptation=Fine-tuning...
2025.11
35.6
Llama-2-7B-Chat + SFT
Adaptation=Fine-tuning...
2025.11
30.8
SFT
Backbone=Qwen3-4B-Inst...
2025.11
12.7
Feedback
Search any
task
Search any
task