Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Size Querying on CoP-QA-F
Loading...
1
AC Score
Talk2DM
0.926056
0.945253
0.96445
0.983647
Feb 12, 2026
AC Score
AQ Score
Updated 4d ago
Evaluation Results
Method
Method
Links
AC Score
AQ Score
Talk2DM
LLM Backbone=GPT-oss:20B
2026.02
1
0.9969
Talk2DM
LLM Backbone=Gemma3:27B
2026.02
1
0.9372
Talk2DM
LLM Backbone=Llama3.1:8B
2026.02
1
0.8724
Talk2DM
LLM Backbone=Magistral...
2026.02
0.9895
0.7238
Talk2DM
LLM Backbone=Qwen3:30B
2026.02
0.9885
0.9885
Talk2DM
LLM Backbone=Deepseek-...
2026.02
0.9289
0.9188
Feedback
Search any
task
Search any
task