Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Constraint Following on ComplexInstruct & CoDI (Level 1)
Loading...
95.2
ISR
DVR_WS
89.896
91.273
92.65
94.027
Oct 16, 2024
ISR
Updated 4d ago
Evaluation Results
Method
Method
Links
ISR
DVR_WS
Backbone=Llama-3.1-8B-...
2024.10
95.2
DVR_CS
Backbone=Llama-3.1-8B-...
2024.10
94.5
ReAct
Backbone=Llama-3.1-8B-...
2024.10
94.2
CRITIC
Backbone=Llama-3.1-8B-...
2024.10
93.8
R-Sample
Backbone=Llama-3.1-8B-...
2024.10
92.1
Reflxion
Backbone=Llama-3.1-8B-...
2024.10
91.6
U-SC
Backbone=Llama-3.1-8B-...
2024.10
90.9
Vanilla
Backbone=Llama-3.1-8B-...
2024.10
90.5
BSM
Backbone=Llama-3.1-8B-...
2024.10
90.1
Feedback
Search any
task
Search any
task