Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Logical Refinement of Natural Language Explanations on e-SNLI
Loading...
41
Initial Performance
Faithful-Refiner
11.88
19.44
27
34.56
May 30, 2025
Initial Performance
Final Performance
Number of Iterations
Number of Calls
Updated 4d ago
Evaluation Results
Method
Method
Links
Initial Performance
Final Performance
Number of Iterations
Number of Calls
Faithful-Refiner
LLM=Deepseek-V3
2025.05
41
95
1.5
9.52
Faithful-Refiner
LLM=GPT-4o
2025.05
39
89
1.54
10.24
Faithful-Refiner
LLM=Llama3.1-70b
2025.05
36
78
2.38
16.28
Faithful-Refiner
LLM=GPT-4o-mini
2025.05
32
77
2.27
16.62
Explanation-Refiner
LLM=GPT-4o
2025.05
31
71
3.62
32.34
Explanation-Refiner
LLM=Deepseek-V3
2025.05
25
69
2.82
27.74
Explanation-Refiner
LLM=Llama3.1-70b
2025.05
23
51
4.08
34.56
Explanation-Refiner
LLM=GPT-4o-mini
2025.05
13
30
3.65
32.55
Feedback
Search any
task
Search any
task