Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Logical Refinement of Natural Language Explanations on WorldTree
Loading...
9
Initial Performance
Faithful-Refiner
-0.36
2.07
4.5
6.93
May 30, 2025
Initial Performance
Final Performance
Iterations Count
API Calls Count
Updated 4d ago
Evaluation Results
Method
Method
Links
Initial Performance
Final Performance
Iterations Count
API Calls Count
Faithful-Refiner
LLM=GPT-4o
2025.05
9
56
3.86
26.16
Faithful-Refiner
LLM=Deepseek-V3
2025.05
7
73
3.55
25.3
Faithful-Refiner
LLM=Llama3.1-70b
2025.05
6
52
4.62
35.72
Faithful-Refiner
LLM=GPT-4o-mini
2025.05
5
47
4.75
36.5
Explanation-Refiner
LLM=Deepseek-V3
2025.05
3
31
4.52
42.64
Explanation-Refiner
LLM=Llama3.1-70b
2025.05
2
15
5.23
51.61
Explanation-Refiner
LLM=GPT-4o
2025.05
2
13
4.18
39.26
Explanation-Refiner
LLM=GPT-4o-mini
2025.05
0
4
5
46.12
Feedback
Search any
task
Search any
task