Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Logical Refinement of Natural Language Explanations on QASC
Loading...
17
Initial Score
Faithful-Refiner
2.44
6.22
10
13.78
May 30, 2025
Initial Score
Final Score
Number of Iterations
Number of Calls
Updated 1mo ago
Evaluation Results
Method
Method
Links
Initial Score
Final Score
Number of Iterations
Number of Calls
Faithful-Refiner
LLM=Deepseek-V3
2025.05
17
90
2.53
20.18
Faithful-Refiner
LLM=GPT-4o-mini
2025.05
12
71
3.35
27.1
Faithful-Refiner
LLM=Llama3.1-70b
2025.05
11
68
2.9
25.4
Faithful-Refiner
LLM=GPT-4o
2025.05
10
79
3.22
22.32
Explanation-Refiner
LLM=Llama3.1-70b
2025.05
4
18
4.07
37.49
Explanation-Refiner
LLM=GPT-4o
2025.05
4
26
4.35
38.45
Explanation-Refiner
LLM=Deepseek-V3
2025.05
4
38
3.71
35.97
Explanation-Refiner
LLM=GPT-4o-mini
2025.05
3
20
5.12
44.84
Feedback
Search any
task
Search any
task