Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Explanation Quality Evaluation on RAW-FC
Loading...
2.07
M Score
ChatGPT w/ evi
1.498
1.6465
1.795
1.9435
Nov 25, 2025
M Score
I Score
S Score
R Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
M Score
I Score
S Score
R Score
ChatGPT w/ evi
Backbone=ChatGPT, Evid...
2025.11
2.07
4.44
4.62
4.69
ChatGPT w/o evi
Backbone=ChatGPT, Evid...
2025.11
1.97
4
4.44
4.68
L-Defense
Backbone=LLaMA2
2025.11
1.95
4.44
4.67
4.62
L-Defense
Backbone=ChatGPT
2025.11
1.91
4.17
4.41
4.49
SFT
Backbone=LLaMA2
2025.11
1.9
4.78
4.82
4.55
S-EGS
Backbone=LLaMA2
2025.11
1.79
4.88
4.83
4.8
Oracle - skyline
Backbone=ChatGPT, Evid...
2025.11
1.52
4.46
4.73
4.72
Feedback
Search any
task
Search any
task