Share your thoughts, 1 month free Claude Pro on usSee more

Explanation Quality Evaluation on LIAR RAW

2.29Meaningfulness Score

ChatGPT w/ evi

Updated 3mo ago

Evaluation Results

Method	Links
ChatGPT w/ evi 2025.11		2.29	3.71	4.04	3.99
ChatGPT w/o evi 2025.11		2.27	3.93	4.29	4.5
L-Defense 2025.11		2.2	4.39	4.64	4.63
L-Defense 2025.11		2.06	4.12	4.28	4.47
SFT 2025.11		1.9	4.48	4.6	4.65
Oracle - skyline 2025.11		1.85	4.44	4.6	4.69
S-EGS 2025.11		1.77	4.58	4.66	4.83