Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Faithfulness Evaluation on SQuAD
Loading...
62.7
AUPC
MExGen C-LIME
37.532
44.066
50.6
57.134
Mar 21, 2024
AUPC
Updated 4d ago
Evaluation Results
Method
Method
Links
AUPC
MExGen C-LIME
Model=Flan-T5-Large, S...
2024.03
62.7
MExGen L-SHAP
Model=Flan-T5-Large, S...
2024.03
61.1
MExGen LOO
Model=Flan-T5-Large, S...
2024.03
60.2
P-SHAP
Model=Flan-T5-Large, S...
2024.03
58.8
MExGen L-SHAP
Model=Llama-3-8B-Instr...
2024.03
57
MExGen C-LIME
Model=Llama-3-8B-Instr...
2024.03
56.4
MExGen LOO
Model=Llama-3-8B-Instr...
2024.03
54.9
P-SHAP
Model=Llama-3-8B-Instr...
2024.03
38.5
Feedback
Search any
task
Search any
task