Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Summarization on Medical domain (100 random samples)
Loading...
3.3
Coherence
Llama-VOCABADAPT
2.5304
2.7302
2.93
3.1298
May 17, 2026
Coherence
Relevance
Faithfulness
Updated 15d ago
Evaluation Results
Method
Method
Links
Coherence
Relevance
Faithfulness
Llama-VOCABADAPT
Base Model=Llama
2026.05
3.3
3.82
3.84
Qwen-VOCABADAPT
Base Model=Qwen
2026.05
3.24
3.92
3.94
Llama-CPTOnly
Base Model=Llama, Voca...
2026.05
2.7
3.3
3.48
Qwen-CPTOnly
Base Model=Qwen, Vocab...
2026.05
2.56
3.34
3.76
Feedback
Search any
task
Search any
task