Share your thoughts, 1 month free Claude Pro on usSee more

LLM-as-a-Judge on BC5CDR (test)

48.35EM

GPT-4o-Mini

Updated 3mo ago

Evaluation Results

Method	Links
GPT-4o-Mini 2025.06		48.35	2.33
Qwen-2.5-7B-Instruct 2025.06		45.25	2.42
Gemini-Flash 2025.06		42.55	2.09
Phi-3.5-Mini-3.8B-Instruct 2025.06		33.8	2.4
Deepseek-R1-Qwen-7B 2025.06		30.6	2.76
Deepseek-R1-LLaMA-8B 2025.06		30.5	3.37
Claude-3-Haiku 2025.06		29.5	2.26
LLaMA-3.1-8B-Instruct 2025.06		29.45	2.4