Share your thoughts, 1 month free Claude Pro on usSee more

Human Preference Evaluation for Code-switched Text Generation on In-Domain Data

804Preference Score

Gold Standard

Updated 4mo ago

Evaluation Results

Method	Links
Gold Standard 2025.02		804	1
Llama3 2025.02		573.5	2
NLLB 2025.02		507	3
Llama3 Instruct 2025.02		480.5	4
GPT-4ofs 2025.02		413.5	5
Llama3.3-70Bfs 2025.02		371.5	6
Gold Standard 2025.02		369.5	1
Llama3 2025.02		291.5	2
Llama3 Instruct 2025.02		270.5	3
NLLB 2025.02		259.5	4
GPT-4ofs 2025.02		251.5	5
Llama3.3-70Bfs 2025.02		207.5	6