Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Helpful response generation on Human A/B 100 randomly chosen instances (test)

62Human Preference Score

MuffinGPT-3.5

5.8420.423549.58Jan 11, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.01
62
2024.01
8