Share your thoughts, 1 month free Claude Pro on usSee more

Interactive Question Answering on IQA-EVAL MMLU-derived (TextDavinci)

4.6Helpfulness

Human

Updated 5mo ago

Evaluation Results

Method	Links
Human 2024.08		4.6	4.35	1.78	69
IQA-EVAL-GPT3.5 2024.08		4.3	4.47	1.57	63
IQA-EVAL-Claude 2024.08		4.13	4.47	2.2	67
IQA-EVAL-GPT3.5 2024.08		3.93	3.97	2	53
IQA-EVAL-GPT4 2024.08		3.67	4.77	1.57	87
Human 2024.08		3.52	3.22	2.66	48
IQA-EVAL-Claude 2024.08		3	3.23	2.07	57
IQA-EVAL-GPT4 2024.08		2.1	3.03	2.37	67