Share your thoughts, 1 month free Claude Pro on usSee more

Gender bias evaluation on RealWorldQuestioning Health Recommendations 1.0

80.89Proportion (Male More Info)

Llama-3

Updated 4mo ago

Evaluation Results

Method	Links
Llama-3 2025.05		80.89	17.97	1.12	0.05	0
DeepSeek-R1 2025.05		73.03	26.96	0	0.13	0
ChatGPT-4-turbo 2025.05		56.17	38.2	5.61	0.48	0.02
ChatGPT-3.5-turbo 2025.05		53.93	39.32	6.74	0.55	0.07