Share your thoughts, 1 month free Claude Pro on usSee more

Empathetic Dialogue on Sentient Benchmark

82.4Score

Gemini2.5-Pro-0605

Updated 4mo ago

Evaluation Results

Method	Links
Gemini2.5-Pro-0605 2025.07		82.4	55	4
GPT-4o-0326 2025.07		79.9	51	4
RLVER (PPO) 2025.07		79.2	42	9
RLVER (GRPO) 2025.07		72	34	10
RLVER (GRPO) 2025.07		68.3	26	10
GPT-4.1-0414 2025.07		68.2	35	13
Gemini2.5-Flash-Think-0520 2025.07		66.1	39	14
OpenAI-o3-0416 2025.07		62.7	32	14
RLVER (PPO) 2025.07		61.7	24	23
Qwen2.5-7B-Instruct 2025.07		13.3	2	76