Share your thoughts, 1 month free Claude Pro on usSee more

Multi-turn conversation evaluation on Lost-in-Conversation (test)

95Actions Score

Full†

Updated 2mo ago

Evaluation Results

Method	Links
Full† 2026.05		95	91.7	72	86.2
Full† 2026.05		94.1	88.1	75.9	86
Full† 2026.05		88.4	90.6	97	92
SeDT 2026.05		85.9	71.8	61.8	73.2
SeDT 2026.05		78.1	75.7	72	75.3
Sharded 2026.05		74.7	62.9	51.6	63.1
SeDT 2026.05		73	74.4	58.8	68.7
Sharded 2026.05		46.7	62.9	50.4	53.3
Sharded 2026.05		40.4	65	66.4	57.3