Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PCogAlignBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal ConversationPCogAlignBench (LS2)
LLM Judge Score0.852
20
Multimodal ConversationPCogAlignBench LS1
LLM Judge Score0.903
20
Personalized VLM AlignmentPCogAlignBench LS1->LS1 1.0
P Score4.303
16
Personalized response selectionPCogAlignBench Average
P Score4.154
14
Personalized response selectionPCogAlignBench LS2->LS2
P Score4.151
14
Personalized response selectionPCogAlignBench LS2->LS1
P Score4.15
14
Personalized response selectionPCogAlignBench LS1->LS2
P. Score4.156
14
Personalized response selectionPCogAlignBench LS1->LS1
P Score4.161
14
Personalized VLM AlignmentPCogAlignBench LS1->LS2 1.0
P Score4.321
8
Personalized VLM AlignmentPCogAlignBench 1.0
P Score4.312
4
Personalized VLM AlignmentPCogAlignBench LS2->LS1 1.0
P Score4.275
4
Personalized VLM AlignmentPCogAlignBench LS2->LS2 1.0
P Score4.321
2
Showing 12 of 12 rows