Context-aware Instruction Following on Avocado Emails (test)

8.31Overlap (with context)

GPT-4-turbo

Updated 5mo ago

Evaluation Results

Method	Links
GPT-4-turbo 2024.03		8.31	7.71	8.05
Qwen-Chat(v1.5) + Qwen-Chat(v1.5) (Logit-based CoGenesis) 2024.03		6.66	5.7	7.12