Share your thoughts, 1 month free Claude Pro on usSee more

Binary comparison for commonsense plausibility on ViComTe Color 1.0 (test)

93.29Accuracy

GPT-4

Updated 3mo ago

Evaluation Results

Method	Links
GPT-4 2025.02		93.29
EVA-CLIP 2025.02		93.29
GPT-3.5 2025.02		92.25
Qwen2 2025.02		86.79
Mistral 2025.02		86.06