Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Binary comparison for commonsense plausibility on ViComTe Material 1.0 (test)

91.27Accuracy

Mistral

88.659689.337390.01590.6927Feb 19, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.02
91.27
2025.02
90.79
2025.02
90.42
2025.02
89.6
2025.02
88.76