Share your thoughts, 1 month free Claude Pro on usSee more

Reasoning on FLenQA 1000 tokens

78.5Accuracy

LIME+1

Updated 4mo ago

Evaluation Results

Method	Links
LIME+1 2025.12		78.5
LIME+1 2025.12		72.5
LIME+1 2025.12		65.3
LIME+1 2025.12		65.3
LIME+1 2025.12		65.3
LIME 2025.12		47
LIME 2025.12		47
LIME 2025.12		47
Base 2025.12		34.8
Baseline 2025.12		34.8
Base (DCLM-BASELINE) 2025.12		34.8
LIME 2025.12		33
Baseline 2025.12		32.4
LIME 2025.12		30.5
Baseline 2025.12		12.5