Share your thoughts, 1 month free Claude Pro on usSee more

Logical Expression Evaluation on ListOps-O Length Generalization (Lengths 500-600)

99.4Accuracy

EBT-GRC

Updated 4mo ago

Evaluation Results

Method	Links
EBT-GRC 2023.11		99.4
BT-GRC OS 2023.11		99
RIR-EBT-GRC (-S4D) 2023.11		98.87
CRVNN 2023.11		98.5
RIR-EBT-GRC 2023.11		98.25
OM 2023.11		92.7
RIR-EBT-GRC (-Beam Align) 2023.11		79.05
BBT GRC 2023.11		40.4
RIR-GRC 2023.11		35.55
MEGA 2023.11		31.71
S4D 2023.11		20.85