Share your thoughts, 1 month free Claude Pro on usSee more

Logical Expression Evaluation on ListOps-O Length Generalization (Lengths 200-300)

99.9Accuracy

EBT-GRC

Updated 5mo ago

Evaluation Results

Method	Links
EBT-GRC 2023.11		99.9
OM 2023.11		99.6
CRVNN 2023.11		99.5
BT-GRC OS 2023.11		99.5
RIR-EBT-GRC 2023.11		99.15
RIR-EBT-GRC (-S4D) 2023.11		99.15
RIR-EBT-GRC (-Beam Align) 2023.11		91.75
MEGA 2023.11		45.21
BBT GRC 2023.11		43.6
RIR-GRC 2023.11		41.75
S4D 2023.11		31