Share your thoughts, 1 month free Claude Pro on usSee more

Logical Expression Evaluation on ListOps-O Length Generalization (Lengths 900-1000)

99.5Accuracy

EBT-GRC

Updated 4mo ago

Evaluation Results

Method	Links
EBT-GRC 2023.11		99.5
RIR-EBT-GRC (-S4D) 2023.11		98.6
CRVNN 2023.11		98
BT-GRC OS 2023.11		97.2
RIR-EBT-GRC 2023.11		97.1
OM 2023.11		76.9
RIR-EBT-GRC (-Beam Align) 2023.11		68.8
RIR-GRC 2023.11		32.3
BBT GRC 2023.11		31.5
MEGA 2023.11		24.73
S4D 2023.11		14.7