Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context language understanding on RULER (4k Context)

27VT Score

LLama2-7B-chat

-1.086.2113.520.79Sep 10, 2024Dec 4, 2024Feb 27, 2025May 24, 2025Aug 17, 2025Nov 10, 2025Feb 4, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2024.09
27----74.33---95.8785.663
2026.02
17.410089.687.879.21079.856.463.7---
2026.02
17.310089.488.478.9108056.263.7---
2024.09
9.8----20.33---4415.27
2024.09
7.2----2---66.2232.437
2024.09
5.2----4.67---27.447.614.5
2024.09
3----0---8.281.535
2024.09
1.6----9.33---41.0716.655.5
2024.09
1.2----13.67---2.456.835.5
2024.09
0----24.67---0.4427.732.5