Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Doc Question Answering on LongBench-E

49.6F1 Score

Teacher Model (w/ Context)

5.50416.95228.439.848Oct 23, 2025
Updated 22d ago

Evaluation Results

MethodLinks
2025.10
49.6
2025.10
45.9
2025.10
43
2025.10
41.4
2025.10
40.9
2025.10
38.2
2025.10
36
2025.10
32.5
2025.10
32.1
2025.10
31.6
2025.10
29.4
2025.10
28
2025.10
22.4
2025.10
21.8
2025.10
21.7
2025.10
11.4
2025.10
7.2