Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context Question Answering on LoCoMo

50.25F1 (Multi Hop)

Block-Dist - Full

22.01429.344536.67544.0055Sep 22, 2025Oct 31, 2025Dec 9, 2025Jan 17, 2026Feb 25, 2026Apr 5, 2026May 15, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.05
50.25-17.82-67.72-36.57-17.23-37.92----
2026.05
49.76-18.65-66.15-38.67-16.82-38.01----
2026.05
48.55-19.79-65.22-36.82-20.73-38.22----
2026.04
45.1-57.49-38.04-63.09---57.06----
2026.02
44.73-64.53-31.04-61.78---57.32----
2026.02
43.46-58.62-19.76-51.12---43.24----
2026.02
43.06-61.9-29.79-62.95---57.03----
2026.04
42.32-47.18-25.88-61.02---52.52----
2026.05
40.44-18.28-58.3-32.43-23.32-34.55----
2026.04
40.43-60.31-42.67-64.81---58.03----
2026.05
39.98-9.6-48.83-33.38-34.75-33.31----
2026.05
39.93-13.06-61.1-34.93-36.61-37.13----
2026.02
39.88-58.03-27.96-60.09---53.1----
2026.05
39.08-19.87-56.28-34.03-23.99-34.65----
2026.05
38.95-14.26-60.45-33.16-37.44-36.85----
2026.05
38.86-9.93-49.34-34.09-32.34-32.91----
2026.02
38.84-57.96-26.61-57.36---52.18----
2026.04
38.84-57.96-26.61-57.36---52.18----
2026.05
38.76-16.26-57.38-32.71-22.49-33.52----
2026.02
38.72-48.93-28.64-47.65---45.09----
2026.04
38.72-48.93-28.64-47.65---45.1----
2026.04
38.19-32.24-20.27-46.33---40.28----
2026.05
37.95-14.73-59.89-33.82-37.3-36.74----
2026.02
37.06-56.27-29.11-57.22---51.58----
2026.05
37.05-12.6-48.57-32.67-32.02-32.58----
2026.02
36.52-47.9-24.77-50.07---45.56----
2025.09
36.2-19.2-16.6-59.3---44.1----
2026.02
35.8929.2439.7827.1225.7419.5349.0843.0548.2448.9244.3938.7---
2026.02
35.74-42-19.37-49.56---43.56----
2026.02
35.7-50.1-25.9-48.9---40.5----
2026.02
35.27-41.15-20.02-48.62---42.84----
2026.04
35.27-41.15-20.02-48.62---42.84----
2026.02
35.24-41.36-24.79-47.95---42.81----
2026.02
35.1327.5652.3844.1517.7315.9239.1235.4325.4424.1936.5932.25---
2026.02
34.7225.1345.9335.5122.6415.5843.6537.4230.1527.4438.732.07---
2026.04
34.525.140.13218.416.949.844.241.539.8-----
2026.04
34.5-44.7-23.86-50.39---44.64----
2026.04
33.59-48.17-28.59-54.84---47.92----
2026.04
33.55-44.14-18.23-50.58---44.1----
2026.02
33.3724.2631.4916.4213.9211.0225.4624.8249.173532.4225.02---
2025.09
33.2-22.9-12.3-49.1---38.4----
2026.02
32.8623.7639.4131.2317.115.8448.4342.9736.3535.5340.5335.36---
2026.04
32.8623.7639.4131.2317.115.8448.4342.9736.3535.53-----
2026.05
32.61-10.75-50.65-25.75-15.37-27.03----
2025.09
32.6-24.6-15.1-46.5---37.5----
2025.09
32.6-28.1-15.3-52.7---41.6----
2026.04
32.59-27.36-13.5-55.35---42.74----
2026.02
32.36-55.99-29.19-46.33---44.72----
2026.04
32.17-30.77-23.21-52.19---42.25----
2026.02
32.1124.4846.6131.8423.9816.8444.7438.1751.4851.9643.7637.27---
2026.02
32.11-53.79-26.14-47.64---44.73----
2026.04
31.73-28.96-15.03-42.58---36.04----
2025.09
31.3-17-10-41.7---32.7----
2026.02
30.823.1329.2524.5614.1111.0342.2535.5226.5925.0233.6528.45---
2025.09
30.8-13.2-14.9-39.4---30.8----
2026.02
30.3622.8317.2913.1812.2411.8760.1653.3534.9634.2541.0236.23---
2026.04
30.3622.8317.2913.1812.2411.8760.1653.3534.9634.25-----
2026.05
29.71-15.37-48.27-25.62-17.51-27.3----
2026.05
29.68-7.34-51.36-28.87-22.97-28.04----
2025.09
29-20.9-14.1-38.5---31.6----
2026.04
28.8521.446.237.113.512.945.838.254.552.1-----
2025.09
28.3-23.7-15.9-36.4---30.8----
2026.02
28.2422.7638.3933.6415.4313.8142.0936.5743.7943.1438.6234.51---
2026.04
28.11-24.73-20.42-49.79---38.77----
2026.02
28.1-18.1-7.1-47.6------19.8--
2026.02
2818.479.095.7816.4714.861.5654.1952.6151.1344.1238.7---
2026.04
2818.479.095.7816.4714.861.5654.1952.6151.13-----
2025.09
27.8-9.1-15.2-30.2---24.4----
2025.09
27.7-14-14-32.7---29.4----
2026.05
27.62-4.39-42.59-20.63-19.41-22.93----
2026.04
27.57-30.66-19.74-42.45---35.85----
2025.09
27.3-13.3-10.8-37.2---28.8----
2025.09
27.1-10.7-11.4-29.7---24.1----
2026.02
27.0220.0945.8536.6712.141244.6537.0650.0349.4741.9736.16---
2026.02
27.02-45.85-12.14-44.65---39.65----
2026.04
27.0220.0945.8536.6712.141244.6537.0650.0349.47-----
2026.04
27.02-45.85-12.14-44.65---39.65----
2026.02
26.6517.7225.5219.449.157.4441.0434.3443.2942.7335.4530.16---
2026.04
26.6517.7225.5219.449.157.4441.0434.3443.2942.73-----
2025.09
26.4-15.4-13.2-29.3---24.9----
2025.09
26.3-25.2-14.3-29.8---27.2----
2026.02
25.9718.1625.3718.7613.5211.6934.9230.621.9417.5628.1623.08---
2025.09
25.6-23.8-13-34.8---29.5----
2026.05
25.51-15.51-40.48-28.64-17.42-25.51----
2026.02
25.2-34.1-14.1-41.6------28.7--
2026.02
25.0915.7332.8227.1414.4713.3520.1818.3946.7740.8128.6224.22---
2026.02
25.0219.7518.4114.7712.0411.1640.3629.0569.2368.7539.7433.47---
2026.04
25.0219.7518.4114.7712.0411.1640.3629.0569.2368.75-----
2026.04
25.01-23.83-15.49-45.89---35.57----
2026.02
24.316.934.523.113.112.238.133.33130.132.7627.58---
2026.02
24.3-17.2-7.9-41.5------24.3--
2025.09
24.3-19.5-12.5-27.2---24.2----
2026.02
24.1215.4125.4819.0413.4412.6434.7432.4127.1124.3228.9925.06---
2025.09
23.9-18.8-13.2-25.1---22.8----
2026.02
23.8-33.5-14-37.3------33--
2025.09
23.8-8-13.1-24.2---20.1----
2025.09
23.6-7.6-13.4-23.8---19.8----
2025.09
23.4-23.7-15-26---24.3----
2025.09
23.2-14.3-12.1-27.4---22.9----
2026.02
23.1-29.1-13.3-37.7------31.8--
Showing 100 of 196 rows