Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Variable Tracking on RULER 8k
Loading...
79.28
F1 Score
Rephrasing
-3.1712
18.2344
39.64
61.0456
Jun 10, 2025
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Rephrasing
Model (Max Context)=Ll...
2025.06
79.28
PCD
Model (Max Context)=Ll...
2025.06
77.92
DoLa-Low
Model (Max Context)=Ll...
2025.06
72.93
DoLa-High
Model (Max Context)=Ll...
2025.06
72.87
Base
Model (Max Context)=Ll...
2025.06
72.78
Beam-Search
Model (Max Context)=Ll...
2025.06
71.25
Base
Model (Max Context)=Ll...
2025.06
71.21
PCD
Model (Max Context)=Ll...
2025.06
71.19
MsPoE
Model (Max Context)=Ll...
2025.06
70.03
PCD
Model (Max Context)=Ll...
2025.06
60.07
Base
Model (Max Context)=Ll...
2025.06
57.93
SegR
Model (Max Context)=Ll...
2025.06
0
Feedback
Search any
task
Search any
task