Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
World Modeling on Manhattan taxi rides
Loading...
100
Next-Token Test Accuracy
GPT
95
97.5
100
102.5
Nov 8, 2025
Next-Token Test Accuracy
Valid Trajectories
Sequence Compression Ratio
Effective Latent Rank
Detour Robustness
Updated 9d ago
Evaluation Results
Method
Method
Links
Next-Token Test Accuracy
Valid Trajectories
Sequence Compression Ratio
Effective Latent Rank
Detour Robustness
GPT
Implementation=nanoGPT...
2025.11
100
97
0.65
160.1
85
MTP
Multi-step prediction...
2025.11
100
98.1
0.64
57.7
95
JTP
Multi-step prediction...
2025.11
100
97.1
0.32
215.8
87
NextLat
Multi-step prediction...
2025.11
100
98.7
0.71
52.7
95
True world model
2025.11
100
100
1
-
100
Feedback
Search any
task
Search any
task