STAP: A Shuffle-Tokenized App Predictor with Ultra Long Context for Vocabulary-Free Mobile App Prediction

About

Predicting the next mobile application a user will launch is essential for intelligent device resource management and proactive assistance. Existing models rely on fixed app vocabularies, which prevents them from generalizing across different app ecosystems. Many also depend on user-specific knowledge, which complicates deployment in cold start scenarios. We propose STAP, a Transformer-based model that eliminates the need for a fixed vocabulary. STAP replaces true app identities with randomly reassigned virtual indices via a shuffle mechanism, and compensates for discarded semantic information by processing behavioral sequences with an ultra-long context design. A theoretical analysis shows that, given a sufficiently long context, the predicted distribution converges to the correct one despite the anonymity of the mapping. Experiments on two datasets from different continents demonstrate that STAP achieves strong cross-dataset zero-shot prediction accuracy -- a setting where all existing fixed-vocabulary methods are inherently inapplicable -- while its cold start performance within each dataset remains competitive with leading models. Furthermore, we introduce a deployment strategy that enables the model to retain a sufficiently long context during continuous inference while keeping latency within acceptable bounds.

Chengyu Fan, Hang Liu• 2026

Related benchmarks

Task	Dataset	Result
App Prediction	Tsinghua (In-dataset)	HR@149.99	6
App Prediction	LSapp In-dataset	Hit Rate @ 1 (HR@1)71.27	6
App Prediction	Tsinghua to LSApp (Cross-dataset)	Hit Rate @ 1 (HR@1)68.95	4
App Prediction	LSapp to Tsinghua (Cross-dataset)	HR@142.63	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord