Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences

About

Industrial financial systems operate on temporal event sequences such as transactions, user actions, and system logs. While recent research emphasizes representation learning and large language models, production systems continue to rely heavily on handcrafted statistical features due to their interpretability, robustness under limited supervision, and strict latency constraints. This creates a persistent disconnect between learned embeddings and feature-based pipelines. We introduce Embedding-Aware Feature Discovery (EAFD), a unified framework that bridges this gap by coupling pretrained event-sequence embeddings with a self-reflective LLM-driven feature generation agent. EAFD iteratively discovers, evaluates, and refines features directly from raw event sequences using two complementary criteria: \emph{alignment}, which explains information already encoded in embeddings, and \emph{complementarity}, which identifies predictive signals missing from them. Across both open-source and industrial transaction benchmarks, EAFD consistently outperforms embedding-only and feature-based baselines, achieving relative gains of up to $+5.8\%$ over state-of-the-art pretrained embeddings, resulting in new state-of-the-art performance across event-sequence datasets.

Artem Sakhno, Ivan Sergeev, Alexey Shestov, Omar Zoloev, Elizaveta Kovtun, Gleb Gusev, Andrey Savchenko, Maksim Makarenko• 2026

Related benchmarks

TaskDatasetResultRank
Age PredictionAge
Accuracy65.2
12
ClassificationRosbank
AUC0.872
12
ClassificationDataFusion
AUC0.781
10
Gender PredictionGender
AUC89.8
10
Age ClassificationPrivate Dataset
Accuracy75.6
6
Gender ClassificationPrivate Dataset
AUC90.1
6
RegressionPrivate Dataset
MAE1.07e+4
6
Showing 7 of 7 rows

Other info

Follow for update