Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeepCoT: Deep Continual Transformers for Real-Time Inference on Data Streams

About

Transformer-based models have dramatically increased their size and parameter count to tackle increasingly complex tasks. At the same time, there is a growing demand for high performance, low-latency inference on devices with limited resources. In particular, stream data inference is typically performed over a sliding temporal window, leading to highly redundant computations. While the recent Continual Transformers started addressing this issue, they can be effectively used only in shallow models, which limits their scope and generalization power. In this paper, we propose the Deep Continual Transformer (DeepCoT), a redundancy-free encoder attention mechanism that can be applied over existing deep encoder architectures with minimal changes. In our experiments over audio, video, and text streams, we show that DeepCoTs retain comparative performance to their non-continual baselines while offering a linear computational cost for all Transformer layers, which reduces up to two orders of magnitude in the running time compared to previous efficient models.

Gin\'es Carreto Pic\'on, Peng Yuan Zhou, Qi Zhang, Alexandros Iosifidis• 2025

Related benchmarks

TaskDatasetResultRank
Audio ClassificationGTZAN
Accuracy94.19
59
Natural Language UnderstandingGLUE
Average Score85.18
18
Online Action DetectionTHUMOS14 (val)
mAP K40063.68
5
Sound Event DetectionDCASE 2023
PSDS10.0678
2
Sound Event DetectionURBAN-SED
SbF140.62
2
Showing 5 of 5 rows

Other info

Follow for update