Toto: Time Series Optimized Transformer for Observability
About
This technical report describes the Time Series Optimized Transformer for Observability (Toto), a new state of the art foundation model for time series forecasting developed by Datadog. In addition to advancing the state of the art on generalized time series benchmarks in domains such as electricity and weather, this model is the first general-purpose time series forecasting foundation model to be specifically tuned for observability metrics. Toto was trained on a dataset of one trillion time series data points, the largest among all currently published time series foundation models. Alongside publicly available time series datasets, 75% of the data used to train Toto consists of fully anonymous numerical metric data points from the Datadog platform. In our experiments, Toto outperforms existing time series foundation models on observability data. It does this while also excelling at general-purpose forecasting tasks, achieving state-of-the-art zero-shot performance on multiple open benchmark datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Probabilistic time series forecasting | ENTSO-e Load FEV leaderboard subset 1H | SQL0.48 | 16 | |
| Probabilistic time series forecasting | ENTSO-e Load FEV leaderboard 30T | SQL0.496 | 8 | |
| Probabilistic time series forecasting | Solar with Weather FEV leaderboard 15T | SQL0.784 | 8 | |
| Probabilistic time series forecasting | Solar with Weather FEV leaderboard 1H | SQL0.876 | 8 | |
| Probabilistic time series forecasting | AVG RANK FEV leaderboard | SQL5.31 | 8 | |
| Probabilistic time series forecasting | ENTSO-e Load FEV leaderboard 15T | SQL0.591 | 8 | |
| Probabilistic time series forecasting | GEOAVERAGE FEV leaderboard | SQL0.705 | 8 |