Saturday, March 7, 2026

5 Basic Time Series Models You’re Missing

Share


Author’s photo | Diagram with Chronos-2: from one-dimensional to universal forecasts

# Entry

The base models didn’t start with ChatGPT. Long before enormous language models became popular, pre-trained models were already driving advances in computer vision and natural language processing, including segmentation, classification, and text understanding.

If you still rely primarily on classical statistical methods or deep learning models with a single dataset, you may be missing a fundamental shift in the way you build forecasting systems.

# 1. Chronos-2

Key Features:

  1. Encoder-only architecture inspired by T5
  2. Null forecasting with quantile scores
  3. Native support for past and known future covariates
  4. Long context length up to 8192 and forecast horizon up to 1024
  5. Productive CPU and GPU inference at high throughput

Apply cases:

  • Enormous-scale forecasting across multiple related time series
  • Forecasting based on covariates such as demand, energy and prices
  • Quick prototyping and production implementation without model training

Best utilize cases:

  • Production forecasting systems
  • Research and benchmarking
  • Intricate multivariate forecasting with covariates

# 2. TiRex

Key Features:

  • Pre-trained architecture based on xLSTM
  • Zero forecasting without training on a specific dataset
  • Point projections and quantile-based uncertainty estimates
  • Good performance in both long- and short-term benchmarks
  • Optional CUDA acceleration for capable GPU inference

Apply cases:

  • Zero forecasting for fresh or unseen time series datasets
  • Long- and short-term forecasting in finance, energy and operations
  • Quick benchmarking and deployment without model training

# 3.TimesFM

Key Features:

  • The basic model is intended only for a decoder with a 500M checkpoint
  • Forecasting univariate zero-shot time series
  • Context length up to 2048 time points, with support beyond training limits
  • Versatile forecast horizons with optional frequency indicators
  • Optimized for speedy, large-scale point forecasting

Apply cases:

  • Enormous-scale univariate forecasting across diverse datasets
  • Long-term forecasting for operational and infrastructure data
  • Experiment and compare quickly without model training

# 4. IBM Granite TTM R2

Key Features:

  • Tiny, pre-trained models starting with 1M parameters
  • High efficiency of multivariate forecasting using the zero method and several shots
  • Targeted models tailored to your specific context and forecast length
  • Quick inference and tuning on a single GPU or CPU
  • Support for exogenous variables and unchanging categorical features

Apply cases:

  • Multivariate forecasting in low-resource or edge environments
  • Zero baselines with optional slight tuning
  • Quick implementation for operational forecasting with confined data

# 5. Toto Open Base 1

Key Features:

  • A decoder-only transformer providing adaptable context and predicted lengths
  • Zero-forecasting without fine-tuning
  • Efficiently handle high-dimensional multidimensional data
  • Probabilistic forecasts using the Student-T mixture model
  • Pre-trained on over two trillion time series data points

Apply cases:

  • Forecasting observability and monitoring metrics
  • Enormous-scale telemetry of systems and infrastructure
  • Zero-based forecasting for large-scale nonstationary time series

Abstract

Model Parameters Architecture Forecasting type Key strengths
Chronos-2 120M Encoder only One-dimensional, multi-dimensional, probabilistic High zero-shot accuracy, long context and horizon, high inference throughput
TiRex 35M based on xLSTM One-dimensional, probabilistic A featherlight model with good parameters in the brief and long term
TimesFM 500M Only decoder One-factor, point forecasts Supports long contexts and adaptable horizons at scale
Granit TimeSeries TTM-R2 1M – tiny Targeted, pre-trained models Multidimensional, point forecasts Extremely compact, speedy inference, forceful zero results and few shots
Toto Open Base 1 151M Only decoder Multidimensional, probabilistic Optimized for high-dimensional, non-stationary observability data

Abid Ali Awan (@1abidaliawan) is a certified data science professional who loves building machine learning models. Currently, he focuses on creating content and writing technical blogs about machine learning and data science technologies. Abid holds a Master’s degree in Technology Management and a Bachelor’s degree in Telecommunications Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

Latest Posts

More News