+33    Contact
Supply Chain

Evolution of forecasting : from statistical methods to deep learning


The field of supply chain forecasting has undergone a significant evolution over the years, transitioning from traditional statistical methods to more sophisticated approaches based on deep learning.

This evolution has been driven by the need to better anticipate complex fluctuations in demand and supply, while harnessing the potential of large amounts of available data.

In this article, we will explore the evolution of forecasting methods, focusing on classical statistical methods, neural networks and recent advances in deep learning.

Statistical methods (still viable, fast, but less powerful)

The forecasting issue is old and extensive, with traces of forecasting found in newspapers dating back to the mid-1800s. During that era, without computers, the approaches were purely statistical – methods that have been tested and refined over time, while still remaining useful.

Among the most commonly used approaches are the (S)ARIMA ((Seasonal) AutoRegressive Integrated Moving Average) models, Holt’s methods and the Holt-Winters variant.

These methods allow for relatively simple mathematical formulas to be used to calculate a fairly reliable forecast based on sales history. With this type of forecast, it’s even possible to capture trends and seasonality.

The main advantage of these methods is their simplicity and speed of implementation. Furthermore, they are specialized methods. Depending on the structure of sales history (stationary, seasonal, intermittent, etc.), forecasters determine which method is appropriate in their specific context. These methods have also evolved over time, incorporating technological advancements, especially in terms of data availability and processing.

Even today, these methods are widely used and, on certain datasets, show results comparable to those of machine learning algorithms, particularly when used in ensembles.

However, statistical methods have their limitations. For the most part, they cannot incorporate a large number of external influencing factors, such as meteorological data, for example. These methods are primarily univariate, meaning that in our case, they rely solely on sales data without considering other sources of information.

Although traditional statistical methods are still widely used and offer satisfactory results for certain situations, they have gradually been replaced by more sophisticated approaches based on deep learning, which have made it possible to tackle more complex forecasting challenges in the supply chain. .

However, these statistical methods remain relevant for quick and simple forecasts when the complexity of the data is relatively low.

Rise of neural networks (Long-Short-Term Memory, Gated Recurrent Unit)

Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) are based on neural network architectures.

Once neglected due to limited computational power, neural networks experienced a resurgence in interest around 2016-2017, thanks to the emergence of Convolutional Neural Networks (CNNs). A Convolutional Neural Network (CNN) is a type of neural network architecture initially employed to identify patterns in images for classification purposes. They are also used in the context of time series, which explains their use in forecasting.

LSTMs and GRUs can take into account the sequential dependencies of both recent and past events, making it possible to capture complex patterns in time series.

Both technologies rely on supervised learning. The network is trained on a set of historical sequences to predict future values. During the learning process, neural networks are assessed based on their predictions and adjust the connection weights of their neurons to minimize errors.

Recent methods (DeepAR, N-BEATS, ...)

As deep learning gained popularity, new approaches have emerged, particularly to address more complex forecasting challenges.

Among the recent methods used to enhance prediction accuracy in the field of supply chain forecasting, there is an architecture known as DeepAR, developed by Amazon in 2017. DeepAR is a model based on recurrent neural networks (RNNs), while employing a probabilistic approach to estimate distributions over future values. This approach enables the provision of confidence intervals around predictions, which is highly valuable for decision-making in an uncertain environment.

An intriguing feature of DeepAR is its capability to predict demand without requiring direct data normalization. This implies that it can handle time series with vastly different magnitudes and asymmetry levels, which is common in the supply chain context. The model is adept at accurate predictions even for products with limited historical data available.

DeepAR is a global model, which means it can learn from multiple time series, thereby enhancing forecast accuracy by leveraging common similarities and structures.

The model has been evaluated on various datasets and has shown promising results, even surpassing state-of-the-art models of its time. Moreover, it requires less preprocessing than certain other models and can be employed with minimal parameter adjustments for different time series.

DeepAR is capable of learning complex structures such as seasonality, making it a powerful method for improving forecast accuracy in the supply chain context.

Cutting-edge methods (Temporal Fusion Transformers, Informer, Autoformer, ...)

Transformers, a class of neural network architectures introduced in 2017, are currently revolutionizing the field of supply chain forecasting.

Like previous methods such as LSTMs and GRUs, Transformers are designed to process sequential data. To achieve this, they leverage attention mechanisms to model relationships between different time sequences.

Transformers operate on the basis of two main blocks: the encoder and the decoder. The encoder creates a vector representation of the sales series. The vectors are then transformed into forecasts by the decoder, all under the supervision of attention mechanisms that help to identify which input sequences are most important for the forecast.

The main strength of Transformers lies in their ability to effectively manage long-term dependencies in time series. Through the use of attention, they can detect complex relationships and interactions between different time periods, enabling them to capture seasonal trends, growth patterns, and behavioral changes in supply chain data.

The benefits of Transformers for supply chain forecasting are manifold. Their ability to model long-term dependencies improves forecast accuracy, particularly for time series with non-linear trends and irregular cycles.

However, they often require more computing power and storage resources than more traditional methods. Nevertheless, Transformers are revolutionising many fields, and you’ve probably heard of them without even knowing it, for example with Generative Pre-trained Transformer 3 (GPT-3).


Forecasting evolves according to a number of criteria : availability of data, computing capacity of machines, etc. Traditional statistical approaches are still used today, but the search for greater accuracy coupled with the integration of multiple influencing factors is driving the development of new deep learning models.

Methods based on neural networks such as LSTM and GRU have shown significant progress, but more recent architectures based on Transformers are pushing the boundaries by offering, in some cases, better interpretability, greater parallel processing capacity, and improved capture of long-term dependencies.

Future challenges regarding Transformers and complex neural network models in general revolve around minimizing computation and memory costs. This issue is becoming increasingly common in the scientific literature, especially for recent problems involving image and text generation.