Time Series Prediction Method Based on LSTM and Attention Mechanism

Tuesday

July 2021

Testing “Gold” Room

Time series refers to a sequence formed by arranging the values of the same statistical indicator in chronological order. Its essence is the trend of one or more random variables changing over time. The core of time series prediction methods is to mine such patterns from the data^[1].Time series prediction is widely applied in various aspects of social life, such as financial markets, meteorological research, road traffic, and product demand.

1. Introduction to Time Series Prediction Methods

Traditional time series prediction methods include autoregression, moving average, and autoregressive moving average, mainly based on statistical methods, which assume the stationarity of time series and perform linear weighted predictions on historical values. Traditional time series prediction methods are only suitable for small-scale univariate predictions, as they struggle with nonlinear modeling of time series data, leading to low prediction accuracy.

With the continuous development of artificial intelligence technologies, machine learning models such as KNN, GBRT, RF, and XGBoost have been successfully applied in the field of time series prediction. Compared to traditional time series prediction methods, machine learning techniques have stronger capabilities in nonlinear modeling of time series data, which can enhance prediction accuracy. However, machine learning techniques build models through feature engineering without considering the dependencies between time series data.

In the digital age, with the continuous improvement of computer computing power, deep learning technologies have been widely applied.Deep learning technologies have strong nonlinear modeling capabilities, while recurrent neural networks and their variants can effectively model the long-term dependencies of time series data. The attention mechanism can effectively capture the correlations between time series data, which plays an important role in improving prediction accuracy^[2].Academia and industry have applied Long Short-Term Memory Networks (LSTM) based on attention mechanisms for time series prediction.

2. Principles of LSTM and Attention Mechanism Theory

1. LSTM Principles

LSTM is a special type of recurrent neural network designed to address the issues of gradient vanishing and explosion during the training process of long sequences. Figure 1 shows the structure of LSTM, with four neural network layers inside each neuron.

Time Series Prediction Method Based on LSTM and Attention Mechanism

Figure 1 LSTM Structure

Figure 2 represents the neuron state of LSTM, indicated by the horizontal line running across the top. LSTM uses a gating structure to delete or add information to the neuron state. A gate is an information selection mechanism composed of a Sigmoid function and pointwise multiplication operator, which determines how much information passes through.

Figure 2 LSTM Neuron State

Figure 3 shows the forget gate, which controls how much information from the previous moment’s internal state needs to be forgotten.

Figure 3 LSTM Forget Gate

Figure 4 shows the input gate, which controls how much information from the current moment’s candidate state needs to be saved.

Figure 4 LSTM Input Gate

Figure 5 shows the output gate, which controls how much information from the current moment’s internal state needs to be output to the external state.

Figure 5 LSTM Output Gate

2. Attention Mechanism Theory

The attention mechanism in deep learning is an optimization method. Its basic idea is inspired by the human visual processing mechanism. When humans gather information with their eyes, they prioritize the most important information and focus more attention on it, thereby extracting key information from complex data.

Attention mechanisms can be divided into soft attention mechanisms and hard attention mechanisms. The weight distribution of the soft attention mechanism ranges from [0, 1], while the hard attention mechanism has weights of either 0 or 1.

In recent years, the attention mechanism has significantly improved accuracy metrics in fields such as computer vision and natural language processing. Consequently, research has emerged applying attention mechanisms to time series prediction to enhance prediction accuracy.

3. Applications in Time Series Prediction

In the field of time series prediction, many researchers often combine LSTM with attention mechanisms.This combination typically occurs in two ways:based on attention mechanisms at different times and based on attention mechanisms for different features.The attention mechanism based on different times assigns different weights to the hidden layer outputs at different times, and then obtains a context vector for LSTM through weighted summation;the attention mechanism based on different features assigns different attention weights to different dimensions of the output vector^[3].

LSTM can effectively learn the temporal correlations in the sequence, and the attention mechanism can effectively extract the dynamic change features of the data, making the analysis of correlations in time series data more accurate, thus leading to more precise prediction results.

4. Scenario Applications

The banking industry is a service-oriented industry, where high-quality service is the foundation for gaining customers’ lasting trust. To provide better services, banks need to accurately predict future cash levels, business volumes, transaction volumes, and workloads related to relevant business scenarios. Therefore, the time series prediction methods based on LSTM and attention mechanisms can be applied to the following banking business scenarios:

Cash level prediction for self-service ATMs. Banks predict the cash levels stored in each self-service ATM based on business volumes and related management requirements, and formulate cash replenishment plans.
Branch cash level prediction. Predict the cash usage for branch cash business, formulate cash allocation plans for branches, and request the cash center for cash preparation and allocation.
Remote banking center business volume prediction. Due to the large number of agents in various business lines at remote banking centers, unreasonable scheduling can occur based on human experience; this can be resolved through business volume predictions.
Optimizing human resource allocation at branches. By predicting the transaction volumes and workloads at various positions in branches, reasonable staff scheduling can be achieved, optimizing human resource allocation.
Reasonable pricing of mortgage-backed securities (MBS). Accurate predictions of prepayments can guide the reasonable pricing of MBS.

References

[1] Yang Haimin, Pan Zhihong, Bai Wei. Overview of Time Series Prediction Methods [J]. Computer Science, 2019, v.46(01):28-35.

[2] Tan Zhanning. Time Series Prediction and Classification Based on Deep Learning [D]. South China University of Technology, 2020.

[3] Li Hao. Stock Price Prediction Based on Multi-Input LSTM [D]. Shanghai Jiao Tong University, 2019.

Previous Recommendations

Discussion on HTTP/3: The Next Generation Hypertext Transfer Protocol

The Cornerstone of Real-Time Financial Data Lakes: An Analysis of the Iceberg Principle

An Analysis of Voice Emotion Recognition Technology

Confirmed by Eye Contact, You Are You—A Brief Insight into Iris Recognition Technology

An Analysis of PaaS Cloud Platforms

Leave a Comment Cancel reply