Bus Travel Time Prediction Based on Attention-LSTM Neural Network

XU Wanxu, SHEN Yindong

(School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, Hubei 430074)

Abstract: Traditional bus travel time prediction models often ignore information from historical timestamps, leading to unsatisfactory prediction accuracy. To address the temporal nature of bus travel times, this paper proposes a prediction model based on the LSTM neural network, incorporating an Attention mechanism for optimization. Firstly, various influencing factors are comprehensively considered to design a multivariable LSTM module, associating the current travel time with historical data and extracting multidimensional features; subsequently, to overcome the limitation of a single LSTM network that cannot automatically recognize the importance of different information, an Attention mechanism is introduced to focus on key information while ignoring irrelevant data; finally, real bus GPS data is used to validate the effectiveness of this method. Experimental results indicate that this model has higher accuracy compared to five common methods.

Keywords: Intelligent Transportation; Bus Travel Time Prediction; LSTM Neural Network; Attention Mechanism; Bus GPS Data; Deep Learning; Recurrent Neural Network

Classification Number: TN99-34 Document Identification Code: A

Article Number: 1004-373X(2022)03-0083-05

0 Introduction

Bus travel time is a crucial component of intelligent transportation systems. Accurate travel time information provides essential support for optimizing bus scheduling, real-time dispatching, and priority control at bus intersections, significantly impacting dynamic allocation of bus resources and urban traffic structure planning.

In recent years, scholars both domestically and internationally have conducted extensive research on this issue, proposing four main types of prediction models:

1) Kalman Filter Model[1-2]. For example, reference [1] analyzed the discretization patterns of time under heterogeneous traffic conditions to construct a prediction model based on the Kalman filter. However, this model considers relatively few factors and is suitable for linear systems, making it less appropriate for the highly nonlinear problem of bus travel time prediction.

2) Support Vector Machine (SVM) Model[3-4]. For instance, reference [3] constructed an improved SVM prediction model using seven dimensional characteristics such as time periods and weather, validating the model’s accuracy with data from Xiamen BRT-1 route. However, this type of model has high computational complexity and cannot handle large-scale data well.

3) Decision Tree Model. For instance, reference [5] constructed a prediction model based on Gradient Boosting Regression Tree (GBRT), which showed improved prediction accuracy compared to SVM and Autoregressive Integrated Moving Average methods. This model is highly interpretable but suffers from low speed and overfitting issues.

4) Neural Network Model. This model is widely used in travel time prediction problems. For example, reference [6] combined the Firefly algorithm with BP neural networks to construct a prediction model; reference [7] built a BP neural network prediction model aimed at dynamic bus stops, achieving predictions across multiple stops; reference [8] constructed neural networks using collected historical and real-time data. Neural networks can fit nonlinear problems well, making them significant for travel time prediction. However, bus travel time has temporal characteristics, meaning the current travel time is closely related to historical times. The inadequacy of the aforementioned models lies in only considering current information and not fully utilizing historical data, limiting the model’s accuracy.

Compared with traditional learning methods, deep learning has more powerful data learning and abstraction capabilities. LSTM (Long Short-Term Memory) is one of the most popular deep learning technologies today, capable of retaining historical information. It inherits the advantages of traditional neural networks while also mining historical data, making it advantageous for handling temporal issues[9], and has been widely applied in recent years. Reference [10] constructed an improved LSTM model using data from 66 road segments in the UK; reference [11] used LSTM networks for prediction and compared it with BP neural networks, showing that LSTM had superior accuracy. However, traditional LSTM converts input sequences into fixed-length vectors while retaining all information, which limits the model’s memory and can lead to information loss when processing long sequences.

The introduction of the Attention mechanism can compensate for this defect, as it assigns weights to different information, enhancing memory of important information and ignoring irrelevant data. In recent years, neural networks incorporating attention mechanisms have become a research hotspot, widely used in machine translation, image classification, and other fields, while research on bus travel time prediction using this approach is relatively scarce. Therefore, this paper proposes an Attention-based LSTM prediction model that utilizes the LSTM module to analyze multiple factors synchronously in historical data, integrating the Attention mechanism to automatically extract key information and optimize the model. Finally, comparisons with five common methods reveal that this model has higher prediction accuracy.

1 Problem Definition

This paper aims to design a travel time prediction method based on a large number of travel time samples accumulated by bus companies. Bus travel times vary randomly across different dates and time periods and are closely related to dynamic factors such as road conditions and accidents[12]. Due to the short time intervals between bus schedules, adjacent schedules often have similar road conditions, indicating that the historical data contains information influencing future outcomes. Thus, current travel times are related to historical timestamps, showing that bus travel times are temporal and form a time series with sequential dependencies.

Based on the temporal nature of travel times, the problem can be described as follows: predicting the bus travel time at time t based on the historical travel time sequence [yt-s, ext{…}, y_{t-2}, y_{t-1}] ext{ (where }s ext{ represents the length of the time step, i.e., the number of historical timestamps) and historical features [xt-s, ext{…}, x_{t-2}, x_{t-1}] ext{.}

Bus Travel Time Prediction Based on Attention-LSTM Neural Network

Where: x_i=(x_{i,1}, x_{i,2}, ext{…}, x_{i,n})^T represents the value vector of various factors affecting travel time at the i-th moment, with n being the number of influencing factors; F is a function representing some mapping relationship between the predicted value and input values. The goal of this paper is to find a suitable model to fit this complex nonlinear mapping.

2 Analysis of Influencing Factors on Bus Travel Time

Common LSTM models only consider the

Leave a Comment