Bayesian Optimization of CNN-LSTM Hybrid Neural Network Prediction (Matlab Implementation)

💥1 Overview

References:

Bayesian Optimization of CNN-LSTM Hybrid Neural Network Prediction (Matlab Implementation)

The CNN is constructed by mimicking the biological visual perception mechanism, capable of performing both supervised and unsupervised learning[33]. The parameter sharing of the convolution kernel in the hidden layers and the sparsity of inter-layer connections enable CNNs to extract deep local features from high-dimensional data with a relatively small amount of computation, and to obtain effective representations through convolutional and pooling layers[34]. The structure of the CNN network includes 2 convolutional layers and 1 flattening operation, where each convolutional layer contains 1 convolution operation and 1 pooling operation. After the second pooling operation, a fully connected layer is used to flatten the high-dimensional data into one-dimensional data, making it easier to process the data. The CNN structure is shown in Figure 1.

Bayesian Optimization of CNN-LSTM Hybrid Neural Network Prediction (Matlab Implementation)

When the number of time steps is large, the historical gradient information of RNN cannot be maintained within a reasonable range, leading to gradient decay or explosion, making it difficult for RNN to capture effective information from long-distance sequences[35]. LSTM, as a special type of RNN, effectively addresses the gradient vanishing problem in RNNs[36]. GRU, proposed based on LSTM, has a simpler structure, fewer parameters, shorter training time, and faster training speed[37]. The structure of GRU is shown in Figure 2.

Bayesian Optimization of CNN-LSTM Hybrid Neural Network Prediction (Matlab Implementation)

Bayesian optimization, also known as the sequential model-based optimization method (SMBO), is a derivative-free technique. The BO method includes using Gaussian process regression models to estimate the objective function[40]. First, two groups of random hyperparameters are evaluated. A probabilistic model is used to sequentially establish prior knowledge of the optimization problem, and then the objective function f(z) is scalarized[41], as shown in the equation.

Bayesian Optimization of CNN-LSTM Hybrid Neural Network Prediction (Matlab Implementation)

In the equation, z* is the global optimal value of the constraint domain of f(z), including real numbers, integers, or categorical feature values. The advantages of the BO algorithm include fast convergence, good performance, strong scalability, and suitability for hyperparameter optimization problems, especially when features are non-parametric. However, the drawbacks of hyperparameter optimization based on BO can be summarized into two categories: training time and adjustment of BO parameters. Since BO is a sequential method, it is challenging to parallelize it to reduce computation time[27]. Additionally, the kernel function of BO is difficult to adjust; recent research has addressed these issues, such as standardizing BO parameters.

📚2 Running Results

Bayesian Optimization of CNN-LSTM Hybrid Neural Network Prediction (Matlab Implementation)

🎉3 References

Some theories are sourced from the internet; if there is any infringement, please contact us for deletion.

[1] Zou Zhi, Wu Tiezhu, Zhang Xiaoxing, et al. Short-term load forecasting based on Bayesian optimization CNN-BiGRU hybrid neural network [J]. High Voltage Technology, 2022, 48(10): 3935-3945. DOI:10.13336/j.1003-6520.hve.20220168.

🌈4 Matlab Code Implementation

Leave a Comment Cancel reply