A Simple Explanation of Neural Networks

A Simple Explanation of Neural Networks

Neural Networks are inspired by the operation of biological neural networks. Artificial neural networks are typically optimized through a learning method based on mathematical statistics, making them a practical application of statistical methods.

Like other machine learning methods, neural networks have been used to solve a variety of problems, such as machine vision and speech recognition, which are difficult to address with traditional rule-based programming.

1Neural Networks

In the field of machine learning, neural networks refer to mathematical or computational models established to mimic the structure and function of biological neural networks for function estimation or approximation.

For example, given some data on the area and price of houses on the market, we need to build a housing price prediction model based on this data. That is, input the area of a house and expect the model to output a predicted price. Clearly, this is a linear regression problem, as housing prices are generally positively correlated with the area of the house. The relationship of the known data can be represented in a Cartesian coordinate system:

A Simple Explanation of Neural Networks

By performing linear fitting on the data, and ensuring that housing prices are never negative, we obtain the ReLU function (Rectified Linear Unit) shown in the figure.

A Simple Explanation of Neural Networks

In this simple example, the area of the house is the input, the housing price is the output, and the ReLU function acts as a neuron to produce the output.

However, housing prices are also influenced by factors such as the number of bedrooms, the location of the house, and the wealth level of the area, necessitating the construction of a more complex neural network model.

A Simple Explanation of Neural Networks

This constitutes the basic structure of a neural network model, which automatically generates hidden layers to process inputs and generate outputs. In this case, as long as there is enough training data, a good neural network model can be generated to achieve relatively accurate results.

In simple terms, deep learning is a more complex form of neural networks.

A Simple Explanation of Neural Networks

In this model, a cost function is established first, and then the gradient descent method is continuously used to find the optimal values of the parameters and . The cat recognizer written using this algorithm still does not have a high accuracy rate; to further improve recognition accuracy, a multi-layer neural network must be established to train the samples.

2Symbol Convention

In the neural network shown in the figure, the front is the input layer, the middle is the hidden layer, and the last is the output layer. The middle layer is called the hidden layer because during training, we see what input samples are available and what the output results are, but the actual values produced by the neural nodes in the middle layer cannot be observed.

Thus, the middle layer is called the hidden layer simply because you will not see it in the training set.

A Simple Explanation of Neural Networks

In the previous logistic regression, X represented the input; here we use the symbol a^[0] instead, where the number in the superscript “[ ]” indicates the layer of the neural network, and the symbol a represents activation, which refers to the values passed from one layer of the neural network to the next.

A Simple Explanation of Neural Networks

A Simple Explanation of Neural Networks

3Representation of Neural Networks

A Simple Explanation of Neural Networks

A Simple Explanation of Neural Networks

A Simple Explanation of Neural Networks

A Simple Explanation of Neural Networks

A Simple Explanation of Neural Networks

In logistic regression, both parameters are initialized to zero. In neural networks, parameters w are typically randomly initialized, while parameter b is initialized to 0.

Various parameters other than w and b, such as learning rate alpha, the number of layers l in the neural network, the number of nodes n^[k] in the l-th layer, and which activation function is used in the hidden layer, are referred to as hyperparameters because their values determine the final values of parameters w and b.

Alright, that’s it for the previous section; the next section will explain activation functions and related concepts~

A Simple Explanation of Neural Networks

Recommended Reading:

【吴恩达Deeplearning.ai笔记一】Intuitive Explanation of Logistic Regression

【Basic Mathematical Knowledge】Understanding the Essence of Taylor Expansion?

Comparison of Tsinghua, Peking, Zhejiang, and Nanjing Universities in AI Strength?

Welcome to follow our public account for learning and communication~

A Simple Explanation of Neural Networks

Leave a Comment