Online Handwritten Chinese Character Recognition Using New RNN Architecture

This article briefly introduces the main work of the paper accepted in April 2019 by Pattern Recognition titled “Recognizing Online Handwritten Chinese Characters Using RNNs with New Computing Architectures”. This paper mainly addresses the end-to-end recognition problem of handwritten Chinese characters.

1. Research Background

Handwritten input is a very common human-computer interaction method. With the continuous development of deep learning, researchers have gradually applied deep neural networks to online handwritten Chinese character recognition[1][2][3], and the technology has become increasingly mature.

Generally speaking, the handwritten Chinese characters we commonly see are written on touch screens. When writing on a touch screen, the hand or arm is usually supported, so the characters written on the touch screen are mostly quite neat.

In recent years, a new type of handwriting method has emerged—air writing. This method generally uses sensors that can detect finger information (such as Leap Motion) to record the trajectory of the user’s finger movements (writing Chinese characters) and displays the trajectory (characters) on the machine.

The Chinese characters written through air writing are generally written in one stroke without any stroke marks (lifting the pen, placing the pen). In addition, the shapes of air-written characters tend to be more irregular.

We provide an introductory diagram of the air writing character recognition system and a comparison between air-written characters and traditional handwritten characters in Fig. 1.

Fig.1. Examples of handwriting: (a) The example of gesture-based handwriting with the leap motion sensor; (b) The gesture-based handwritten character recognition system; (c) Gesture-based handwritten Chinese characters; (d) Handwritten Chinese characters written on touch devices.

2. Main Contributions

This article mainly addresses two different types of handwritten Chinese characters and designs an end-to-end recognizer based on Recurrent Neural Networks (RNN) that achieves good recognition results on both types of handwritten Chinese character datasets.

In addition, we have added two new computing architectures based on traditional Recurrent Neural Networks:

1. Variance constraint;

2. Attention weight vector.

By adding these two new computing architectures, the recurrent neural network achieves a high recognition rate with fewer parameters.

3. Basic Network Structure

Fig.2. Basic architecture of our RNN system.

Fig.2 illustrates the basic network structure for handwritten character recognition. It consists of a N-layer unidirectional recurrent neural network, a hidden layer state vector processing layer, and a fully connected layer.

At each time step t, the neural network accepts a position coordinate of the handwritten character sample and calculates the corresponding hidden layer state vector. After the neural network has received and processed all position coordinates of the input sample, these hidden layer state vectors are processed and sent to the fully connected layer. Then, classification is performed through a softmax classifier.

4. New Computing Architectures

1. Brief Description of Variance Constraint

Assuming a recurrent neural network system, all its parameters are represented by a set. The output of the recurrent neural network can be seen as a representation of the input sample.

As a complex graphical model, the recurrent neural network involves many paths and parameters when calculating the representation of the input sample. Due to the many gate functions in the neural network, many of these gate functions are in the closed state (value = 0), we believe that only a portion of the paths or parameters are important for calculating the representation of the current sample, and these paths or parameters are called key paths or key parameters.

Since the parameters of the neural network are shared among all samples, the probability of sharing key parameters between samples is very high. These shared key parameters obtain a relatively optimal value through training, enabling them to participate in the calculation of the representation of all relevant samples and allowing them to be classified correctly.

However, if we consider the situation where certain key parameters of a sample are not shared, then through training, it can obtain an absolutely optimal value to represent this sample.

In neural networks, parameter sharing is unavoidable and beneficial. The main goal of this work is to reduce the number of shared parameters within a certain range to improve the effectiveness of the neural network. In this article, we aim to achieve the above objective by reducing the number of key parameters representing a single sample.

Fig.3. Illustration to describe the change of key parameters by using the variance constraint method

Online Handwritten Chinese Character Recognition Using New RNN Architecture

From the above expression, it can be seen that the greater the absolute value of elements in the hidden layer state vector, the higher the probability that the corresponding current parameter is a key parameter for that sample. If the absolute value of the hidden layer vector is 0, then the current parameter has little effect on the classification of that sample. Thus, the size of the absolute value of the elements in the hidden layer state vector determines the importance of the current parameter for sample classification, i.e., whether the current parameter is a key parameter for that sample.

Therefore, during the training process, we constrain the variance of the hidden layer state vector, which is incorporated into the loss function. By constraining the variance of the hidden layer state vector, the absolute values of the elements in the hidden layer state vector will decrease, and all elements in the hidden layer state vector will be distributed around its mean. This reduces the number of elements with large values in the hidden layer state vector, thereby decreasing the number of key parameters for the input sample.

2. Brief Introduction to Attention Weight Vector

For the current input sample, the hidden layer state vector corresponding to each time step in the RNN has different importance for recognizing that sample. In this article, we use the RNN network itself to generate an attention weight vector, which assigns different weights to the hidden layer states at different time steps. In this article, we directly take the last dimension of the hidden layer state vector of the RNN as the weight of that hidden layer state, as shown in Fig.4.

Online Handwritten Chinese Character Recognition Using New RNN Architecture

Fig.4. Attention weight vector

The stroke coordinates of handwritten Chinese characters are continuous, and the current position is closely related to the previous and next positions. The hidden layer state vector at the current time step is also closely related to the states at the previous and next time steps.

Therefore, when calculating the weight of the hidden layer state vector at the current time step, it is essential to consider the hidden layer state vectors of adjacent time steps. Thus, we perform a smoothing process in time when calculating the corresponding weights, as shown in Fig.5.

Fig.5 Attention weight vector

5. Main Experimental Results

TABLE 1. Effectiveness comparison of the “variance constraint” on IAHCC-UCAS2016.

TABLE 2. Effectiveness comparison of the “variance constraint” on ICDAR-2013 competition database.

TABLE 3. Effectiveness comparison of the “attention weights” on IAHCC-UCAS2016 datasets.

TABLE 4. Effectiveness comparison of the “attention weights” on ICDAR-2013 competition database.

From TABLE 5 and TABLE 6, it can be seen that the proposed method achieves state-of-the-art results on two handwritten Chinese character datasets: ICDAR-2013 competition database and IAHCC-UCAS2016 datasets. From TABLE 1, TABLE 2, TABLE 3, and TABLE 4, the two new computing architectures proposed in this article can effectively improve the recognition performance of the system, especially for systems with fewer parameters.

6. Conclusion and Discussion

This article proposes an end-to-end recognizer for online handwritten Chinese characters. It introduces two new computing architectures based on traditional RNNs: (1) variance constraint; (2) attention weight vector.

The variance constraint mechanism can effectively reduce the number of key parameters used to represent a single sample, thereby decreasing the number of samples that a parameter can represent. This is beneficial for the parameters in the RNN system to more likely obtain the optimal representation for the input samples.

The introduction of the attention weight vector to represent the importance of the hidden layer state vector at different time steps, compared to existing attention mechanisms, does not introduce any additional parameters and achieves competitive results.

A large number of experimental results show that the two new computing architectures proposed can effectively improve the performance of traditional RNNs. However, regarding the variance constraint mechanism, the selection of hyperparameters is particularly time-consuming in experiments, so it is necessary to design an adaptive algorithm to improve the current manual selection mechanism.

In addition, the two computing architectures proposed in this article should not be limited to recurrent neural networks but should be improved and applied to other network structures as well.

References

[1] X.-Y. Zhang, F. Yin, Y.-M. Zhang, C.-L. Liu, Y. Bengio, Drawing and recognizing Chinese characters with recurrent neural networks, TPAMI 40 (4) (2017) 849-862. Paper link: https://arxiv.org/pdf/1606.06539.pdf

[2] W. Yang, L. Jin, Z. Xie, Z. Feng, Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge, ICDAR 15 (6) (2015) 551-555. Paper link: https://arxiv.org/abs/1505.07675

[3] H. Ren, W. Wang, K. Lu, J. Z. Q. Yuan, An end-to-end recognizer for in-air handwritten Chinese characters based on a new recurrent neural network, ICME (2017) 841-846. Paper link: https://ieeexplore.ieee.org/document/8019443

Original authors: Haiqing Ren, Weiqiang Wang, Chenglin Liu

Written by: Ren Haiqing

Formatted by: Gao Xue

Reviewed by: Yin Fei

Published by: Jin Lianwen

Disclaimer: (1) This article only represents the views of the author; personal understanding and summary may not be accurate or comprehensive, and the complete ideas and arguments of the paper should be based on the original paper. (2) The views in this article do not represent the position of this public account.

OCR Group Chat

Follow the latest cutting-edge technologies in text detection, recognition, correction, preprocessing, etc. Scan the code to add CV Jun to get you into the group (if you are already friends with other CV Jun accounts, please message directly).

(Please be sure to indicate: OCR)

If you like to communicate on QQ, you can join the official QQ Group: 805388940.

(I won’t be online all the time; if I can’t verify in time, please forgive me.)

Long press to follow I Love Computer Vision

Leave a Comment Cancel reply