A Novel Hybrid Method for Plant Classification Based on CNN-KNN

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

Abstract: Plant classification is an interesting problem in the field of computer vision. Many researchers have completed the classification of plants through their leaves and flowers. Multiple studies have shown that leaves are the best and most consistent source of information for plant classification. However, classifying plants based on leaf structure is somewhat appealing. Therefore, it is necessary to normalize the plant leaves to the same size to achieve better performance. In this paper, we propose a novel hybrid method, specifically a CNN-KNN based hybrid method. This method has been tested on two datasets, LeafSnap and Flavia. The CNN+KNN method achieved the highest accuracy of 98.4% and 96.5% on these two datasets, respectively.Keywords: CNN, KNN, LeafSnap dataset, Flavia dataset

1. Introduction

There are various plants on Earth, which play a key role in monitoring the environment and improving the climate. In traditional plant classification systems, one needs to have botanical knowledge to identify plants through their leaves. However, manual plant classification faces many issues and challenges. To overcome these difficulties, we need to develop a better-performing automatic plant recognition system. In an automatic plant classification system, plants can be classified with simple information, without needing to master botanical classification knowledge. The leaves of plants provide an effective medium for plant classification, and even allow identification of plant diseases through their leaves.

The main goal of this paper is to construct an optimal predictive model for plant classification. To this end, we propose a hybrid method to improve classification accuracy. Image classification is an important and widely used technology in the field of computer vision. Computer vision methods can be mainly divided into two categories: machine learning methods and deep learning methods. Among them, machine learning methods include K-nearest neighbors (KNN), while deep learning methods include Convolutional Neural Networks (CNN). This paper presents a hybrid method based on CNN and KNN (CNN-KNN) for plant classification. In this hybrid method, we combine CNN with KNN and test it on two datasets, LeafSnap and Flavia.

2. Related Work

Many studies have been completed regarding plant classification. Different researchers classify plants by extracting features from leaf datasets. Some research papers on plant classification have been published in high-quality journals and top conferences. Many researchers have conducted studies on plant classification problems based on artificial neural networks and machine learning methods.

2.1 Convolutional Neural Networks (CNN)

In deep learning, Convolutional Neural Networks (CNN) can automatically extract features from leaves. CNNs can perform two tasks, known as “convolution” and “pooling”. These tasks can extract features from leaves, which are then used to classify plants.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

2.2 Proposed Model

In this paper, we propose the CNN-KNN hybrid method. This method can predict plants from the LeafSnap and Flavia leaf datasets. First, the proposed method is trained on these leaf datasets to predict plant categories, and then the system extracts features from the training data. A user will provide a leaf image for the system to test and predict the plant category. The features extracted from the test image will be stored. Subsequently, the system will compare the test data with the training data and use the KNN algorithm to predict the plant.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

2.3 Algorithm Steps

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

2.4 Algorithm Flow

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

2.5 Experimental Environment In our study, we used the Python language. We utilized predefined libraries such as pandas, matplotlib, and OpenCV for data analysis and numerical plotting. Pandas is used for data structuring, NumPy for basic numerical calculations, and OpenCV for computer vision tasks in plant classification.

2.6 Data Preprocessing In this paper, we converted leaf image photos to JPEG or JPG format before inspection. Therefore, we collected leaf image photos (JPEG format). These image photos were converted to JPEG format using Python scripts. By scaling and using Python scripts to convert images, we reduced noise in the leaf image photos. The data preprocessing method utilized CNN leaf image reconstruction and leaf measurement techniques.

3. Datasets

3.1 Flavia Leaf Dataset: This is a leaf dataset used for plant image classification. The dataset contains 32 plant species. The naming of leaf images consists of 4-digit numbers, with the extension “.jpg”. The plant leaf images in the Flavia dataset are shown below.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

3.2 Leafsnap Dataset: This is one of the leaf datasets used for plant image classification. The Leafsnap dataset contains leaf images. These leaf images come from laboratory leaf images and field leaf images. Laboratory leaf images are taken under bright lighting conditions, with a pure white background. These laboratory leaf images have a certain shape. The laboratory leaf dataset contains 23,147 leaf images with high resolution. The field dataset contains 7,716 leaf images, which have noise. The leaf images in the Leafsnap dataset are shown below.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

3.3 Training

In this paper, we propose a method called CNN-KNN based hybrid method. This method is applied to two datasets: LeafSnap and Flavia. The structure, number of species, total number of leaf images, and total number of images after data augmentation for the LeafSnap and Flavia datasets are shown in the table below. The images in the given LeafSnap and Flavia datasets have been enhanced. In the total data, 60% of the data is used for training, and 40% is used for testing.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

4. Results Analysis

In this paper, the CNN-KNN based hybrid method predicts the categories of plants. First, the CNN-KNN based hybrid method is trained using the leaf datasets. Then, the system is trained using these leaf datasets, and the test image (leaf image) is inputted, and the system provides the prediction results. In the LeafSnap and Flavia datasets, CNN + KNN achieved the highest accuracy of 98.4% and 96.5%, respectively.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

The metrics calculated the performance of the CNN-KNN based hybrid method.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

The proposed CNN-KNN hybrid method performs better. Due to the lowest error rate, CNN-KNN ranks first.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

Figures 5 and 6 show the error rates and accuracy between different models. The CNN-KNN hybrid model shows good results.

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

The accuracy and loss of the proposed model

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

5. Conclusion

In this paper, we propose a novel CNN-KNN based hybrid method to improve the accuracy of plant classification. We use CNN for feature selection and KNN instead of SoftMax for plant classification. In this paper, we improve performance by increasing the amount of training data. The CNN-KNN based hybrid method has been trained and tested on a dataset containing 13,600 images. On the Flavia dataset, this method achieved an accuracy of 99.4% in plant classification. On the LeafSnap dataset, the accuracy of this method in plant classification reached 97.4%. The results indicate that classification accuracy is greatly influenced by the size of the leaf dataset. Compared to other machine learning methods (such as SVM-KNN), the CNN-KNN based hybrid method performs better. In this paper, we also discuss the reliability of the proposed method. We collected additional leaf images from different sources to increase the leaf dataset and enhance the performance of future research. Furthermore, we can improve this method to analyze and diagnose plant leaves. In the future, we will propose and study how to conduct better learning process research without using labeled leaf images.

References: A Novel CNN-KNN based Hybrid Method for Plant Classification

Comments from AI Intelligence Circle on the Article:

1. Although the idea of this paper is relatively simple, the models used are classic techniques, not the latest algorithms, but rather a combination of classic CNN convolution network technology and classic KNN algorithm to propose a hybrid model;

2. This is a common idea for publishing papers, the key is to validate effectiveness on datasets and have certain advantages compared to other algorithms.

3. Proposing such algorithmic innovations requires a solid grasp of the basic principles of convolutional neural networks and KNN models. CNN has advantages in feature extraction from images, while KNN, although a lazy learning model, can classify by calculating and comparing distances based on its core idea.

Thank you for reading, welcome to communicate, criticize and correct, if there is infringement, please contact for deletion!

For communication and cooperation, and to join Machine Learning, Deep Learning Group, Training Guidance Services, please add WeChat:1810162582

A Novel Hybrid Method for Plant Classification Based on CNN-KNN

Leave a Comment