Build Your First Image Classification Model in Just 10 Minutes!

Author: Pulkit Sharma; Translator: Wang Weili; Proofreader: Ding Nanya

This article is about 3400 words, recommended reading time is 10 minutes.

This article introduces the process of building a deep learning model for image recognition, providing a basic framework for beginners to solve image recognition problems by stating the actual competition problem, introducing the model framework, and showcasing the solution code.

Introduction

“Can a deep learning model be built in just a few minutes? Training takes hours, right? I don’t even have a good enough machine.” I have heard aspiring data scientists say this countless times, afraid to build deep learning models on their own machines.

In fact, you don’t have to work at Google or other large tech companies to train deep learning datasets. You can completely build your own neural network from scratch in just a few minutes without renting Google’s servers. A Fast.ai student designed a model for the ImageNet dataset in 18 minutes, and I will demonstrate a similar approach in this article.

Build Your First Image Classification Model in Just 10 Minutes!

Deep learning is a vast field, so we will narrow our focus to image classification problems. Moreover, we will use a very simple deep learning architecture to achieve a good accuracy.

You can use the Python code in this article as a foundation for building an image classification model. Once you have a good understanding of these concepts, you can continue programming, participate in competitions, and climb the leaderboard.

If you are just starting to delve deeper and are fascinated by the field of computer vision (who isn’t?!), be sure to check out the course on Computer Vision using Deep Learning, which provides a comprehensive introduction to this cool field and will lay the foundation for your future entry into this huge job market.

Course Link:

https://trainings.analyticsvidhya.com/courses/course-v1:AnalyticsVidhya+CVDL101+CVDL101_T1/about?utm_source=imageclassarticle&utm_medium=blog

Table of Contents

1. What is Image Classification and Its Application Cases

2. Setting Up the Image Data Structure

3. Breaking Down the Model Building Process

4. Setting Up the Problem Definition and Understanding the Data

5. Steps to Build the Image Classification Model

6. Starting Other Challenges

1. What is Image Classification and Its Application Cases

Observe the following image:

Build Your First Image Classification Model in Just 10 Minutes!

You should be able to recognize it immediately—it’s a luxury car. Step back and analyze how you came to this conclusion—you were shown an image, and then you classified it into the category “car” (in this example). Simply put, this process is image classification.

Often, images can have many categories. Manually checking and classifying images is a very tedious process. Especially when the problem becomes about 10,000 or even 1,000,000 images, this task becomes nearly impossible. So how useful would it be if we could automate this process and quickly label image categories?

Self-driving cars are a great example of image classification applied in the real world. To achieve autonomous driving, we can build an image classification model to identify various objects on the road, such as vehicles, people, moving objects, etc. We will see more applications in the following sections, and many applications are right around us.

Now that we have grasped the topic, let’s dive deep into how to build an image classification model, what its prerequisites are, and how to implement it in Python.

2. Setting Up the Image Data Structure

Our dataset needs a special structure to solve the image classification problem. We will see this in several parts, but before we go further, please keep these suggestions in mind.

You should create two folders, one for the training set and the other for the test set. The training set folder should contain a csv file and an image folder:

  • The csv file stores the image names of all training images and their corresponding true labels.

  • The image folder stores all the training images.

The csv file in the test set folder is different from the csv file in the training set folder; the csv file in the test set folder only contains the image names of the test images, excluding their true labels. Because we will predict the images in the test set based on the training images.

If your dataset is not in this format, you need to convert it; otherwise, the prediction results may be incorrect.

3. Breaking Down the Model Building Process

Before we explore the Python code, let’s first understand how image classification models are typically designed. The process can be divided into four parts. Each step requires a certain amount of time to execute:

Step One: Load and preprocess data—30% of the time

Step Two: Define model architecture—10% of the time

Step Three: Train the model—50% of the time

Step Four: Evaluate model performance—10% of the time

Next, I will explain each of the above steps in more detail. This part is very important because not all models are built in the first step. You need to return and fine-tune the steps after each iteration and run it again. Having a solid understanding of the foundational concepts will greatly help speed up the entire process.

  • Step One: Load and Preprocess Data

Data is crucial for deep learning models. If there are a large number of images in the training set, your image classification model will also have a greater chance of achieving better classification results. Additionally, depending on the framework used, the dimensions of the data vary, and the effects are also different.

Therefore, for this critical data preprocessing step, I recommend browsing the following article for a better understanding of image data preprocessing:

Basics of Image Processing in Python

https://www.analyticsvidhya.com/blog/2014/12/image-processing-python-basics/)

But we are not quite at the data preprocessing step yet; to understand how our data performs in a new dataset it has never seen before (before predicting the test set), we first need to split a portion of the training set for validation.

In short, we train the model on the training set and then validate it on the validation set. If we are satisfied with the results on the validation set, we can use it to predict the test set data.

Time Required: About 2-3 minutes.

  • Step Two: Build the Model Framework

This is another important step in the process of building a deep learning model. During this process, several questions need to be considered:

  • How many convolutional layers are needed?

  • What is the activation function for each layer?

  • How many hidden units are there in each layer?

There are other questions as well. But these are basically the model’s hyperparameters, which play an important role in the prediction results.

How to determine the values of these hyperparameters? Good question! One method is to select these values based on existing research. Another idea is to keep trying these values until the best ones are found, but this can be a very time-consuming process.

Time Required: About 1 minute to define this framework.

  • Step Three: Train the Model

For model training, we need:

  • Training images and their true labels.

  • Validation set images and their true labels. (We only use the validation set labels for model evaluation, not for training)

We also need to define the number of iterations (epochs). In the initial phase, we train 10 times (you can change this).

Time Required: Approximately 5 minutes to learn the structure of the model.

  • Step Four: Evaluate Model Performance

Finally, we load the test data (images) and complete the preprocessing steps. Then we use the trained model to predict the categories of these images.

Time Required:1 minute

4. Setting Up the Problem Definition and Understanding the Data

We will attempt a very cool challenge to understand image classification. We need to build a model that can classify given images (shirts, pants, shoes, socks, etc.). This is actually a problem faced by many e-commerce retailers, making it a more interesting computer vision problem.

This challenge is called “Identify the Apparel,” one of the practical problems we encountered on the data hack platform. You must register and download the dataset from the link above.

“Identify the Apparel” Competition Link:

https://datahack.analyticsvidhya.com/contest/practice-problem-identify-the-apparels/)

Data Hack Platform:

https://datahack.analyticsvidhya.com/

Build Your First Image Classification Model in Just 10 Minutes!

There are a total of 70,000 images (28×28 dimensions), of which 60,000 come from the training set and 10,000 from the test set. The training images have already been labeled with clothing categories, totaling 10 categories. The test set has no labels. This competition is to recognize the images in the test set.

We will build the model in Google Colab because it provides free GPU.

Google Colab:

https://colab.research.google.com/

5. Steps to Build the Image Classification Model

Now it’s time to showcase your Python skills, and we’ve finally reached the execution phase!

The main steps are as follows:

  1. Set up Google Colab

  2. Import Libraries

  3. Import Data Preprocessing Data (3 minutes)

  4. Set Up Validation Set

  5. Define Model Structure (1 minute)

  6. Train Model (5 minutes)

  7. Predict (1 minute)

Below are the detailed steps of the above.

  • Step 1: Set Up Google Colab

Since we will import data from a Google Drive link, we need to add a few lines of code to the Google Colab notebook. Create a new Python3 notebook and write the following code:

!pip install PyDrive

This step installs PyDrive. Next, import the required libraries:

import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

Next, create a drive variable to access Google Drive:

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

You need to use the ID of the file uploaded to Google Drive to download the dataset:

download = drive.CreateFile({'id': '1BZOv422XJvxFUnGh-0xVeSvgFgqVY45q'})

Replace the ID part with the ID of your folder. Next, download the folder and unzip it.

download.GetContentFile('train_LbELtWX.zip')
!unzip train_LbELtWX.zip

You need to run the above code every time you start the notebook.

  • Step 2: Import Libraries Required for the Model.

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tqdm import tqdm
  • Step 3: Next is Data Import and Data Preprocessing.

train = pd.read_csv('train.csv')

Next, we will read the training set, store it as a list, and finally convert it to a numpy array.

# We have grayscale images, so while loading the images we will keep grayscale=True, if you have RGB images, you should set grayscale as False
train_image = []
for i in tqdm(range(train.shape[0])):
    img = image.load_img('train/'+train['id'][i].astype('str')+'.png', target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    train_image.append(img)
X = np.array(train_image)

This is a multi-class problem (10 categories), and we need to one-hot encode the label variable.

y=train['label'].values
y = to_categorical(y)
  • Step 4: Split the Validation Set from the Training Set

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)
  • Step 5: Define the Model Structure

We will build a simple structure with 2 convolutional layers, one hidden layer, and one output layer.

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

Next, compile the model.

model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
  • Step 6: Train the Model

In this step, we will train the data from the training set and validate it on the validation set.

model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))
  • Step 7: Predict!

We will first follow the steps we executed when processing the training dataset. Load the test images and predict the classification results using the model.predict_classes() function to predict their classes.

download = drive.CreateFile({'id': '1KuyWGFEpj7Fr2DgBsW8qsWvjqEzfoJBY'})
download.GetContentFile('test_ScVgIM0.zip')
!unzip test_ScVgIM0.zip

First, import the test set:

test = pd.read_csv('test.csv')

Next, read in the data and store the test set:

test_image = []
for i in tqdm(range(test.shape[0])):
    img = image.load_img('test/'+test['id'][i].astype('str')+'.png', target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    test_image.append(img)
test = np.array(test_image)
# making predictions
prediction = model.predict_classes(test)

We also need to create a submission folder to upload to the DataHack platform.

download = drive.CreateFile({'id': '1z4QXy7WravpSj-S4Cs9Fk8ZNaX-qh5HF'})
download.GetContentFile('sample_submission_I5njJSF.csv')
# creating submission file
sample = pd.read_csv('sample_submission_I5njJSF.csv')
sample['label'] = prediction
sample.to_csv('sample_cnn.csv', header=True, index=False)

Download the sample_cnn.csv file and upload it to the competition page to generate your ranking. This provides a foundational solution to help you get started with solving image classification problems.

You can try adjusting hyperparameters and regularization to improve model performance. You can also understand the details of tuning parameters by reading the article below.

A Comprehensive Tutorial to Learn Convolutional Neural Networks from Scratch

https://www.analyticsvidhya.com/blog/2018/12/guide-convolutional-neural-network-cnn/

6. Starting a New Challenge

Build Your First Image Classification Model in Just 10 Minutes!

Let’s try testing on another dataset. In this part, we will solve the problem on Identify the Digits.

Identify the Digits Competition Link:

https://datahack.analyticsvidhya.com/contest/practice-problem-identify-the-digits/

Before you look down, please try to solve this challenge on your own. You have gained the tools to solve the problem; you just need to use them. When you encounter difficulties, you can come back to check your process and results.

In this challenge, we need to recognize digits in the given images. There are a total of 70,000 images, with 49,000 training images labeled and the remaining 21,000 test images unlabeled.

Ready? Good! Open a new Python3 notebook and run the following code:

# Setting up Colab
!pip install PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# Replace the id and filename in the below codes
download = drive.CreateFile({'id': '1ZCzHDAfwgLdQke_GNnHp_4OheRRtNPs-'})
download.GetContentFile('Train_UQcUa52.zip')
!unzip Train_UQcUa52.zip
# Importing libraries
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tqdm import tqdm
train = pd.read_csv('train.csv')
# Reading the training images
train_image = []
for i in tqdm(range(train.shape[0])):
    img = image.load_img('Images/train/'+train['filename'][i], target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    train_image.append(img)
X = np.array(train_image)
# Creating the target variable
y=train['label'].values
y = to_categorical(y)
# Creating validation set
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)
# Define the model structure
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
# Training the model
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))
download = drive.CreateFile({'id': '1zHJR6yiI06ao-UAh_LXZQRIOzBO3sNDq'})
download.GetContentFile('Test_fCbTej3.csv')
test_file = pd.read_csv('Test_fCbTej3.csv')
test_image = []
for i in tqdm(range(test_file.shape[0])):
    img = image.load_img('Images/test/'+test_file['filename'][i], target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    test_image.append(img)
test = np.array(test_image)
prediction = model.predict_classes(test)
download = drive.CreateFile({'id': '1nRz5bD7ReGrdinpdFcHVIEyjqtPGPyHx'})
download.GetContentFile('Sample_Submission_lxuyBuB.csv')
sample = pd.read_csv('Sample_Submission_lxuyBuB.csv')
sample['filename'] = test_file['filename']
sample['label'] = prediction
sample.to_csv('sample.csv', header=True, index=False)

Submit this file on the exercise page, and you will get a pretty good accuracy. This is a good start, but there is always room for improvement. Keep working hard and see if you can improve our basic model.

Conclusion

Who said deep learning models need hours or days of training? My goal is to show you that you can come up with a pretty good deep learning model in double the time. You should accept similar challenges and try coding them from your terminal. Nothing beats learning through practice!

Top data scientists and analysts even prepare these codes before the hackathon starts. They use these codes to submit early before doing a detailed analysis. First, they provide a benchmark solution and then improve the model using different techniques.

Did you find this article useful? Please share your feedback in the comments section below.

Original Title:

Build your First Image Classification Model in just 10 Minutes!

Original Link:

https://www.analyticsvidhya.com/blog/2019/01/build-image-classification-model-10-minutes/

Editor: Huang Jiyan

Translator Profile: Wang Weili, job seeker, studying big data technology at the Hong Kong University of Science and Technology. Feels that data science is challenging yet interesting, still learning (tu) and (tou). One person cannot handle the literature, following big shots in data circles.

“The End”

Source: Data Circle THU ;

Copyright Statement: Some content from this account comes from the internet, please indicate the original link and author when reprinting. If there is any infringement or incorrect source, please contact us.

Related Reading

Original Series Articles:

1: Building Your Own Data Operation Indicator System from Scratch (Overview)

2: Building Your Own Data Operation Indicator System from Scratch (Positioning)

3: Building Your Own Data Operation System from Scratch (Business Understanding)

4: The Process and Logic of Building Data Indicators

5: Series: From Data Indicators to Data Operation Indicator System

6: Practical: Building a Data Operation Indicator System for Your Public Account

7: Building Your Own Data Operation Indicator System from Scratch (Operational Activity Analysis)

8: After a week of work, 【Employment Season】 provides a satisfactory answer for 2018…

Data Operation Related Articles Reading:

Introduction to Operations: Building a Data Analysis Knowledge System from 0 to 1

Recommended: 9 Good Habits of Collaboration Between Data Analysts and Operations

Practical: Hands-on Guide to Building a Data-Driven User Operation System

Recommended: The Most Thoughtful Interpretation of Operational Data Indicators

Practical: How to Build a Data Operation Indicator System

From Scratch: Building a Data-Driven Operation System

Practical: Interpreting the Relationship Between Products, Operations, and Data

Practical: Building a Data Operation System from 0 to 1

Data Analysis, Data Products Related Articles Reading:

Practical: Building and Thinking of a Data Analysis Team

Everything You Need to Know About User Portraits, This Article Is Enough

10 Analytical Mindsets Every Data Analyst Should Have.

How to Build a Big Data Hierarchical System, This Article Is Enough

Practical: Focusing on User Behavior Analysis Data Products

80% of Operations Are Doomed to Be Miscellaneous? Because You Haven’t Built an Effective User Operation System

From Bottom to Application, Essential Skills for Data People

Understanding the User Operation System: User Segmentation and Clustering

Data Analysis Thinking Must Be Mastered for Operations, Are You Still Afraid of Not Being Able to Do Data Analysis?

For Cooperation, Please Add QQ: 365242293

For More Related Knowledge, Please Reply:“ Moonlight Treasure Box ”;

Data Analysis (ID: ecshujufenxi ) is a WeMedia member of the Internet Technology and Data Circle, covering a population of 50 million.

Build Your First Image Classification Model in Just 10 Minutes!

Leave a Comment