Image-based video search, as the name suggests, is to take an image and search the video database for videos containing similar shots. A key step in image-based video search is video vectorization, which involves extracting key frames from the video and performing feature extraction for each frame, converting it into a structured vector. At this point, curious readers may ask, how is this different from image-based image search? Yes, searching through all key frame images of a video is essentially the same as image-based image search.
After many friends built the image-based image search system and were not satisfied, we decided to write another article on building an image-based video search system using Milvus to provide reference for those in need. If you haven’t read the previous article on image-based image search, you can click here to read: Building An Image-Based Image Search System with Milvus.
| System Overview
The workflow of the entire image-based video search system can be represented by the following diagram:
When importing videos, we first use the OpenCV algorithm library to frame a video input into the system, then use the image feature extraction model VGG to extract vectors from these key frame images, and finally import the extracted vectors into Milvus. The original video is stored using Minio, and Redis is used to store the correspondence between videos and vectors.
When searching for videos, we first convert the uploaded image into a feature vector using the same VGG model, then use this vector to perform a similar vector search in Milvus, retrieving the most similar vectors, and using the corresponding relationship stored in Redis to fetch the videos from Minio and return them to the front-end interface.
| Data Preparation
This article builds an end-to-end solution for image-based video search using approximately 100,000 GIFs from Tumblr as an example. Readers can use their own video files to build the system.
| System Deployment
The code for building the image-based video search system has been uploaded to GitHub, and the repository address is: https://github.com/JackLCL/search-video-demo.
Step 1: Image Building
The entire image-based video search system requires Milvus 0.7.1 docker, Redis docker, Minio docker, front-end interface docker, and back-end API docker. Readers need to build the front-end interface docker and back-end API docker themselves; the other three docker images can be pulled directly from Docker Hub.
# Get the image-based video search code
$ git clone -b 0.7.1 https://github.com/JackLCL/search-video-demo.git
# Build front-end interface docker and API docker images
$ cd search-video-demo & make all
Step 2: Environment Configuration
This article uses docker-compose to manage the five containers mentioned above. The configuration of the docker-compose.yml file can refer to the table below:
The IP address 192.168.1.38 in the table above is the server address used to build the image-based video search system, and users need to modify it according to their actual situation.
Users need to manually create storage directories for Milvus, Redis, and Minio, and then map the corresponding paths in docker-compose.yml. For example, the storage directories created in this article are:
/mnt/redis/data
/mnt/minio/data
/mnt/milvus/db
Thus, the configuration for Milvus, Redis, and Minio in docker-compose.yml can be set up as shown in the following image:
Step 3: System Startup
To start the five docker containers needed for the image-based video search system, use the modified docker-compose.yml from Step 2:
$ docker-compose up -d
After the startup is complete, you can use the docker-compose ps command to check if the five docker containers started successfully. The normal startup result interface is shown in the following image:
So far, the entire image-based video search system has been built, but the system’s database still does not contain videos.
Step 4: Video Import
In the deploy directory of the code repository, there is a video import script named import_data.py. Readers only need to modify the video file path and the video import time interval in the script to run it for video import.
data_path: The path of the video to be imported.
time.sleep(0.5): Indicates the time interval for importing videos. The server for the image-based video search system built in this article has 96 CPU cores, so a time interval of 0.5 seconds is appropriate. If the number of CPU cores is lower, the import time interval should be extended accordingly, otherwise, it may cause high CPU usage and create zombie processes.
The startup command is as follows:
$ cd deploy
$ python3 import_data.py
The import process is shown in the following image:
After waiting for the video import to complete, the entire image-based video search system is fully set up!
| Interface Display
Open your browser and enter 192.168.1.38:8001 to see the interface of the image-based video search system, as shown in the following image:
By clicking the settings icon in the upper right corner, you can see the videos in the database:
By clicking the upload box on the left, you can upload an image that you want to search for, and then the interface on the right will search for videos containing similar shots:
Now, enjoy the fun of image-based video search!
| Conclusion
This article uses Milvus to build an image-based video search system, demonstrating the application of Milvus in processing unstructured data. The Milvus vector similarity search engine is compatible with various deep learning platforms, responding to searches of billions of vectors in milliseconds. You can explore more AI applications with Milvus!
If you have suggestions or comments, you can issue an issue on our GitHub project or contact us in the Slack community.
Further Reading on Image-Based Image Search:
Live Replay | Easily Build Milvus Image-Based Image Search System in Three Steps
Video | Explaining Milvus Image-Based Image Search and Video Search in 10 Minutes