Common Image Processing Libraries in Machine Vision

Click on the above “Beginner Learning Vision” to select “Star” or “Top”

Important content delivered promptly

When studying and researching image processing, the first task is to choose a suitable image processing library. In this issue, we will briefly introduce the pros and cons of various image libraries. OpenCV, Intel IPP, Halcon, MATLAB, OpenGL, EmguCv, AForge.net, CxImage, FreeImage, paintlib, AGG, IPL, visDSK. Please feel free to provide suggestions for any shortcomings. Thank you!

1. OpenCV

Introduction: OpenCV stands for Open Source Computer Vision Library. It is Intel®’s open-source computer vision library. It consists of a series of C functions and a few C++ classes, implementing many general algorithms in image processing and computer vision. OpenCV is a cross-platform computer vision library based on (open source) distribution, capable of running on Linux, Windows, and Mac OS operating systems. It is lightweight and efficient—composed of a series of C functions and some C++ classes, while also providing interfaces for languages like Python, Ruby, and MATLAB, implementing many general algorithms in image processing and computer vision. OpenCV is written in C++, and its main interface is also in C++, but it still retains a large number of C language interfaces.

OpenCV Features

Provides function interfaces for image data manipulation, image/video input and output, basic image processing, structural analysis, camera calibration, motion analysis, object recognition, and basic GUI, etc.

OpenCV Modules

cv – Core function library; cvaux – Auxiliary function library; cxcore – Data structure and linear algebra library; highgui – GUI function library; ml – Machine learning function library

Characteristics:

Free, source code is open, it is composed of a cross-platform mid-to-high level API, it does not depend on other external libraries, although some external libraries can also be used. To be completely independent of IPL (Intel’s Image Processing Library), OpenCV replicates the declarations of IplImage and some other structural variables.

OpenCV provides a transparent user interface for Intel® Integrated Performance Primitives (IPP). It has no dependencies on IPP. However, if IPP is installed, OpenCV will dynamically link to the IPP library to take advantage of IPP to enhance speed.

2. Intel IPP

Introduction: IPP “Intel Integrated Performance Primitives” is a cross-platform software function library,

It provides a wide range of multimedia functions: audio decoders (e.g., H263, MPEG-4), image processing (JPEG), signal processing, voice compression (e.g., G723, GSM, AMR), and encryption mechanisms.

Intel IPP is optimized for a large number of Intel Microprocessors: Intel Pentium 4 processors, components of Intel Pentium M processors using Intel Centrino mobile computing technology, Intel Itanium 2 processors, Intel Xeon processors, and Intel PCA application processors using Intel XScale technology. Using a set of cross-platform structure’s general API, users do not need to worry about platform compatibility issues, saving development costs and research time, allowing users to easily port existing applications.

Characteristics:

IPP is charged and the source code is not open

Function libraries: signal processing, image processing, multimedia, vector processing, etc.

Cross-platform and OS general API

High-performance code, no need to write assembly code, optimized applications, fast speed, and high efficiency.

3. Halcon

Introduction: HALCON is an image processing software from MVTec, Germany, recognized worldwide for its best performance in machine vision. It originated from academia, unlike some commercial software packages on the market. In fact, it is a library of image processing, composed of more than a thousand independent functions and a core for data management. It includes various filters, color analysis, and basic geometric and image computation functions such as geometric transformations, corrections, classifications, identifications, shape searches, etc. Since these functions are mostly not designed for specific tasks, HALCON’s powerful computational analysis capabilities can be used wherever image processing is needed. Its application scope is almost unlimited, covering medicine, remote sensing detection, monitoring, and various automated inspections in industry. In recent years, due to the development of machine vision technology, this technology can “replace the human eye”, does not fatigue in repetitive work, and has high precision and stability, promoting the development of high-tech industries, such as a significant increase in output in the electronics industry. MVTec also continuously collaborates with academia, incorporating the latest academic research results, not only keeping its technology at the industry-leading position but also pushing machine vision technology to new heights.

HALCON is widely used worldwide, allowing users to quickly develop image processing and machine vision applications using its open structure.

A professional image processing tool not only contains an image processing function library. Solving image processing tasks is only part of the entire machine vision solution, which also includes software components such as processing controls or database connections, as well as hardware components like image acquisition and lighting. Therefore, it is very important that the image processing system is easy to use and can be seamlessly integrated into development projects. HALCON considers these aspects, and it has the following features:

1). HALCON includes an interactive programming interface HDevelop, where programs can be directly written, modified, executed in HALCON program code, and all variables in the computation process can be viewed. Once designed, it can directly output C/C++ or COM (Visual Basic) program code, embedded into application programs.

2). HALCON does not restrict image acquisition devices; suitable devices can be selected freely. The manufacturer has provided drivers for over 40 types of cameras, and even for unsupported cameras, images can be easily captured through pointers, and the open architecture of HALCON allows users to write DLL files and connect systems.

3). Using HALCON offers the best return on investment. The software supports operating systems besides Microsoft NT/XP/2000, including Linux, Solaris7, 181X6.5, ‘1’ru64 UNIX5.1, etc. When a system needs to be developed, it can be easily switched to meet the requirements. To accelerate speed, multiprocessor computers can be used, and the written programs do not need to be modified.

4). Using HALCON, there are no special restrictions when designing human-machine interfaces, nor are there any special visualization components required; it can fully utilize the programming language in the development environment, such as Microsoft Visual Studio, to build its own interface, and end-users do not see the development tools. Moreover, only minimal resources are required on the machine running the software.

4. MATLAB

Introduction: MATLAB is a commercial mathematical software produced by MathWorks, USA, used for algorithm development, data visualization, data analysis, and numerical computing in a high-level technical computing language and interactive environment, mainly comprising MATLAB and Simulink.

MATLAB can perform matrix operations, plot functions and data, implement algorithms, create user interfaces, connect programs of other programming languages, etc. It is mainly used in engineering computing, control design, signal processing and communication, image processing, signal detection, financial modeling design and analysis, and other fields.

The basic data unit of MATLAB is a matrix, and its command expressions are very similar to the forms commonly used in mathematics and engineering, making it much easier to solve problems using MATLAB than using languages like C or FORTRAN for the same tasks. MATLAB also incorporates advantages from software like Maple, making it a powerful mathematical tool. The new version also includes support for C, FORTRAN, C++, and JAVA.

Characteristics:

1). Efficient numerical and symbolic computing capabilities can free users from complex mathematical calculations and analyses;

2). Complete graphic processing functions enable visualization of computational results and programming;

3). User-friendly interface and natural language close to mathematical expressions make it easy for scholars to learn and master;

4). Rich application toolboxes (e.g., signal processing toolbox, communication toolbox, etc.) provide users with a large number of convenient and practical processing tools.

It is very powerful, but the legitimate software is expensive, suitable for academic research, simulation, and demonstration. It has certain limitations in software development and engineering applications.

5. OpenGL

OpenGL (full name Open Graphics Library) specifies a cross-language, cross-platform programming interface specification for professional graphics programming interfaces. It is used for 3D images (2D is also possible) and is a powerful, easy-to-call low-level graphics library.

OpenGL is the most widely accepted 2D/3D graphic API in the industry, and since its inception, it has spawned thousands of excellent applications on various computer platforms and devices. OpenGL™ is independent of the Windows operating system or other operating systems and is also network transparent. In industries including CAD, content creation, energy, entertainment, game development, manufacturing, pharmaceuticals, and virtual reality, OpenGL™ helps programmers develop high-performance, visually impactful graphics processing software on hardware devices such as PCs, workstations, and supercomputers.

Characteristics:

1). Closely integrated with C language.

OpenGL commands were originally described using C language functions, making it easy to understand and learn for those who have studied C. If you have encountered TC’s graphics.h, you will find that using OpenGL for graphics is even simpler than TC.

2). Strong portability.

Microsoft’s Direct3D is also an excellent graphics API, but it is only used for Windows systems (now also for an XBOX game console). OpenGL can be used not only on Windows but also on Unix/Linux and other systems; it is even applied in large computers and various professional computers (e.g., medical display devices). Moreover, the basic commands of OpenGL are hardware-independent and even platform-independent.

3). High-performance graphics rendering.

OpenGL is an industrial standard, and its technology keeps up with the times; currently, all graphics card manufacturers provide strong support for OpenGL. In fierce competition, OpenGL’s performance has always been leading.

In summary, OpenGL is a very powerful graphics software interface. As for how powerful it is, just look at professional games like DOOM3 and QUAKE4.

6. EmguCv

EmguCV is a .NET platform wrapper for the OpenCV image processing library. It is the .NET version of OpenCV. It runs under .NET compatible programming languages to call OpenCV functions, such as C#, VB, VC++, IronPython, etc. This wrapper library can be compiled under Mono and run on Linux/Mac OSX. EmguCV is entirely written in C#, and it can be compiled in the Mono environment and run on any Mono-supported platform (such as Linux, Solaris, MacOSX). Other advantages of EmguCV include support for general color and depth image classes, automatic garbage collection, XML serializable images, XML documents, and Intel license support, allowing for the choice between image classes or direct invoke functions of OpenCV, and common operations on image elements. In other words, EmguCV not only inherits all the functions and features of OpenCV but also has greater breakthroughs in cross-platform support.

It wraps most of OpenCV’s functionality into .NET classes, structures, or enumerations. However, the documentation is not complete, and it is still necessary to refer to OpenCV’s documentation.

7. AForge.net

AForge.NET is designed specifically for developers and researchers based on the C# framework, including computer vision and artificial intelligence, image processing, neural networks, genetic algorithms, machine learning, robotics, etc.

This framework consists of a series of libraries and examples. The features included are:

AForge.Imaging – some everyday image processing and filters

AForge.Vision – computer vision application library

AForge.Neuro – neural network computing library

AForge.Genetic – evolutionary algorithm programming library

AForge.MachineLearning – machine learning library

AForge.Robotics – provides some machine learning tools library

AForge.Video – a series of video processing libraries (very convenient)

Pure .NET class libraries, simple interface, very convenient to use.

8. CxImage

The CxImage library is an excellent image manipulation library. It can quickly access, display, and convert various images. Some readers may say that there are so many excellent graphic libraries, such as OpenIL, FreeImage, PaintLib, etc., which are powerful and comprehensive, and there is no need to use other libraries. However, I must say that these libraries are basically not free, and using them will bind you to various licensing agreements. In this regard, the CxImage library is completely free. Additionally, when using the above libraries, you will encounter numerous troubles. Because most of them are platform-independent and written in C language, some also mix basic C++ wrappers and a pile of compilation options that need to be handled. CxImage library does a great job in this aspect. The author has fully opened the source code. Compared to those encapsulated graphic libraries and GDI+, this allows us to further learn various encoding and decoding technologies, rather than just skimming the surface of various technologies.

9. FreeImage

FreeImage is a free, open-source, cross-platform (Windows, Linux, and Mac OS X) image processing library that supports over 20 image types (such as BMP, JPEG, GIF, PNG, TIFF, etc.). Its biggest advantage is its plugin-driven architecture, which is fast, flexible, and easy to use, and has been widely adopted.

The main functions of FreeImage include reading and writing multi-format bitmaps; convenient bitmap information retrieval; bit depth conversion; bitmap page access; basic geometric transformations and point processing; channel merging and blending, etc. FreeImage currently does not support vector graphics and advanced image processing; bitmap rendering needs to be done by the user.

All functions in FreeImage start with FreeImage_, such as the functions for reading and writing image files are FreeImage_Load and FreeImage_Save. The FIBITMAP data structure stores bitmap information and pixel data and is the core of FreeImage.

10. paintlib

paintlib is a portable C++ class library for loading, saving, and processing images. It can load images from BMP, GIF, JPEG, PCX, PGM, PICT, PNG, PSD, TGA, TIFF, and WMF files, and can save them in BMP, JPEG, PNG, and TIFF formats. It can perform image processing either by executing filters in filter classes or by directly accessing bitmaps. It provides complete C++ source code.

11. AGG

AGG, short for Anti-Grain Geometry, is an open-source, efficient 2D graphics library.

AGG is a high-quality, open-source vector graphics library, similar to GTK+’s Cairo and Microsoft’s GDI+. Among the three, AGG has the highest performance (not discussing Skia and Direct2D, which have hardware acceleration with OGL and DX, making drawing speeds incomparable).

Characteristics:

AGG’s functionality is very similar to GDI+’s, but it offers a more flexible programming interface and produces very high-quality graphics. It is cross-platform and can run on many operating systems.

AGG Functions

1). Supports ALPHA, GAMMA, and other color transformations, as well as user-defined color transformations;

2). Supports arbitrary 2D graphic transformations;

3). Supports SVG and PostScript descriptions, suitable for online graphic generation;

4). Supports high-quality graphic processing, including anti-aliasing interpolation and other advanced functions;

5). Supports arbitrary methods of color transformations;

6). Supports all color formats;

7). Supports various processing of bitmaps;

8). Supports various processing of lines, similar to GDI+;

9). Supports GPC, which is a general polygon clipping method;

10). Supports various font outputs, including Chinese character processing;

High efficiency, high quality (with anti-aliasing), powerful functionality, cross-platform and platform GUI support; drawbacks: no hardware acceleration, limited documentation, and a slightly high threshold.

12. IPL

Intel’s IPL image library, although not open-source, is very useful for ordinary image processing, such as Laplacian operators, Sobel operators, and other edge extraction and sharpening functions. If the target platform is a PC or industrial control machine and it is an Intel CPU, this is a very good choice.

IPL is no longer available; it has become part of IPP. However, earlier versions can still be downloaded. The educational version of IPP is $50, which is not very expensive and integrates most algorithms for 1D signal processing, speech signal processing; 2D signal processing, image processing, video processing. The model of OpenCV will be very similar to IPL.

Characteristics:

IPL image processing library, free, initially open-source, later closed-source.

13. visDSK

MICROSOFT’s image processing library, free and open-source. It was very popular before OpenCV appeared, but now visDSK can do what OpenCV can do, and in terms of speed, OpenCV also has Intel’s optimization support. The era of visDSK has indeed become a thing of the past.


Download 1: OpenCV-Contrib Extension Module Chinese Tutorial

Reply “Extension Module Chinese Tutorial” to the WeChat public account “Beginner Learning Vision” to download the first OpenCV extension module tutorial in Chinese, covering installation of extension modules, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing, and more than twenty chapters of content.

Download 2: Python Vision Practical Project 52 Lectures

Reply “Python Vision Practical Project” to the WeChat public account “Beginner Learning Vision” to download 31 practical vision projects, including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, face recognition, etc., to help quickly learn computer vision.

Download 3: OpenCV Practical Project 20 Lectures

Reply “OpenCV Practical Project 20 Lectures” to the WeChat public account “Beginner Learning Vision” to download 20 practical projects based on OpenCV for advanced learning of OpenCV.

Group chat

Welcome to join the reader group of the public account to communicate with peers. Currently, there are WeChat groups for SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (which will gradually be subdivided). Please scan the WeChat ID below to join the group, and note: “Nickname + School/Company + Research Direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Please follow the format; otherwise, you will not be approved. After successful addition, you will be invited to enter the relevant WeChat group based on your research direction. Please do not send advertisements in the group; otherwise, you will be removed from the group. Thank you for your understanding~

Leave a Comment Cancel reply