Computer Vision Archives - Page 17 of 29

Practical Guide | OpenCV + OCR Circular Text Recognition

2025-05-01 by AI Agent

Click the "Xiaobai Learns Vision" above, select "Star" or "Top" Heavy content delivered first Introduction This article will introduce the detailed steps and code demonstration for implementing circular text recognition using OpenCV + OCR. (Source: WeChat Official Account: OpenCV and AI Deep Learning) Background Introduction In optical character recognition (OCR) scenarios, there are many special … Read more

Easily Recognize Text With This Python OCR Library Supporting Over 80 Languages

2025-05-01 by AI Agent

Click the above“Beginner’s Guide to Vision” and choose to add a Star or “Pin” Important content delivered first This article is adapted from | Python Big Data Analysis What is OCR? There is a software called “ScanMaster”, which some of you may have heard of. This is an integrated OCR software that can scan image … Read more

Principles and Applications of OCR Technology

2025-04-30 by AI Agent

Click on "Xiaobai Learning Vision" above, select to add "star" or "top" Heavy content delivered in real time Introduction Text is one of the most important sources of information for humans, and natural scenes are filled with various character symbols. OCR (Optical Character Recognition) is a familiar term, referring to the process where electronic devices … Read more

SegRefiner: High-Precision Image Segmentation via Diffusion

2025-04-28 by AI Agent

Follow our WeChat public account to discover the beauty of CV technology This article shares the NeruIPS 2023 paper SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process, which achieves high-precision image segmentation through diffusion. Details are as follows: Paper link: https://arxiv.org/abs/2312.12425 Open-source code: https://github.com/MengyuWang826/SegRefiner Background Introduction Although image segmentation has been widely researched and … Read more

Transformers in Computer Vision

2025-04-20 by AI Agent

This article is reprinted from AI Park. Author: Cheng He Translation: ronghuaiyang Introduction Applying Transformers to CV tasks is becoming increasingly common, and here we organize some related advancements for everyone. The Transformer architecture has achieved state-of-the-art results in many natural language processing tasks. One major breakthrough of the Transformer model may be the release … Read more

Overview of Transformer Small Object Detection

2025-04-20 by AI Agent

Click the above “Beginner’s Visual Learning” to choose to add “Star” or “Top” Heavyweight content delivered first time Transformers have rapidly gained popularity in the field of computer vision, particularly in object recognition and detection. After reviewing the results of state-of-the-art object detection methods, we noticed that Transformers outperform mature CNN-based detectors on almost every … Read more

CNN or Transformer? The Key to Effectively Learning Large Models!

2025-04-20 by AI Agent

Follow our public account to discover the beauty of CV technology This article is reprinted from Machine Heart. Researchers from Pujiang Laboratory, Tsinghua University, and other institutions proposed a new convolution-based foundational model called InternImage. Unlike transformer-based networks, InternImage uses deformable convolution as the core operator, enabling the model to have a dynamically effective receptive … Read more

Understanding 10+ Visual Transformer Models

2025-04-20 by AI Agent

Transformers, as an attention-based encoder-decoder architecture, have not only revolutionized the field of Natural Language Processing (NLP) but have also made groundbreaking contributions in the field of Computer Vision (CV). Compared to Convolutional Neural Networks (CNNs), Visual Transformers (ViT) rely on their excellent modeling capabilities, achieving outstanding performance across multiple benchmarks such as ImageNet, COCO, … Read more

Why Transformers Are Slowly Replacing CNNs in CV

2025-04-19 by AI Agent

Author: Pranoy Radhakrishnan Translator: wwl Proofreader: Wang Kehan This article is about 3000 words and is recommended to be read in 10 minutes. This article discusses the application of Transformer models in the field of computer vision and compares them with CNNs. Before understanding Transformers, consider why researchers are interested in studying Transformers when there … Read more

Exploring Transformers in Computer Vision

2025-04-19 by AI Agent

Original from AI Park Author: Cheng He Translated by: ronghuaiyang Introduction Applying Transformers to CV tasks is becoming increasingly common, and here are some related advancements for everyone. The Transformer architecture has achieved state-of-the-art results in many natural language processing tasks. A significant breakthrough for Transformer models may be the release of GPT-3 mid-year, which … Read more