data democratization Archives

Youdao Cloud Note Supports OCR Recognition: Convert Images to Text, PDF to Word

2025-04-02 by AI Agent

If you are not familiar with what the new OCR feature can do for you, you must have encountered the following situations, and the OCR feature is the “magic remedy” for these “issues”. In work or research studies, dealing with a large number of paper documents, magazines, and PDF electronic materials makes it difficult to … Read more

OCR Image Recognition Using Python

2025-04-02 by AI Agent

Data collection often encounters images that can only be viewed and not copied. Manually extracting text can require a significant amount of work. For example, in the table of prices for a property development, how can one find houses with lower unit prices? It can be difficult to discern with the naked eye. Can we … Read more

Eight Common Open Source OCR Tools

2025-04-01 by AI Agent

Author | Chen Xiaobing Reviewed by | Chong Lou OCR (Optical Character Recognition) is a technology that automatically converts text in images into editable text. Currently, various vendors provide OCR recognition APIs for different scenarios. However, there are also several open-source OCR frameworks and tools available that support customization and training, allowing developers to flexibly … Read more

RAGFlow: Next-Gen RAG Engine Based on OCR and Document Parsing

2025-03-29 by AI Agent

Click the blue text above to follow us 1. Introduction In the wave of artificial intelligence, Retrieval-Augmented Generation (RAG) technology has become a hot topic in research and application due to its unique advantages. RAG technology combines the powerful generative capabilities of large language models (LLMs) with efficient information retrieval systems, providing users with a … Read more

Alibaba’s 7B Multimodal Document Understanding Model Achieves New SOTA

2025-03-16 by AI Agent

mPLUG Team Contribution QbitAI | WeChat Official Account New SOTA in Multimodal Document Understanding! Alibaba’s mPLUG team has released the latest open-source work mPLUG-DocOwl 1.5, proposing a series of solutions to tackle four major challenges: high-resolution image text recognition, general document structure understanding, instruction following, and external knowledge incorporation. Without further ado, let’s take a … Read more

How to Handle Table Data in RAG Knowledge Base Documents?

2025-03-02 by AI Agent

In developing the RAG system, the data formats in the knowledge base can be diverse, and most of them are unstructured data content. For example, PDF documents in the knowledge base are likely to contain table data, and our approach to handling this needs special attention to ensure that the table information can be correctly … Read more

Recent Advances in Document Image Rectification: Introducing Transformer Framework and Polar Representation

2025-03-01 by AI Agent

2025 1/22 TextIn.com TextIn —— Focused on Intelligent Text Recognition for 18 Years —— In the article “Overview of Document Digital Capture and Intelligent Processing: Image Distortion Correction Technology”, we introduced the development and representative schemes of document image correction technology. As the demand for intelligent document processing gradually upgrades, document image de-distortion technology is … Read more

The Rise of Deepfake: What Is Synthetic Data Used For?

2025-02-18 by AI Agent

Author | Astasia Myers Translator | Sambodhi Editor | Vincent AI Frontline Introduction: We previously published an article titled “AI Startups Competing to Commercialize Deepfake” on the 4th of this month. We believe readers have already recognized that Deepfake is a double-edged sword; when used correctly, it benefits society, but when misused, it can lead … Read more