Understanding Optical Character Recognition (OCR)

Understanding Optical Character Recognition (OCR)

Introduction Optical Character Recognition (OCR) is a branch of computer vision research, belonging to pattern recognition and artificial intelligence, and is an important component of computer science. This article will briefly describe the various components in the field of OCR, using the above image as the main clue. 1. Introduction to Optical Character Recognition Computer … Read more

A Survey on Generative Diffusion Models

A Survey on Generative Diffusion Models

0. Introduction This article reviews deep generative models, particularly diffusion models, and how they endow machines with human-like imagination. Diffusion models show great potential in generating realistic samples, overcoming the posterior distribution alignment obstacles in variational autoencoders and alleviating the instability of adversarial objectives in generative adversarial networks. Diffusion models consist of two interconnected processes: … Read more

StableVideo: Generate Continuous Flicker-Free Videos Using Stable Diffusion

StableVideo: Generate Continuous Flicker-Free Videos Using Stable Diffusion

Source: DeepHub IMBA This article is approximately 2100 words long and is recommended to be read in 7 minutes. This article summarizes the paper by Chai et al., which proposes a new method that enables diffusion models to edit videos with high temporal consistency. Using Stable Diffusion to generate videos has always been a research … Read more

Understanding Diffusion Models from a Unified Perspective

Understanding Diffusion Models from a Unified Perspective

Source: PaperWeekly This article is about 8700 words, and it is recommended to read for 15+minutes This article briefly summarizes the derivation process of diffusion models from a unified perspective. Source This article is a compilation of some technical blogs and overviews that I recently read about diffusion models, mainly referencing Calvin Luo’s paper, aimed … Read more

Building an LLM from Scratch: A Step-by-Step Guide

Building an LLM from Scratch: A Step-by-Step Guide

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP master’s and PhD students, university teachers, and researchers from enterprises. The vision of the community is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. … Read more

Prospects For Large Language Models In The Era Of AGI

Author: Wang Yurun This article is about 5000 words long and suggests a reading time of 10 minutes. The goal of AGI has always been the ultimate vision of artificial intelligence research. 1. What is AGI AGI (Artificial General Intelligence) refers to an artificial intelligence system that can exhibit a wide and flexible range of … Read more

Latest Research: iTransformer – Are Transformers Unsuitable for Multivariate Time Series Forecasting?

Latest Research: iTransformer - Are Transformers Unsuitable for Multivariate Time Series Forecasting?

Paper Title: Inverted Transformers are Effective for Time Series Forecasting Authors: Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, Mingsheng Long Compiler: Sienna Reviewer: Los Introduction: iTransformer is the latest research achievement in the field of time series forecasting in 2024, currently demonstrating the best performance (SOTA) on standard time series … Read more

Implementing the Transformer Model from Scratch

Implementing the Transformer Model from Scratch

Madio.net Mathematics China /// Editor: Mathematics China Qianxia Since thoroughly understanding the Self_Attention mechanism, the author’s understanding of the Transformer model has risen directly from the underground to the atmosphere, and the meridians have been opened. Before going to sleep every night, that gentle phrase “Attention is all you need” often echoes in my ears, … Read more

Transformers Mimic Brain Functionality and Outperform 42 Models

Transformers Mimic Brain Functionality and Outperform 42 Models

Follow our official account to discover the beauty of CV technology This article is reprinted from Quantum Bit. Pine from Aofeisi Quantum Bit | Official Account QbitAI Many AI application models today cannot avoid mentioning one model structure: Transformer. It abandons traditional CNN and RNN structures, consisting entirely of the Attention mechanism. Transformers not only … Read more

Understanding the Mathematical Principles of Transformers

Understanding the Mathematical Principles of Transformers

Author:Fareed Khan Translator: Zhao Jiankai,Proofreader: Zhao Ruxuan The transformer architecture may seem intimidating, and you may have seen various explanations on YouTube or blogs. However, in my blog, I will clarify its principles by providing a comprehensive mathematical example. By doing so, I hope to simplify the understanding of the transformer architecture. Let’s get started! … Read more