Recommended Computer Vision Papers for May 2024

Recommended Computer Vision Papers for May 2024

Source: DeepHub IMBA This article is approximately 3100 words long and is recommended for a 6-minute read. This article introduces the latest research and advancements in the field of computer vision, covering various topics including diffusion models, vision-language models, image editing and generation, video processing and generation, and image recognition. Today, we summarize the most … Read more

Understanding and Implementing Diffusion Models in PyTorch

Understanding and Implementing Diffusion Models in PyTorch

In the previous article, we introduced OpenAI Sora, which has once again shocked the AI community with its text-to-video model, and mentioned that Sora is essentially a diffusion model + Transformer. This article continues to discuss the development, principles, and coding practices of diffusion models. The catalyst for diffusion models began with the introduction of … Read more

Understanding and Implementing Diffusion Models with PyTorch

Understanding and Implementing Diffusion Models with PyTorch

Source: Machine Learning Algorithms<br/><br/> This article is approximately 6500 words long and is suggested to take 13 minutes to read. This article provides a complete introduction to the necessary knowledge about diffusion models and implements it fully using PyTorch. The trigger for diffusion models began with the introduction of the Denoising Diffusion Probabilistic Model (DDPM) … Read more

Principles and Implementation of Diffusion Models in PyTorch

Principles and Implementation of Diffusion Models in PyTorch

MLNLP community is a well-known machine learning and natural language processing community in China and abroad, covering NLP master’s and doctoral students, university teachers, and corporate researchers. The Vision of the Community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning in China and abroad, … Read more

Google Proposes New Framework Soft Diffusion: Correctly Scheduling, Learning, and Sampling from General Corruption Processes

Google Proposes New Framework Soft Diffusion: Correctly Scheduling, Learning, and Sampling from General Corruption Processes

Follow our public account to discover the beauty of CV technology This article is adapted from Machine Heart, edited by Dan Jiang. Recently, diffusion models have become a research hotspot in the AI field. Researchers from Google Research and UT-Austin have fully considered the ‘corruption’ process in their latest study, proposing a design framework for … Read more

A Survey on Generative Diffusion Models

A Survey on Generative Diffusion Models

0. Introduction This article reviews deep generative models, particularly diffusion models, and how they endow machines with human-like imagination. Diffusion models show great potential in generating realistic samples, overcoming the posterior distribution alignment obstacles in variational autoencoders and alleviating the instability of adversarial objectives in generative adversarial networks. Diffusion models consist of two interconnected processes: … Read more

Understanding Diffusion Models: A Unified Perspective

Understanding Diffusion Models: A Unified Perspective

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP master’s and doctoral students, university teachers, and corporate researchers. The vision of the community is to promote communication and progress between academia, industry, and enthusiasts in natural language processing and machine learning, especially for beginners. Reprinted from … Read more

Stable Diffusion Sampling Speed Doubled! Diffusion Model Sampling Algorithm Requires Only 10 to 25 Steps

Stable Diffusion Sampling Speed Doubled! Diffusion Model Sampling Algorithm Requires Only 10 to 25 Steps

Professor Zhu Jun from Tsinghua University’s Computer Science Department, leading the TSAIL team, proposed DPM-Solver (NeurIPS 2022 Oral, approximately the top 1.7%) and DPM-Solver++, which has pushed the rapid sampling algorithm of diffusion models to the extreme: achieving high-quality sampling with only 10 to 25 steps without additional training. This year, one of the most … Read more

SemCity: 3D Diffusion Model for Outdoor Scene Generation

SemCity: 3D Diffusion Model for Outdoor Scene Generation

Paper Title: SemCity: Semantic Scene Generation with Triplane Diffusion Authors: Jumin Lee1, Sebin Lee1, Changho Jo, Woobin Im, Juhyeong Seon, Sung-Eui Yoon Project Address: https://sglab.kaist.ac.kr/SemCity/ Compiler: Babata Reviewer: Los Introduction: This paper has been accepted by CVPR24 and proposes a 3D diffusion model for generating outdoor real scenes. Utilizing the Triplane Diffusion method, it not … Read more

SegRefiner: High-Precision Image Segmentation via Diffusion

SegRefiner: High-Precision Image Segmentation via Diffusion

Follow our WeChat public account to discover the beauty of CV technology This article shares the NeruIPS 2023 paper SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process, which achieves high-precision image segmentation through diffusion. Details are as follows: Paper link: https://arxiv.org/abs/2312.12425 Open-source code: https://github.com/MengyuWang826/SegRefiner Background Introduction Although image segmentation has been widely researched and … Read more