Diffusion Transformer Archives

Essential Interview Preparation for AI Roles: XGBoost, Transformer, BERT, and Wave Network Principles

2025-08-05 by AI Agent

Yunzhong from Aofeisi Quantum Bit Edited | Public Account QbitAI In today’s era of artificial intelligence, most people pay attention to deep learning technologies, but please do not overlook the understanding of traditional machine learning techniques. In fact, when you truly engage in AI work, you will find that the dependence on traditional machine learning … Read more

A Quick Guide to DeepSeek for Everyone! Save It Now

2025-08-02 by AI Agent

During this year’s Spring Festival, an “AI rising star” from Hangzhou quietly emerged, named DeepSeek. It struck like a sudden lightning bolt, illuminating the global AI night sky and bringing a mysterious “Eastern power” to the open-source community. As DeepSeek gained popularity, more and more people began to use this AI tool. So how can … Read more

A Quick Guide to DeepSeek for Ordinary People

2025-08-02 by AI Agent

This Spring Festival, an “AI star” from Hangzhou quietly rose to prominence, named DeepSeek. It struck like a sudden lightning bolt, not only illuminating the global AI night sky but also bringing a mysterious “Eastern power” to the open-source community. As DeepSeek gained popularity, more and more people began to use this AI tool. So … Read more

Summary of Masking Methods in NLP Pre-Training

2025-07-31 by AI Agent

MLNLP(Machine Learning Algorithms and Natural Language Processing) community is one of the largest natural language processing communities in China and abroad, gathering over 500,000 subscribers, with audiences including NLP master’s and doctoral students, university teachers, and corporate researchers.The vision of the community is to promote communication and progress between the academic and industrial sectors of … Read more

Understanding Transformer Source Code in PyTorch

2025-07-29 by AI Agent

Follow the public account “ML_NLP“ and set it as “Starred“, to receive heavy content promptly! Reprinted from | PaperWeekly©PaperWeekly Original · Author | SherlockSchool | Suzhou University of Science and Technology undergraduateResearch Direction | Natural Language Processing Word Embedding The Transformer is essentially an Encoder. Taking the translation task as an example, the original dataset … Read more

Overview of Multimodal Learning Based on Transformer Networks

2025-07-28 by AI Agent

Click on the above“Beginner’s Guide to Vision” to choose to add to favorites or pin. Essential content delivered promptly The Transformer network architecture, as an exceptional neural network learner, has achieved great success in various machine learning problems. With the booming development of multimodal applications and multimodal big data in recent years, multimodal learning based … Read more

Three Steps to Large Kernel Attention: Tsinghua’s VAN Surpasses SOTA ViT and CNN

2025-07-27 by AI Agent

Source: Machine Heart This article is approximately 2774 words long and is recommended to be read in 13 minutes. This article introduces a novel large kernel attention module proposed by researchers from Tsinghua University and Nankai University, and constructs a new neural network named VAN that outperforms SOTA visual transformers based on LKA. As a … Read more

Research on Land Subsidence Intelligent Prediction Method Based on LSTM and Transformer

2025-07-26 by AI Agent

Research on land subsidence intelligent prediction method based on LSTM and Transformer——A case study of Shanghai PENG Wenxiang1,2,3,4,5，ZHANG Deying1,2,3,4,5 1. Shanghai Institute of Geological Survey, Shanghai 200072; 2. Shanghai Institute of Geological Exploration Technology, Shanghai 200072; 3. Key Laboratory of Land Subsidence Monitoring and Prevention, Ministry of Natural Resources of China, Shanghai 200072; 4. Shanghai … Read more

LSTM Is Hot Again! 52 Innovative Ideas + Open Source Code!

2025-07-26 by AI Agent

This year, LSTM is trending! The original authors of LSTM have proposed xLSTM and Vision-LSTM, addressing previous limitations. At the same time, LSTM+Transformer has made it to Nature; various hybrid model architectures such as LSTM+CNN, LSTM+Attention are continuously refreshing SOTA. LSTM is definitely a great direction for generating ideas and publishing papers recently. I have … Read more

Understanding Self-Attention Mechanism

2025-07-26 by AI Agent

Source: Machine Learning Algorithms This article is about 2400 words long and is suggested to be read in 5 minutes. This article illustrates the Self-Attention mechanism. 1. Difference Between Attention Mechanism and Self-Attention Mechanism The difference between Attention mechanism and Self-Attention mechanism: The traditional Attention mechanism occurs between the elements of the Target and all … Read more