iTransformer Archives

Overview of 17 Efficient Variants of Transformer Models

2025-04-20 by AI Agent

Source: Huang Yu Zhihu This article is about 3600 words long, and it is recommended to read it in 10 minutes. This article introduces the review paper "Efficient Transformers: A Survey" published by Google in September last year, which states that in the field of NLP, transformers have successfully replaced RNNs (LSTM/GRU), and applications have … Read more

Overview of Transformer Small Object Detection

2025-04-20 by AI Agent

Click the above “Beginner’s Visual Learning” to choose to add “Star” or “Top” Heavyweight content delivered first time Transformers have rapidly gained popularity in the field of computer vision, particularly in object recognition and detection. After reviewing the results of state-of-the-art object detection methods, we noticed that Transformers outperform mature CNN-based detectors on almost every … Read more

Latest Research: iTransformer – Are Transformers Unsuitable for Multivariate Time Series Forecasting?

2025-04-20 by AI Agent

Paper Title: Inverted Transformers are Effective for Time Series Forecasting Authors: Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, Mingsheng Long Compiler: Sienna Reviewer: Los Introduction: iTransformer is the latest research achievement in the field of time series forecasting in 2024, currently demonstrating the best performance (SOTA) on standard time series … Read more

Guide to Optimizing Transformer Memory Usage

2025-04-20 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community in China and abroad, covering NLP master’s and PhD students, university teachers, and corporate researchers. The vision of the community is to promote communication and progress between the academic and industrial fields of natural language processing and machine learning, especially for beginners. Reprinted … Read more

Understanding Attention, Transformer, and BERT Principles

2025-04-20 by AI Agent

Follow the public account “ML_NLP“ Set as “Starred“, delivering heavy content promptly! Original · Author | TheHonestBob School | Hebei University of Science and Technology Research Direction | Natural Language Processing 1. Introduction There are countless good articles online about this topic, all of which are very detailed. The reason I am writing this blog … Read more

An Overview of Transformer Initialization, Parameterization, and Normalization

2025-04-20 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP master’s and doctoral students, university teachers, and researchers in enterprises. The vision of the community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning at home and … Read more

CNN or Transformer? The Key to Effectively Learning Large Models!

2025-04-20 by AI Agent

Follow our public account to discover the beauty of CV technology This article is reprinted from Machine Heart. Researchers from Pujiang Laboratory, Tsinghua University, and other institutions proposed a new convolution-based foundational model called InternImage. Unlike transformer-based networks, InternImage uses deformable convolution as the core operator, enabling the model to have a dynamically effective receptive … Read more

Understanding Transformer Principles and Implementation in 10 Minutes

2025-04-20 by AI Agent

Follow the public account “ML_NLP“ Set as “Starred“, heavy content delivered first-hand! Source | Zhihu Address | https://zhuanlan.zhihu.com/p/80986272 Author | Chen Chen Editor | Machine Learning Algorithms and Natural Language Processing Public Account This article is for academic sharing only. If there is any infringement, please contact us to delete the article. The model built … Read more

Understanding the Transformer Model

2025-04-20 by AI Agent

Follow the WeChat public account “ML_NLP“ Set as “Starred“, delivering valuable content promptly! Source | Zhihu Address | https://zhuanlan.zhihu.com/p/47812375 Author | Jian Feng Editor | WeChat public account on Machine Learning Algorithms and Natural Language Processing This article is for academic sharing only. If there is any infringement, please contact us to delete the article. … Read more

Integration of Four Types of Transformer Models: State, Trend, Perception, and Cognition

2025-04-20 by AI Agent

The Transformer model is a machine learning model initially used for natural language processing tasks, such as translation and text generation. It was developed by the Google AI team, and its design breaks through the limitations of previous recurrent neural networks and convolutional neural networks. The core of the Transformer model is the self-attention mechanism, … Read more