Overview of Transformer Pre-trained Models in NLP

Overview of Transformer Pre-trained Models in NLP

The revolution brought by the Transformer in the field of natural language processing (NLP) is beyond words. Recently, researchers from the Indian Institute of Technology and biomedical AI startup Nference.ai conducted a comprehensive investigation of Transformer-based pre-trained models in NLP and compiled the results into a review paper. This article will roughly translate and introduce … Read more

ShapeFormer: Shapelet Transformer for Multivariate Time Series Classification

ShapeFormer: Shapelet Transformer for Multivariate Time Series Classification

Source: Time Series Research This article is approximately 3400 words long and is recommended for a 5-minute read. This article introduces the Transformer in multivariate time series classification. Multivariate time series classification (MTSC) has attracted extensive research attention due to its diverse real-world applications. Recently, utilizing Transformers for MTSC has achieved state-of-the-art performance. However, existing … Read more

Introduction to Transformer Models

Introduction to Transformer Models

Madio.net Mathematics China ///Editor: Only Tulips’ Garden The essence of the Transformer is an Encoder-Decoder structure, as shown in the figure: Before the advent of transformers, most sequence-to-sequence models (Encoder-Decoder) were based on CNNs and RNNs. In this article, we have already introduced the Attention and Self-attention mechanisms, and the Transformer is based on the … Read more

Revisiting Transformer: Inversion More Effective, New SOTA for Real-World Prediction

Revisiting Transformer: Inversion More Effective, New SOTA for Real-World Prediction

The Transformer has shown strong capabilities in time series forecasting, capable of describing pairwise dependencies and extracting multi-level representations from sequences. However, researchers have also questioned the effectiveness of Transformer-based predictors. These predictors often embed multiple variables at the same timestamp into indistinguishable channels and attend to these time tokens to capture temporal dependencies. Considering … Read more

ViTGAN: A New Approach to Image Generation Using Transformers

ViTGAN: A New Approach to Image Generation Using Transformers

Transformers have brought tremendous advancements to various natural language tasks and have recently begun to penetrate the field of computer vision, starting to show potential in tasks previously dominated by CNNs. A recent study from the University of California, San Diego, and Google Research proposed using visual Transformers to train GANs. To effectively apply this … Read more

Illustrated Guide to Transformers

Illustrated Guide to Transformers

Step 1 — Define the Dataset For demonstration purposes, the dataset here contains only three English sentences, using a very small dataset to intuitively perform numerical calculations. In real applications, larger datasets are used to train neural network models, such as ChatGPT, which was trained on data amounting to 570 GB. Our entire dataset contains … Read more

Time Series + Transformer: Understanding iTransformer

Time Series + Transformer: Understanding iTransformer

This article is about 3500 words long and is recommended to be read in 10 minutes. This article will help you understand iTransformer and better utilize the attention mechanism for multivariate correlation. 1 Introduction Transformers perform excellently in natural language processing and computer vision, but they do not perform as well as linear models in … Read more

A Review of Transformers at the Forefront of GNN

A Review of Transformers at the Forefront of GNN

This article is about 4500 words long and is recommended for a reading time of over 10 minutes. This article introduces Graphormer, a graph representation learning method based on the standard Transformer architecture. 1 Introduction The Transformer architecture has shown excellent performance in fields such as natural language processing and computer vision, but it performs … Read more

Understanding Transformers: A Simplified Guide

Understanding Transformers: A Simplified Guide

Source: Python Data Science This article is approximately 7200 words long and is recommended to be read in 14 minutes. In this article, we will explore the Transformer model and understand how it works. 1. Introduction The BERT model launched by Google achieved SOTA results in 11 NLP tasks, igniting the entire NLP community. One … Read more

Recent Advances in Graph Transformer Research

Recent Advances in Graph Transformer Research

Source: Algorithm Advancement This article is approximately 4500 words long and is recommended for a 9-minute read. This article introduces the Graph Transformer, a novel and powerful neural network model capable of effectively encoding and processing graph-structured data. Graph neural networks (GNNs) and Transformers represent a recent advancement in machine learning, providing a new type … Read more