Building Instruction-Based Intelligent Agents: Insights from Transformer

Building Instruction-Based Intelligent Agents: Insights from Transformer

Source | The Robot Brains Podcast Translation | Xu Jiayu, Jia Chuan, Yang TingIn 2017, Google released the paper “Attention Is All You Need,” which proposed the Transformer architecture. This has become one of the most influential technological innovations in the field of neural networks over the past decade and has been widely applied in … Read more

Mamba Can Replace Transformer, But They Can Also Be Combined

Mamba Can Replace Transformer, But They Can Also Be Combined

Follow the public account to discover the beauty of CV technology This article is reprinted from Machine Heart, edited by Panda W. Transformers are powerful but not perfect, especially when dealing with long sequences. State Space Models (SSMs) perform quite well on long sequences. Researchers proposed last year that SSMs could replace Transformers, as seen … Read more

Comprehensive Tutorial: Visualizing Transformer

Comprehensive Tutorial: Visualizing Transformer

Click the above “Beginner Learn Vision“, select to add “to favorites” or “pin“ Important content delivered in real-time 1. Introduction This article is the second part of the visual AI algorithm tutorial series, and today’s main character is Transformer. Transformer can do many interesting and meaningful things. For example, I previously wrote about “What is … Read more

Who Will Replace Transformer?

Who Will Replace Transformer?

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering graduate students, faculty, and researchers in NLP. The community’s vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. Reprinted from | AI Technology Review … Read more

Understanding Transformers and Federated Learning

Understanding Transformers and Federated Learning

The Transformer, as an attention-based encoder-decoder architecture, has not only revolutionized the field of Natural Language Processing (NLP) but has also made groundbreaking contributions in the field of Computer Vision (CV). Compared to Convolutional Neural Networks (CNNs), Vision Transformers (ViT) rely on excellent modeling capabilities, achieving outstanding performance on multiple benchmarks such as ImageNet, COCO, … Read more

Understanding Transformer: 8 Questions and Answers

Understanding Transformer: 8 Questions and Answers

Originally from AI有道 Seven years ago, the paper “Attention is All You Need” introduced the transformer architecture, revolutionizing the entire field of deep learning. Today, all major models are based on the transformer architecture, yet the internal workings of the transformer remain a mystery. Last year, one of the authors of the transformer paper, Llion … Read more

Finally, Someone Visualized the Transformer!

Finally, Someone Visualized the Transformer!

Is there anyone who still doesn’t understand how the Transformer works in 2024?Come and try this interactive tool. In 2017, Google introduced the Transformer in the paper “Attention is All You Need,” which became a major breakthrough in deep learning. The paper has been cited nearly 130,000 times, and all models in the subsequent GPT … Read more

What Is ChatGPT?

What Is ChatGPT?

At the end of 2022, the conversational large language model ChatGPT, released by the artificial intelligence lab OpenAI, became an overnight sensation. This product quickly became a hot new generation of artificial intelligence products due to its powerful text processing and human-computer interaction capabilities. A report by UBS Group shows that ChatGPT has surpassed 100 … Read more

ChatGPT Development History, Principles, Technical Architecture, and Future

ChatGPT Development History, Principles, Technical Architecture, and Future

Source: Chen Wei Talks on Chips, This article will introduce the characteristics, functions, technical architecture, limitations, industrial applications, investment opportunities, and future of ChatGPT. Author: Dr. Chen Wei, the author previously served as the chief scientist of a Huawei-affiliated natural language processing (NLP) company. Integrated storage/computing/GPU architecture and AI expert, senior title. Expert in the … Read more

How ChatGPT Can Be Used in Education

How ChatGPT Can Be Used in Education

【Tip】:Click on “Puning Education” above to subscribe for the latest news Recently, ChatGPT has become popular worldwide. As a generative artificial intelligence software, ChatGPT can generate text on any topic, accomplishing various tasks including answering questions, writing articles, papers, and poetry. It has been praised as having “significant historical significance, comparable to the birth of … Read more