How to Handle Variable Length Sequences Padding in PyTorch RNN

How to Handle Variable Length Sequences Padding in PyTorch RNN

Follow us on WeChat “ML_NLP” Set as “Starred”, delivering valuable content to you first! Produced by Machine Learning Algorithms and Natural Language Processing Original Column Author on WeChat @ Yi Zhen School | PhD Student at Harbin Institute of Technology SCIR 1. Why RNN Needs to Handle Variable Length Inputs Assuming we have an example … Read more

SUPRA: Transforming Transformers into Efficient RNNs Without Extra Training

SUPRA: Transforming Transformers into Efficient RNNs Without Extra Training

This article is approximately 2600 words long and is recommended to be read in 9 minutes. The SUPRA method significantly improves model stability and performance by replacing softmax normalization with GroupNorm. Transformers have established themselves as the primary model architecture, particularly due to their outstanding performance across various tasks. However, the memory-intensive nature of Transformers … Read more

Essential Guide to Recurrent Neural Networks for Beginners

Essential Guide to Recurrent Neural Networks for Beginners

Author: Victor Zhou Translator: Wang Yutong Proofreader: Wu Jindi This article is about 3800 words, recommended reading time is 15 minutes. This article will introduce the basics of Recurrent Neural Networks (Vanilla RNNs), how they work, and how to implement them in Python. Recurrent Neural Networks (RNNs) are a type of neural network specifically designed … Read more

Future Directions of Large Models by Academician Zhang Bo

Future Directions of Large Models by Academician Zhang Bo

Recently, Academician Zhang Bo of the Chinese Academy of Sciences and Honorary Dean of the Institute for Artificial Intelligence at Tsinghua University stated in his speech at the 12th Internet Security Conference ISC.AI 2024 that current artificial intelligence lacks a theory, only developed models and algorithms targeted at specific fields. Both software and hardware are … Read more

A Review of Major AI Models in China and Abroad

A Review of Major AI Models in China and Abroad

As a heavy user of productivity tools, I consider myself to be on the cutting edge of this wave of AI. I have basically used various AI models and tools, and here I will briefly share my views on them. The following evaluations are very subjective, and I don’t care about any benchmark rankings. I … Read more

What Is the Emergence of Large AI Models?

What Is the Emergence of Large AI Models?

The large models of artificial intelligence not only amaze people through conversation, but many projects have already been implemented.The key to all this is the magical “emergence” of large AI models, and this article explains this phenomenon. 1. Artificial Intelligence Is Booming Again The popularity of large models triggered by ChatGPT remains unabated, with China … Read more

Eight Observations on Large Model Technology Development

Eight Observations on Large Model Technology Development

Following the emergence of ChatGPT, the introduction of the Sora model has once again ignited enthusiasm for AI across various industries. In the face of rapidly evolving terminology, the plethora of personal and enterprise applications, and the continuous restructuring of business models, large models can exhibit astonishing capabilities and quickly impact society, which has deeper … Read more

Four Directions for the Future of Large Models

Four Directions for the Future of Large Models

Recently, Zhang Bo, an academician of the Chinese Academy of Sciences and honorary director of the Institute of Artificial Intelligence at Tsinghua University, stated in a speech at the ISC.AI 2024 12th Internet Security Conference that current artificial intelligence lacks a theory; it has only developed models and algorithms targeted at specific fields. Both the … Read more

Introduction to Mainstream AI Large Models in China

Introduction to Mainstream AI Large Models in China

  With the rapid development of artificial intelligence technology, a number of excellent AI large models have emerged in China, demonstrating outstanding performance in various fields and scenarios. Here are some of the mainstream AI large models in China and their characteristics: 1. Alibaba Cloud — Tongyi Qianwen Characteristics: Has hundreds of billions of parameters, supports … Read more

From Black Iron Era to Silver Era: The Evolution of Large Model Infra

From Black Iron Era to Silver Era: The Evolution of Large Model Infra

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate students, university teachers, and researchers from enterprises. The vision of the community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning, especially for beginners. Reprinted from … Read more