PyTorch Tricks Compilation

PyTorch Tricks Compilation

Author丨z.defying@Zhihu Source丨https://zhuanlan.zhihu.com/p/76459295 Editor | Jishi Platform For academic sharing only, please contact to delete if there is infringement Table of Contents 1. Specify GPU ID 2. View model output details for each layer 3. Gradient Clipping 4. Expand image dimensions 5. One-hot encoding 6. Prevent out-of-memory during model validation 7. Learning rate decay 8. Freeze … Read more

Getting Started with PyTorch: A Dynamic Neural Network Library

Getting Started with PyTorch: A Dynamic Neural Network Library

Hello everyone, I am Azheng. Today, I am excited to introduce a “martial arts master”—PyTorch, a Python library that is like a dynamic neural network superhero! With it, you can navigate the world of neural networks with ease. 1. Getting to Know PyTorch Imagine PyTorch as a magical “workshop” that specializes in creating amazing tools … Read more

Practical Implementation of PyTorch FlexAttention: Causal Attention and Variable-Length Sequence Processing Based on BlockMask

Practical Implementation of PyTorch FlexAttention: Causal Attention and Variable-Length Sequence Processing Based on BlockMask

Source: DeepHub IMBA This article is approximately 2000 words long and is recommended for a 5-minute read. This article introduces how to use the new FlexAttention and BlockMask features introduced in PyTorch version 2.5 and above to implement causal attention mechanisms and handle padded inputs. Given the current lack of complete code examples and technical … Read more

Comprehensive Collection of Common PyTorch Code Snippets

Comprehensive Collection of Common PyTorch Code Snippets

↑ ClickBlue Text Follow the Jishi Platform Author丨Jack Stark@Zhihu (Authorized) Source丨https://zhuanlan.zhihu.com/p/104019160 Editor丨Jishi Platform Jishi Guide This article is a collection of common PyTorch code snippets, covering five aspects: basic configuration, tensor processing, model definition and operation, data processing, and model training and testing. It also provides several noteworthy tips, making the content very comprehensive. >> … Read more

Convolutional Neural Networks: Understanding the Digit Zero

Convolutional Neural Networks: Understanding the Digit Zero

Cover Image: Airbnb Headquarters, Illustrated in March 2020 Recently, while exploring artificial intelligence, I felt that among the materials available, there is a lot of information that can yield results through programming steps, but many people regard this process as a black box. It is often said that we do not know why this process … Read more

Illustration of 3 Common Deep Learning Network Structures: FC, CNN, RNN

Illustration of 3 Common Deep Learning Network Structures: FC, CNN, RNN

Introduction: Deep learning can be applied in various fields, and the shapes of deep neural networks vary according to different application scenarios. The common deep learning models mainly include Fully Connected (FC), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN). Each of these has its own characteristics and plays an important role in different … Read more

Explaining CNNs from the Frequency Domain Perspective

Explaining CNNs from the Frequency Domain Perspective

Link: https://www.zhihu.com/question/59532432/answer/1510340606 Editor: Deep Learning and Computer Vision Disclaimer: For academic sharing only, please delete if infringed Time domain convolution = frequency domain multiplication. Most of the computations in convolutional neural networks occur in the convolution part. How to think about convolutional neural networks from the perspective of the frequency domain? How to explain ResNet … Read more

Understanding Convolutional Neural Networks (CNN)

Understanding Convolutional Neural Networks (CNN)

Understanding Convolutional Neural Networks (CNN) Convolutional Neural Networks (CNN) are a type of feedforward neural network where artificial neurons can respond to a portion of the surrounding units within a coverage area, demonstrating outstanding performance in large image processing. CNN has five characteristics: 1. Local perception; 2. Parameter sharing; 3. Sampling; 4. Multiple convolutional kernels; … Read more

Explaining the Basic Structure of Convolutional Neural Networks (CNN)

Explaining the Basic Structure of Convolutional Neural Networks (CNN)

I am a master’s student at a double first-class university, and I am currently preparing for the 2024 autumn recruitment. While looking for internships in large model algorithm positions, I encountered many interesting interviews, so I decided to record these interview questions and share them with friends who, like me, are striving for a satisfactory … Read more

Stanford Deep Learning Course Part 7: RNN, GRU, and LSTM

Stanford Deep Learning Course Part 7: RNN, GRU, and LSTM

This article is a translated version of the notes from Stanford University’s CS224d course, authorized by Professor Richard Socher of Stanford University. Unauthorized reproduction is prohibited; for specific reproduction requirements, please see the end of the article. Translation: Hu Yang & Xu Ke Proofreading: Han Xiaoyang & Long Xincheng Editor’s Note: This article is the … Read more