Summary of BERT Related Papers, Articles, and Code Resources

Summary of BERT Related Papers, Articles, and Code Resources

BERT has been very popular recently, so let’s gather some related resources, including papers, code, and article interpretations. 1. Official Google resources: 1) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Everything started with this paper released by Google in October, which instantly ignited the entire AI community, including social media: https://arxiv.org/abs/1810.04805 2) GitHub: … Read more

Hands-On Series with Hugging Face Transformers – 03 Analysis of Transformers Model

Hands-On Series with Hugging Face Transformers - 03 Analysis of Transformers Model

In Chapter 2, we saw what is needed to fine-tune and evaluate a Transformer. Now let’s take a look at how they work under the hood. In this chapter, we will explore the main components of the Transformer model and how to implement them using PyTorch. We will also provide guidance on how to do … Read more

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

Follow our public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! Reprinted from | PaperWeekly ©PaperWeekly Original · Author|Li Luoqiu School|Zhejiang University Master’s Student Research Direction|Natural Language Processing, Knowledge Graph Continuing from the previous article, I will record my understanding of the HuggingFace open-source Transformers project code. This article is based on the … Read more

Local Deployment and Fine-Tuning Tutorial for Qwen 2.5 Model

Local Deployment and Fine-Tuning Tutorial for Qwen 2.5 Model

“ As a non-professional beginner, my initial interest in large models led me to explore related knowledge. As I read more papers and reports, I always wanted to practice with large models but didn’t know where to start. I believe many students share the same experience as I did back then. This article will guide … Read more

Understanding Word2vec Principles and Practice

Understanding Word2vec Principles and Practice

Source: Submission Author: Aksy Editor: Senior Sister Video Link: https://ai.deepshare.net/detail/p_5ee62f90022ee_zFpnlHXA/6 Article Title: Efficient Estimation of Word Representations in Vector Space Author: Tomas Mikolov (First Author) Unit: Google Conference and Time: ICLR 2013 1. Research Background 1.1 Prior Knowledge Mathematics Knowledge: Calculus in Advanced Mathematics Matrix Operations in Linear Algebra Conditional Probability in Probability Theory Machine … Read more

Summary of PyTorch Loss Functions

Summary of PyTorch Loss Functions

Source: Pythonic Biologist This article is about 1900 words long, and it is recommended to read it in 8 minutes. TensorFlow and PyTorch are quite similar; this article introduces loss functions using PyTorch as an example. 19 Types of Loss Functions 1. L1 Loss L1Loss Calculates the absolute difference between output and target. torch.nn.L1Loss(reduction='mean') Parameters: … Read more

Common Pitfalls in PyTorch

Common Pitfalls in PyTorch

Click the “CVer” above to select “Star” or “Pin”. Heavyweight content delivered at the first time. Author: Bi Ji Ji https://zhuanlan.zhihu.com/p/59271905 This article is authorized, and no secondary reproduction is allowed without permission. 1. The Differences Between nn.Module.cuda() and Tensor.cuda() Both the cuda() function can achieve memory migration from CPU to GPU for models and … Read more

Common Pitfalls in PyTorch

Common Pitfalls in PyTorch

Click on the “CVer” above to select “Star” or “Top” Heavyweight content delivered promptly Author: Yu Zhenbo https://zhuanlan.zhihu.com/p/77952356 This article is authorized by the author and cannot be reproduced without permission. Recently, I just started using PyTorch and have encountered quite a few pitfalls. I record them here, as I feel they are common issues … Read more

Essential Tool for PyTorch: Accelerate Mixed Precision Training with Apex

Essential Tool for PyTorch: Accelerate Mixed Precision Training with Apex

Author: Nicolas Affiliation: Researcher at Zhuiyi Technology AI Lab Research Direction: Information Extraction, Machine Reading Comprehension Do you want to experience double the training speed? Do you want to instantly double your GPU memory? If I tell you that it only takes three lines of code, would you believe it? In this article, the author … Read more