Understanding Huffman Tree Generation in Word2Vec

Understanding Huffman Tree Generation in Word2Vec

Deep learning has achieved great success in natural language processing (NLP) tasks, among which distributed representation of words is a crucial technology. To deeply understand distributed representation, one must delve into word2vec. Today, let’s explore how the Huffman Tree is generated in the word2vec code. This is a very important data structure in word2vec, used … Read more

Understanding Word2Vec: A Deep Dive into Neural Networks

Understanding Word2Vec: A Deep Dive into Neural Networks

Since Tomas Mikolov from Google proposed Word2Vec in “Efficient Estimation of Word Representation in Vector Space”, it has become a fundamental component of deep learning in natural language processing. The basic idea of Word2Vec is to represent each word in natural language as a short vector with a unified meaning and dimension. As for what … Read more

Understanding Word2vec: The Essence of Word Vectors

Understanding Word2vec: The Essence of Word Vectors

Summary of Word2vec Reference Materials Let me briefly describe my deep dive into Word2vec: I first looked at Mikolov’s two original papers on Word2vec, but found myself still confused after reading them. The main reason is that these papers omit too much theoretical background and derivation details. I then revisited Bengio’s 2003 JMLR paper and … Read more

The Secrets of Word2Vec: Part 3 of the Word Embedding Series

The Secrets of Word2Vec: Part 3 of the Word Embedding Series

Excerpt from Sebastian Ruder Blog Author: Sebastian Ruder Translated by: Machine Heart Contributors: Terrence L This article is Part 3 of the Word Embedding Series, introducing the popular word embedding model Global Vectors (GloVe). To read Part 2, click on Technical | Word Embedding Series Part 2: Comparing Several Methods of Approximate Softmax in Language … Read more

Understanding Character Relationships in ‘Story of Yanxi Palace’ Using Word2Vec

Understanding Character Relationships in 'Story of Yanxi Palace' Using Word2Vec

Source | Wujie Community Mixlab Editor | An Ke 【PanChuang AI Introduction】: Recently, everyone has been flooded with the popular Qing Dynasty drama “Story of Yanxi Palace”~ The male lead, Emperor Qianlong, is often referred to as a “big pig’s hoof” by everyone because he falls in love with every woman he meets. As simple … Read more

Word2Vec: A Powerful Python Library for Word Vectors

Word2Vec: A Powerful Python Library for Word Vectors

Python Natural Language Processing Tool: Word2Vec from Beginner to Practical Hello everyone, I am Niu Ge! Today, I will take you deep into understanding Word2Vec, a very important tool in the field of natural language processing. With it, we can enable computers to truly “understand” the relationships between words, achieving smarter text processing. What is … Read more

Detailed Explanation of Word2vec Source Code

Detailed Explanation of Word2vec Source Code

I’ve been looking at word2vec for a long time, but I’ve found many different versions of explanations. Moreover, the original paper does not mention many details, so I plan to look at the source code directly. On one hand, it can deepen my understanding; on the other hand, I can make appropriate improvements in the … Read more

From Word2Vec to BERT: The Evolution of NLP Pre-trained Models

From Word2Vec to BERT: The Evolution of NLP Pre-trained Models

Natural Language Processing Author: Zhang Junlin Source: Deep Learning Frontier Notes Zhihu Column Original Link: https://zhuanlan.zhihu.com/p/49271699 The theme of this article is the pre-training process in natural language processing (NLP). It will roughly explain how pre-training techniques in NLP have gradually developed into the BERT model, naturally illustrating how the ideas behind BERT were formed, … Read more

Illustrated Word2Vec: Understanding Word Embeddings

Illustrated Word2Vec: Understanding Word Embeddings

Word embeddings represent a word with a numerical vector, which is different from the IDs used in Tokenization. Word embedding vectors carry more semantic information. This article will illustrate Word2Vec: a method for word embeddings. This series also includes illustrations of Tokenization, Transformer, GPT2, and BERT. If you want to learn about Tokenization, please see … Read more

Don’t Understand Word2Vec? Don’t Call Yourself an NLP Expert!

Don't Understand Word2Vec? Don't Call Yourself an NLP Expert!

Author: Li Xuedong     Editor: Li Xuedong Introduction: Nowadays, deep learning is all the rage. Deep learning has made significant progress in the field of image processing. With the release of Word2Vec by Google, deep learning has also sparked a frenzy in the field of Natural Language Processing (NLP). As I am currently working on … Read more