Resources for Learning and Understanding Word2Vec

Source: AI Study Society

I was interviewed recently and, since I still don’t fully understand how word embeddings work, I’ve been looking for a lot of related materials to grasp this concept better. My understanding is still limited, so I won’t overestimate myself by writing my own article (even if I did, it would just be a remix of existing articles, not claiming it as my own understanding). Here, I’ve compiled useful materials that I found, as a record.

There is also an English blog that has done a similar compilation (http://textprocessing.org/getting-started-with-word2vec).

This article mainly records Chinese materials and necessary English resources.

Understanding Word2Vec primarily involves grasping some concepts and practical techniques:

Concepts include distributed representation of words, word embeddings, and neural network language models. Most online explanations of W2V cover these topics, so just find a reliable source and read through it; the main contribution of Word2Vec is not the algorithm itself, but the simplifications and speed improvements it offers for the above methods. As mentioned in the original text, it can now train a billion-level corpus in a single machine within a day, which involves CBOW, Skip-Gram, Hierarchical Softmax, Negative Sampling, etc.

1. Author’s Papers:

https://arxiv.org/pdf/1301.3781.pdf

https://arxiv.org/pdf/1310.4546.pdf

Source code download:

https://code.google.com/archive/p/word2vec/

2. [NLP] Understand Word Vectors Word2vec in Seconds

This is suitable for establishing an impression of important concepts.

https://zhuanlan.zhihu.com/p/26306795

The tone is quite bold, but it is indeed well-written. The materials organized in the article will be repeated below. I also recommend selecting your own readings.

I believe that for a blog post rather than a paper, the most important thing is not to be “error-free,” but to “speak in plain language.” Just reading academic papers often buries many key details in formulas, but for some issues, what matters may not be those formulas. The concept of “let data talk” in deep learning particularly reflects this.

3. Rong Xin’s Work

Explanation video: https://www.youtube.com/watch?v=D-ekE-Wlcds

Article: word2vec Parameter Learning Explained (https://arxiv.org/abs/1411.2738)

PPT: https://docs.google.com/presentation/d/1yQWN1CDWLzxGeIAvnGgDsIJr5xmy4dB0VmHFKkLiibo/edit#slide=id.ge79682746_0_438

Demo: https://ronxin.github.io/wevi/

It is recommended to watch the video first, then understand the paper.

4. Youdao’s Deep Learning Word2Vec Notes

This article may be more suitable for developers; I found it quite challenging to read…

5. Laisiwei’s Blog and Doctoral Dissertation

http://licstar.net/archives/category/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86

6. A Blog Post by a PhD Student from the Institute of Computing Technology, Source Code Analysis

http://www.cnblogs.com/neopenx/p/4571996.html

7. The Origins and Development of Word Vectors

http://ruder.io/word-embeddings-1/

Ruder’s series of blog posts are very clear.

With a sufficient understanding, you can then look at the code or try using Tensorflow or Gensim’s training versions.

In terms of practice, refer to Laisiwei’s doctoral dissertation.

Leave a Comment Cancel reply