Summary of Word2vec References
First, let me briefly describe my deep dive into Word2vec: as per usual, I started by reading Mikolov’s two original papers on Word2vec, but I found myself still confused after finishing them. The main reason is that these two papers omitted too much theoretical background and derivation details. I then dug out Bengio’s 2003 JMLR paper and Ronan’s 2011 JMLR paper. After reading them, I gained some understanding of topic models and using CNN for NLP tasks, but I still couldn’t fully grasp Word2vec. At this point, I began to read a lot of Chinese and English blogs, and one blog by