
Distributed Word and Phrase Representations
Failed to add items
Add to basket failed.
Add to Wish List failed.
Remove from Wish List failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
By:
About this listen
This 2013 paper introduces advancements to the continuous Skip-gram model, a method for learning high-quality distributed vector representations of words. The authors present extensions like subsampling frequent words and negative sampling to enhance vector quality and training speed. A significant contribution is the method for identifying and representing idiomatic phrases as single tokens, improving the model's ability to capture complex meanings. The paper demonstrates that these word and phrase vectors exhibit linear relationships, allowing for precise analogical reasoning through simple vector arithmetic. Overall, the research highlights improved efficiency and accuracy in learning linguistic representations, especially with large datasets, by optimizing the Skip-gram architecture.
Source: https://arxiv.org/pdf/1310.4546