Understanding Word2Vec, Large Language Models

Word2Vec

Word2Vec is a method for learning word embeddings by predicting context words or target words using shallow neural networks. It produces static vectors, meaning a word has the same embedding regardless of context. The article references embeddings as inputs to self-attention layers, where Word2Vec represents an earlier stage of embedding development. While not based on attention, it helps establish how word meaning can be encoded numerically. Word2Vec laid groundwork for more complex, contextual embeddings used in Transformers. It's a stepping stone toward understanding input representations. Its limitations contrast with dynamic embeddings in attention models.

Word2Vec

Mentioned in blog posts: