# GloVe: Global Vectors for Word Representation

@inproceedings{Pennington2014GloVeGV, title={GloVe: Global Vectors for Word Representation}, author={Jeffrey Pennington and Richard Socher and Christopher D. Manning}, booktitle={EMNLP}, year={2014} }

Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. [...] Key Method Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with… Expand

#### Supplemental Content

#### 19,875 Citations

Rehabilitation of Count-Based Models for Word Vector Representations

- Computer Science
- CICLing
- 2015

A systematic study of the use of the Hellinger distance to extract semantic representations from the word co-occurrence statistics of large text corpora shows that this distance gives good performance on word similarity and analogy tasks, with a proper type and size of context, and a dimensionality reduction based on a stochastic low-rank approximation. Expand

Modeling Semantic Relatedness using Global Relation Vectors

- Computer Science
- ArXiv
- 2017

A novel method which directly learns relation vectors from co-occurrence statistics is introduced, and it is shown how relation vectors can be naturally embedded into the resulting vector space. Expand

Measuring Enrichment Of Word Embeddings With Subword And Dictionary Information

- Computer Science
- 2019

Results show that fine-tuning the vectors with semantic information dramatically improves performance inword similarity; conversely, enriching word vectors with subword information increases performance in word analogy tasks, with the hybrid approach finding a solid middle ground. Expand

Modeling Context Words as Regions: An Ordinal Regression Approach to Word Embedding

- Computer Science
- CoNLL
- 2017

The underlying ranking interpretation of word contexts is sufficient to match, and sometimes outperform, the performance of popular methods such as Skip-gram, and by using a quadratic kernel, the model can effectively learn word regions, which outperform existing unsupervised models for the task of hypernym detection. Expand

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

- Computer Science
- ArXiv
- 2019

A framework for decomposing word embeddings into smaller meaningful units which are called sub-vectors is presented, which opens up a wide range of possibilities analyzing phenomena in vector space semantics, as well as solving concrete NLP problems. Expand

Word2Box: Learning Word Representation Using Box Embeddings

- Computer Science
- ArXiv
- 2021

This model takes a region-based approach to the problem of word representation, representing words as n-dimensional rectangles, and provides additional geometric operations such as intersection and containment which allow them to model co-occurrence patterns vectors struggle with. Expand

PAWE: Polysemy Aware Word Embeddings

- Computer Science
- ICISDM '18
- 2018

This work develops a new word embedding model that can accurately represent such words by automatically learning multiple representations for each word, whilst remaining computationally efficient. Expand

Fast PMI-Based Word Embedding with Efficient Use of Unobserved Patterns

- Computer Science
- AAAI
- 2019

A new word embedding algorithm that works on a smoothed Positive Pointwise Mutual Information (PPMI) matrix which is obtained from the word-word co-occurrence counts and a kernel similarity measure for the latent space that can effectively calculate the similarities in high dimensions is proposed. Expand

Distributed Representation of Words in Vector Space for Kannada Language

- Computer Science
- 2018 3rd International Conference on Computational Systems and Information Technology for Sustainable Solutions (CSITSS)
- 2018

A distributed representation for Kannada words is proposed using an optimal neural network model and combining various known techniques to improve the vector space representation. Expand

Learning Word Vectors with Linear Constraints: A Matrix Factorization Approach

- Computer Science
- IJCAI
- 2018

Two new embedding models based on the singular value decomposition of lexical co-occurrences of words are proposed, which allow for injecting linear constraints when performing the decomposition, with which the desired semantic and syntactic information will be maintained in word vectors. Expand

#### References

SHOWING 1-10 OF 40 REFERENCES

Linguistic Regularities in Continuous Space Word Representations

- Computer Science
- NAACL
- 2013

The vector-space word representations that are implicitly learned by the input-layer weights are found to be surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset. Expand

Linguistic Regularities in Sparse and Explicit Word Representations

- Computer Science
- CoNLL
- 2014

It is demonstrated that analogy recovery is not restricted to neural word embeddings, and that a similar amount of relational similarities can be recovered from traditional distributional word representations. Expand

Better Word Representations with Recursive Neural Networks for Morphology

- Computer Science
- CoNLL
- 2013

This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes. Expand

Efficient Estimation of Word Representations in Vector Space

- Computer Science
- ICLR
- 2013

Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities. Expand

Improving Word Representations via Global Context and Multiple Word Prototypes

- Computer Science
- ACL
- 2012

A new neural network architecture is presented which learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and accounts for homonymy and polysemy by learning multiple embedDings per word. Expand

Distributed Representations of Words and Phrases and their Compositionality

- Computer Science, Mathematics
- NIPS
- 2013

This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling. Expand

Learning word embeddings efficiently with noise-contrastive estimation

- Computer Science
- NIPS
- 2013

This work proposes a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation, and achieves results comparable to the best ones reported, using four times less data and more than an order of magnitude less computing time. Expand

Word Representations: A Simple and General Method for Semi-Supervised Learning

- Computer Science
- ACL
- 2010

This work evaluates Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeds of words on both NER and chunking, and finds that each of the three word representations improves the accuracy of these baselines. Expand

A Neural Probabilistic Language Model

- Computer Science
- J. Mach. Learn. Res.
- 2000

This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Expand

Word Embeddings through Hellinger PCA

- Computer Science
- EACL
- 2014

This work proposes to drastically simplify the word embeddings computation through a Hellinger PCA of the word co- occurence matrix and shows that it can provide an easy way to adaptembeddings to specific tasks. Expand