On the Information Geometry of Word Embeddings
- Luigi Malagò (Romanian Institute of Science and Technology - RIST, Cluj-Napoca)
Word embeddings are a set of techniques commonly used in natural language processing to map the words of a dictionary to a real vector space. Such mapping is commonly learned through the contexts of the words in a text corpora, by the estimation of a set of conditional probability distributions - of a context word given the central word - for each word of the dictionary. These conditional probability distributions form a Riemannian statistical manifold, where word analogies can be computed through the comparison between vectors in the tangent bundle of the manifold. In this presentation we introduce a geometric framework for the study of word embeddings in the general setting of Information Geometry, and we show how the choice of the geometry allows to define different expressions for word similarities and word analogies. The presentation is based on a joint work with Riccardo Volpi.