To avoid the “meaning conflation deficiency” of word embeddings, a number of models have aimed to embed individual word
senses. These methods at one time performed well on tasks such as word sense induction (WSI), but they have since been overtaken by task-specific techniques which exploit contextualized embeddings.
However, sense embeddings and contextualization need not be mutually exclusive. We introduce PolyLM, a method which formulates the task of learning sense embeddings as a language modeling problem, allowing contextualization techniques to be applied. PolyLM is based on two underlying assumptions about word senses: firstly, that the probability of a word occurring in a given context is equal to the sum of the probabilities of its individual senses occurring; and secondly, that for a given occurrence of a word, one of its senses tends to be much more plausible in the context than the others. We evaluate PolyLM on WSI,
showing that it performs considerably better than previous sense embedding techniques, achieving state-of-the-art performance on the SemEval-2010 and 2013 datasets.
Alan Ansell is a PhD student in NLP at the University of Cambridge’s Language Technology Lab. He was previously a Masters student at the University of Waikato under the supervision of Bernhard Pfahringer and Felipe Bravo-Marquez.