Skip to content

Gensim #
Find similar titles

You are seeing an old version of the page. Go to latest version

Installation #

Testing whether the fast version is installed:

#!python
>>> from gensim.models import word2vec
>>> assert word2vec.FAST_VERSION > -1

Models #

Phrases #

This model detects multi-word phrases that can be grouped, such as new_york_times. Can be used as a preprocessor for word2vec or doc2vec models.

#!python
>>> bigram_transformer = gensim.models.Phrases(sentences)
>>> model = Word2Vec(bigram_transformed[sentences], size=100, ...)

word2vec #

Let V as the size of the vocabulary and N as the dimension of the hidden layer (vector dimension).

  • model.syn0: \( V \times N \) matrix. model.syn0[wordindex] returns the word vector.

doc2vec #

0.0.1_20140628_0