# Skip-gram

The objective of skip-gram model is to find representations of words that are useful for predicting the 'contexts' — surrounding words.

Given a sequence of training words $$w_1, w_2, \ldots, w_T$$,

$$\frac{1}{T} \sum_{t=1}^{T} \sum_{-c \ge j \ge c, j \neq 0} \log p(w_{t+j} | w_t)$$