Word2Vec Misleadings

Distributional semantics is all the rage and is very cool and omg it’s a bit of a fraud!

Just to be clear: there is nothing wrong with the algorithm itself! It is conceptually very interesting and works very well for a lot of cases. Done right, it can give a decent representation of word similarity or meaning. But the “King – Man + Woman = Queen” example by far overstates what the algorithm actually is capable of.

Here are some reasons why I think we should stop using that classical example to introduce Word2Vec:

1) It turns out that for the example to work in the first place, you have to include some ‘cheating’. The actual result would namely be King – Man + Woman = King. So, the resulting vector would be more similar to King than to Queen. The widely known example only works because the implementation of the algorithm will exclude the original vector from the possible results! That means the word vector for King – Man + Woman is closest to the word vector for King. Second comes Queen, which is what the routine will then pick. Quite disappointing, isn’t it?

Why yes, yes it is. (I wonder how close ‘Queen’ is to ‘King’ without the subtraction. Or under random subtractions on ‘King’.)

Gah! This drives me nuts. Please don’t mislead in this way!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s