The Null Device

How Google handles synonyms

A Google engineer writes about how Google's search engine attempts to understand synonyms:
We use many techniques to extract synonyms, that we've blogged about before. Our systems analyze petabytes of web documents and historical search data to build an intricate understanding of what words can mean in different contexts. In the above example "photos" was an obvious synonym for "pictures," but it's not always a good synonym. For example, it's important for us to recognize that in a search like [history of motion pictures], "motion pictures" means something special (movies), and "motion photos" doesn't make any sense. Another example is the term "GM." Most people know the most prominent meaning: "General Motors." For the search [gm cars], you can see that Google bolds the phrase "General Motors" in the search results. This is an indication that for that search we thought "General Motors" meant the same thing as "GM." Are there any other meanings? Many people can think of the second meaning, "genetically modified," which is bolded when GM is used in queries about crops and food, like in the search results for [gm wheat]. It turns out that there are more than 20 other possible meanings of the term "GM" that our synonyms system knows something about. GM can mean George Mason in [gm university], gamemaster in [gm screen star wars], Gangadhar Meher in [gm college], general manager in [nba gm] and even gunners mate in [navy gm].

There are 1 comments on "How Google handles synonyms":

Posted by: arthur Tue Jan 26 14:01:21 2010

The amazing thing is that on Google Taiwan, searching for "wow" will also search for "World of Warcraft" in Chinese characters.