What is Yahoo Phrase Based Indexing?
Recently Yahoo published a patent application that explains how the company analyzes web pages for related keyword phrases. Yahoo identifies several likely phrases from the content of the web pages and matches those phrases with a content dictionary. The patent application suggests that it is beneficial to use these related keywords close together on a web page so that search engines can find the relation between them more easily.
Currently, search results can include looking at different algorithms or ranking criteria that include the number of times the queried term appears on a web page, how close the terms are together on the page and the location on the page of the terms. The problem with this method is that it doesn’t factor in the context of the search terms in relation to other words on the page.
Yahoo’s patent application attempts to find out the context of the terms as a concept or phrase as it is associated with other related phrases on the page. This helps determine the most appropriate pages to return in a search query.
One of the ways Yahoo is trying to do this is by establishing meaningful phrases or concepts. A page’s text and tags let a search engine know what those pages are about. Yahoo sends the text and tags to a program that uses an aboutness extractor to break down the text, match it with keyword phrases in a concept dictionary and see if they are listed as concepts. A context dictionary keeps information that identifies related concepts, which are one or more keywords associated with a given concept. This understanding of concepts can help search engines better understand what a searcher is looking for.
The patent application covers quite a bit more on how Yahoo plans to discover better results in searches. If you are interested in learning more you can see the patent here.
Related posts:
