Microsoft
The titles of two Microsoft patent applications are very similar, but the processes described aren′t. The first one looks at anchor text in links, and the titles to pages those links point to, to see if the anchor text is accurate. The second one looks at links on pages, using the Document Object Model, and tries to determine if they are valid links while simulating the experience of a user of the page viewing it with a browser. This may help a search engine understand dynamic html menus, and view links that may otherwise be unavailable to a search engine crawler.
Methods and apparatus for the evaluation of aspects of a web page
Inventors: Michael A. Starbird
Assigned to Microsoft
US Patent Application 20060150076
Published July 6, 2006
Filed on December 30, 2004
Abstract
Methods and apparatus are provided for evaluating the extent to which link text, representing a hypertext link on a web page, corresponds to a web page referenced by the link. In one embodiment, the link text may be compared to the title of a web page referenced by the link, such as by parsing the link text and page title into individual tokens and comparing the tokens. The extent to which the link text and the page title correspond may be expressed as a percentage of tokens which match. A graphical user interface (GUI) may be provided which presents a visual indication when a minimum percentage of tokens do not match.

