|
| |
|
How
Search engine works? |
|
Search engines are are
the key to finding specific information on the vast expanse of the World
Wide Web. Without sophisticated search engines, it would be virtually
impossible to locate anything on the Web without knowing a specific URL. |
|
|
|
|
|
|
|
But do you know how search engines work?
And do you know what makes some search engines more effective than others?
When people use the term search engine in relation to the Web, they are
usually referring to the actual search forms that searches through databases
of HTML documents, initially gathered by a robot. There are basically three types of search engines: Those that are powered by
robots (called crawlers; ants or spiders) and those that are powered by
human submissions; and those that are a hybrid of the two. Crawler-based search engines are those
that use automated software agents (called crawlers) that visit a Web site,
read the information on the actual site, read the site's meta tags and also
follow the links that the site connects to performing indexing on all linked
Web sites as well. The crawler returns all that information back to a
central depository, where the data is indexed. The crawler will periodically
return to the sites to check for any information that has changed. The
frequency with which this happens is determined by the administrators of the
search engine.
Human-powered search engines rely on humans to submit information that is
subsequently indexed and catalogued. Only information that is submitted is
put into the index. |
|
|
|
|
|
|
In both cases, when you query a search engine to locate information, you're
actually searching through the index that the search engine has created -you
are not actually searching the Web. These indices are giant databases of
information that is collected and stored and subsequently searched. This
explains why sometimes a search on a commercial search engine, such as
Yahoo! or Google, will return results that are, in fact, dead links. Since
the search results are based on the index, if the index hasn't been updated
since a Web page became invalid the search engine treats the page as still
an active link even though it no longer is. It will remain that way until
the index is updated. So why will the same search on different
search engines produce different results? Part of the answer to that
question is because not all indices are going to be exactly the same. It
depends on what the spiders find or what the humans submitted. But more
important, not every search engine uses the same algorithm to search through
the indices. The algorithm is what the search engines use to determine the
relevance of the information in the index to what the user is searching for.
One of the elements that a search engine algorithm scans for is the
frequency and location of keywords on a Web page. Those with higher
frequency are typically considered more relevant. But search engine
technology is becoming sophisticated in its attempt to discourage what is
known as keyword stuffing, or spam indexing.
Another common element that algorithms analyze is the way that pages link
to other pages in the Web. By analyzing how pages link to each other, an
engine can both determine what a page is about (if the keywords of the
linked pages are similar to the keywords on the original page) and whether
that page is considered "important" and deserving of a boost in ranking.
Just as the technology is becoming increasingly sophisticated to ignore
keyword stuffing, it is also becoming more savvy to Web masters who build
artificial links into their sites in order to build an artificial ranking |
|