Google Rolls Out New Search Infrastructure

Filed under: — By Aviran Mordo @ 10:23 pm

Google Inc. has begun a steady rollout of “Bigdaddy,” a new infrastructure for Google Web search that will eventually be in all of the company’s data centers.

The Mountain View, Calif., search engine recently converted a third data center to Bigdaddy, and hopes to switch over a new one every 10 days or so, Google senior engineer Matt Cutts wrote in his blog.

The biggest problems Google plans to fix with Bigdaddy are hijacking redirects of URLs and what Google calls “canonicalization.”

The latter refers to a search engine determining the preferred domain name of a site. Web sites regularly have multiple domain names, but only one is the actual name. For example, www.techweb.com or just techweb.com will get you to the CMP Media tech site. However, the former is the actual domain name.

Because it’s difficult for search engines to figure out that multiple names are for the same site, results often include multiple listings, when only one would do.

Hijacking refers to someone redirecting to his Web site, a request for another site. URLs for sites constantly change for a variety of reasons, including people getting new domain names, a site reorganization or a new content delivery system. Because many visitors will continue to use the old URL, Webmasters will set up a server-side redirect so the old URL points to the new one. Hijackers are sometimes able to intervene, and steer the traffic to their own sites.

Bigdaddy is also expected to help reduce Webspam, which are pages comprised of advertisements and links to other sites that contain mostly ads. The pages, which often appear in search results, pretend to provide assistance or facts about a particular subject.

Source: informationweek

Digg this story ?


Leave a Reply

You must have Javascript enabled in order to submit comments.

All fields are optional (except comment).
Some comments may be held for moderation (depends on spam filter) and not show up immediately.
Links will automatically get rel="nofollow" attribute to deter spammers.

Powered by WordPress