4/15/2008

Google dips toes into ‘deep Web’ search

Filed under: — Aviran Mordo

Google’s ever-active search bots, which scour the Web constantly for new pages, have begun a new, more active phase of their indexing jobs.

In a blog post Friday, Jayant Madhavan and Alon Halevy of Google’s crawling and indexing team said the company has begun an experiment in which its indexing software experimentally enters text in Web site forms to see what previously undiscovered pages may appear.

“In the past few months, we have been exploring some HTML forms to try to discover new Web pages and URLs that we otherwise couldn’t find and index for users who search on Google,” they wrote. “This experiment is part of Google’s broader effort to increase its coverage of the Web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines.”

Yahoo moving to new Web-crawler software

Filed under: — Aviran Mordo

Yahoo has begun indexing the World Wide Web with its third-generation software, Slurp 3.0, the company said Monday.

“With everything now in place, the rollout has officially begun,” Sharad Verma and Yoram Arnon said in a posting to Yahoo’s search blog on Monday.

Unlike top search rival Google, which on Friday revealed its indexing software now is trying to uncover previously hidden pages by filling in Web pages’ forms, Yahoo didn’t detail what’s new with its indexing software.

The company did advise those who watch for indexing software (sometimes called bots, crawlers, and spiders) as it visits their site to update their methodology from the Slurp 2.0 days

Oklahoma Leaks 10,000 Social Security Numbers

Filed under: — Aviran Mordo

Apparently the folks at the Department of Corrections of Oklahoma just forgot to use common sense when they created the state’s Sexual and Violent Offender Registry.

By putting SQL queries in the URLs, they not only leaked the personal data of tens of thousands of people, but enabled literally anyone with basic SQL knowledge to put his neighbor/boss/enemies on the sexual offender list. Fortunately, after the author of the blog The Daily WTF notified the department about the issue, the site went down for ‘routine maintenance’ on April 13 2008

Google enlists video ID tools to combat child porn

Filed under: — Aviran Mordo

Google Inc is enlisting the same image-recognition technology the company uses to trace copyright violations on its YouTube video site to fight online child pornography, the company said on Monday.

Google said it is working the National Center for Missing & Exploited Children (NCMEC) of Alexandria, Virginia to help automate and streamline how child protection workers troll through millions of pornographic images to identify victims of abuse.

The project is applying so-called video fingerprinting technology, which Google has been urging media copyright holders to adopt as a means for policing widespread piracy of professionally created video programming on the Web.

A small team of Google engineers have worked for more than a year with federal agencies and NCMEC’s analysts in its Child Victim Identification Program to create software to automate the review of some 13 million pornographic images and videos that analysts at the center previously had to review manually.

Powered by WordPress