Google's war against scraper sites continues
- 28 August, 2011 04:03
- Comments
Google appears to be getting ready to launch another offensive against website scrapers.
Scraper sites are usually operated by spammers. They copy almost all the content of the scraper site from other websites. By doing so, they hope to exploit the popularity of the material from original content makers to steer search engine traffic to their sites to make money through advertising.
"Scrapers getting you down? Tell us about blog scrapers you see... We need datapoints for testing," Google's web spam leader Matt Cutts said in a recent tweet.
Cutts' war cry illustrates Google feels more effort is needed to combat scrapers.
Along with his tweet, Cutts included a link to a form that allows web surfers to report scraper pages to Google. Some of that information may be used to test and improve Google's algorithm, the company said.
The form asks for the text of the search query that produced the scraping problem -- such as a scraper site outranking an original content site -- as well as the URL for the original content site and scraper site. There's also a form field for top-of-head comments.
Some scrapers are so successful in what they do that their sites achieve higher search engine rankings than the sites of the content makers from whom they pinch their material. Google attempted to correct that situation in January, when it changed its top-secret search algorithm aimed, among other things, to address the scraping problem.
Scraping, along with search results poisoning, have long been a problem with search engine results, although Google has steadfastly defended the quality of its results, saying the results are better than they have ever been in terms of relevance, freshness and comprehensiveness.
Earlier this year, Google announced changes, including filter changes, in its algorithm. The filter changes, referred to as "Panda," didn't quell the problem. Quite the contrary, it may have made it worse. "We've experienced a significant drop in our traffic (almost 35%) as a result of this change (with an equivalent drop in revenue)," wrote one webmaster after the change took effect. "We believe that our only crime is that we host user-generated content."
Google took another crack at the scraping problem in June, when it rolled out version 2.2 of the Panda filters. Reviews of that move appear to be mixed.
With this latest effort by Google to garner information on scraping sites, maybe the next version of Panda will finally put the issue to bed.
Follow freelance technology writer John P. Mello Jr. and Today@PCWorld on Twitter.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
-
Google Jumps Into Social Bookmarks Game
-
NBN build gaining momentum daily: Quigley
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Monday Grok: Will Siri crack the walls of GOOG?
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Managing IBM License Complexity
IBM provides thousands of products in its portfolio and uses a variety of license models, contract terms and conditions. These license models can be very complex, causing frequent confusion for organisations trying to grasp the concepts while maintaining license compliance. While at first IBM licensing may seem incomprehensible, some education on the license models and licensing scenarios will help minimise the confusion. In addition, a more automated approach to managing licenses enables organisations to gain control, reduce ongoing software costs and minimise license liability risks. Read on. -
Justifying Business Intelligence Applications
This white paper explores the decision criteria used in a build vs. buy scenario when considering the Oracle BI Applications. The major benefits of the BI Applications will be discussed in the framework of an overall buy vs. build argument. -
The State of Data Security
Recognize how your data can become vulnerable, including the latest issues stemming from unprotected data on mobile devices and social media sites. Understand the compliance issues involved, and identify data protection strategies you can use to keep your company’s information both safe and compliant.
-
Universal Command Guide for Operating Systems
-
PowerPoint 2007 Visual Quick Tips
-
It Sounded Good When We Started
-
Windows 7 Visual Quick Tips
-
Fighting Computer Crime
-
Creating Cool Web Sites
-
Software Change Impact Analysis
-
Fachworterbuch Horfunk Fernsdict Radio Television 5A
-
Icod-2 2nd International Conference








Comments
Post new comment