Tuesday, May 24, 2011

Google Shuts Down Ambitious Newspaper Scanning Project

Jill Hurst-Wahl reported that Google is shutting down one of its digitization efforts.  In a statement to Search Engine Land, a Google spokesperson said:
Users can continue to search digitized newspapers at http://news.google.com/archivesearch, but we don’t plan to introduce any further features or functionality to the Google News Archives and we are no longer accepting new microfilm or digital files for processing.
Google's efforts were in partnership with several North American newspapers, ProQuest and Heritage Microfilm, according to a 2008 news report.

In reporting on Google's decision, the Boston Phoenix wrote:
News Archive was generally a good deal for newspapers -- especially smaller ones like ours, who couldn't afford the tens or hundreds of thousands of dollars it would have cost to digitally scan and index our archives -- and a decent bet for Google. It threaded a loophole for newspapers, who, in putting pre-internet archives online, generally would have had to sort out tricky rights issues with freelancers -- but were thought to have escaped those obligations due to the method with which Google posted the archives. (Instead of posting the articles as pure text, Google posted searchable image files of the actual newspaper pages.) Google reportedly used its Maps technology to decipher the scrawl of ancient newsprint and microfilm; but newspapers are infamously more difficult to index than books, thanks to layout complexities such as columns and jumps, which require humans or intense algorithmic juju to decode. Here's two wild guesses: the process may have turned out to be harder than Google anticipated. Or it may have turned out that the resulting pages drew far fewer eyeballs than anyone expected.
The lesson is that jumping on the Google bandwagon can be good thing, if the wagon keeps on moving. A lesson that those involved in Microsoft's book digitization program also learned the hard way.

No comments:

Post a Comment