Resurfaced story raises questions on Google search

[12 September 2008]

By Benjamin Pimentel

MarketWatch (MCT)

SAN FRANCISCO - It has long been the popular term for lightning-speed Web searches. But could “Google” eventually come to mean stumbling accidentally and disastrously onto misleading reports and information?

The fracas over a 6-year-old story about United Airlines’ bankruptcy filing that got passed around as though it were fresh news, triggering a massive sell-off in the airline parent’s stock, has raised questions about the reliability of the company’s vaunted technology.

In addition, The Wall Street Journal reported in its online edition Thursday that the Securities and Exchange Commission has started looking into the stock drop for UAL Corp. Regulators have been stepping up efforts to fight false rumors that affected financial stocks in particular earlier this year, the report said.

Some analysts say the scenario with the United story shows the limitations of automated search technology - based on tools called “bots” and “scrapers” - and highlights the need for a more precise system for identifying information, especially news reports.

“While the technology is there, it’s just an algorithm, and the algorithm did not go to journalism school,” said Ed Keating, vice president of the Software Information Industry Association.

Crawford Del Prete of the International Data Corp. agreed, describing the comedy of errors as a reminder that “while ‘bots’ and ‘scrapers’ can efficiently capture information, we still have a long way to go to filter out the timely and relevant from the not timely and relevant.”

“Just because the information came from a newspaper site does not make it current news,” he commented. “This is a black eye for Google.”

But Google defended itself, maintaining that it is up to media organizations and publishers to make sure that the information on their sites is complete and accurate.

“Our mission is to organize the information and make it accessible and usable,” said company spokesman Gabriel Stricker. “Their mission is to create that information. The onus is on the publishers to make sure that the information is accurate.”

The problem began when the bankruptcy story, originally published by the Chicago Tribune, was made available on the Florida Sun Sentinel Web site through a Google news search. It was redistributed by a Florida investment-research firm via Bloomberg’s terminals service. Bloomberg’s news service saw the report and sent a headline to its subscribers, citing the Sun Sentinel.

The story caused the airline company’s stock to plummet more than 70 percent, and its trades on the Nasdaq stock market were halted at the request of Chicago-based UAL. The shares eventually recovered after word went out that the report was old news. United exited bankruptcy in February 2006.

Tribune Co. has blamed Google for the error, citing “the inability of Google’s automated search agent ‘Googlebot’ to differentiate between breaking news and frequently viewed stories on the Web sites of its newspapers.”

The media company said a single visit to the bankruptcy story in the wee hours Sunday morning, when traffic to the newspaper’s site was low, pushed the item onto the list of popular articles. Google’s search technology then picked up the story as breaking news, Tribune said.

Tribune also said it had identified problems with Google’s search technology months ago and had asked the search giant to stop using Googlebot to “crawl” its newspaper sites. The company added it believes Google nevertheless “continues to misclassify stories.”

Google spokesman Stricker denied that Tribune asked for no more crawling on its newspaper sites. He also affirmed the company’s position that, just like other readers, Google’s search technology was misled by the updated time stamp on the 6-year-old story.

He suggested that publishers may need to explore standards for the way articles are posted.

“If by standards and quality control, it means that newspapers and other publishers would adhere to basic protocols like prominently publishing accurate datelines on stories, we would certainly support that,” he added. “But just to be clear, that’s really for the industry to decide.”

Both Tribune and Google probably should share the blame for what happened, speculated analyst Roger Kay of Endpoint Technologies Associates. “There should be a definitively auditable outcome here, not just a random ‘he said-she said’ finger-pointing in all directions,” he said. “It may very well be the way Tribune archives things and the way Google’s crawling algorithm works.”

For example, Kay cited the “little quirk” in most newspaper sites in which the day’s most popular stories are ranked and listed.

Rob Enderle of the Enderle Group called the Google-Tribune controversy “kind of scary to watch.”

“It goes beyond credibility and suggests some liability may exist here as well,” he said. “Looking at how much United Airlines moved, that liability could be legendary.”

While UAL immediately put out a statement denying the false report, it is unclear if the airlines is planning to take further action.

“I think it gives us a huge caution,” Enderle commented. “I was talking to one of the vendor PR folks, and they noticed that executive changes that had occurred years earlier had been popping up as current news for them, so this is likely happening more than we realize.”

Still, analyst Michael Dortch of Aberdeen Group did not see the bankruptcy-story issue having any serious impact on Google. “Until and unless similar episodes happen repeatedly over time, there is no real threat to Google’s credibility here,” he said.

“Everyone involved knows, or should know, that despite the rapid growth of online information-gathering and presentation, maturation equivalent to that of the offline world is still very much a work in progress,” Dortch added, “for Google, other vendors and the business users increasingly dependent upon them.”

Published at: http://www.popmatters.com/pm/article/resurfaced-story-raises-questions-on-google-search/