Random thoughts

Lovely piece by Steven Levy about the randomness of the iPod shuffle algorithm.

My first iPod loved Steely Dan. So do I. But not as much as my iPod did. By 2003, among the 3,000 or so songs in my iTunes library, I had about 50 Steely Dan tracks. Yet every time I shuffled my music collection “randomly” to mix the tunes, it seemed that the Dan was weirdly over-represented. Only two or three songs after Rikki Don’t Lose That Number, I’d hear Kid Charlemagne. Then, 20 minutes later, there would be Pretzel Logic. Where was the logic in this? I didn’t keep track of every song that played every time I shuffled my tunes, but after a while I would keep a sharp ear out for what I came to call the LTBSD (Length of Time Before Steely Dan) Factor. The LTBSD Factor was always perplexingly short…

This is one of those maddening articles in which someone writes about a topic that one had thought of covering, but didn’t. I use the shuffle facility on my iPod a lot, and often wondered if it was giving truly random results. But I didn’t take the logical next step and do some digging. Levy did, which is what makes him such a good journalist.

It turns out that this is an excerpt from his forthcoming book about the iPod phenomenon (Ebury Press, November 2, according to the Guardian). If it’s anything like as good as his book on the history of the Apple Mac, it’ll be worth queueing for.

Algorithmic ‘integrity’

From Rough Type: Nicholas Carr’s Blog

Last week, CNET’s Elinor Mills reported on how a web search for “Martin Luther King” returns, as its first result on Google and as its second result on Windows Live Search, a web site (martinlutherking.org) operated by a white supremacist organization named Stormfront. The site, titled “Martin Luther King Jr.: A True Historical Examination,” refers to King as “The Beast” and says he was “just a sexual degenerate, an America-hating Communist, and a criminal betrayer of even the interests of his own people.” The site also features an essay on “Jews & Civil Rights” by former Ku Klux Klan official David Duke.

What’s remarkable, though, is not that a search algorithm might be gamed by extremists but that the owners of the algorithm might themselves defend the offensive result – and reject any attempt to override it as an assault on the “integrity” of their system….

Carr goes on to quote Google’s response to the CNET story:

At Google, a Web site’s ranking is determined by computer algorithms using thousands of factors to calculate a page’s relevance to any given query, a company representative said. The company can’t tweak the results because of that automation and the need to maintain the integrity of the results, she said. “In this particular example, the page is relevant to the query and many people have linked to it, giving it more PageRank than some of the other pages. These two factors contribute to its ranking.”

Microsoft’s response was even more robust:

The results on Microsoft’s search engine are “not an endorsement, in any way, of the viewpoints held by the owners of that content,” said Justin Osmer, senior product manager for Windows Live Search. “The ranking of our results is done in an automated manner through our algorithm which can sometimes lead to unexpected results,” he said. “We always work to maintain the integrity of our results to ensure that they are not editorialized.”

To which Carr tartly responds:

By “editorialized” he seems to mean “subjected to the exercise of human judgment.” And human judgment, it seems, is an unfit substitute for the mindless, automated calculations of an algorithm. We are not worthy to question the machine we have made. It is so pure that even its corruption is a sign of its integrity.

Googleplex working overtime?

From Good Morning Silicon Valley

What kind of a sweatshop are they running over there at Google? Just this week, the elves in the trenches have made Google Gadgets available for addition to any site, launched its Literacy Project in conjunction with LitCam and UNESCO, were apparently caught teaming up with Apple on a possible hookup of Google Maps with iPhoto, added another new batch of imagery to Google Earth, and now have released Code Search, a tool for programmers to dig through publicly available source code. (From the Unintended Consequences Department, the search engine also lets you find things like serial number generating algorithms.) That’s just this week, and it still has a day to go. Gee, I hope they’re paying those folks enough.

But, later, there’s this

The L.A. Times reports company execs have launched an initiative called “Features, Not Products,” telling engineers to stop launching so many services and focus on making the existing ones work together better. Co-founder Sergey Brin said it occurred to him this summer as he scanned the 50 or so products available across the company’s Web sites that users were probably getting overwhelmed. “It’s worse than that,” he said. “It’s that I was getting lost in the sheer volume of the products that we were releasing.” Simplicity was among the things that made Google so popular, and its success led it to snap up hundreds of smart, ambitious software engineers. “The result occurred precisely because we told these incredible engineering teams to run as fast as possible to solve new problems,” said Chief Executive Eric Schmidt. “But then that created this other problem.” Analyst Rob Enderle puts it another way: “They created a bunch of crap that they have no idea what to do with. What a huge waste of resources.” Schmidt says the plan is to make Google products easier to use by packaging services, citing plans to combine the company’s spreadsheet, calendar and word-processing programs into one suite of Web-based applications. “That is a big change in the way we run the company,” Schmidt said, describing Google’s previous attitude as, “Just get this stuff built and get it out — don’t worry about the integration.”

Computerised hostility detection

Fascinating article in today’s New York Times

WASHINGTON, Oct. 3 — A consortium of major universities, using Homeland Security Department money, is developing software that would let the government monitor negative opinions of the United States or its leaders in newspapers and other publications overseas.

Such a “sentiment analysis” is intended to identify potential threats to the nation, security officials said.

Researchers at institutions including Cornell, the University of Pittsburgh and the University of Utah intend to test the system on hundreds of articles published in 2001 and 2002 on topics like President Bush’s use of the term “axis of evil,” the handling of detainees at Guantánamo Bay, the debate over global warming and the coup attempt against President Hugo Chávez of Venezuela.

A $2.4 million grant will finance the research over three years.

American officials have long relied on newspapers and other news sources to track events and opinions here and abroad, a goal that has included the routine translation of articles from many foreign publications and news services.

The new software would allow much more rapid and comprehensive monitoring of the global news media, as the Homeland Security Department and, perhaps, intelligence agencies look “to identify common patterns from numerous sources of information which might be indicative of potential threats to the nation,” a statement by the department said.

It could take several years for such a monitoring system to be in place, said Joe Kielman, coordinator of the research effort. The monitoring would not extend to United States news, Mr. Kielman said.

“We want to understand the rhetoric that is being published and how intense it is, such as the difference between dislike and excoriate,” he said.

Even the basic research has raised concern among journalism advocates and privacy groups, as well as representatives of the foreign news media.

“It is just creepy and Orwellian,” said Lucy Dalglish, a lawyer and former editor who is executive director of the Reporters Committee for Freedom of the Press.

Andrei Sitov, Washington bureau chief of the Itar-Tass news agency of Russia, said he hoped that the objective did not go beyond simply identifying threats to efforts to stifle criticism about an American president or administration.

“This is what makes your country great, the open society where people can criticize their own government,” Mr. Sitov said…

BTW: “Sentiment analysis” is what this stuff is called in the comp.sci community.

The pleasures of Amazon

When I got back from work today, the postman had delivered two delights. One is this collection of David Remnick’s journalism. He’s such a graceful writer — the kind that leaves you staring at a sentence and wondering how anything can be so succinctly elegant.

The other delight is Bob Dylan’s new album, Modern Times. It’s astonishingly original and fresh, with some tracks (Thunder on the Mountain, The Levee’s Gonna Break) which remind one of why rock music changed the world.