The 15% refuseniks

From the latest PEW survey:

For the first 13 years of the decade, Americans embraced the Internet at a whirlwind pace. The percentage of Americans who use the Internet grew to 84 percent in 2013 from 52 percent at the turn of the century, according to data from the Pew Research Center.

But since 2013, the percentage of American adults who go online has remained virtually unchanged, according a new Pew study released on Tuesday. The 15 percent of Americans who still do not use the Internet is essentially the same portion as in 2013.

So why the slowdown?

Pew found that Americans who remain offline do so for a number of reasons: the cost of buying a computer and paying a broadband or cellphone bill, the perceived relevance of Internet content or even the physical ability to use devices.

Noam Chomsky: the purpose of education

My favourite wise old bird. Lovely moment late in the interview where he tells a story about a famous MIT physicist being asked by a Freshman “what are we going to cover this semester?”

“It doesn’t matter what we cover”, replies the academic. “What matters is what you discover.”

We love your work… now show us your workings

This morning’s Observer column.

The growth in computing power, networking and sensor technology now means that even routine scientific research requires practitioners to make sense of a torrent of data. Take, for example, what goes on in particle physics. Experiments in Cern’s Large Hadron Collider regularly produce 23 petabytes per second of data. Just to get that in context, a petabyte is a million gigabytes, which is the equivalent of 13.3 years of HDTV content. In molecular biology, a single DNA-sequencing machine can spew out 9,000 gigabytes of data annually, which a librarian friend of mine equates to 20 Libraries of Congress in a year.

In an increasing number of fields, research involves analysing these torrents of data, looking for patterns or unique events that may be significant. This kind of analysis lies way beyond the capacity of humans, so it has to be done by software, much of which has to be written by the researchers themselves. But when scientists in these fields come to publish their results, both the data and the programs on which they are based are generally hidden from view, which means that a fundamental principle of scientific research – that findings should be independently replicable – is being breached. If you can’t access the data and check the analytical software for bugs, how can you be sure that a particular result is valid?

Real hero of the Facebook story isn’t Zuckerberg

My take on the Facebook story — from yesterday’s Observer.

The number to watch is not the putative $100bn valuation but the 845 million users that Facebook now claims to have. The observation that if Facebook were a country then it would be the third most populous on the planet has become a cliche, but underpinning it is an intriguing question: how did an idea cooked up in a Harvard dorm become so powerful?

Thanks to a compelling movie, The Social Network, we think we know the story. A ferociously gifted Harvard sophomore named Zuckerberg has difficulties with women and vents his frustration by creating an offensive web application that invites users to compare pairs of female students and indicate which is “hottest”. He puts this up on the Harvard network where it gets him into trouble with the authorities. Then he lifts an idea from a pair of nice-but-dim Wasp contemporaries who need a programmer and, in a frenzied burst of inspired hacking, implements the idea in computer code, thereby creating an online version of the printed “facebooks” common to elite US universities. This he then launches on an unsuspecting world. The Wasps sue him but lose (though get a settlement). Zuckerberg goes on to become Master of the Universe. Cue music, fade to black.

It’s all true, sort of, but the dramatic imperatives of the narrative obscure the really significant bit of the story. So let’s rewind…

.XXX marks the spot where the dot con boom begins

This morning’s Observer column about ICANN’s decision to allow new generic Top-Level Domains (gTLDs).

Given the pressure on namespace within the existing gTLDs, it was inevitable that Icann would come under pressure to expand their number. The first foray came from those who wanted to establish “.xxx” as cyberspace’s red-light district. After numerous spluttering rows, this was finally approved a few months ago. You may be reassured to know that applicants for a .xxx domain must first complete an application process that will be overseen by the International Foundation for Online Responsibility (Iffor), a body set up to “promote the development of responsible business practices and conduct within the online adult-entertainment community”. So that’s all right, then.

But the .xxx stuff was really just the overture to the main business, which is accommodating the needs of corporations. Icann has now announced that it will allow them to apply for new gTLDs, such as “.fashion” or “.drinks”, for example. Or perhaps even “.coke” (though cocaine dealers might have something to say about that)…

In a thoughtful comment on the piece, Anthony Van Couvering argues that I’ve allowed my conspiracy theorising to get the better of my judgement.

The big-corporation conspiracy theory is one that I typically warm to, but as someone who’s been involved with ICANN for years, I can tell you that far from big corporations pushing this, they have been dead set against it, and over the last few years have put a lot of dollars and muscle trying to defeat the expansion of the top-level domain space. The reason is twofold: first, they feel that they will have to do defensive registrations in hundreds of new namespaces; second, they already own most of the beachfront property, and if you’re in that position, the last thing you want are more beaches.

He goes on to say that:

So although it does function as a barrier to entry, the $185,000 application fee is probably not what’s going to stop someone who’s seriously interested in applying — it’s just one of many expenses.

So who will apply? The brands will apply, mostly for .brand, either to promote it or as a “just in case” defensive measure. Cities will apply, and both New York and Paris among others have made known their intent. Ethnolinguistic communities will apply: .scot (Scotland), .cymru (Wales), .bzh (Brittany), and .eus (Basque) are just some of the announced candidates. Finally, there will be entrepreneurs who think that they have some special sauce that will make their top-level domain one that will be attractive to consumers.

[…]

The new gTLD program has a complicated history, and many interests have forcefully put forward their agenda. What we finally got was a compromise that everyone can grudgingly live with, but which in the end opens up the top-level domain space that for years had been blocked by corporate interests. It is a Very Good Thing.

I hope he’s right.

The start-up fallacy

This morning’s Observer column.

In an essay entitled "How to Make an American Job Before It's Too Late", Grove pointed out that whereas Apple has 25,000 employees in the US, Foxconn has 250,000 in southern China alone. "The company," he points out, "has grown at an astounding rate, first in Taiwan and later in China. Its revenue last year was $62bn, larger than Apple Inc, Microsoft Corp, Dell Inc or Intel. Foxconn employs more than 800,000 people, more than the combined worldwide head count of Apple, Dell, Microsoft, Hewlett-Packard Co, Intel and Sony Corp."

Grove cited these figures to attack what he regards as a pernicious mindset that now afflicts government policymakers in most western countries – "Our own misplaced faith in the power of start-ups to create US jobs. Americans love the idea of the guys in the garage inventing something that changes the world. New York Times columnist Thomas L Friedman recently encapsulated this view in a piece called 'Start-Ups, Not Bailouts'. His argument: let tired old companies that do commodity manufacturing die if they have to. If Washington really wants to create jobs, he wrote, it should back start-ups."

Grove thinks this is baloney and he's right. Start-ups are wonderful but – at least in technology – they generally don't create jobs on the scale that western economies need. What really matters is what comes after that eureka moment in the garage, as the new idea goes from prototype to mass production…

Oxford takes on Wikipedia

From ArsTechnica:

The OBO tool is essentially a straightforward, hyperlinked collection of professionally-produced, peer-reviewed bibliographies in different subject areas—sort of a giant, interactive syllabus put together by OUP and teams of scholars in different disciplines. Users can drill down to a specific bibliographic entry, which contains some descriptive text and a list of references that link to either Google Books or to a subscribing library’s own catalog entries, by either browsing or searching. Each entry is written by a scholar working in the relevant field and vetted by a peer review process. The idea is to alleviate the twin problems of Google-induced data overload, on the one hand, and Wikipedia-driven GIGO (garbage in, garbage out), on the other.

“We did about 18 months of pretty intensive research with scholars and students and librarians to explore how their research practices were changing with the proliferation of online sources,” Damon Zucca, OUP’s Executive Editor, Reference, told Ars. “The one thing we heard over and over again is that people were drowning in scholarly information, and drowning in information in general. So it takes twice as much time for people to begin their research.”

OBO grew out of that research, with the goal of helping scholars and students deal with information overload, possibly by skipping Google entirely. The resulting bibliography is fairly simple and lean, which is exactly the point. The messy and often politicized work of sorting and sifting the information has already been done for users, so that they can drill down directly to a list of the main publications in their target area.

“You can’t come up with a search filter that solves the problem of information overload,” Zucca told Ars. OUP is betting that the solution to the problem lies in content, which is its area of expertise, and not in technology, which is Google’s and Microsoft’s.

To trust OBO’s content, you have to trust its selection and vetting process. To that end, OUP is making the list of contributing scholars and editors freely available. Each subject area has an Editor in Chief who’s a top scholar in the field, and an editorial board of around 15 to 20 scholars. The EIC and editorial board either write the bibliographic entries themselves, or they select other scholars to do the work.

The launch version of OBO covers only four subject areas: Classics, Islamic Studies, Social Work and Criminology. But OUP has plans to add 10-12 new subject areas (known as modules) within the next year. Each subject area contains between 50 and 100 individual entries, and that number should grow at the rate of about 50 to 75 entries per year.

And the cost of all this peer-reviewed quality? Why $29.95 a month or $295.00 a year

Hard times for venture capitalists

Last Sunday’s Observer column.

Spare a thought for the poor venture capitalists of the world. Well, perhaps the word “poor” is not entirely appropriate, but there’s no doubt that they seem to be having a torrid time at the moment. Over the past decade they have poured hundreds of billions of dollars into start-up technology companies – and have emerged with an average figure for five-year returns that has oscillated around, er, zero.

This will come as a surprise to those who subscribe to the cartoon image of the venture capitalist as a hatchet-faced investor who invests in someone's dream in order to wind up effectively owning it – and then flogging it to anonymous shareholders by floating the company on the stock market…

Amazon close to achieving Bezos’s dream?

I’ve always believed that the business Jeff Bezos wanted emulate was Wal-Mart. He started with books simply because they were objects that people will buy without having to handle them. But in recent years I’ve bought an increasing number of non-book items from the UK store. Today, the NYT is claiming that Amazon is closer to realising the Bezos dream than many of us realised.

Fifteen years after Jeffrey P. Bezos founded the company as an online bookstore, Amazon is set to cross a significant threshold. Sometime later this year, if current trends continue, worldwide sales of media products — the books, movies and music that Amazon started with — will be surpassed for the first time by sales of other merchandise on the site. (That transition already occurred this year in its North American business.)

In other words, in an increasingly digital age, Amazon is quickly becoming the world’s general store. Alongside the books and CDs and DVDs are diapers, Legos and power drills, not to mention replacement car clutches and more arcane items like the Jackalope Buck taxidermy mount ($69.97).

“Amazon has gone from ‘that bookstore’ in people’s mind to a general online retailer, and that is a great place to be,” said Scot Wingo, chief executive of ChannelAdvisor, an eBay-backed company that helps stores like Wal-Mart and J.C. Penney sell online. Mr. Wingo envisions e-commerce growing to 15 percent of overall retail in the next decade from around 7 percent. “If Amazon grows their market share throughout that period, and honestly I don’t see anything stopping it, that is pretty scary,” he said…

And that’s ignoring the whole new S3 cloud-computing business that Amazon launched a while back and which now seems to be powering every major Web 2.0 service. In a way, Amazon is a more astonishing company than Google, because it has to deal directly with the public all the time. And it’s very good at what it does.

Will Google be a benign foster-parent? Don’t bet on it

When you think about the way the academic world allowed itself to be hooked by the scientific periodical racketeers, it makes sense to be wary of any commercial outfit that looks like acquiring a monopoly of a valuable resource. The obvious candidate du jour is Google, which is busily scanning all those orphan works (i.e. works whose copyright owners cannot be found) in libraries in order to make them available to a grateful (academic) world. Some people are (rightly) suspicious and are going to challenge the legal settlement which Google negotiated with publishers in the US. At the JISC ‘Libraries of the Future’ event in Oxford last Thursday, Robert Darnton of Harvard (pictures above) said some perceptive things about the potential threats ahead. So it was interesting to see this piece in this morning’s NYT.

These critics say the settlement, which is subject to court approval, will give Google virtually exclusive rights to publish the books online and to profit from them. Some academics and public interest groups plan to file legal briefs objecting to this and other parts of the settlement in coming weeks, before a review by a federal judge in June.

While most orphan books are obscure, in aggregate they are a valuable, broad swath of 20th-century literature and scholarship.

Determining which books are orphans is difficult, but specialists say orphan works could make up the bulk of the collections of some major libraries.

Critics say that without the orphan books, no competitor will ever be able to compile the comprehensive online library Google aims to create, giving the company more control than ever over the realm of digital information. And without competition, they say, Google will be able to charge universities and others high prices for access to its database.

The settlement, “takes the vast bulk of books that are in research libraries and makes them into a single database that is the property of Google,” said Robert Darnton, head of the Harvard University library system. “Google will be a monopoly.”

Yep. I’ve always thought that Google will be Microsoft’s successor as the great anti-trust test for the Obama Administration. I hope the DoJ is tooling up for it.