Computational science

From this morning’s Observer column.

One of the diseases studied was lung cancer. The research revealed 23,000 mutations that were exclusive to the diseased cells. Almost all were caused by the 60 or so chemicals in cigarette smoke that stick to DNA and deform it. “We can say that one mutation is fixed in the genome for every 15 cigarettes smoked,” said Peter Campbell, the scientist who led the lung cancer part of the study. “That is frightening because many people smoke a packet of 20 a day.”

Although these stories are reports about medical research, they are really about computing – in the sense that neither would have been possible without the application of serious computer power to masses of data. In that way they reflect a new – but so far unacknowledged – reality; that in many important fields leading-edge scientific research cannot be done without access to vast computational and data-handling facilities, with sophisticated software for analysing huge data-sets.

In many significant areas, advanced research is no longer done by individuals looking through microscopes or telescopes, but by computers enabling investigators to collate, visualise and analyse the torrents of data produced by arrays of instruments…

UK snow

Ben Marsh has come up with a really neat use of Twitter. If it’s snowing in your area, tweet the first part of your postcode followed by a score out of 10 for density of snow. So the tweet “#uksnow CB3 0/10” indicates that it’s not currently snowing in my part of Cambridge.

Regularly-updated UK map here.

Lovely idea. It’ll probably be used by the railway companies to justify pre-emptive cancelling of trains, though.

P4P: rethinking file sharing

The thorniest problem in making decisions about internet policy is how to balance the public interest against the vested interests of companies and other incumbents of the status quo. The task is made more difficult by the fact that often there is nobody to speak for the public interest, whereas vested interested are organised, vocal and very rich. The result is usually evidence-free policymaking in which legislators give to vested interests everything they ask for, and then some.

The copyright wars provide a case-study in this skewed universe. When P2P file-sharing appeared, the record and movie industries campaigned to have the entire technology banned. (Larry Lessig used to tell a wonderful story about how he arrived at his office in Stanford Law one day and found two of the university’s network police there. They were going to disconnect him from the network because he had P2P software running on his machine. The fact that Larry used P2P as a way of distributing his own written works had apparently never occurred to them. And Stanford is a pretty smart place.)

So the idea that P2P technology might have licit as well as illicit uses was ignored by nearly everyone. And yet P2P was — and remains — a really important strategic technology, for all kinds of reasons (see, for example, Clay Shirk’s great essay about PCs being the ‘dark matter’ of the Internet). In fact, one could argue — and I have — that it’s such an important strategic technology that the narrow business interests of the content industries ought never to be allowed to stifle it. Evidence-based policy-making would therefore attempt to strike a balance between the social benefits of P2P on the one hand, and those aspects of it that happen to be inconvenient (or profit-threatening) for a particular set of industries at a particular time in history.

All of which makes this report in Technology Review particularly interesting.

‘Peer-to-peer’ (P2P) is synonymous with piracy and bandwidth hogging on the Internet. But now, Internet service providers and content companies are taking advantage of technology designed to speed the delivery of content through P2P networks. Meanwhile, standards bodies are working to codify the technology into the Internet’s basic protocols.

Rather than sending files to users from a central server, P2P file-sharing networks distribute pieces of a file among thousands of computers and help users find and download this data directly from one another. This is a highly efficient way to distribute data, resistant to the bottlenecks that can plague centralized distribution systems, but it uses large amounts of bandwidth. Even as P2P traffic slowly declines as a percentage of overall Internet traffic, it is still growing in volume. In June, Cisco estimated that P2P file-sharing networks transferred 3.3 exabytes (or 3.3 billion trillion bytes) of data per month.

While a PhD student at Yale University in 2006, Haiyong Xie came up with the idea of ‘provider portal for peer-to-peer’ or P4P, as a way to ease the strain placed on networking companies by P2P. This system reduces file-trading traffic by having ISPs share specially encoded information about their networks with peer-to-peer ‘trackers’–servers that are used to locate files for downloading. Trackers can then make file sharing more efficient by preferentially connecting computers that are closer and reducing the amount of data shared between different ISPs.

During its meetings last week in Japan, the Internet Engineering Task Force, which develops Internet standards, continued work on building P4P into standard Internet protocols…

.

Kindling a revolution? Maybe not

Astute comment by Rory Cellan-Jones.

Suddenly I realised why a book worked on the Kindle but a paper did not. For me, reading a book is an analogue experience – I start at page one and continue until I’ve finished. A newspaper, on the other hand, is more random, more interactive. I scan the sections and leap from one article to another, much as I do on the web. That’s what is already available to me – for free – on newspaper websites, so why would I pay for a less satisfactory digital newspaper? Newspapers have woken up rather late to the fact that they’ve been giving away content online which could be monetised through e-readers.

There are other reasons why the Kindle may not be quite the game-changer some are claiming. Is a device costing upwards of £200 really going to persuade many people to abandon paper for a screen – especially when you can get a netbook these days for around the same price? And there will be questions about Amazon's walled garden, which allows some other e-books to be read on the Kindle but doesn’t allow titles from its online store to be read on other devices. Other contenders – perhaps including an Apple tablet – may learn some lessons from Amazon and take digital reading to the next level.

The Kindle looks to me like an attractive but expensive niche product, giving a few techie bibliophiles the chance to take more books on holiday without incurring excess baggage charges. But will it force thousands of bookshops to close and transform the economics of struggling newspapers? Don't bet on it.

Touching the void

Last May, the Economist carried an interesting report (now hidden, alas, behind a paywall) about technology developed by a Cambridge company.

TOUCH screens, once the preserve of science museums and ticket machines, have become commonly available on mobile phones thanks largely to the popularity of Apple’s iPhone. Now a novel hand-held device has been developed that can turn an inert tabletop into an interactive touch-screen. It could even end up being projected from a mobile phone.

The device developed by Light Blue Optics, a company spun out from Cambridge University in England, embodies a tiny projector and sensors that allow it not only to cast an image onto a flat surface but also to detect when the image is being touched. This makes it possible to press buttons, move and manipulate virtual objects such as photos and navigate between different screens, all just by touching the projected image.

Today, there’s a piece in Technology Review which shows how Light Blue Optics is finding ingenious applications for the technology, like this:

The new projection device, developed by Light Blue Optics, based in Cambridge, UK, uses a technique called holographic projection that allows it to be far smaller than current in-car HUD systems. “We can make an HUD so small you can put it into a rearview mirror or wing mirror,” says Edward Buckley, Light Blue Optics’s head of business development.

Details of Light Blue Optics’s prototype were presented today at the Society for Information Display’s Vehicles and Photons 2009 symposium, in Dearborn, MI. The prototype projects an image through a two-way wing mirror so that it appears to be about 2.5 meters away, superimposed over the reflected road scene. The picture appears to originate from a point in space in front of the mirror, only from a narrow perspective.

Existing HUDs require relatively large liquid-crystal arrays and optics to generate an image, says Buckley. “In a BMW 5 Series, the size is about five litres,” he says. “We can make it about one-tenth of the size. This means you can start to put these virtual image displays where you couldn’t previously.”

Sir Timothy regrets

According to Engadget,

Tim Berners-Lee, the man credited with creating the World Wide Web, recently said that his only real regret about the whole shebang is forcing people to type out the (essentially unnecessary) double slash after the “http:” in URLs. Speaking at a symposium on the future of technology, he noted (in reference to the dreaded marks) the paper, trees and human labor that could have been spared without them.

To which Engadget responds:

Hey Tim: don’t sweat it! You’ve done us enough good turns that we’re willing to overlook it.

Amen.

Posted in Web

Coping with the data tsunami

Interesting article in today’s NYT about the challenges posed by the coming avalanche of experimental data.

The next generation of computer scientists has to think in terms of what could be described as Internet scale. Facebook, for example, uses more than 1 petabyte of storage space to manage its users’ 40 billion photos. (A petabyte is about 1,000 times as large as a terabyte, and could store about 500 billion pages of text.)

It was not long ago that the notion of one company having anything close to 40 billion photos would have seemed tough to fathom. Google, meanwhile, churns through 20 times that amount of information every single day just running data analysis jobs. In short order, DNA sequencing systems too will generate many petabytes of information a year.

The article makes the rather good point that today’s university students, for the most part, will be imprinted on the rather feeble personal computer technology that they use today, and so are not attuned to the kit that will be required to do even routine science in a few years. It cites some of the usual scare stories — e.g. from astronomy:

The largest public database of such images available today comes from the Sloan Digital Sky Survey, which has about 80 terabytes of data, according to Mr. Connolly. A new system called the Large Synoptic Survey Telescope is set to take more detailed images of larger chunks of the sky and produce about 30 terabytes of data each night. Mr. Connolly’s graduate students have been set to work trying to figure out ways of coping with this much information.

NYT discovers Tablet PC

Yep. Here’s the evidence.

SAN FRANCISCO — The high-tech industry has been working itself into paroxysms of excitement lately over an idea that is not exactly new: tablet computers.

Quietly, several high-tech companies are lining up to deliver versions of these keyboard-free, touch-screen portable machines in the next few months. Industry watchers have their eye on Apple in particular to sell such a device by early next year.

Tablets have been around in various forms for two decades, thus far delivering little other than memorable failure. Nonetheless, the new batch of devices has gripped the imagination of tech executives, bloggers and gadget hounds, who are projecting their wildest dreams onto these literal blank slates.

In these visions, tablets will save the newspaper and book publishing industries, present another way to watch television and movies, play video games, and offer a visually rich way to enjoy the Web and the expanding world of mobile applications.

“Desktops, laptops — we already know how those work,” said Brian Lam, editorial director of the popular gadget site Gizmodo, which reports and hypothesizes almost daily about these devices. Tablets, he said, “are one of the last few mysteries left.”