Thursday 19 March, 2020

If you might find it more useful to get this blog as a daily email, why not subscribe here? (It’s free, and there’s a 1-click unsubscribe). One email, in your inbox at 07:00 every morning.


England’s green and pleasant land

One part of the country that’s not currently under lockdown: the river Cam at Grantchester, photographed on Tuesday morning.

____________________________-

Apocalypse now

The Wall Street Journal reports from the Papa Giovanni XXIII Hospital, a large, modern medical facility in Bergamo, a prosperous Italian city that has been overwhelmed by the coronavirus disease:

Bergamo shows what happens when things go wrong. In normal times, the ambulance service at the Papa Giovanni hospital runs like a Swiss clock. Calls to 112, Europe’s equivalent of 911, are answered within 15 to 20 seconds. Ambulances from the hospital’s fleet of more than 200 are dispatched within 60 to 90 seconds. Two helicopters stand by at all times. Patients usually reach an operating room within 30 minutes, said Angelo Giupponi, who runs the emergency response operation: “We are fast, in peacetime.”

Now, people wait an hour on the phone to report heart attacks, Dr. Giupponi said, because all the lines are busy. Each day, his team fields 2,500 calls and brings 1,500 people to the hospital. “That’s not counting those the first responders visit but tell to stay home and call again if their condition worsens,” he said.

Ambulance staff weren’t trained for such a contagious virus. Many have become infected and their ambulances contaminated. A dispatcher died of the disease Saturday. Diego Bianco was in his mid-40s and had no prior illnesses.

“He never met patients. He only answered the phone. That shows you the contamination is everywhere,” a colleague said. Mr. Bianco’s co-workers sat Sunday at the operations center with masks on their faces and fear in their eyes…

This is why social-distancing has to be made to work.


MEOW

Our local supermarket announced that the first hour after opening this morning would be reserved for people who would have to ‘self-isolate’ from next weekend. I fall into that category because of my age, but people with particular medical conditions also fall into it. Think of it as voluntary house arrest! The supermarket was fairly busy with customers of retirement age. The atmosphere was cheery and civilised, with a vague feeling of wartime solidarity. In a way, I reflected, on discovering that all the milk had gone and further stocks were not expected until midday, that in a sense this is the moral equivalent of war.

And then I remembered that during the 1979 energy crisis in the US, the then president Jimmy Carter had used that phrase — I think in the context of making the US independent of oil imports from the Middle East. For Carter, the phrase was a way of signalling how important his campaign was. But of course his Republican opponents resisted it — and found a way of effectively ridiculing it by making an acronynm from the initial letters of each word: MEOW. And it worked.


AI is an ideology, not a technology

Nice essay in Wired by Jaron Lanier, arguing that, at its core, “artificial intelligence” is a perilous belief that fails to recognize the agency of humans. “The usual narrative goes like this”, he writes.

Without the constraints on data collection that liberal democracies impose and with the capacity to centrally direct greater resource allocation, the Chinese will outstrip the West. AI is hungry for more and more data, but the West insists on privacy. This is a luxury we cannot afford, it is said, as whichever world power achieves superhuman intelligence via AI first is likely to become dominant.

If you accept this narrative, the logic of the Chinese advantage is powerful. What if it’s wrong? Perhaps the West’s vulnerability stems not from our ideas about privacy, but from the idea of AI itself.

The central point of the essay is that “AI” is best understood as a political and social ideology rather than as a basket of algorithms. And at its core is the belief

that a suite of technologies, designed by a small technical elite, can and should become autonomous from and eventually replace, rather than complement, not just individual humans but much of humanity. Given that any such replacement is a mirage, this ideology has strong resonances with other historical ideologies, such as technocracy and central-planning-based forms of socialism, which viewed as desirable or inevitable the replacement of most human judgement/agency with systems created by a small technical elite. It is thus not all that surprising that the Chinese Communist Party would find AI to be a welcome technological formulation of its own ideology.

Thoughtful piece. Worth reading in full.


Will the virus enable us to rediscover what the Internet is for?

The wonderful thing about the Net — so we naive techno-utopians used to think — was that it would liberate human creativity because it lowered the barriers to publication and self-expression. The most erudite articulation of this was probably Yochai Benkler’s wonderful The Wealth of Networks, a celebration of the potential of ‘peer production’ and user-generated content. We saw the technology was an enabling, democratising force — a ‘sit-up’ rather than a ‘lie-back’ medium. And we saw in its apparently inexorable rise the end of the era of the couch potato.

What we never expected was that a combination of capitalism and human nature would instead turn the network into million-channel TV, with billions of people passively consuming content created by media corporations: the ultimate lie-back medium. And indeed, if you look at the data traffic on the Net these days, you see the effects of that. According to Sandvine, a network equipment company, in 2019, for example, video accounted for 60.6 percent of total downstream volume worldwide, up 2.9 percentage points from 2018. Web traffic was the next biggest category, with a 13.1 percent share (down 3.8 points year over year), followed by gaming at 8.0 percent, social media at 6.1 percent and file sharing at 4.2 percent. The same report found that Google and its various apps (including YouTube and Android) accounted for 12 percent of overall internet traffic and that Facebook apps took 17 percent of downstream internet traffic in the Asia-Pacific region, as compared with 3 percent worldwide.

One interesting question raised by the COVID-19 crisis is whether people who find themselves isolated in their homes will discover affordances of the network of which they were hitherto unaware. Kevin Roose of the NYT explores this in “The Coronavirus Crisis Is Showing Us How to Live Online”. We’ve always hoped that our digital tools would create connections, not conflict, he says. We now have a chance to make it happen. After a week in self-isolation, he finds himself agreeably surprised:

Last weekend, in between trips to the grocery store, I checked up on some friends using Twitter D.M.s, traded home-cooking recipes on Instagram, and used WhatsApp to join a blockwide support group with my neighbors. I even put on my Oculus virtual reality headset, and spent a few hours playing poker in a V.R. casino with friendly strangers.

I expected my first week of social distancing to feel, well, distant. But I’ve been more connected than ever. My inboxes are full of invitations to digital events — Zoom art classes, Skype book clubs, Periscope jam sessions. Strangers and subject-matter experts are sharing relevant and timely information about the virus on social media, and organizing ways to help struggling people and small businesses. On my feeds, trolls are few and far between, and misinformation is quickly being fact-checked.

Well, well. Reporters should get out more — onto the free and open Internet rather than the walled gardens of social media.


AI for good is possible

This morning’s Observer column:

…As a consequence, a powerful technology with great potential for good is at the moment deployed mainly for privatised gain. In the process, it has been characterised by unregulated premature deployment, algorithmic bias, reinforcing inequality, undermining democratic processes and boosting covert surveillance to toxic levels. That it doesn’t have to be like this was vividly demonstrated last week with a report in the leading biological journal Cell of an extraordinary project, which harnessed machine learning in the public (as compared to the private) interest. The researchers used the technology to tackle the problem of bacterial resistance to conventional antibiotics – a problem that is rising dramatically worldwide, with predictions that, without a solution, resistant infections could kill 10 million people a year by 2050.

Read on

The real test of an AI machine? When it can admit to not knowing something

This morning’s Observer column on the EU’s plans for regulating AI and data:

Once you get beyond the mandatory euro-boosting rhetoric about how the EU’s “technological and industrial strengths”, “high-quality digital infrastructure” and “regulatory framework based on its fundamental values” will enable Europe to become “a global leader in innovation in the data economy and its applications”, the white paper seems quite sensible. But as for all documents dealing with how actually to deal with AI, it falls back on the conventional bromides about human agency and oversight, privacy and governance, diversity, non-discrimination and fairness, societal wellbeing, accountability and that old favourite “transparency”. The only discernible omissions are motherhood and apple pie.

But this is par for the course with AI at the moment: the discourse is invariably three parts generalities, two parts virtue-signalling leavened with a smattering of pious hopes. It’s got to the point where one longs for some plain speaking and common sense.

And, as luck would have it, along it comes in the shape of Sir David Spiegelhalter, an eminent Cambridge statistician and former president of the Royal Statistical Society. He has spent his life trying to teach people how to understand statistical reasoning, and last month published a really helpful article in the Harvard Data Science Review on the question “Should we trust algorithms?”

Read on

The White House’s ten principles for AI

Must be a spoof, surely? Something apparently serious emerging from the Trump administration. Ten principles for government agencies to adhere to when proposing new AI regulations for the private sector. The move is the latest development of the American AI Initiative, launched via executive order by President Trump early last year to create a national strategy for AI. It is also part of an ongoing effort to maintain US leadership in the field.

Here are the ten principles, for what they’re worth:

Public trust in AI. The government must promote reliable, robust, and trustworthy AI applications.

Public participation. The public should have a chance to provide feedback in all stages of the rule-making process.

Scientific integrity and information quality. Policy decisions should be based on science.

Risk assessment and management. Agencies should decide which risks are and aren’t acceptable.

Benefits and costs. Agencies should weigh the societal impacts of all proposed regulations.

Flexibility. Any approach should be able to adapt to rapid changes and updates to AI applications.

Fairness and nondiscrimination. Agencies should make sure AI systems don’t discriminate illegally.

Disclosure and transparency. The public will trust AI only if it knows when and how it is being used.

Safety and security. Agencies should keep all data used by AI systems safe and secure.

Interagency coordination. Agencies should talk to one another to be consistent and predictable in AI-related policies.

Biased machines may be easier to fix than biased humans

This morning’s Observer column:

One of the things that really annoys AI researchers is how supposedly “intelligent” machines are judged by much higher standards than are humans. Take self-driving cars, they say. So far they’ve driven millions of miles with very few accidents, a tiny number of them fatal. Yet whenever an autonomous vehicle kills someone there’s a huge hoo-ha, while every year in the US nearly 40,000 people die in crashes involving conventional vehicles.

Likewise, the AI evangelists complain, everybody and his dog (this columnist included) is up in arms about algorithmic bias: the way in which automated decision-making systems embody the racial, gender and other prejudices implicit in the data sets on which they were trained. And yet society is apparently content to endure the astonishing irrationality and capriciousness of much human decision-making.

If you are a prisoner applying for parole in some jurisdictions, for example, you had better hope that the (human) judge has just eaten when your case comes up…

Read on

Bias in machine learning

Nice example from Daphne Keller of Google:

Another notion of bias, one that is highly relevant to my work, are cases in which an algorithm is latching onto something that is meaningless and could potentially give you very poor results. For example, imagine that you’re trying to predict fractures from X-ray images in data from multiple hospitals. If you’re not careful, the algorithm will learn to recognize which hospital generated the image. Some X-ray machines have different characteristics in the image they produce than other machines, and some hospitals have a much larger percentage of fractures than others. And so, you could actually learn to predict fractures pretty well on the data set that you were given simply by recognizing which hospital did the scan, without actually ever looking at the bone. The algorithm is doing something that appears to be good but is actually doing it for the wrong reasons. The causes are the same in the sense that these are all about how the algorithm latches onto things that it shouldn’t latch onto in making its prediction.

To recognize and address these situations, you have to make sure that you test the algorithm in a regime that is similar to how it will be used in the real world. So, if your machine-learning algorithm is one that is trained on the data from a given set of hospitals, and you will only use it in those same set of hospitals, then latching onto which hospital did the scan could well be a reasonable approach. It’s effectively letting the algorithm incorporate prior knowledge about the patient population in different hospitals. The problem really arises if you’re going to use that algorithm in the context of another hospital that wasn’t in your data set to begin with. Then, you’re asking the algorithm to use these biases that it learned on the hospitals that it trained on, on a hospital where the biases might be completely wrong.

Can the planet afford machine learning as well as Bitcoin?

This morning’s Observer column:

There is, alas, no such thing as a free lunch. This simple and obvious truth is invariably forgotten whenever irrational exuberance teams up with digital technology in the latest quest to “change the world”. A case in point was the bitcoin frenzy, where one could apparently become insanely rich by “mining” for the elusive coins. All you needed was to get a computer to solve a complicated mathematical puzzle and – lo! – you could earn one bitcoin, which at the height of the frenzy was worth $19,783.06. All you had to do was buy a mining kit (or three) from Amazon, plug it in and become part of the crypto future.

The only problem was that mining became progressively more difficult the closer we got to the maximum number of bitcoins set by the scheme and so more and more computing power was required. Which meant that increasing amounts of electrical power were needed to drive the kit. Exactly how much is difficult to calculate, but one estimate published in July by the Judge Business School at the University of Cambridge suggested that the global bitcoin network was then consuming more than seven gigwatts of electricity. Over a year, that’s equal to around 64 terawatt-hours (TWh), which is 8 TWh more than Switzerland uses annually. So each of those magical virtual coins turns out to have a heavy environmental footprint.

At the moment, much of the tech world is caught up in a new bout of irrational exuberance. This time, it’s about machine learning, another one of those magical technologies that “change the world”…

Read on

Kranzberg’s Law

As a critic of many of the ways that digital technology is currently being exploited by both corporations and governments, while also being a fervent believer in the positive affordances of the technology, I often find myself stuck in unproductive discussions in which I’m accused of being an incurable “pessimist”. I’m not: better descriptions of me are that I’m a recovering Utopian or a “worried optimist”.

Part of the problem is that the public discourse about this stuff tends to be Manichean: it lurches between evangelical enthusiasm and dystopian gloom. And eventually the discussion winds up with a consensus that “it all depends on how the technology is used” — which often leads to Melvin Kranzberg’s Six Laws of Technology — and particularly his First Law, which says that “Technology is neither good nor bad; nor is it neutral.” By which he meant that,

“technology’s interaction with the social ecology is such that technical developments frequently have environmental, social, and human consequences that go far beyond the immediate purposes of the technical devices and practices themselves, and the same technology can have quite different results when introduced into different contexts or under different circumstances.”

Many of the current discussions revolve around various manifestations of AI, which means machine learning plus Big Data. At the moment image recognition is the topic du jour. The enthusiastic refrain usually involves citing dramatic instances of the technology’s potential for social good. A paradigmatic example is the collaboration between Google’s DeepMind subsidiary and Moorfields Eye Hospital to use machine learning to greatly improve the speed of analysis of anonymized retinal scans and automatically flag ones which warrant specialist investigation. This is a good example of how to use the technology to improve the quality and speed of an important healthcare service. For tech evangelists it is an irrefutable argument for the beneficence of the technology.

On the other hand, critics will often point to facial recognition as a powerful example for the perniciousness of machine-learning technology. One researcher has even likened it to plutonium. Criticisms tend to focus on its well-known weaknesses (false positives, racial or gender bias, for example), its hasty and ill-considered use by police forces and proprietors of shopping malls, the lack of effective legal regulation, and on its use by authoritarian or totalitarian regimes, particularly China.

Yet it is likely that even facial recognition has socially beneficial applications. One dramatic illustration is a project by an Indian child labour activist, Bhuwan Ribhu, who works for the Indian NGO Bachpan Bachao Andolan. He launched a pilot program 15 months prior to match a police database containing photos of all of India’s missing children with another one comprising shots of all the minors living in the country’s child care institutions.

The results were remarkable. “We were able to match 10,561 missing children with those living in institutions,” he told CNN. “They are currently in the process of being reunited with their families.” Most of them were victims of trafficking, forced to work in the fields, in garment factories or in brothels, according to Ribhu.

This was made possible by facial recognition technology provided by New Delhi’s police. “There are over 300,000 missing children in India and over 100,000 living in institutions,” he explained. “We couldn’t possibly have matched them all manually.”

This is clearly a good thing. But does it provide an overwhelming argument for India’s plan to construct one of the world’s largest facial-recognition systems with a unitary database accessible to police forces in 29 states and seven union territories?

I don’t think so. If one takes Kranzberg’s First Law seriously, then each proposed use of a powerful technology like this has to face serious scrutiny. The more important question to ask is the old Latin one: Cui Bono?. Who benefits? And who benefits the most? And who loses? What possible unintended consequences could the deployment have? (Recognising that some will, by definition, be unforseeable.) What’s the business model(s) of the corporations proposing to deploy it? And so on.

At the moment, however, all we mostly have is unasked questions, glib assurances and rash deployments.

What if AI could write like Hemingway?

This morning’s Observer column:

Last February, OpenAI, an artificial intelligence research group based in San Francisco, announced that it has been training an AI language model called GPT-2, and that it now “generates coherent paragraphs of text, achieves state-of-the-art performance on many language-modelling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarisation – all without task-specific training”.

If true, this would be a big deal…

Read on

Excavating AI

Fabulous essay by Kate Crawford and Trevor Paglen, uncovering the politics and biases embedded in the guge image databases that have been used for training machine learning software. Here’s how it begins:

You open up a database of pictures used to train artificial intelligence systems. At first, things seem straightforward. You’re met with thousands of images: apples and oranges, birds, dogs, horses, mountains, clouds, houses, and street signs. But as you probe further into the dataset, people begin to appear: cheerleaders, scuba divers, welders, Boy Scouts, fire walkers, and flower girls. Things get strange: A photograph of a woman smiling in a bikini is labeled a “slattern, slut, slovenly woman, trollop.” A young man drinking beer is categorized as an “alcoholic, alky, dipsomaniac, boozer, lush, soaker, souse.” A child wearing sunglasses is classified as a “failure, loser, non-starter, unsuccessful person.” You’re looking at the “person” category in a dataset called ImageNet, one of the most widely used training sets for machine learning.

Something is wrong with this picture.

Where did these images come from? Why were the people in the photos labeled this way? What sorts of politics are at work when pictures are paired with labels, and what are the implications when they are used to train technical systems?

In short, how did we get here?

The authors begin with a deceptively simple question: What work do images do in AI systems? What are computers meant to recognize in an image and what is misrecognised or even completely invisible? They examine the methods used for introducing images into computer systems and look at “how taxonomies order the foundational concepts that will become intelligible to a computer system”. Then they turn to the question of labeling: “how do humans tell computers which words will relate to a given image? And what is at stake in the way AI systems use these labels to classify humans, including by race, gender, emotions, ability, sexuality, and personality?” And finally, they turn to examine the purposes that computer vision is meant to serve in our society and interrogate the judgments, choices, and consequences of providing computers with these capacities.

This is a really insightful and sobering essay, based on extensive research.

Some time ago Crawford and Paglen created an experimental website — ImageNet Roulette — which enabled anyone to upload their photograph and then pulled up from the ImageNet database how the person would be classified based on their photograph. The site is now offline, but the Guardian journalist Julia Carrie Wong wrote an interesting article about it recently in the course of which she investigated how it would classify/describe her from her Guardian byline photo. Here’s what she found.

Interesting ne c’est pas? Remember, this is the technology underpinning facial recognition.

Do read the whole thing.