Bias in machine learning

Nice example from Daphne Keller of Google:

Another notion of bias, one that is highly relevant to my work, are cases in which an algorithm is latching onto something that is meaningless and could potentially give you very poor results. For example, imagine that you’re trying to predict fractures from X-ray images in data from multiple hospitals. If you’re not careful, the algorithm will learn to recognize which hospital generated the image. Some X-ray machines have different characteristics in the image they produce than other machines, and some hospitals have a much larger percentage of fractures than others. And so, you could actually learn to predict fractures pretty well on the data set that you were given simply by recognizing which hospital did the scan, without actually ever looking at the bone. The algorithm is doing something that appears to be good but is actually doing it for the wrong reasons. The causes are the same in the sense that these are all about how the algorithm latches onto things that it shouldn’t latch onto in making its prediction.

To recognize and address these situations, you have to make sure that you test the algorithm in a regime that is similar to how it will be used in the real world. So, if your machine-learning algorithm is one that is trained on the data from a given set of hospitals, and you will only use it in those same set of hospitals, then latching onto which hospital did the scan could well be a reasonable approach. It’s effectively letting the algorithm incorporate prior knowledge about the patient population in different hospitals. The problem really arises if you’re going to use that algorithm in the context of another hospital that wasn’t in your data set to begin with. Then, you’re asking the algorithm to use these biases that it learned on the hospitals that it trained on, on a hospital where the biases might be completely wrong.

Can the planet afford machine learning as well as Bitcoin?

This morning’s Observer column:

There is, alas, no such thing as a free lunch. This simple and obvious truth is invariably forgotten whenever irrational exuberance teams up with digital technology in the latest quest to “change the world”. A case in point was the bitcoin frenzy, where one could apparently become insanely rich by “mining” for the elusive coins. All you needed was to get a computer to solve a complicated mathematical puzzle and – lo! – you could earn one bitcoin, which at the height of the frenzy was worth $19,783.06. All you had to do was buy a mining kit (or three) from Amazon, plug it in and become part of the crypto future.

The only problem was that mining became progressively more difficult the closer we got to the maximum number of bitcoins set by the scheme and so more and more computing power was required. Which meant that increasing amounts of electrical power were needed to drive the kit. Exactly how much is difficult to calculate, but one estimate published in July by the Judge Business School at the University of Cambridge suggested that the global bitcoin network was then consuming more than seven gigwatts of electricity. Over a year, that’s equal to around 64 terawatt-hours (TWh), which is 8 TWh more than Switzerland uses annually. So each of those magical virtual coins turns out to have a heavy environmental footprint.

At the moment, much of the tech world is caught up in a new bout of irrational exuberance. This time, it’s about machine learning, another one of those magical technologies that “change the world”…

Read on

Kranzberg’s Law

As a critic of many of the ways that digital technology is currently being exploited by both corporations and governments, while also being a fervent believer in the positive affordances of the technology, I often find myself stuck in unproductive discussions in which I’m accused of being an incurable “pessimist”. I’m not: better descriptions of me are that I’m a recovering Utopian or a “worried optimist”.

Part of the problem is that the public discourse about this stuff tends to be Manichean: it lurches between evangelical enthusiasm and dystopian gloom. And eventually the discussion winds up with a consensus that “it all depends on how the technology is used” — which often leads to Melvin Kranzberg’s Six Laws of Technology — and particularly his First Law, which says that “Technology is neither good nor bad; nor is it neutral.” By which he meant that,

“technology’s interaction with the social ecology is such that technical developments frequently have environmental, social, and human consequences that go far beyond the immediate purposes of the technical devices and practices themselves, and the same technology can have quite different results when introduced into different contexts or under different circumstances.”

Many of the current discussions revolve around various manifestations of AI, which means machine learning plus Big Data. At the moment image recognition is the topic du jour. The enthusiastic refrain usually involves citing dramatic instances of the technology’s potential for social good. A paradigmatic example is the collaboration between Google’s DeepMind subsidiary and Moorfields Eye Hospital to use machine learning to greatly improve the speed of analysis of anonymized retinal scans and automatically flag ones which warrant specialist investigation. This is a good example of how to use the technology to improve the quality and speed of an important healthcare service. For tech evangelists it is an irrefutable argument for the beneficence of the technology.

On the other hand, critics will often point to facial recognition as a powerful example for the perniciousness of machine-learning technology. One researcher has even likened it to plutonium. Criticisms tend to focus on its well-known weaknesses (false positives, racial or gender bias, for example), its hasty and ill-considered use by police forces and proprietors of shopping malls, the lack of effective legal regulation, and on its use by authoritarian or totalitarian regimes, particularly China.

Yet it is likely that even facial recognition has socially beneficial applications. One dramatic illustration is a project by an Indian child labour activist, Bhuwan Ribhu, who works for the Indian NGO Bachpan Bachao Andolan. He launched a pilot program 15 months prior to match a police database containing photos of all of India’s missing children with another one comprising shots of all the minors living in the country’s child care institutions.

The results were remarkable. “We were able to match 10,561 missing children with those living in institutions,” he told CNN. “They are currently in the process of being reunited with their families.” Most of them were victims of trafficking, forced to work in the fields, in garment factories or in brothels, according to Ribhu.

This was made possible by facial recognition technology provided by New Delhi’s police. “There are over 300,000 missing children in India and over 100,000 living in institutions,” he explained. “We couldn’t possibly have matched them all manually.”

This is clearly a good thing. But does it provide an overwhelming argument for India’s plan to construct one of the world’s largest facial-recognition systems with a unitary database accessible to police forces in 29 states and seven union territories?

I don’t think so. If one takes Kranzberg’s First Law seriously, then each proposed use of a powerful technology like this has to face serious scrutiny. The more important question to ask is the old Latin one: Cui Bono?. Who benefits? And who benefits the most? And who loses? What possible unintended consequences could the deployment have? (Recognising that some will, by definition, be unforseeable.) What’s the business model(s) of the corporations proposing to deploy it? And so on.

At the moment, however, all we mostly have is unasked questions, glib assurances and rash deployments.

What if AI could write like Hemingway?

This morning’s Observer column:

Last February, OpenAI, an artificial intelligence research group based in San Francisco, announced that it has been training an AI language model called GPT-2, and that it now “generates coherent paragraphs of text, achieves state-of-the-art performance on many language-modelling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarisation – all without task-specific training”.

If true, this would be a big deal…

Read on

Excavating AI

Fabulous essay by Kate Crawford and Trevor Paglen, uncovering the politics and biases embedded in the guge image databases that have been used for training machine learning software. Here’s how it begins:

You open up a database of pictures used to train artificial intelligence systems. At first, things seem straightforward. You’re met with thousands of images: apples and oranges, birds, dogs, horses, mountains, clouds, houses, and street signs. But as you probe further into the dataset, people begin to appear: cheerleaders, scuba divers, welders, Boy Scouts, fire walkers, and flower girls. Things get strange: A photograph of a woman smiling in a bikini is labeled a “slattern, slut, slovenly woman, trollop.” A young man drinking beer is categorized as an “alcoholic, alky, dipsomaniac, boozer, lush, soaker, souse.” A child wearing sunglasses is classified as a “failure, loser, non-starter, unsuccessful person.” You’re looking at the “person” category in a dataset called ImageNet, one of the most widely used training sets for machine learning.

Something is wrong with this picture.

Where did these images come from? Why were the people in the photos labeled this way? What sorts of politics are at work when pictures are paired with labels, and what are the implications when they are used to train technical systems?

In short, how did we get here?

The authors begin with a deceptively simple question: What work do images do in AI systems? What are computers meant to recognize in an image and what is misrecognised or even completely invisible? They examine the methods used for introducing images into computer systems and look at “how taxonomies order the foundational concepts that will become intelligible to a computer system”. Then they turn to the question of labeling: “how do humans tell computers which words will relate to a given image? And what is at stake in the way AI systems use these labels to classify humans, including by race, gender, emotions, ability, sexuality, and personality?” And finally, they turn to examine the purposes that computer vision is meant to serve in our society and interrogate the judgments, choices, and consequences of providing computers with these capacities.

This is a really insightful and sobering essay, based on extensive research.

Some time ago Crawford and Paglen created an experimental website — ImageNet Roulette — which enabled anyone to upload their photograph and then pulled up from the ImageNet database how the person would be classified based on their photograph. The site is now offline, but the Guardian journalist Julia Carrie Wong wrote an interesting article about it recently in the course of which she investigated how it would classify/describe her from her Guardian byline photo. Here’s what she found.

Interesting ne c’est pas? Remember, this is the technology underpinning facial recognition.

Do read the whole thing.

Quote of the Day

”For all the progress made, it seems like almost all important questions in AI remain unanswered. Many have not even been properly asked yet.”

Francois Chollet

Posted in AI

Quote of the Day

“It’s absurd to believe that you can become world leader in ethical AI before becoming world leader in AI first”

Ulrike Franke, policy fellow at the European Council on Foreign Relations.

Google’s big move into ethics-theatre backfires.

This morning’s Observer column:

Given that the tech giants, which have been ethics-free zones from their foundations, owe their spectacular growth partly to the fact that they have, to date, been entirely untroubled either by legal regulation or scruples about exploiting taxation loopholes, this Damascene conversion is surely something to be welcomed, is it not? Ethics, after all, is concerned with the moral principles that affect how individuals make decisions and how they lead their lives.

That charitable thought is unlikely to survive even a cursory inspection of what is actually going on here. In an admirable dissection of the fourth of Google’s “principles” (“Be accountable to people”), for example, Prof David Watts reveals that, like almost all of these principles, it has the epistemological status of pocket lint or those exhortations to be kind to others one finds on evangelical websites. Does it mean accountable to “people” in general? Or just to Google’s people? Or to someone else’s people (like an independent regulator)? Answer comes there none from the code.

Warming to his task, Prof Watts continues: “If Google’s AI algorithms mistakenly conclude I am a terrorist and then pass this information on to national security agencies who use the information to arrest me, hold me incommunicado and interrogate me, will Google be accountable for its negligence or for contributing to my false imprisonment? How will it be accountable? If I am unhappy with Google’s version of accountability, to whom do I appeal for justice?”

Quite so. But then Google goes and doubles down on absurdity with its prestigious “advisory council” that “will consider some of Google’s most complex challenges that arise under our AI Principles, such as facial recognition and fairness in machine learning, providing diverse perspectives to inform our work”…

Read on

After I’d written the column, Google announced that it was dissolving its ethics advisory council. So we had to add this:

Postscript: Since this column was written, Google has announced that it is disbanding its ethics advisory council – the likely explanation is that the body collapsed under the weight of its own manifest absurdity.

That still leaves the cynical absurdity of Google’s AI ‘principles’ to be addressed, though.