[MUSIC – THEME, “SWAY”] (SINGING) When you walk in the room, do you have sway?
I’m Kara Swisher, and you’re listening to “Sway.” My guest today is the Nobel Prize-winning psychologist Daniel Kahneman. He’s done more than just about anyone else alive to change the way we think about the way we think. He started with unconscious bias, publishing a series of groundbreaking studies in the 1970s with his collaborator Amos Tversky. In his latest book which he co-authored with Olivier Sibony and Cass Sunstein, Kahneman diagnoses another flaw in human judgment — something called noise. In fact, that’s the title of the new book. [MUSIC PLAYING]
Daniel Kahneman, welcome.
Glad to be here.
Before we start, I want you to sort of review your last book. “Thinking, Fast and Slow” was an enormous hit when it came out in 2011. Why do you think it did as well as it did and caught on so much?
Well, I believe that people recognize themselves in it. We all feel that there are those two obviously different ways in which ideas come to mind. When you say 2 plus 2, something comes to your mind. And 17 times 24, well, no number comes to your mind. Some ideas just happen to you — like 2 plus 2. Some ideas, you’ve got to generate, you’ve got to produce. And that’s mental work. So there is a distinction drawn there between System 1 and System 2. The System 1 thoughts are those thoughts that happen to you. And System 2 is the more effortful thoughts that take work to produce. And System 2 also is involved in self-control.
So you say they’re slow, they’re deliberate, they’re analytical, consciously effortful. And the first one is you just know. We think of it as intuitive and unconscious.
It’s intuitive, but perception is System 1, and driving is done by System 1 skill behavior. In System 1, when we talk to each other, and we have a vague idea, and words pop out of our mouth, that is automatic, and that is System 1.
So one of the things you had with is that System 1 tends to take over for much of deliberation.
Well, the idea was that System 1 is continuously active, and it makes suggestions. It has an interpretation, it has a model of the world, quite a rich model of the world. That’s what it’s for. And System 2 monitors System 1, so that not every thought that comes to your mind you will say. Fortunately, we can inhibit ourselves. I describe System 2 as an editor of System 1. So the System 1 comes up with copy, and the editor can accept it or reject it or anything. But a lot of the time, System 2 simply accepts it, so thoughts come to our mind, and we accept those thoughts.
Your new book is about noise. Now I want you to quickly define noise for people to understand what you’re talking about.
Noise in general is unwanted variability. That is, when there is a judgment or a measurement or a decision, and there is variability, and the variability can be across occasions. When the same person judges the same object many times and reaches different conclusions, that’s one kind of noise. And the other kind of noise is what we call system noise. So we have the judicial system, and it passes sentences on defendants and criminals. And you want it to function so that the same crime should be punished the same way by different judges and not be affected. And it’s not, it’s affected by the judge’s tastes, by the judge’s ideological position, by the weather —
By what they had for lunch.
By what they had for lunch.
By their commute. So that’s noise. How big a problem is noise as compared to bias? Because bias enters individual decisions — you’re saying — and noise is across the system that doesn’t give the same results, that can be removed from the system. Noise can be quieted, right?
When you look at the system, and the system is trying to make accurate judgments, the system can fail in two different ways. It can fail by being biased. That is, if judgments are too high systematically or too low. But it can also fail by being variable. So suppose that your scale is not biased. On average, it gives the same number. But when you step on it and off it, you don’t get the same number. This is certainly true for my bathroom scale probably because it’s not expensive enough, but that is a noisy bathroom scale. It’s variable. And on average, it could be unbiased, but it’s clear that the variability itself is inaccuracy. So in global error, bias and noise have the same weight, and they are completely independent of each other, so that you can improve the overall accuracy of judgment either by decreasing bias or by decreasing noise. And in both cases, you are going to decrease overall error.
Absolutely. Noise is easier to fix, though, than bias, correct?
Absolutely. And noise is easier to fix than bias because you can measure noise without knowing the true answer.
So if I’m a CEO, how do I figure out how much of a problem noise is in my company versus other issues? What is a noise audit?
A noise audit is really an experiment such as the one that I describe where you present the same problem to many professionals. They perform the same role. They all speak on behalf of the organization. And if they don’t agree in what they say on several of these cases, that’s what a noise audit is. So you could perform a noise audit on underwriters or on radiologists. You would present them the same X-ray or the same MRI and ask them to determine a diagnosis. And if they don’t agree on the diagnosis, that’s noise.
This is an example that uses a noise audit you did at an insurance company. Why don’t you talk about this — how common is noise in decision-making and explain it via the insurance company audit, noise audit that you were brought in to do.
So what we did was the executives in the company constructed cases that were very realistic of the cases that —
What they might decide to insure different things —
What they would do. And then they presented those cases to a standard number of underwriters. Now the question that I asked some executives is the following — suppose you take two underwriters at random, just a pair of underwriters, how much of a difference you expect to find relative to the average of their judgment? And people have a number. It’s a number that comes to most people’s mind, which is quite striking — about 10%. 10% is a number that we think, if you have two people make judgments, you don’t expect them to agree perfectly because a matter of judgment allows disagreement almost by definition. But you expect a small difference. Now, in that study of underwriters, the difference between two randomly chosen underwriters was 55%. So it’s more than five times as much as the executives had expected.
Right. So two different people were making two different decisions.
Wildly different. So this is the striking amount of noise.
So this was a surprise to these executives, that there was so much of a difference. They thought they had a pretty substantively rational system in place to make underwriting decisions, but it wasn’t at all. It was completely all over the map, correct?
That was actually the more important result, I think, in some ways. Certainly, the most surprising to me was that this finding was news. It was a big problem, obviously. And they immediately recognized that there’s a big problem. They had been completely unaware that they had that problem.
Why should we be worried about noisy decisions? If you were in the high side today and the low side tomorrow, don’t things average out? That’s what people always say — things average out.
Well, things don’t average out. That is, when one underwriter sets too high a premium, and the other underwriter sets too low a premium, they don’t cancel out. That’s two mistakes. So that is what just the noise is.
Is mistake after mistake —
It’s mistake after mistake in different directions. That’s why it’s variability and not bias, is that some mistakes are one way, the other mistakes are the other way. But they are mistakes, all of them.
So when you’re a consumer — I can see why noise would be a problem for the insurance companies because it’s not as standardized as they think it is. But as a consumer, isn’t noise how I get a good deal, or they make a mistake?
Oh, I think as a consumer of insurance, you would hate the idea that the number that you are going to get is determined by a lottery. And you would certainly hate that if you are a defendant before a judge, to know that they will add or subtract a few years from your sentence. People wouldn’t like that.
Are there qualities of a noisier place than others, or is it just to enter the picture — I’d say a political administration, it feels like the Biden administration, for example, is less noisy than the Trump administration. It may not be.
It may not be. An administration that is very consistent and predictable doesn’t suffer from a problem of noise. It could suffer from a major problem if the decisions are wrong. But the problem of noise is when different agencies within the government sort of trip over each other or reach decisions that are wildly inconsistent, or when the defense organizations and the foreign affairs are inconsistent with each other. And this happens quite a lot.
I can see how you would measure noise in a situation where there is one correct answer — like weather forecasting or cancer diagnosis or radiology or insurance things. Are there contexts where there is no right answer necessarily?
Let me give you an example, the judicial system. And is there a right sentence, or isn’t there a right sentence? And finding the right sentence or defining it is very difficult. But what I think you can say independently of that is that variability in the sentences for the same crime violates our sense of fairness and justice. So that noise is really bad in itself regardless of whether there is a uniform meant. Variability is not always bad. When you have different film critics, you don’t want them all to be the same. There are many situations in which you want people to think differently. So it’s only in those situations where you want people to think alike that noise is a problem.
So in a judicial system that you would get an individual defendant getting the wrong sentence versus another one, is that a very big problem?
Whatever we call it — one noise audit that we describe in detail in the book, 208 federal judges were presented with the same 16 simplified problems, and they had to pass a sentence. The average prison sentence was seven years. But when you talk to judges at random, looking at the same case, and you ask by how much did they differ, the average difference was more than 3 and 1/2 years. That is enormous.
But if noising is a product of different judges’ ideologies about punishment and deterrence, in other words, if you had all liberal or all conservative judges, you’d have less noise, correct?
Certainly. But interestingly enough, I don’t think this is the largest source of noise. So you really have three sources of noise. One is differences in the average level. It’s like your scale which overstates or understates. The other one, we call it occasion noise — the same judge on different occasions, depending on whether the football team won or lost, or it’s early or late in the day. That’s occasion noise. The biggest source of noise — and this was a very substantial surprise to us — is differences in taste. So some judges are severe with younger people, other with older people. Some are shocked by fraud. Some are very influenced by who the victim is. There are many differences. It’s not only lenient versus severe, but each judge has a personality. For example, you may have a judge who is sympathetic to young defendants or to first-time defendants and really very harsh on repeat defendants, and another who doesn’t have that feeling, who actually thinks that young defendants should be punished severely to deter them from future crime or whatever. You can have a judge who happens to have a child, and young defendants, some defendants remind that judge of her child. That will cause that judge to make different judgments than others.
Sure. That’s called humanity, I believe. I believe that’s what that’s called. The problem is humans are noisy, is what you’re saying —
Humans are noisy. Humans are different, what we call judgment personalities. And that is where a lot of noise comes from.
Whether it’s a judicial system, or a company, or even in social relations.
That’s right. We look at the same world, and we look at it with confidence. I feel that I’m right in most of my judgments, and I’m truthful to you. I respect my colleagues, and I like them. And they are looking at the same world. I expect them to see the same world that I see. But in fact, they don’t. That’s the surprise.
I have an expression where I say, people, they don’t believe what they see, they see what they believe.
And different people believe different things.
That’s right, on top of that. So the problem is humans, I guess. So one of the things you talk about is algorithms making these decisions — which is very controversial, obviously. Do you think algorithms are better at producing good decisions than humans?
It’s very clear that algorithms, most algorithms are noise-free, in the sense that if you presented the algorithm with the same problem on two occasions, you’re very likely to get the same answer. And by the way, where algorithms are better than people, it is very often because of noise, because they are noise-free. So that by itself improves the accuracy of algorithms relative to people. In many domains — this already is happening — you can compare the performance of algorithm to the performance of people, and provide the same information to the algorithm and to the person, and have them make predictions of judgment. And when you can make that predictions, algorithms tend to win.
Tend to win and be the correct decision.
Absolutely. So for example, a massive study of bail judges, of thousands of judges, half a million decisions, a very large study comparing artificial intelligence to human judges, if you let artificial intelligence use the information, you would get objectively better decisions. That is, you would get fewer people spending time in jail, and you would get fewer people released and committing crimes. You could actually improve at both ends.
That’s the hope.
That’s not the hope, that’s a reality.
When it comes to this. But when you go back to sentencing, if you programmed say a punitive theory of justice, produce prison sentences that were less noisy, isn’t it just as unfair?
Well, people also make mistakes. So you would try — and it’s very clear that our demands from algorithms are a lot higher than our demands from people, so it wouldn’t be enough for a self-driving car to be as safe as the average driver, it’s going to have to be safer than the average driver by a huge factor before we let those things on the streets. So we’re really biased against algorithms, which I think is fine. I mean, I share that bias.
Talk about that a little bit because I spent a lot of time talking with the guys who are building AI to run our daily lives, I’m not confident they’re going to program them correctly. You know what I mean? Because ultimately, one of them told me — which is a very common term — “Crap in, crap out.” You know what I mean? Like if it’s the case — so if you don’t trust algorithms, but you think they make better decisions, can you explain that gulf for you?
Well, the key in constructing algorithms is not building in biases into the algorithm. For example, if you have an algorithm that tries to maximize public safety, but if it measures the risks as a proxy for crimes, then if the police has been arresting more Black people, then there will be bias in the algorithm. This is something that very careful construction of algorithms can control. It is really unacceptable. Nobody would speak in favor of that. But we shouldn’t infer that it’s a built-in characteristic of algorithm that they have to be biased.
Right, because we make them biased, in other words. So there’s been a lot of research in the ways that seemingly neutral algorithms that were supposed to be programmed that way produced systemic racism. So one of the things that’s happening, though, the same time algorithm decision-making is happening more and more and more, as we start to rely on that, how do we solve that problem?
Well, in some cases, some of these issues really don’t have an obvious solution. If you leave it with humans, humans make mistakes. It’s not as if only algorithms make mistakes. Somehow, the mistakes that humans make are more tolerable to us and more acceptable.
Why is that? Why do we tolerate humans doing it, even if the mistakes are more damaging?
There is really a fundamental distinction that we draw between what is natural and what is man-made or artificial. And you can see that with the vaccine. If you have a vaccine that occasionally kills people, it’s not enough for the vaccine to save a little more — a few more people than it kills, it would have to save hundreds of times more people than it kills. Six cases were enough to stop the use of the Johnson & Johnson vaccine. So we really find errors by man-made systems — in this case, a vaccine, or the self-driving car — we find those mistakes much harder to tolerate than human mistakes or than acts of God. [MUSIC PLAYING]
We’ll be back in a minute. If you like this interview and want to hear others, follow us on your favorite podcast app. You’ll be able to catch up on “Sway” episodes you may have missed — like my conversation with Jane Goodall — and you’ll get new ones delivered directly to you. More with Daniel Kahneman after the break. [MUSIC PLAYING]
What do we need to make people either believe in algorithms more — because it does feel sinister to a lot of people, especially say in the context of China, where they’re making algorithmic decisions that may be accurate, but could be sinister. I had someone pose the idea that someone in China would correctly predict that someone was going to be seditious towards the Chairman of the party, for example. Let’s get rid of them now before — what’s going to happen is going to happen. It reminds you of “Minority Report” — I think we talked about — that we know what’s going to happen pretty much, and the algorithms are pretty much right. It has a sinister feel to it, that you prejudge someone.
This is something that we do a lot. We do it with mentally ill people. We think they are dangerous. We don’t wait until they kill someone. We say they are dangerous, and we send them to treatment involuntarily. So it is just very different when we do it. It feels just, or it feels sensible. But the idea in “Minority Report” or in that example, that feels shocking. And in part, it’s an irrational response. But in part, that’s a moral attitude that we have. I share it. There is a dilemma here. And by the way, attitudes are changing, so algorithms are becoming gradually more accepted in more and more domains. We get used to them, and we accept them. And we begin to tolerate their mistakes. The point of our book — I should stress — is we’re not proposing that algorithms should take over just because they’re not noisy. We assume that for the foreseeable future, people will go on making the important decisions, and we’re perfectly happy with that. And we see our task as improving decisions, not as substituting algorithms for people.
Although ultimately, if algorithms were perfect, get humans out — some of the things I often say about the problem with self-driving cars is people are still driving. Once they get in charge, one car gets in an accident, a million cars will learn. When one human makes a noisy judgment, it doesn’t affect anybody else whatsoever.
No, I mean — and this is true in general of artificial intelligence, their capacity for learning is incredibly higher than humans, which is why when an AI, an artificial intelligence come close to human performance, you can bet with high confidence that within a few years, the AI will be better.
So let’s talk about what — since humans are still in charge for just a short time until the machines take over, what are some ways that a company or individual can reduce noise? I’m going to give you a quick-fire. These are some tips from your book. So decision hygiene, what’s that?
An example of decision hygiene is when you have different people looking at a problem, you want them to remain independent of each other until the very end. The more independent they are, the more information you get.
So you want to have independent judgments and then aggregate them together?
Aggregate them at the end. When you’re looking at a problem, you want to delay intuition. You want to delay a global view of what the issue is until a lot of information has been processed because otherwise, we tend to jump to conclusions.
I never notice that in any meeting I’ve been in, but go ahead. [CHUCKLING] It’s every meeting.
There is a lot of research and interviews where the interviewer actually forms an impression of the interviewee within the first 3 minutes and spends the rest of the interview confirming that impression. That’s a waste of time.
Ah, interesting. Although I do think you’re more brilliant than I am, I do. I’m sorry it’s going to continue throughout this interview. So sequencing information, talk about that.
It’s the same principle as keeping people independent. So when you have a problem — and it could be hiring someone, or it could be investing in a firm — there are different attributes of that person that you want to consider or that investment that— the most different attributes. And the recommendation is to consider those attributes one at a time, and to reach a conclusion about each attribute independently of the others, and not to try to form a global impression of the case — whether it’s an investment or an individual — until a lot of the information is in, and then you can have an intuition. That’s what we call delay intuition. And then you can make a global judgment. But first, collect information.
All right, judgment guidelines.
Well, judgment guidelines is just a way of constraining noise, a very direct way of constraining noise. And that was applied to sentencing. There were sentencing guidelines that restricted, to some extent, the freedom of judges. They said, for a given offense, this is the likely range of punishments. And I must say, this is very distressing. Judges hated that. They really didn’t like it. But it did reduce noise. There were many indications that the variants of judgments for the same crimes diminished considerably. Then as it turns out, the Supreme Court judged against them, but on a technicality. So noise is increasing, and judges are happier. So there is a cost to reducing noise. It’s clear that in some situations — the judicial system in particular — there is a sense that individuals who make those judgments want to be left alone, and they want to make their judgments.
I don’t know if you know this, they know best. Just so you know, that’s what I’m often told by tech people — they know best, Kara. Stop complaining. Do we need to devalue experience and gut reaction? Do we need to devalue it completely in any case?
Not completely, but we know that the confidence that people have in how valid their judgments are is really no guarantee of accuracy. Subjectively, people are not very good at telling when they are right and when they are wrong. So —
I always say frequently wrong, but never in doubt.
Yeah. And it’s the case, by the way, that we tend to be influenced by people who are self-confident, much more influenced than we should be. So controlling that would be a good thing.
Controlling self-confident people?
Controlling the role of different people in making decisions, so that the decisions are not influenced too much by overconfident people speaking first and determining the agenda.
Oh, I’m screwed then. Uh oh, they’re going to get me. So speaking of which, taking the outside view, this is a thing I think is critically important. I’ve often told tech people they should have someone in the room who disagrees with them, to challenge what they do. And they seldom take me up on that. Talk about what taking an outside view is from your perspective.
The outside view is that when you have a problem, for example, you’re trying to forecast how much a given building will cost, and so there is one within, the inside view. You are computing the cost. But another way would be to look at comparable problems in the past and to ask, by how much did we underestimate costs? You will see that in general, in projects, people underestimate costs and underestimate the time it will take to complete a project. This is true for kitchen renovation. Kitchen renovations cost twice as much as people budget when they start. Taking the outside view, you would know when you enter the project, this is my budget, but I’m going to spend more.
Right. So give me an example of noise reduction you’ve seen that’s made a dramatic difference.
Well, I think a fairly dramatic difference is in hiring — what sort of interviews you conduct, and how many interviews you conduct, and how you integrate the information. And it turns out that you want interviews that are structured, a structured interview is that you collect information in a sequenced and orderly way and evaluate different attributes one at a time. You want independent interviewers. You don’t want them talking to each other before they have made their judgments. And then you want a process in which they discuss their differences and reach a final conclusion. The best way of doing hiring has very general implications. We use the phrase, “In any decisions, options are like candidates.” And I think we have a pretty good idea of how best to pick among candidates. And if we apply those ideas to picking options, we would do better than we do now.
You also write that people don’t want to be treated as they are mere things or cogs in some kind of machine. Tell me about a situation where noise reduction doesn’t work or shouldn’t be used.
Well, we’re thinking of an organization. It has conducted a noise audit, so it knows it has a noise problem. And it is taking a group of employees, and it wants those employees to act more uniformly. So it’s going to suggest some procedures. And if it’s heavy-handed in the suggestion of procedures, it’s going to be perceived by the employees as a bureaucratic intervention in stuff that they’re doing, and they will resist it, and they will sabotage it. So what you would want is, you would want the employees themselves — to a very large extent — to be involved in any procedure changes that are designed to reduce noise.
So the key parts is involving the people, making the decisions in them without offending them, right? Because people do get offended if you say your decision-making is bad, and now we’re going to fix it.
Oh, you want to involve people from the noise audit and on. When people it turns out are different from each other, you want to make that interesting. You want to make that a source of wonder, and then trying to figure it out, and then trying jointly to reduce it.
What do you think the big questions in psychology that researchers still need to ask? What would you do if you — 50 years, you have 50 years to look at something, Daniel Kahneman.
If I were starting my career now, I would be choosing between artificial intelligence and neuroscience because those are now particularly exciting ways of looking at human nature.
And what about neuroscience is exciting to you?
This is probably where the future understanding human nature at a more molecular level, at a more granular level. This is beginning to happen from all directions — the study of individual neurons, the study of patterns, the study of epigenetics. So much is happening. So the study of the human brain is really in its infancy. In a century from now, people will know things that we can’t even imagine and will answer questions that we cannot even pose.
Will we know how we think? Will we actually have a map of how we think, do you think?
We’ll know vastly more than we do now. Psychology is not a waste of time. I think we have learned quite a bit. But we are poised to learn so much more, that it’s going to be very exciting for the people in that field.
Daniel Kahneman, thank you so much. It’s so riveting.
My pleasure. [MUSIC PLAYING]
“Sway” is a production of New York Times Opinion. It’s produced by Nayeema Raza, Blakeney Schick, Matt Frassica, Heba Elorbany, Matt Kwong, and Daphne Chen, edited by Nayeema Raza and Paula Szuchman, with original music by Isaac Jones, mixing by Erick Gomez, and fact-checking by Kate Sinclair. Special thanks to Shannon Busta, Kristin Lin, and Liriel Higa. If you’re in a podcast app already, you know how to get your podcasts, so follow this one. If you’re listening on the Times website and want to get each new episode of “Sway” delivered to you as fast as a System 1 thought, download any podcast app and search for “Sway” and follow the show. We release every Monday and Thursday. Thanks for listening. [MUSIC PLAYING]