What if whether you went to prison, or walked free, hinged on a mathematical theorem? In this podcast, Tom Rocks Maths intern Alex Homer explores what’s happened when lawyers have attempted to introduce probability into the courtroom, and what this means for justice in an increasingly data-driven world. His guests are Prof William Thompson, Professor Emeritus of Criminology, Law and Society; Psychology and Social Behavior; and Law at the University of California, Irvine, and Prof Norman Fenton, Director of the Risk Information Management Research Group at Queen Mary University of London.
The podcast carries a content note for discussion of rape and murder. Though it discusses legal matters and all efforts have been made to ensure accuracy, no part of it constitutes legal advice.
(The following has been lightly edited for clarity.)
Alex Homer (narrator): What if whether you went to prison, or walked free, hinged on a mathematical theorem? This is a real question that has faced the courts of England and Wales, as well as others around the world, in recent years. And it’s a surprisingly controversial one.
To set the scene, let’s take a fictional example. Imagine that each person has an “aura”, which we can measure scientifically. Most pairs of people have different auras, but let’s say that one in a million people has the same aura as you. So if forensic investigation of the crime scene discovers there an “aura profile” matching yours, is there a one-in-a-million chance that you’re innocent?
The answer, contrary to many people’s intuition, is: no. Let’s hear from an expert.
Prof William Thompson: My name is William Thompson; I’m a professor—or now a retired professor, professor emeritus—at University of California, Irvine. I’m also a lawyer, and I became familiar with Bayes’ rule and Bayes’ Theorem as a graduate student, and I’ve encountered it and used it in my work for decades now! <chuckles>
Alex: In our aura example, we committed something called “the prosecutor’s fallacy”, a term first used in a paper Prof Thompson wrote with Edward L Schumann in 1987. I asked Prof Thompson how he became interested in researching this topic.
Prof Thompson: Well, it was a legal conference: I was invited to speak to a criminal lawyers’ group in my local area. There were prosecutors and defence lawyers, and we got into a discussion of the use of serological testing evidence in court, which was often at the time—this was in the 1980s—it was often presented in conjunction with statistical estimates of the frequency of various protein and enzyme markers in blood.
So blood would be found at the crime scene that may have come from the perpetrator, and the police would pick up a suspect, and they would test the blood and determine that he had certain enzymes and protein markers that were shared with the blood at the crime scene. And an estimate was made of the percentage of the population that would have that set of markers. And it might be, say, 1%. And then the question is: what’s the probative value of this, or what should one make of it in the courtroom?
And the prosecutors in the room were saying: well isn’t it obvious? Only 1% of the population would have this set of markers, and the defendant has it, so there’s only 1% chance he would have it if he’s innocent, so there must be a, y’know, a fortiori, 99% chance that he’s guilty. Perfectly obvious! <Alex laughs>
And then the defence lawyers said: now, wait a minute, I beg to differ here! Because 1% of the population—there are 10 million people in this general area, and 1% of 10 million is whatever thousands. So the chance of his being guilty is just one in thousands. It’s practically worthless evidence!
Right, and then I, being the academic, stepped up and said: no, no, no. Prosecutors, defence lawyers: you’re both wrong. Let me explain to you something called Bayes’ Theorem. And I explained as well as I could Bayes’ Theorem, and both sides looked at me and said: What?! That couldn’t be right! That couldn’t be right! <both laugh> Dismissed it out of hand, like: you pointy-headed academics, I don’t even understand what you’re talking about!
So, anyway, this convinced me that this phenomena, that this was a powerful—what psychologists would call a “fallacy”. It’s an error in judgement.
Alex: So the Prosecutor’s Fallacy is what happens when we assume that innocence, now that we’ve seen the evidence, is the same as the probability of seeing the evidence on the assumption of innocence. In general, mathematically speaking, this just isn’t true. As Prof Thompson said, Bayes’ Theorem is the way to avoid the Prosecutor’s Fallacy. Bayes’ Theorem lets us update our assessment of probabilities on the basis of new evidence. We can think of it in a number of ways, and for the easiest one to use here, we need to talk about odds.
Odds are a little bit like the ones bookmakers use. We say the odds of an event are the probability that it happens, divided by the probability that it doesn’t. So if an event has a probability of 4-in-5, we say that its odds are , which is just 4. This is similar to how bookies would describe the probability of the same event as being 4:1 on. Odds of exactly 1 are what bookies call “evens”: that is, the event has a 1-in-2 chance of occurring, or is equally likely to happen or not to. Bigger than 1 and an event is more likely to happen; less than 1 and an event is more likely not to. It’s easy to convert probabilities into odds, and vice versa.
Odds are a convenient way to express probabilities for use in Bayes’ Theorem, as it turns out. Let’s look at an example event: that a defendant is guilty, say. This is somewhat subjective. You might start with the assumption that guilt and innocence are equally likely, or you might try and reflect the fact that people are considered innocent until proven guilty by setting very low odds of guilt.
Either way, let’s say we got some new evidence: that the defendant matched an aura profile, to use our previous example. Bayes’ Theorem tells us that, to get this [the new odds], we multiply the prior odds by the likelihood ratio. We get the likelihood ratio by taking the probability of seeing the evidence if the defendant is guilty, and dividing it by the probability of seeing it if the defendant is innocent. This latter is the probability I told you at the start, so we’ve sidestepped the prosecutor’s fallacy: we really can say that the probability of seeing a matching aura profile if the defendant is innocent is one-in-a-million. Assuming aura matching is a perfect science, we’re certain to get a match if the defendant is guilty, giving a conditional probability of 1. So our likelihood ratio is , or one million. And we get the posterior odds of guilt by multiplying our prior odds by a million. We can then convert our posterior odds back to a probability.
Don’t worry if this was all a little confusing: the precise mathematical details aren’t important for what you’re about to hear. But remember the term “likelihood ratio”, because it will be important later.
I spoke to another expert in combining probability with the law.
Prof Norman Fenton: I’m Norman Fenton, I’m a professor of risk and information management at Queen Mary, in the Electrical Engineering and Computer Science Department. I’m a mathematician by training, and I’ve done a lot of work in the last fifteen years on legal reasoning and applying probabilistic methods, risk assessment methods and, in particular, Bayes’ Theorem to legal arguments, and I’ve been involved in many criminal and civil cases either as an expert witness or a consultant.
Alex: Prof Fenton leads an international consortium of researchers focusing on Bayes and the law. I asked him what led him to a focus on legal aspects of probability.
Prof Fenton: I’ve had a long-term research interest in probabilistic risk assessment, which for many years was in things like systems engineering and looking at risk of transport systems, for example, assuring how do we understand the risk of, say, a new fly-by-wire aircraft being safe to fly—stuff like that. And the thing about all of that work—and also generally things like reliability of critical systems—and the thing about that work is that it’s not just enough to look at the statistics of things like safety, because you tend not to have a lot of relevant data on things like accidents. So you have this problem where you have to combine statistics with judgement, expert knowledge.
Bayes is the perfect way of doing that, and it turns out, of course, Bayes very much fits in to what the law is about, because what the law is about is, you have some prior belief in some hypothesis, like whether a person is guilty of a particular crime, and then what happens is that you get evidence, and you have to update your belief once you see the evidence. Now Bayes’ Theorem is the rational way for doing that probabilistic updating.
Alex: So if Bayes’ Theorem is a good way of weighing up evidence, the courts will approve, right? Well, let’s look at our first case, R v Adams. This was a case in 1996, where the defendant, a Mr Adams, was charged with rape. The counsel for the defence brought in Prof (now Sir) Peter Donnelly, who was appointed as the head of the University of Oxford’s Statistics Department in the same year, to provide evidence. Prof Thompson tells us more.
Prof Thompson: I mean, this was a case where there was powerful DNA evidence incriminating the person: the blood at the crime scene had genetic characteristics that were shared with the defendant and they were very rare, like one-in-a-million in the population: I don’t know the exact figure, but say it was one-in-a-million. And that sounds really powerful, and if you commit the prosecutor’s fallacy you think there’s a million-to-one chance the defendant’s guilty [sic] and so why even consider the other evidence.
But what Donnelly came in to point out is that, in this case, all of the other evidence against the suspect was very weak or even exculpatory: the victim was not able to identify him as the perpetrator, and in fact the description of the perpetrator did not match this defendant; he had a strong alibi; there were other indicators that he was an unlikely person to have done the crime. And so you’d have to think that the prior odds of his guilt were extremely low. And what Donnelly wanted to explain to the jury is that, if you put the prior odds together with the… if the prior odds are low enough, and you combine them with a likelihood ratio—even a very strong one, like a million—it could still be a significant chance this person wasn’t guilty.
Alex: The judge advised the jury that they were free to use the theorem, or not to.
Voiceover: “As a theorem, it is agreed by both sides that it is a way of looking at non-statistical matters in statistical terms. Whether it is practical for you, a jury, to operate, is, as I say, something that you will decide for yourselves.”
Alex: We will never know whether or not the jury chose to use the mathematical method, since it is illegal for anyone to ask them what they discussed in the jury room. What we do know is that the jury found the defendant guilty.
Adams appealed. The Court of Appeal ordered a retrial of the case, on the basis that the judge’s directions to the jury hadn’t been sufficiently clear. In particular, it found that he hadn’t given them enough direction on what to do if they chose not to use Bayes’ Theorem. However, the Court did express its misgivings on the use of the theorem in court.
Voiceover: “We have very grave doubt as to whether that evidence was properly admissible, because it trespasses on an area peculiarly and exclusively within the province of the jury, namely the way in which they evaluate the relationship between one piece of evidence and another.”
Alex: At the second trial, the defence lawyers again called Donnelly. This time, the juries were provided with questionnaires to fill in should they choose to use the Bayesian method, allowing them to insert their estimates of the probabilities and compute the relevant likelihood ratios. These were written jointly by statisticians representing the prosecution and defence sides of the case. Again, we don’t know if they used them, but again we know that the result of their deliberations was a guilty verdict. And, again, Adams appealed.
This time the appeal was explicitly based upon the contention by the defence that the judge led the jury towards what he described as a “common-sense” approach as opposed to a Bayesian one. This meant the Court of Appeal ruled directly on the admissibility of the sort of Bayesian evidence given by Donnelly at the initial trial. It’s fair to say that they were not impressed. Among other things, the appeal court ruling said of the jury that:
Voiceover: “We do not consider that they will be assisted in their task by reference to a very complex approach which they are unlikely to understand fully and even more unlikely to apply accurately, which we judge to be likely to confuse them and distract them from their consideration of the real questions on which they should seek to reach a unanimous conclusion.”
Alex: Overall the court dismissed the appeal, ruling that the experiment of coaching the jury in calculating posterior odds was not to be repeated.
Voiceover: “We are very clearly of opinion that in cases such as this, lacking special features absent here, expert evidence should not be admitted to induce juries to attach mathematical values to probabilities arising from non-scientific evidence adduced at the trial.”
Alex: Prof Thompson explains where he thought the court was coming from.
Prof Thompson: The basic conception is that the jury is the trier of fact: it’s the jurors who are brought in to weigh and determine the weight to give to the evidence. That’s why you have a jury, rather than having experts decide the case. And we do allow experts to come in to explain certain factual matters that may be beyond the jury’s ken, but typically experts are used to explain things like “how does a DNA test work and what do the results mean”, not to tell jurors how to think for themselves. And, to the court, Donnelly’s testimony about Bayesian reasoning came too close to telling the jury how to think about the evidence.
The courts use very militant language about this: they talk about the expert invading the province of the jury or usurping the jury’s role—it sounds very mediaeval. So you imagine Prof Donnelly, mild-mannered Prof Donnelly, coming in as the usurper. Anybody who’s watched Game of Thrones knows what happens to usurpers: they either gain the throne or their heads are cut off—and [in] Donnelly’s case, unfortunately, the courts decided that his head had to be cut off, and so this would-be usurper was cast from the legal realm.
Alex: I asked Prof Fenton for his thoughts on the ruling.
Prof Fenton: I think the idea of using Bayes in the case kind of made sense. The problem in that case was that the expert witness basically tried to do it all from first principles. He was actually trying to combine three different bits of evidence, and do the necessary probability updating. He was actually making some fairly simplified assumptions about a lot of it being independent, but even so it’s still not trivial to do the Bayesian calculations—you can’t expect lay people just to… it just seemed wrong to expect to go through them from scratch. Because Bayes’ Theorem is a little bit counter-intuitive the first time you see it, it’s not that simple a concept to expect people just to be able to do the calculations from scratch—so I just think it didn’t work.
I think it confused the judge certainly; I don’t think the jury would have been particularly happy having to go through… y’know, my understanding was that the defence paid for twelve calculators for the jury and one for the judge, just to be able to do the calculations from scratch and take them through those. I think it was a very poor approach.
Alex: It’s worth saying the Prof Donnelly has himself noted that presenting the Bayes calculations before the jury was not his idea, and that he’s unconvinced about its utility as a technique.
Let’s stop and think for a moment about why anyone might want juries to use Bayes’ rule. Of course, we can assume that the defence introduced it in R v Adams because they considered it would help their client, since advancing such arguments is the purpose of the defence counsel in any trial. As we’ve seen, the courts there ultimately ruled in favour of a common-sense approach. This, as courts have noted, had been the only tool juries had for centuries, prior to the attempt to introduce Bayesian logic in 1996. Outside of any particular case: why might people want it to be any other way?
Well, Prof Thompson has carried out extensive research, often using simulated juries, on how people update their assessment of probabilities on the basis of new evidence. I asked him what the research picture is telling us.
Prof Thompson: Well, I think that we know that people’s process of reasoning, the mechanisms they use for dealing with probability and frequency data (normally), are not Bayesian; they’re not formally statistical. People use various rules-of-thumb, mental shortcuts, heuristics, which often work fairly well but sometimes can lead them awry.
So there was a period of time, and psychologists first started… the first studies that psychologists did to compare human judgement to Bayesian norms were back in the 1960s. I think the pioneer was a guy named Ward Edwards, and then later this was picked up by Daniel Kahnemann and Amos Tversky in a whole series of studies that led to Kahnemann getting the Nobel Prize. So a lot of that work—the first Nobel Prize actually awarded to a psychologist, although it was in Economics, I think—but, y’know, the first psychologist to get a Nobel Prize, the key work involved comparing human judgements to Bayesian norms.
And so a lot of the early work was looking at updating problems, where people would be asked to form an opinion about a hypothesis, and then they’d be given additional data regarding the hypothesis and asked to update: “How much do you revise your opinion in light of the data?” And what the studies were showing is that people were tending not to revise their views as much as a Bayesian analysis suggested that they should.
And that caused concern: it led to a huge debate and furore in the legal community. There’s an enormous legal literature that was prompted by the suggestion that, because people are conservative relative to Bayesian norms in updating their views in light of evidence, that, in order to help them use the evidence better, they should be tutored on Bayesian norms—and basically experts should present statistics to them, and then they should be given decision aids, let’s say, on how to use these things. So there was a proposal by two guys named Finkelstein and Fairly [who] wrote a prominent article in the 1970s (I think) suggesting this.
And there was a famous response from Laurence Tribe, who’s a well-known constitutional scholar—still around today, probably one of the most famous constitutional scholars in the United States, so a very high-profile legal academic—wrote this scathing attack on the idea in 1971, called “Trial by Mathematics”, in which he pointed out all of the things that could go wrong if you introduce mathematical proofs into a criminal trial. And, y’know, making some fairly good points: like, if you’re going to have the expert come in and present Bayes’ Theorem, what’s the expert supposed to use as the prior. If we’re supposed to be presuming people to be innocent, then where does the expert get off coming in and saying, well, let’s assume a priori it’s 50:50 and then update from there. So there were all kinds of disputes in law about use of Bayesian norms and probability in the legal system. Later that went off: there’s a philosophical debate about whether legal proof should even be conceptualised as probabilistic as opposed to some other kind of logic.
Let me get back to where we started! When we started, you asked me: do I think that people are conservative relative to Bayesian norms, and is that a problem? And my answer is that I think that sometimes they are conservative but not always; I think that some of the studies showing strong conservatism were based on bad methodology: it’s not so easy to do studies that compare human judgement to Bayesian norms because there are all sorts of issues about how you elicit judgement: it’s difficult to measure subjective probabilities, which is what we’re doing with the Bayesian analysis.
And so I’m not convinced that people are always conservative: I think they sometimes are, but I think that the particular heuristics and rules-of-thumb they use for reasoning can sometimes lead them to be non-conservative and give too much weight to evidence. I’ve done some studies that show that in connection with forensic science evidence. And so I think it’s a more complicated story. I think lawyers need to be aware that people don’t think like statisticians, and I think lawyers are still debating whether that’s—to what extent is that a problem?
Alex: So, people’s assessment of probability isn’t always consistent with what Bayes’ Theorem says, even though Bayes’ Theorem tells us mathematically how probabilities must behave. But, of course, criminal courts aren’t used to dealing with probabilities at all. They work on [sic] a framework in which defendants are considered innocent unless it is proven beyond reasonable doubt that they are guilty, and that doesn’t translate easily to a statement about the probability of guilt.
We now fast-forward to 2011, when the R v T case came before the Court of Appeal. This was a murder case, in which a shoeprint at the crime scene was, in the original trial, compared to a shoe belonging to the defendant, who is known only as “T”. The forensic scientist, a Mr Ryder, told the court that there was “a moderate degree of scientific evidence” that the shoe and the shoeprint matched. No statistics were introduced into the courtroom. T was convicted.
After the trial, it was discovered that Ryder had calculated a likelihood ratio, based on approximations of the relevant probabilities, based on footwear seen by the Forensic Science Service. He had then converted the likelihood ratio into words using a standard scale, and this had been the evidence presented to the court. T appealed.
The Court of Appeal considered that, as a principle of open justice, the expert should have provided the court with the logic behind the evidence he gave: it was important that the court could rule on its admissibility, and that both sets of lawyers could have the necessary information to challenge the expert opinion. However, noting the use of likelihood ratios in Bayes’ Theorem, the court went further, and identified Ryder’s approach with the approach that had been banned by the ruling in R v Adams. In that case, applying Bayes’ Theorem to the non-statistical evidence had been prohibited, and the court ruled that the shoe database was insufficiently precise for it to be considered statistical.
Voiceover: “An approach based on mathematical calculations is only as good as the reliability of the data used”.
Alex: Indeed, they were quite explicit on the point. They remarked that, when a judgement had to be made in any future case involving footwear,
Voiceover: “no likelihood ratios or other mathematical formula should be used in reaching that judgement”.
Alex: Overall, the conviction was quashed, and it’s important to say at this point that nobody’s arguing with the Court of Appeal on its decision to acquit T. Indeed, many agree with the point made in the ruling that Ryder should have presented his calculations to the court. But this second part, on the admissibility of evidence based on likelihood ratios, had precedential value for future cases, and, despite not ruling on forms of forensic evidence other than footwear, had the potential to be consider highly persuasive in such cases too. The only type of evidence that the court explicitly excluded was DNA evidence.
This caused significant alarm among statisticians, forensic scientists, and others working in the field. Thirty-one of them signed a position statement, which was published as a guest editorial in the journal Science and Justice, which began:
Voiceover: “The judgment of the Court of Appeal in R v T raises several issues relating to the evaluation of scientific evidence that, we believe, require a response. We, the undersigned, oppose any response to the judgment that would result in a movement away from the use of logical methods for evidence evaluation.”
Alex: I asked Prof Fenton for his thoughts on the case.
Prof Fenton: The thing about this case is that there was a lot in the judges’ recommendations that did make sense, because, again, the forensic expert didn’t make clear a lot of the assumptions, did it very poorly—so there were problems in the way the, let’s say, the Bayesian reasoning was presented, there were a lot of problems.
Where I have a problem with the ruling is this idea that it’s only okay to do it when there’s a firm statistical base. So in that case, what they were saying was that the statistics relevant to footwear matching weren’t sufficiently rigorous to do the Bayesian reasoning. Now I have a slight problem… So the judge was right in that the way it was presented was terrible, but the idea that somehow there’s a switch between there being a firm statistical base and there not, and that that’s where you allow Bayes or not: that’s where I have a real problem.
Because the judge used the example of DNA evidence as one where there is a firm statistical base—well, it’s all actually… <chuckles> It’s actually a very grey area as to how firm the statistical base for DNA evidence is. I mean, it’s not a simple switch between yes or no, there’s a firm statistical base. It’s a continuum, basically, where I would agree that there’s a sounder base for forensic statistics than there are for statistics of footwear evidence, but nevertheless, it’s a continuum. The thing is that Bayes gives you the opportunity to take account of the uncertainty about the underlying statistics, and it allows you to incorporate expert judgement, as long as you’re explicit about what that is and the uncertainty about it.
So I have a problem there, but also it had an interesting negative effect on forensic science, I believe, that ruling, and the example, which I experienced myself, was that the expert witnesses in different areas of forensics… So, for example, let’s take something which is not DNA and not footwear, take something like fibre evidence—that had been an area where people had used Bayes and likelihood ratios on matching fibres. And in that case what was happening—I was involved in a very, very high-profile case, acting actually, not in court, but as an expert consultant for one of the defendants in a very high profile murder case—where there was fibre evidence, and the expert witnesses for the prosecution had used Bayes and likelihood ratios in their evidence to come up with what I thought was a fairly respectable argument, and it would have been difficult for me to criticise it, because obviously the defence were looking for ways to attack their argument.
But what actually happened was that, in the light of R v T, they effectively had to withdraw that, and they had to present their reports without reference to Bayes, the statistics, or the likelihood ratios. And so instead of giving what were fairly precise, probabilistic statements, they were simply giving the bland statements like “we cannot exclude” or “it is highly probable that”, and those arguments were much easier to attack and refute than the original. So the effect of this was that what was actually pretty reasonable evidence got kind of watered down.
Alex: Here, Prof Thompson talks about how he thinks R v T fits in, or perhaps doesn’t, with the precedent established in the Adams case.
Prof Thompson: I think the court perceived a danger in the testimony in R v T which was not the danger that was identified in the Adams case. Y’know, in the Adams case, perhaps they had an argument that perhaps the expert went too far in Adams when telling the jury how to combine the evidence, maybe. I mean <sighs> I’m not bothered by it! Because I’m someone who… I know as well as anybody how people need that kind of advice—and they certainly don’t have to follow it!
But that was not what Ryder was doing in [the] R v T case. He was actually not using Bayes’ rule or Bayes’ Theorem at all; he was simply computing conditional probabilities, so the expert was simply computing what you might call a likelihood ratio, in order to help evaluate the probative value, or the weight, that should be given to the similarity between the shoeprint found at the crime scene and this shoe, this trainer, found in the defendant’s closet.
I think the distinguished justices were not sufficiently educated about the nature of Bayes’ rule and Bayesian reasoning. Had they listened to a podcast like this, I think they would have done better in their legal analysis.
Alex: So we’ve now seen two different cases in which two different forms of Bayesian evidence—or at least, what the court considered to be Bayesian evidence—have been rejected. So what can be done? Well, another relevant case, which took place between the two hearings of R v Adams at the Court of Appeal, was R v Doheny and Adams, this being a different Adams from the one whose case we’ve already discussed. Without going into the details of the case itself, we note that the ruling allowed descriptions of “random match probabilities”—probabilities that two people have the same “aura profile”, in our initial example—to be expressed as frequencies.
For instance, in the case of a probability of one in a million, the expert could note that around 60 to 70 people in the UK would match that profile. This expresses the false positive probability, which is one part of the likelihood ratio. But it doesn’t take any account of possible false negatives, in which there is a non-zero chance that, even on the assumption of guilt, we wouldn’t see a match. In a paper addressing the R v T ruling, Prof Thompson suggested the use of “random match equivalents”, where the likelihood ratio is converted into what would be the random match probability on the assumption of no false negatives. I asked him about that suggestion.
Prof Thompson: Yes, and that was motivated by concern over whether lay people on juries will understand likelihood ratios. I think presenting a likelihood ratio is a perfectly appropriate, and sometimes even a necessary, way to explain the strength of forensic science evidence. So if an expert’s trying to explain: what does it mean, that we’ve compared two voice samples and found similar phonetic characteristics, or we’ve compared two tool marks and find similar features, and so on? A likelihood ratio is a natural way to explain the strength of the evidence by talking about the probability of seeing what we observe under two alternative theories of the origin.
But the problem with presenting a likelihood ratio to a lay jury is that the lay jury has not been adequately educated by podcasts like this about what a likelihood ratio is and how it fits in with the logic of inference. And, there’s a tendency, we know—in my work as a psychologist we do some studies with lay people, and we presented them with likelihood ratios, and we’ve given them simulated expert testimony, and we’ve tried various ways to explain likelihood ratios to them. And it turns out it’s not so easy to get them to actually understand likelihood ratios.
I mean, there are various ways to approach it, but people often think that they understand the testimony, but they misinterpret what the expert’s saying—the expert is talking to them about the probability of observing certain data under given hypotheses, and they think that the expert is telling them about the probability these hypotheses are true. It’s just natural for them to think about it, because what they really wanna know is: what’s the probability that these hypotheses are true? And there’s an expert in to testify about them, and they naturally assume the expert is telling them about the statistic or the probability that they wish to determine, rather than giving them a different probability that, if they were Bayesians, they would know what to do with. <Alex laughs> But, as we’ve said, they’re not Bayesians, so they don’t know what to do with it, and so they can do the wrong thing.
So what do you do about this? And so, one suggestion I made—and I’m not sure it’s the right suggestion, but it’s something to think about, and maybe something for us to study—is, rather than give them a likelihood ratio—a ratio of two conditional probabilities—convert it into something more like a frequency. I mean, there is some evidence in psychology that people are a little better at counting things, and of dealing with the frequency of things. And just selling them: look, we have what is, in effect, it’s either incriminating evidence or it’s a coincidence—but if it’s a coincidence, a fairly rare coincidence, and here’s how rare it is. Maybe people are better at dealing with that.
It turns out, even if you present it that way, some people still fall into the Prosecutor’s Fallacy thinking. They think: oh well, there’s only one chance in a million he would share these characteristics. Ah! So there’s only one chance in a million that he’d have [it] if he was innocent, ergo! <both chuckle> So I don’t think presenting it as frequencies guarantees people will reason about it in a non-fallacious manner, but I think it may reduce the risk.
Alex: Prof Fenton, meanwhile, has suggested a mathematical tool that could be introduced as evidence, known as a Bayesian network. Here, he explains what they are.
Prof Fenton: A Bayesian network is simply a graphical model which allows you to link multiple different hypotheses which, actually, usually exist in a case, because it’s not just the ultimate hypothesis that a person is guilty of the crime, but it’s, y’know, was there a motive, was there opportunity, was the person at the scene… These types of things. You’ve got multiple linked hypotheses, and of course you’ve got multiple and potentially linked pieces of evidence, and this is something which is generally misunderstood. So, effectively, you’ve got a graph which links these things, and you’ve also got [that] the strengths of the links is defined by the probabilities, and the way that you do probabilistic updating on this is through Bayes. So the Bayesian network is just, effectively, Bayes’ Theorem applied to complex webs of dependencies between hypotheses and evidence.
So, of course, you only want to do that when it is a complex case with multiple bits of hypothesis and evidence, but you can do it just to test out a hypothesis about whether or not, for example, what’s the impact of finding a DNA match: what does that tell you about the probability that this comes from a defendant. You can just do it for parts of it. And so it can be as simple, as complex, as you want. But, in any case, there are tools, basically, for doing the probabilistic inference in these Bayesian nets: you don’t have to use calculators, you don’t have to do the maths from scratch. It will automatically do the necessary updating of your belief in the hypothesis given the evidence.
Alex: I asked Prof Fenton how he thought Bayesian network evidence might be introduced in court.
Prof Fenton: I’ve never presented a Bayesian network in court, and I probably wouldn’t recommend it at the moment. What typically happens in these cases is that, behind the scenes, behind an expert report, I will do a Bayesian network which helps the lawyers in the case understand what the implications for different pieces of evidence are. It helps them understand why dependencies between certain bits of evidence mean that, let’s say, the overall impact of the evidence isn’t as great as one side or the other believes. So it basically helps them present the necessary argument in court in non-mathematical terms.
Alex: But, wait a minute! All of these ideas still involve likelihood ratios, whether or not presented to the court, which is precisely what the court ruled against in R v T. So, while we may have potential solutions to the theoretical problem, the legal problem remains, right? Prof Thompson is optimistic.
Prof Thompson: One of the advantages of a common-law system, though, is that precedent can be overturned when it’s recognised as improperly grounded. And as I said in the article I wrote, I do not think R v T is going to be the last word on this issue, because I believe in human progress and the perfectibility of human judgement, and I believe cooler, smarter people will look at this in the future and think: “Oh my god, how did they come up with that? Clearly this is wrong, right?” One of the beauties of the common law, or of a constitutional legal system, is that precedent can sometimes be overturned.
Alex: Meanwhile, Prof Fenton told me that the impact of the ruling had perhaps been less than we might expect.
Prof Fenton: Just something I can tell you as well that <chuckles> people don’t seem to be aware of: despite the R v T ruling, I have been involved in court since then where Bayes and likelihood ratios is used, and not for DNA evidence. So the idea that it’s, kind of like, forbidden—as long as both sides are happy with it, it does get used. I’m not suggesting it gets used extensively, because I’m probably aware of the cases where it does get used, but it does get used.
Alex: Still, he thinks the future for Bayesian methods lies outside the courtroom itself.
Prof Fenton: Look, it is a problem; it has had a negative effect. Because it means—where I said that this stuff is still being used, it’s because, in those cases, the prosecution and the defence were both happy for it, and the judge didn’t object to it, so it’s not a problem—but the thing is, one side can object to it on the basis of R v T, and they can say, “Well, we’re only going to allow you to use this stuff if it was DNA evidence, so we’re not going to allow you to use it for any of these other areas of forensic science.” So there is definitely a problem there.
The thing is, I’ve always felt that the best role for Bayes—in particular, for Bayesian nets, where we’re able to do analysis, bring all of the evidence together—I’ve always felt that the most important potential for that is pre-trial. I think that the most important scope for that is in things that, for example, the Crown Prosecution Service should be doing, in deciding whether or not a case… whether there’s sufficient evidence to bring a case, whether it’s likely to succeed.
So it should be very much used before things get to court, so Crown Prosecution Service, lawyers: when they’re starting on a case, looking at the relevant evidence, right? It will help them know which pieces of evidence are going to be important and which pieces of evidence are not.; where there are multiple pieces of dependent evidence, to focus on the most critical ones. And how, also, how to go about presenting the narrative for the case. So I see it very much pre-court, and guiding the way they conduct the case, rather than presenting explicit probabilities and networks and Bayesian models in court.
Alex: I’m a probabilist, and I believe that probability is a useful way of looking at all sorts of aspects of the world; essentially, whenever there’s uncertainty, I think probability can have a role. At least to a certain extent, and at least in England and Wales, the courts have taken a different view. But one thing we can perhaps all agree on is that the Bayesian approach, and what the judge in the Adams case described as the “common-sense” one, won’t always reach the same conclusions. After all, as Daniel Kahnemann and Amos Tversky wrote in 1972:
Voiceover: “In his evaluation of evidence, man is apparently not a conservative Bayesian: he is not Bayesian at all.”
Alex: Many thanks to my guests, Prof William Thompson and Prof Norman Fenton. Thanks also to my friends Asa Cremin, Phil Welch, Andrew Bewsey, Sean Loveridge, Maria Christodoulou and Amy Chard for providing the voices of the various texts quoted throughout this podcast, and to Dr Tom Crawford for supervising this project. My name’s Alex Homer; thanks for listening.