NaClhv: A common mistake in Bayesian reasoning

You and your friend are investigating a murder, and you have following conversation:

You:
Alice is obviously the culprit. The knife has her fingerprints on it.

Your friend:
Why are you ruling out Bob? It could have been Carol or Dan, too. Or anyone else, for that matter.

You:
Um... because of the knife with the fingerprints I just mentioned?

Your friend:
And you actually think that's solid evidence that Alice did it?

You:
Of course. It's by far the best explanation for the knife, which is obviously what killed the victim.

Your friend:
Exactly! The knife is the one thing that we can all agree on. It's the best explanation for the murder, as you just admitted. And since "the murder was committed with a knife" is the best explanation, it is in fact superior to your "Alice did it" explanation.

You:
What?! That doesn't explain anything. We're trying to figure out who committed the murder.

Your friend:
Yes, and you and I both agree that the victim was killed by a knife. That eliminates any need for the victim to have been killed by Alice.

You:
But what about the fingerprints?

Your friend:
That, too, is something that we both agree on: The knife had fingerprints on it. We now have a very good description of the murder weapon: it is a knife with fingerprints on it. Given this preponderance of evidence that the murder was committed with a knife, which we can describe even down to the fingerprints, I don't see why you insist in bringing in Alice into the picture at all.

You:
But these are Alice's fingerprints! That obviously points to Alice as the murderer!

Your friend:
See, that way of thinking introduces a number of very bad problems. Why would Alice grip the knife at that point, in that particular way? Why not hold it a few millimeters higher or lower on the grip? Why not use a reverse grip instead of the one that she supposedly used? Why not hold the knife in her left hand instead of her right? Why didn't she wear a glove, or wipe up the knife after the murder, or hire a hit-man? Given all these alternate possibilities, "Alice did it" is actually a terrible explanation for the state of the knife.

In fact, I can think of many explanations for the knife just as good as your "Alice did it" theory. The knife may have always existed in this state. Or, it could be that a combination of oils, moisture, and heat from outside the knife left an impression that you're interpreting as "Alice's fingerprints". Or these so-called "fingerprints" are an artifact from the knife's manufacturing process. How do you eliminate all these other possibilities?

Seriously, given all these alternative explanations for the knife, of which there are an infinite number, there's absolutely no reason to think that "Alice did it". That's a terrible explanation.

Okay, so your friend is clearly being ridiculous here. But what exactly is the nature of his error? If we are to learn from your friend's mistake, we ought to try to understand WHY it's a mistake. We can then identify other analogous situations, avoid the logical pitfall, and reason correctly instead.

Your friend's fundamental mistake is neglecting to compare a hypothesis with its RIVALS. In my series on Bayesian reasoning, I said that you need to specify the complete set of competing hypotheses in order to use Bayes' theorem, and that one of the advantages of the odds form of the theorem is that you merely need two competing hypotheses instead of having to know the complete set. But in both of these cases, the hypotheses need to compete. They need to be rivals. They must be mutually exclusive.

In the above conversation with your friend, "Alice did it"(alone) is mutually exclusive with "Bob did it"(alone), for they cannot both be true. They are rival hypotheses, and it's appropriate to ask which one of the two better explains the evidence. That is the heart of Bayes' theorem. However, "Alice did it" is NOT mutually exclusive with "the knife did it", because obviously Alice could have used the knife to kill the victim. They are not rival hypotheses. It is therefore NOT appropriate to say "the victim was killed by a knife. That eliminates any need for the victim to have been killed by Alice."

The same goes for the "alternative explanations" that your friend offers for the state of the knife, such as the idea that the knife always existed in that state, or that a combination of oils, moisture, and heat from outside the knife left the impression of the fingerprints. What do these have in common? None of them are mutually exclusive with the idea that Alice is the culprit. They are not rivals to the "Alice did it" hypothesis.

Likewise for all of the different ways that Alice could have wielded the knife: what is the probability that Alice's fingerprints ended up on the knife in that very specific way, given that she could have held the knife higher or lower on the grip, or wiped down the knife afterwards? Admittedly, it's very small. But that's not the end of the story: this probability now has to be compared with the probability from a RIVAL to the "Alice did it" hypothesis. So, what is the probability that Bob left that fingerprint? Absolutely minuscule, even compared to Alice's probability mentioned earlier: for not only would Bob have to hold that knife exactly in the same way that Alice held it, he furthermore has to somehow leave Alice's fingerprints while doing so. The ratio of these probabilities is what makes the knife serve as evidence pointing to Alice as the culprit.

The lesson here is that you are not done with your analysis until you've connected your ideas back to a set of RIVAL hypothesis. Ignoring this condition is an outright mathematical error in applying Bayes' theorem. It's akin to thinking that the sum of the sides in a triangle must add up to 180 inches. Your friend, in the conversation above, always stopped his analysis at the knife, instead of continuing it back to a set of competing hypothesis. He should have extended his analysis of the knife back to an "Alice did it", "Bob did it", "Carol did it", or a "nobody did it" hypothesis. That would have been the correct way to make his case. Then he would have seen that the knife DOES point to Alice being the culprit, DESPITE the fact that there are more likely "explanations" for the knife, because these "explanations" are NOT RIVALS to the "Alice did it" hypothesis. But among the rivals, "Alice did it" IS the most likely and therefore the best explanation for the state of the knife.

But your friend never did any of this. This was his mistake, which lead to his incorrect conclusions about the case.

Furthermore, your friend tried to sneak in the evidence - the knife - as a part of the "not Alice" hypothesis, when it should have remained as evidence to be considered by the set of competing hypotheses. In essence, the "not Alice" hypothesis became a parasite attached to a completely unrelated (but strongly supported) hypothesis - the "victim was killed with a knife" hypothesis. This is a common cheat when one wants to shore up a weak hypothesis, which cannot explain the evidence. Your "not Alice" hypothesis can't explain the knife with the fingerprints? Just attach the knife as part of your "not Alice" hypothesis, and say that your hypothesis explains everything. You think humans haven't been to the moon, but you can't explain the photos and the videos from the Apollo missions? Just attach them to your hypothesis, by saying that they're part of the government's moon landing conspiracy. You think there's no God, but you can't explain how that would result in the universe as it actually exists? Just sneak in science as part of your hypothesis, and parasitically leech off its prestige and pretend that it belongs to your hypothesis.

I would not be writing all this, except that I see this error made repeatedly, even by people who say they understand Bayesian reasoning. For instance, I've seen people say:

'Platonism explains the orderliness of the universe as well as theism, therefore the orderliness of the universe is not any evidence for theism.' The correct way to make this argument would require you to move past Platonism, to a rival to theism such as polytheism or atheism. So, for example, 'Polytheism explains the orderliness of the universe as well as theism, therefore the orderliness of the universe is not any evidence for theism over Polytheism' would be a sound argument, if polytheism did in fact explain why the universe should be orderly. As it stands, Platonism is not a rival to theism, and that invalidates the argument.

'If the universe were a simulation designed to study life, that would explain the existence of life far better than divine fine-tuning. Therefore the fine-tuning argument is not any evidence for a creator god.' Again, the idea of the universe as a simulation is not a rival to divine fine-tuning. Obviously a god could have created the universe by fine-tuning a set of agents to run a simulation that is our universe. These are not mutually exclusive ideas, and that invalidates this argument. In order to make the argument correctly, you must evaluate the existence of life with respect to the RIVALS of divine fine-tuning, such as random atheistic chance, or a god that's uninterested in life.

'A universe designed to produce black holes is just as good an explanation for why it's suited for life as a universe designed for life. Therefore, the existence of life is no evidence for a fine-tuning God.' This is exactly like the previous case: a universe designed to produce black holes is not mutually exclusive with God creating life. God could have made the universe suitable for life by creating it to produce many black holes. In order for you to make the initial argument correctly, you must either explain why a RIVAL to the God hypothesis would be more likely to make a black hole filled universe, or be more likely to create life directly with or without black holes.

'We can construct a system of morality without God by starting from the Golden Rule, which everyone agrees on. Therefore, morality is not any evidence for God'. By this point, you should know the key question to ask: is the Golden Rule mutually exclusive with God? Of course not. The analysis is therefore incomplete. To finish this line of thought, you must argue that some rival to the God hypothesis is a better explanation for the Golden Rule. For instance, you can try explaining how an atheist is under a stronger obligation to follow the Golden Rule than a believer. That is how you would bring the argument back to a set of competing hypothesis.

Remember that a hypothesis must be judged against its RIVALS. The competing hypothesis must be mutually exclusive. According to the rules of Bayesian reasoning, you are not done making your argument until you've brought it down to the evaluation of the hypothesis against its rivals.

You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 1)
Science as evidence for Christianity against atheism (introduction)
Another post, from the table of contents

NaClhv

Blog pages

A common mistake in Bayesian reasoning

No comments :

Post a Comment