Basic Bayesian reasoning: a better way to think (Part 1)

Image: by me. Feel free to use, just link back to this post.
What is Bayesian inference? I've already mentioned it in several of my previous posts, and I'm sure to bring it up again in the future. I obviously think it's important. Why?

Bayesian inference is the mathematical extension of propositional logic using probability theory. It is superior to deductive propositional logic, which is what many people think of when they hear the word "logic". In fact it includes the rules of propositional logic as special cases of its more powerful and general rules. It is the logical framework that underlies the scientific method, and it encompasses a great deal of what it means to be a rational, logical, scientific individual. As with "normal", propositional logic, you don't necessarily have to be formally trained to use it in your daily life, but knowing its basics will greatly clarify your thoughts and sharpen your rational thinking skills. The intent of this post is to provide an introduction to this important topic.

Let's study an easy problem in propositional logic as a prerequisite review and a starting point for Bayesian logic. You should have learned in middle or high school that if A implies B, and B implies C, then A implies C. With some symbols, it becomes "if (A → B) and (B → C), then (A → C)". In an example with words, it might look like "If Socrates was a human, and all human are mortals, then Socrates was mortal". This is well and good. This is a fine way of thinking. Learning how to think this way is useful and worth learning.

However, when we examine the world around us, this rule is severely restricted in its applicability. Consider the following: "If Socrates was a human, and all humans have ten fingers, then Socrates had ten fingers". Is this sound? Can we conclude that Socrates necessarily had ten fingers? Well, no. The second premise - "all humans have ten fingers" - is not strictly true. Certainly most humans do, but not all. So we cannot conclude that Socrates had ten fingers. For that matter, we're not completely 100% sure that Socrates was human either.

"What's wrong with that?" You ask. "Hasn't logic brought us to a correct conclusion, that Socrates might not have had ten fingers?" True. But that's a very weak conclusion. Someone who was basing an argument on the possibility that Socrates didn't have ten fingers would need some additional evidence. I mean, until now I had implicitly assumed that Socrates had ten fingers, and I don't think I was being particularly irrational. Isn't there some way to conclude that "Socrates probably had ten fingers"? Maybe with a rule like "If A is likely to lead to B, and B is likely to lead to C, then A is likely to lead to C"? Doesn't that seem like a pretty logical conclusion?

Of course, "If A is likely to lead to B, and B is likely to lead to C, then A is likely to lead to C" is not a valid argument in propositional logic, and you can certainly find examples where A is true while C is not. For instance, "A blind person is likely to be a poor driver. Poor drivers are likely to get traffic tickets. Therefore, a blind person is likely to get traffic tickets" seems to be a incorrect chain of reasoning. But how could we be sure that it's not just an instance of bad luck, that this is just one of the cases where that probabilistic statement, "likely", just didn't pan out?

So we can't come to any firm conclusions about the "likely" rule in logic, although it seems to make sense sometimes. At any rate we can't use rigid propositional logic with such statements. But this is an enormous restriction, because there is nothing we know in the physical world with absolute certainty. Every instrument of measurement - including your own eyes and hands - are subject to errors and uncertainty. Even if you double check and verify, that only reduces the uncertainty to infinitesimal levels, without ever completely eliminating it. How can we reason in such cases - that is, in any real world scenarios where we are perpetually plagued by uncertainty?

In Bayesian reasoning, these uncertainties are built into its foundations. The truth of a statement is not represented by just "true" and "false", but by a continuous numerical probability value between 0 and 1. So, for instance, the statement "It will rain tomorrow" might get a probability value of 0.1, representing a 10% chance of rain. A statement like "I will still be alive tomorrow" might get a value like 0.999999, as I will almost certainly not die today. "1" and "0" would respectively correspond to absolute certainty in the truth or falsehood of a statement, but as I said they cannot be used in statements about the physical world. Instead we use numbers like 0.5 to represent the certainty that the coin will land heads, or 0.65 to represent the certainty you feel that you're going to marry that girl.

But isn't any given statement ultimately either true or false? Perhaps, but we are not God. We're ignorant of many things. But we still need to reason, even in our uncertainties. Giving a numerical, probabilistic truth value to a statement allows Bayesian reasoning to mirror the human mind much more closely than propositional logic. In essence, you can treat the numerical value you give to a statement as your personal degree of subjective certainty that the statement is true, given the information that you have.

But isn't this all very probabilistic, subjective, and uncertain? In one sense, yes. And that is a strength of Bayesian reasoning, because that's an actual limitation of the human mind. In representing the truth in this way we are only accurately representing how the truth actually exists in our minds. If we actually cannot be certain, then it's appropriate that our logical system actually represents that uncertainty.

But in another sense, this probabilistic thinking is completely rigorous and unyielding. By assigning a probabilistic value to the truth, you can use all of the mathematical tools of probability theory to process them, and their conclusions are mathematically certain. Bayesian reasoning makes very definitive statements about what these probability values must be, and how they must change in light of new evidence. I said earlier that Bayesian reasoning is an extension of logic using math, and that is exactly as rigorous and compelling as it sounds.

To finish this post, let me give you an extended example to illustrate both the rigor and flexibility of Bayesian reasoning, and its superseding superiority over propositional logic. We will address the question about Socrates and his fingers. There will be some math ahead, but nothing you can't understand at a high school level.

First, let me introduce some notations:

P(X) is the probability that you assign to statement X being true. So, if statement X is "I will roll a 1 on this dice", you might assign P(X)=1/6. But if you happened to know that the dice was loaded, you might assign P(X)=1/2 instead.

P(X|Y) is the probably that X is true, given that Y is already known to be true. So, if X is "It will rain tomorrow" and Y is "It will be cloudy tomorrow", then P(X) might only be 0.1, whereas P(X|Y) would be larger, perhaps something like 0.3. That is, there is only a 10% chance of rain tomorrow, but if we know that it will be cloudy, the chance of rain increases to 30%. Notice that the probabilities change depending on what additional relevant information is known.

This P(X|Y) notation is a little bit awkward, as it's written backwards from the more intuitive "if Y, then X" way of thinking. But unfortunately it's the standard notation. By definition, P(X|Y) = P(XY)/P(Y), where P(XY) is the probability that both X and Y are true.

~X is the negation of X. It is the statement that "X is false". By the rules of probability, P(~X)+P(X)=1, because X must be either true or false. Likewise, P(~X|Y)+P(X|Y)=1, and P(~XY)+P(XY)=P(Y)

You can translate a statement in propositional logic into a statement in Bayesian, probabilistic representation, simply by setting certain probabilities to 1 or 0. For instance, "X implies Y", which would be written as "X → Y" in propositional logic, would be written as "P(Y|X)=1" in terms of probabilities.

Now back to Socrates. Let the relevant statements be represented as follows:

A: "This person is Socrates"
B: "This person is a Human"
C: "This person has ten fingers"

Given these statements, we can translate the following statements as follows:

"Socrates was Human": P(B|A)
"Humans have ten fingers": P(C|B)

Now, let's show that we can duplicate the results of propositional logic simply by setting the probabilities to 1. If P(B|A) = P(C|B) = 1, then by the definitions given earlier, P(BA)=P(A), P(CB)=P(B), therefore P(~BA)=P(~CB)=0, therefore P(C~BA)=P(~CBA)=P(~C~BA)=0. But P(C|A) = [ P(CBA)+P(C~BA) ] / [ P(CBA)+P(C~BA)+P(~CBA)+P(~C~BA) ], which reduces to P(CBA)/P(CBA) =1 after eliminating all the zero terms. That is to say, if P(B|A) = P(C|B) = 1, then P(C|A) = 1. Or, translating back into words, "If Socrates is human, and humans have ten fingers, then Socrates has ten fingers".

Don't worry too much if you got lost in the notation in the above paragraph. The important point is that Bayesian reasoning can reduce down to propositional logic for the special cases where the probability values are set to 1 or 0. Bayesian reasoning thereby completely encompasses and supersedes propositional logic, like General relativity supersedes Newtonian gravity.

What if the probabilities are not 100%? This is the real-life problem of dealing with uncertainties. What is the actual value of P(B|A), the probability that Socrates was human? Might he not have been an alien, or an angel? As ridiculous as these possibilities seem, they ruin our complete certainty and makes propositional logic flounder. What about P(C|B) - the probability that a human has ten fingers? It's certainly not 100%. And what can we conclude about P(C|A) - the probability that Socrates had ten fingers?

To tackle this question, we need to consider the following formula for P(C|A), which can be derived from straightforward application of the rules and definitions mentioned earlier. The fact that this formula exists - that we can actually derive it and use it to perform exact calculations - is one of the compelling fruits of the Bayesian way of thinking. Here it is:

P(C|A) = P(C|BA)P(B|A) + P(C|~BA)P(~B|A)

Let's say that Socrates has a P(B|A)=0.999 999 chance of being human, and that given all this, he has a P(C|BA)=0.998 chance of having ten fingers. This means that P(~B|A) = 0.000 001 is the chance that Socrates was not human. The last factor we need to know, P(C|~BA), is the probability that a non-human Socrates had ten fingers. This is nearly impossible to estimate, as we'd have to consider all the different things Socrates could have been - alien, angel, a demon in disguise, etc. But it will turn out not to matter much for our final result. Let's just assign P(C|~BA)=0.1. Plugging in the numbers and calculating, we get that P(C|A) = 0.997999102. That is to say, Socrates almost certainly had ten fingers.

If you want extra practice, you can also try using the formula on the "blind man getting traffic tickets" scenario above, and see why a blind man is not likely to get traffic tickets. Or you can wait until next week's post to get the answer.

But again, don't worry too much about the details of numerical calculation. The important point is that Bayesian reasoning provides an exact formula for calculating the probability of a conclusion, even when the premises were also only probabilities - which is always the case in the physical universe. Furthermore, the conclusions drawn this way are compelling, because they are mathematical results. If you accept the probabilities of the premises, then you must accept the conclusion. This is the same compelling force which is at work in propositional reasoning. The premises lead to inescapable conclusions.

I hope that this example demonstrates to you the usefulness of the Bayesian way of reasoning. It can be actually applied to situations with uncertain premises, which is really nearly all situations. It is completely rigorous in that a correct Bayesian argument forces you to accept its conclusions if you accept its premises. Yet it's also flexible in assigning probabilities to reflect your current, subjective, personal degree of belief in the truthfulness of a statement. It duplicates propositional logic as its special cases, and in its full form it's more general and more powerful than propositional logic. There are other advantages I have not yet touched on, such as its ability to naturally explain inductive reasoning and Occam's razor, and how it serves as the framework for the scientific method. On the whole, it encompasses a great deal of what it means to be a logical, rational, and scientific thinker.

In my next post, I will discuss a particularly important formula in this probabilistic way of thinking, one that is nearly synonymous with Bayesian reasoning - Bayes' theorem.


You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 2) (Next post of this series)
What is "evidence"? What counts as evidence for a certain position?
Miracles: their definition, properties, and purpose
Another post, from the table of contents

The dialogue between two aliens who found a book on Earth

Image: Toy Story aliens from Amazon.com
Alice and Bob are two aliens. In their interstellar journey, they pass by Earth and there picked up a book. Some time afterwards, they meet to discuss their new acquisition.

Alice:
So, have you had a chance to look at that Terran artifact that we got?

Bob:
I have. It has many fascinating properties. I and my colleagues have studied it quite thoroughly, and although there are obviously still more discoveries to be made, we can make some certain statements about this object.

Alice:
Great! I was looking at it too, and I wanted talk to you about what I found. The alien artifact is clearly a book, and it's got an... interesting... message.

Bob:
Well, it's made of many thin sheets of cellulose fiber, upon which appears characters that consist of a carbon black mixture. I suppose you can call this configuration of materials a "book" if you'd like.

Alice:
Um... yeah, sure. It's made of paper, and the letters are in ink. Of course. But really I'm interested in its meaning.

Bob:
"Meaning"? I don't know what you mean.

Alice:
You mean that you haven't found the message of the book? I thought you said you looked at the book, and studied the writing on it.

Bob:
No, I mean that your question is nonsense. What is "meaning"? We certainly haven't found anything like that in this "book". The object is as I have described it: thin sheets of cellulose fiber, upon which there appears characters consisting of a carbon black mixture. That is all our empirical investigations have found. "Meaning" is not an component of the object, as far as we know. And, although this is a minor point, I must correct your usage of the words "letter" and "writing". The body of professional typographers to which I belong have decided that the technically correct typographic designation for these markings is "characters".

Alice:
Well, okay, whatever, but you haven't actually read those "characters"?

Bob:
Again you're not making any sense. What do you mean by "read"? The characters are characters. Although we have now studied them in great detail and can say a great deal about them, the best way to describe them remains "characters consisting of a carbon black mixture".

Alice:
How could you have studied them and not know how to read them, or their meaning? Look, you see here at the beginning of the book, where the letters "I" and "n"...

Bob:
...Characters.

Alice:
Whatever. Where the characters "I" and "n" appear together, making the word "In"? That combination of characters has a meaning, of being contained by something, or near the center of something, or surrounded by something. And the next word is "the", which is...

Bob:
Wait a minute. "Word"? "meaning"? what do those words mean? This sounds like more nonsense. All you've shown me are just characters.

Alice:
Yes, but the characters form words, which form phrases, which form sentences, which form paragraphs, then...

Bob:
Wait, wait, slow down. That is a lot of entities you've brought up just now that I'm not sure can be empirically verified. So you say that characters form "words"?

Alice:
Yes! Like the words right here, "In", "the", and so forth.

Bob:
You've merely pointed to a set of five characters. Of course you can have sets of characters. You can group the characters however you'd like. But they're still just characters. Where is the "word"? We typographers have not found anything like that in our study of this object.

Alice:
A group of characters IS a word. And each word has a meaning that can be determined in conjunction with its place in the sentence, which...

Bob:
Okay, let me see if what you're saying makes any sense. So, if I show you a character, you can tell me what "word" it belongs to? And what "meaning" it has?

Alice:
Yes.

Bob:
What about this character over here, this "I" character?

Alice:
That? That is the word "I", which means the self, the first person, the one who is also the speaker or the writer of that sentence.

Bob:
But clearly that is the character "I". Where is the "word"?

Alice:
The character IS the word.

Bob:
How could this be a "word" when it is clearly a character? I thought you said that "words" were groups of characters.

Alice:
This happens to be a one-character word.

Bob:
Well, isn't that convenient for you. I see no empirical evidence for any of this. And you say that this "word" is a character but also a "word", that it has "meaning"?

Alice:
Yes, of course. "I" is a particularly important word, with a very important meaning. It can refer to the author of the book, or it can be used in a rhetorical device to address a hypothetical person, or used by a character in a fictional story. You have to look at the context to figure it out. In an abstract sense, a lot of the literature in this book is about the relationship between the "I" and the...

Bob:
Enough. This is all nonsense. "I" is a character, but it is also suppose to be a "word", which has a "meaning", which can also be one of many different "persons"? I, as an empirical typographer, cannot accept such untypographical statements.

Alice:
But can't you clearly see that the characters form words?

Bob:
"Words"? I have no need for such a hypothesis. Everything you've mentioned, everything you've brought up, are only characters. You have no evidence that there is anything else.

Alice:
Look, if you'd just learn to read, you'll see that this is a book of profound truth and meaning. You have to recognize the meaning of each word and learn to use them in the context of a sentence, and build up your ability to interpret the writing all the way up to its full literary context, taking the author's intentions into account. It all makes sense once...

Bob:
You simply see many characters in this object, and in your wishful thinking you have concluded that there must be some "meaning" to them all. You therefore construct these convoluted system of "words", "sentences", "paragraphs", and on top of that, "rhetoric" and "literature", which is all suppose to express some "meaning" that reflects on some "truth" expressed by some "author"! And yet you can provide no empirical evidence for any of it. The reality is that we have investigated the characters in this object and they are now very well understood. There are no "words" to be found in them. Any such ideas are the products of a gullible mind, the yearnings of the untypographic individuals given to delusion. Such thinking is the opium of the masses.

Alice:
You have to begin by learning to understand the meaning in the words. That's how you learn to read. Once you start, you'll see that it all makes sense.

Bob:
So you have to buy into this "meaning" business to see that there is "meaning"? That's circular reasoning. Your argument is invalid.

Alice:
Look, let me read you some passages from the book. You'll see that there is in fact meaning to be found in the object. Watch. I'll read this passage, and you'll see that the characters here become words and sentences and have meaning. Listen: "... these are written so that you may..."

Bob:
Your cultic "reading" rituals are not evidence. All you were doing was to scan your eyes over the characters and making corresponding sounds with your mouth. In fact, upon studying your "reading" rituals, it becomes totally obvious that you're failing to understand the typography of the characters, which is the only underlying entity that actually exists. There is no "meaning".

Alice:
What do you mean?

Bob:
I mean that your so-called "reading" is merely you responding to the typography of the characters. For instance, through our intensive empirical study of this object conducted at the millimeter scales, we have found that each character takes on two forms: an upper case form and a lower case form. For instance, the "t" character is lower case, and its upper case form is "T". Furthermore, we have found that after a "." character, the next character is always in the upper case form. There are other such laws of typography we've discovered - for example, a "q" character is always followed by a "u" character. All of this is verifiable through empirical observations. All you're doing when you're "reading" is employing these laws to look at the characters and making the corresponding sound with your mouth. At the bottom, it's only typography, and because you fail to understand typography you imagine that you're "reading".

Alice:
Look, of course things like clean writing, punctuation, and spelling are important, but you're missing the point here. To read the book means to get its meaning out of it.

Bob:
Then why is it that when I see you "read", I only see you employing typography when I break it down to what's really going on? Or let me put it this way: could you still "read" if the laws of typography were different? For example, if "T" characters looked like ";" characters, and the characters all ran together without any space between them?

Alice:
Of course not. I'm not being anti - typography. I obviously employ it in reading the book. I'm saying that there's meaning behind it all.

Bob:
Yes, you are being anti-typography. You're clinging to your "meaning" instead of recognizing that all of your "meaning" comes from the characters arranged according to typographical laws.

Alice:
So you're completely rejecting the idea of any meaning in the book?

Bob:
I am only holding to beliefs which have been empirically verified. Of course, there is still a possibility that your "meaning" exists, although there is no evidence for it. But if it does exist, even that meaning will be found by typographically examining the characters. For instance, one of the issues at the frontier of typographical research is the similarity between the "1" character and the "I" character. Some typographers suspect that there is even a difference between the "I" character and the "l" character, which would open up the possibilities for discovering new typographical laws. We cannot be certain yet, of course. This will require exciting new studies at the sub-millimeter scales. It may be that your "meaning" will be discovered at these sub-millimeter scales or in these new typographical laws, although I see no reason to expect that to happen.

Alice:
But that's not what's meant by "meaning" at all. The meaning of any object is found outside the object itself. Meaning is what's meant by the author of the book, what's intended for the readers of the book to understand.

Bob:
There you go again with your circular definitions. "Not what's meant by meaning"? What does that even mean? The fact that you're not excited by the prospect of progress at the typographical frontier gives me reason to believe that your "meaning" is antithetical to typography, only brought about by your ignorance of typographical matters. The truth is that there is no outside "meaning". The object is exactly what you'd expect it to look like if it was blindly created only from the laws of typography.

Alice:
Okay, so what are all these characters in the book for then? Why do they exist? What's their reason for being?

Bob:
There is no ultimate, outside "meaning", but there is perhaps meaning to be found in the beauty of the laws of typography, in exploring its depth and appreciating that they are all meaningless. We've found this meaningless "book", and it is up to each of us to choose to give it meaning. I think that's actually far more beautiful and profound than trying to discover some "meaning" that's thrust upon us. It may be depressing to think that there is no ultimate purpose or "meaning" in this object, but we can't let that depression beat us. We find our meaning in fighting against that depression and finding our own meaning - in standing against the meaninglessness of it all. I make my own meaning.


You may next want to read:
Isn't the universe too big to have humans as its purpose?
How to determine the specific purpose of the universe
The trends in science as evidence for Christianity against atheism (part 1)
Another post, from the table of contents

How to determine the specific purpose of the universe

Last week, I cited the fine tuning argument to conclude that the universe does have a purpose, for it is nearly impossible for its features to be the result of purposeless randomness. Just as a rational but ignorant alien who comes across a human book will conclude that it has a purpose, we too are compelled by the same reasoning to conclude that the universe has a purpose.

The fine-tuning argument will get its own series of articles in the future. But for now, here's the basics: the universe has certain fundamental parameters which must fall within exceedingly narrow values for life to have evolved in it. These values are so narrowly determined, and the probability of a random process generating these values so low, that it would be simply called "impossible" in any ordinary situations. That is to say, a purposeless process would almost certainly not have created our universe.

This allows us to firmly conclude that the universe did not come about randomly, that it really does have a purpose. But what is that purpose? The fine-tuning argument only mentions life, so how do we go from that to the biblical claim of the universe being made by and for Christ? How do we know that the universe was not made for cats, or bacteria? In light of the many different life forms in existence, is it not merely human hubris to say that humanity, and especially one particular human, is the reason for the existence of the universe?

As before, we will approach this question using Bayesian inference, and start by tackling an easier, analogous question: that of an alien considering a book. How could our alien conclude that the purpose of this book was to convey information? After all, couldn't the book also serve as a paperweight, or kindling for a fire? How does the alien go from "this object has a purpose" to "that purpose is to convey information"?

Once again, Bayesian inference gives our alien the answer: look for features that could be anticipated, predicted, or explained by each of these purposes. These features then serve as evidence for the purpose which best predicts them. So, the thin paper pages of the book serves as evidence for both the "kindling" and the "information" hypothesis: both can explain why the book has thin paper pages. However, only the "information" hypothesis can explain why the pages contain symbolic markings, and this then decides the question in favor of the "information" hypothesis.

Note that the alien's conclusion would be greatly strengthened by a knowledge of the language in the book. If he did not know the language, he may only tentatively infer that the markings in the book had meaning. But knowing the language brings with it a much greater certainty that the content of these markings are highly unlikely to have come about by chance. If you are ignorant of English, the word "meaning" looks like a random sequence of letters, and you may decide that it was just randomly put together. But as someone who understands English, you know that this sequence of letters is not likely to be the result of chance. When applied to the text in our book, this low probability then serves as strong evidence that the book's purpose really is to convey information.

Once again, it all comes down to probability. The true purpose of the book is that which best explains the least probable feature of the book. Since the least probable feature of the book is its text, its true purpose is that which explains that text: the book exists to transmit information. Reaching this conclusion is greatly aided by the knowledge of the language. This is extendable to all objects: the true purpose of a given object is that which best explains the least probable features of that object, whose recognition is greatly assisted by some prior knowledge.

Now that we've considered this hypothetical book, let's apply the same reasoning to the purpose of the universe: the universe is designed for life, as per the fine-tuning argument, but this does not distinguish between humans, cats, or bacteria being that purpose. If we consider only fine-tuning, we cannot tell whether cats exist to make us laugh or we exist to serve cats. Or perhaps we both exist to serve bacteria. However, this equivalence is broken upon considering other features of the universe, such as human civilization. Only the primacy of humans can explain why humans have achieved civilization while cats and bacteria have not: the other hypotheses cannot explain why this highly unlikely feature should exist for humanity.

Note that this conclusion is likely to be reached by someone with some prior knowledge of human civilization, who understands that civilization is not something that could have come about by chance. A random pile of matter - even a random pile of matter put together by humans - is unlikely to result in civilization. So only by being ignorant of human civilization - only by failing to recognize its low probability starting from randomness - can one claim that the purpose of the universe is to generate cats or bacteria. Those who recognize civilization therefore rightly conclude that its low probability is firm evidence for placing humans at the apex of the purpose of the universe.

It again comes down to probabilities. Humans are the most complex life-forms, and we're the only ones to have achieved an advanced civilization. Both complexity and civilization are low-probability events: therefore among the life-forms we are the least likely to have randomly evolved. And among the humans, Jesus was the least likely person to have ever lived: one does not just randomly fulfill messianic prophecies, then randomly say the things that Jesus said about himself, then randomly lead a morally perfect life, then randomly rise from the dead. But if Jesus really is the incarnate God for whom the universe was created, then everything is explained.

That is how you go from merely stating that universe has a purpose, to specifying that purpose. That is how you narrow down from the purpose existing, to it being life, to humanity, and finally to Jesus. The purpose of the universe is that which explains the least probable features of the universe: and as the least likely member of the least likely species in our improbable universe, Jesus Christ was that purpose. And to all who acknowledged him, he gives them the ability to become the children of God.


You may next want to read:
The dialogue between two aliens who found a book on Earth
The biblical timeline of the universe
Isn't the universe too big to have humans as its purpose?
Another post, from the table of contents