New banner and icon for the blog

I've decided to try my hand at graphics design, to draw the banner and the icon for this blog. The following are some earlier drafts. Hopefully, they illuminate the meaning behind the blog's name.


Here are the results, the final designs for the banner and the icon. They're what appears on the blog now.


You may next want to read:
15 puzzle: a tile sliding game
Time spent on video games: worthwhile or wasteful?
Another post, from the table of contents

Merry Christmas! And happy one year anniversary for this blog!

Jesus asked his disciples, "what about you? Who do you say I am?"

The following is a selection of the things I've said about Jesus in various posts, in the year since I started this blog. This is who I say he is. Merry Christmas!


What does it mean that Christ created the world? It means that he was incarnated into the world. Otherwise, what has God (who is a spirit) to do with the world (which is physical)? To physically create the world, God - the One Father of All - breathed into the world his Secret Fire, the Imperishable Flame, the One that belongs only with God. He did so to "let these things be" - so that his plans and intentions would become physical reality through his Word.

The form of that Flame is none other than Christ come into the world. Merry Christmas to you all - for on that day the universe was (ontologically, not temporally) created.

Jesus is like a baby elephant. A large elephant can be groped at by blind men and never be comprehended because of its large size. But if that elephant had a baby - something begotten to be of the same elephant nature yet small enough to be felt by the blind - then they could get a good idea of the large elephant.

There are no miracles at the highest level: If we could somehow understand God completely through one last final miracle, we'd find that there are no exceptions or surprises or inconsistencies in God, for God is perfectly logical and consistent in himself. All the lower level miracles would be contained and explained in the last miracle to give a complete picture of God.
[...]
The last, deepest, greatest miracle is the Incarnation. It is a miracle at the level of the nature of God himself. It reveals to us everything about God. "Anyone who has seen me has seen the Father". It is the miracle that explains all other miracles.

What is love? As it is written, "Greater love has no on than this: to lay down one's life for one's friends". So at the cross, Jesus displays the unconditional love that continued to love his sinful enemies even while we crucified him. He took on all our sins and their consequences, sacrificing himself and saving us.

Jesus Christ is God himself incarnated as a man. In God's act of true love for us, Christ came - God came as a man - to fulfill the plan for our salvation. For what power does anyone else have to stop the course of sin? To save us? To reach us, he humbled himself down to our level, and took on the human form that he first granted us. Like us, he was conceived, born, and raised, and became a man familiar with our sorrow, who experienced our pain. Despite being fully human, he remained morally perfect, so that he could serve as the perfect example for us. Moreover, this was necessary for the next key part of the plan: his crucifixion and resurrection. 
I do not understand Jesus' death on the cross. There are theories of how it worked, but I doubt we have anything close to the full picture. This is only expected: the cross is nothing less than the intersection of all of existence - things on heaven and earth, visible and invisible, life and death, good and evil, sin and righteousness, God and his creation, story and Author - they all collide here. I think that a complete understanding of Christ's death and resurrection would require nothing short of the entirety of the mind of God. My telling of the story is utterly insufficient for it - nevertheless I will proceed.

The universe is actually designed by and for one person, and one person only: Jesus Christ, who is God himself. It's his game. He designed it to play it himself. Every parameter, every feature of the universe is designed solely for Christ's sake. Because Jesus made the universe to play it himself, it's an excellent, perfectly crafted game: it's made with incredible elegance, efficiency, and simplicity in its fundamental rules which are completely free of bugs or exceptions, yet the final result is rich and complex and intricate, and allows for a great deal of player expression.
[...]
How, then, shall we play? [...] Play like Jesus played: to express God's love back towards God, and to your fellow players. This is what the universe was designed for. After all, it was designed by and for Jesus.

The Incarnation:
This, at last, is the event for which all of creation was made and waiting for. God sent his son, in whom dwells the fullness of deity, to be incarnated as a human being. Thus began the final act in the grand story of the universe - the one that the whole universe had been building up to for 13.8 billion years. 
The death and resurrection of Jesus Christ:
And here is the climax of the story: the singularity at the heart of existence, the purpose for which Christ came into the universe. Everything had been for this event. In the beginning, when God set the laws of the universe, he dictated that a hammer pounding a nail would be sufficient to drive it through Christ's body. When God created the Earth, he placed the iron atoms that would make up the spear that pierced Christ's side. When God created life, he designed it so that sufficient structural disruption would cause it, and therefore Christ, to die. When he created humans, he gave us sufficient brains for processing life, love, death, and resurrection. And when the serpent deceived Adam and Eve, God declared that the seed of the woman would crush the serpent. Everything - all the other events in the universe's history - had been building towards this moment. Through his death and resurrection, Christ makes us fully God's children. He completely reverses Adam's, Eve's, and all of our sins. He sets the course for all of creation - the whole universe - to be redeemed.


You may next want to read:
The Gospel according to Disney's "Tangled"
The Gospel: the central message of Christianity
For Christmas: the Incarnation
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 4)

Have you read the last several posts? In those posts we began the tale of Alice and Bob, a pair of murder suspects who recently started dating one another. Through their sordid tale, we'll examine Bayesian reasoning, the scientific method, and the so-called fallacy of "affirming the consequent".

Alice and Bob are going through a rough patch in their relationship. One day, Alice accuses Bob of infidelity, and they have this conversation:
Alice:
You spent the night at Carol's house last weekend! You're cheating on me with her! 
Bob:
What?! How do you figure that? I'm innocent! 
Alice:
If you're cheating on me with her, it makes perfect sense that you'd spend the night at her house! 
Bob:
Ha! You're "affirming the consequent". You've started from "if [cheating], then [night at Carol's house]", then concluded that "if [night at Carol's house], then [cheating]". This is a logical fallacy, and your argument is invalid. Cheating on you is not the only possible explanation for me spending the night at Carol's. There are other, perfectly innocent explanations - like the fact that Carol threw a party that ran late, and a bunch of us just crashed at her place for the night rather than risk driving home tired and drunk.
Now, let's pause the conversation here for the moment and assess the situation. So far, Bob's logic follows the example in the Wikipedia page on "affirming the consequent". And he certainly seems right - "affirming the consequent" is a fallacy in propositional logic, and Alice can't necessarily conclude that Bob cheated on her just because he spent a night at Carol's house. So, is Alice committing a logical fallacy? And therefore Bob is innocent? Let's continue and see:
Alice:
That party happened last weekend, when Carol knew I would be out of town! That makes perfect sense if Carol plotted to have you come without me! 
Bob:
That's ridiculous. You're still just "affirming the consequent". You've started from "if [Carol plotting], then [party on that weekend]", then concluded that "if [party on that weekend], then [Carol plotting]". I told you before, this is a logical fallacy. There are many other explanations for why the party might have happened on that weekend. 
Alice:
Whenever I ask Carol whether she's seeing anyone, she avoids the question! 
Bob:
That's more flawed reasoning. Do I have to explain it to you again? There are other possible explanations why Carol avoids that topic with you. That doesn't mean I'm cheating on you with her.
Alice:
But she was always very forthcoming about her dating life before. What would make her so reluctant to talk about it now? 
Bob:
I don't know. I suggest you ask her. Many innocent explanations are possible. You're still "affirming the consequent". You've started with a hypothesis - that I'm cheating on you with Carol - and then produced observations that fit with that hypothesis, then used those observations to justify the original hypothesis. That's like saying "The Bible is true because God wrote it, and it says that God exists. Therefore God exists". That kind of silly, circular reasoning is what happens when you "affirm the consequent", and you're using this logical fallacy over and over again to try to say that I've cheated on you.
Well, Bob certainly seems to be right. Alice can't, and shouldn't, conclude that Bob is cheating on her just because Carol is not talking about her dating life, or because the party happened on a certain weekend. After all, "affirming the consequent" is a logically fallacy, isn't it?
Alice:
Someone at the party saw you go into Carol's bedroom with her. 
Bob:
You're still "affirming the consequent", and it invalidates your conclusion. Your logic is flawed. There are perfectly innocent reasons to go into someone's bedroom. 
Alice:
With two bottles of wine. And you closed the door afterwards. 
Bob:
Still just "affirming the consequent". What, we're not allowed to drink wine at a party? We're not allowed to close the door when the music is loud out in the living room?
Alice:
This was at 10 pm, far before your normal bedtime, far before you're normally tired. 
Bob:
It was a crazy party. It wore me out fast. Why do you continue to "affirm the consequent"? Don't you see that you're still just starting with the idea that I'm cheating on you, then using that idea to interpret the events to justify itself? That's circular reasoning. You're saying that just because these are how things would play out IF I were cheating on you, therefore then I MUST be cheating on you. It's a logical fallacy, like I said many times, and you're just using it repeatedly.
Alice:
And you didn't come out from the bedroom until the next day. 
Bob:
Like I said, we got tired and decided to just sleep off the party rather than risk driving home drunk and exhausted.
Okay... hmm... I mean, "affirming the consequent" is still a logical fallacy, right? It's got a Wikipedia page and everything. How could Alice be right when she's committing this fallacy over and over? I mean... if you heard that your significant other went into a bedroom with someone else, along with two bottles of wine, and stayed behind closed doors until the next day, you wouldn't jump to any conclusions, right? Because you're a logical thinker and you don't want to commit a fallacy?
Alice:
My source from the party also tells me that it looked like you and Carol were making out before you went into her room.
Bob:
You know that eyewitnesses are unreliable. The living room was dark and your "source" was probably drunk as well. Or maybe your "source" is lying to break us up for his or her own ends. There's lots of possibilities, you can't conclude that I cheated on you from this, and you're only still "affirming the consequent" by bringing this up to say that I did.
Alice:
I also have this shopping receipt, dated the day before the party, for things that Carol bought. She purchased scented candles, those wine bottles we mentioned, and "sexy" lingerie. 
Bob:
What Carol does with her money and what lingerie she wears is none of my business. You're still using flawed logic, by starting from the idea that I'm cheating on you, to explain what Carol bought, then using that explanation to justify your initial assumption. 
Alice:
She also bought condoms. 
Bob:
I didn't know, I don't care, and it's not relevant. There are many reasons that Carol would buy condoms that have nothing to do with me cheating on you. 
Alice:
I found condoms of the same brand in Carol's trash dumpster after the party. They were used. 
Bob:
Are you crazy?! That's disgusting! Completely apart from your gross dumpster-diving, that doesn't prove anything. Those could have come from anywhere, thrown away by anyone. You're still trying to make everything fit into your preconceived notion that I've cheated on you. That's "affirming the consequent"! It's a logical fallacy! You're just repeating this fallacy over and over again!
Alice:
I had them tested at the lab. The DNA on the outside is a decently good match with samples from Carol's hair. 
Bob:
DNA matching is imperfect. There's thousands of people in this city that would also be a "good match" with that DNA sample. Even if were an "excellent" match there's still lots of people who would fit that criteria. Any one of them could have used the condom. And even if it WAS Carol, you can't conclude that I've cheated on you with her from just that. That would be "affirming the consequent"! 
Alice:
And the DNA on the inside is an excellent match with you. 
Bob:
LOGICAL FALLACY! Over and over again! Your reasoning is invalid! You're trying to go from "if A then B", to "if B then A"! That's circular logic! It's "affirming the consequent"! I did not cheat on you with Carol!
If you still believe Bob, then I have a bridge to sell you. The weight of evidence is overwhelming at this point. Bob did almost certainly cheat on Alice with Carol.

But what about "affirming the consequent"? Isn't Bob right that it's a logical fallacy? Isn't Alice's argument based entirely on using it over and over? What does Bayesian reasoning say about all this?

Now, Bayesian reasoning mirrors human common sense. It will never lead to a result that "normal" reasoning says is impossible. As I mentioned earlier, you don't actually need formal training to use it in your daily life, because its rules are just the rules of good thinking that's been refined to a mathematical precision. However, because of its precision and power over propositional logic, Bayesian reasoning can sometimes lead to surprising results for someone who's only versed in propositional logic. "Affirming the consequent" is one such result.

In Bayesian logic, "Affirming the consequent" is allowed in a mathematically precise way. You CAN relate "if A then B" to "if B then A". In Bayesian terms, where we assign probability values - P(A), P(B), P(A|B), et cetera - to all statements, "if A then B" can be expressed as P(B|A), and "if B then A" becomes P(A|B). And these two probabilities are directly related to one another, as it is plainly written out in Bayes' theorem:

P(A|B) = P(B|A)  * P(A)/P(B)

Essentially, the two factors grow together. As P(B|A) gets bigger, so does P(A|B). As B becomes better explained by A, A becomes more likely given B. The more strongly the consequences of a hypothesis are affirmed, the more likely the hypothesis is to be true. As more events around Carol's party are explained by Bob cheating on Alice, it becomes more certain that Bob cheated based on these events. So each event - each instance of "affirming the consequent" - actually strengthens the hypothesis that Bob cheated on Alice with Carol. Far from dooming Alice's hypothesis because of its status as a "logical fallacy", it actually serves as evidence for Alice's accusation.

That's right: "affirming the consequent" does not invalidate its conclusion, instead it actually serves as evidence FOR that conclusion.

It is the very fact that Alice used "affirming the consequent" OVER AND OVER that made her case so strong. It's crucial to note that if she had made only one such argument, even if that argument was the one from DNA on the condom, her case would have been weak and she would have been wrong to come to her conclusion. But with each instance of "affirming the consequent" - each time Alice successfully showed that the events around Carol's party fit with Bob cheating on her - her case grew stronger. Therefore, "affirming the consequent" is a "logical fallacy" only insofar as it's not being used enough.

So if you see someone say that Bill Gates must own Fort Knox because he's rich, you can legitimately say that this is flawed reasoning, and call him out on "affirming the consequent". In this case, you'd be using that term as a proper logical fallacy and saying that this person conclusion is invalid. But if this person repeated similar arguments over and over - if he showed that Bill Gates was part of a secret cabal that controlled the U.S. government, and that Gates had regularly been inside Fort Knox, and that there were mysterious changes to his net wealth that matched perfectly with mysterious changes in the amount of gold in Fort Knox, and that a highly ranked government official anonymously said "Bill Gates owns Fort Knox.", then we might be getting somewhere. Each of these things would by itself could be dismissed by citing "affirming the consequent", but together, each instance of "affirming the consequent" counts as evidence, and adds up to a strong case.

So "affirming the consequent" can both serve as evidence and be a mistake. But, in Bayesian terms, how can you tell when it's a mistake? What is the genuine blunder in logic when that happens? As a mistake, "affirming the consequent" is the act of coming to the conclusion without enough evidence. It's coming to the conclusion without affirming enough consequents. Or more properly, it's concluding that P(hypothesis|evidence) is high, when P(evidence|hypothesis) is not yet large enough to compensate for P(hypothesis)/P(evidence). The solution to this issue, in part, is not to stop "affirming the consequent", but to do it more - to look for more evidence.

The reason that propositional logic doesn't, and can't, follow this reasoning is because it cannot distinguish between probability values of 1% or 99%. In propositional logic, a statement can only be true, false, or undecided. But "affirming the consequent" works in Bayesian reasoning by moving the probability value: it perhaps starts at 1% (very unlikely to be true), but then slides to 20% (unlikely to be true), then to 70% (likely to be true), and to 99% (very likely to be true) as you affirm more consequents, over and over. Propositional logic sees this and says "all I recognize in all these cases are undecided statements", and since 99% is not 100%, it will not let you say that the conclusion is true. This is why "affirming the consequent" is always a logical fallacy in propositional logic. But this really says more about the limits of propositional logic rather than reflecting true rationality.

How do you know when you've affirmed enough consequents? How many times to you have to "affirm the consequent" to be sure of your conclusion? Due to the difficulties associated with using Bayes' theorem in a real-world context, it may be hard or impossible to get actual numbers. But you have to at least walk through the equations to vaguely answer the question.

In particular, when you work through the equation it turns out that the most effective kind of evidence is that which could be affirmed by your hypothesis, but not by a rival hypothesis. "Affirming the consequent" is better than not affirming. Circular reasoning is better than contradictory reasoning. This is the essence of the odds form of Bayes' theorem, which shows the importance of comparing the hypotheses against one another. It has many important applications:

One such application is the scientific method. Bayesian reasoning is the logical framework that underlies the scientific method. Science, in part, relies on "affirming the consequent". Experimental verification of theoretical predictions serves as evidence for that theory. On the flip side, theories are falsified based on experiments as well. Both sides of that statement are together expressed in the odds form of Bayes' theorem. Between two competing theories or hypothesis, "affirming the consequent" is better than not affirming, and circular reasoning is better than contradictory reasoning.

Bayesian reasoning is also at the heart of presuppositional apologetics, which starts with the idea that God of the Bible - who is the basis for all rational thought - exists. It then "affirms the consequent" by verifying that the world does indeed bear the image of its Creator. Rival non-Christian worldviews cannot make the same affirmation, and therefore must borrow from the Christian worldview even in attacking it, thereby contradicting themselves. Of course, its critics have said that this approach is invalid because it "affirms the consequent", but I hope you now know better.

This reasoning is also the logical foundation for my blog here. I start with this fundamental postulate: God as revealed in Jesus Christ. I then "prove" that God exists by demonstrating that this postulate generates the universe - that is, by affirming the consequent.

This Bayesian reasoning is also the logical framework for my series of posts on how science itself - its axioms and long-term traits and properties - serves as strong evidence for Christianity. Because hypothesis should be measured against its rivals, I said that science is evidence for Christianity and against atheism. Of course, its critics accused me of "affirming the consequent" over and over again. By now, you should recognize this is as the mark of a strong argument, one with a great deal of evidence behind it. After all, "affirming the consequent" is a hallmark of science itself.

In all these areas, beware those who only cry "fallacy!", who will not state or test their hypothesis against yours, who only want to tear down arguments instead of building them. They pretend that their ignorance is strength, because they think that knowing nothing means they never have to affirm any consequents. They do not realize that this is actually the mark of profound weakness, and such know-nothing hypothesis can only survive by parasitically attaching itself to more established theories. But you should actively seek to find, build, critique, and refine your hypothesis. Rejecting a hypothesis is never an end in itself, but a step towards a better hypothesis. Remember that the devil comes to steal and to kill and to destroy. But it is God who creates.

We can now conclude by answering the questions I raised at the end of my last post. Yes, Bayesian reasoning allows for "affirming the consequent", and this actually serves as evidence FOR your conclusion. There is still a sense where "affirming the consequent" is a fallacy, which happens when you give a hypothesis too much credit based on a single instance of "affirming the consequent". But this only means that you haven't affirmed enough consequents. To escape this fallacy, you need to affirm more consequents with your hypothesis, while comparing it with its rival hypothesis. "Affirming the consequent" is a fallacy in propositional logic, but that's more indicative of propositional logic's inflexible limits rather than a reflection of actual rationality. In fact, "affirming the consequent" forms half of Bayes' theorem in odds form, which is the logical basis for the scientific method, presuppositional apologetics, and this very blog and the theories I put forth in it.


You may next want to read:
What is "evidence"? What counts as evidence for a certain position?
Science as evidence for Christianity (Summary and Conclusion)
"Proving" God's existence
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 3)

In my last post, I introduced Bayes' theorem:

P(hypothesis|observation) = P(observation|hypothesis)/P(observation) * P(hypothesis)

Now, this is a powerful equation that tells us how to use observed evidence to update our beliefs about a hypothesis. But as I mentioned, it has two difficulties with its use: first, the probability prior to the observation - P(hypothesis) - is famously difficult to compute in a clear, objective manner, and it changes based on the background information that each person has. For these reasons it's often said to be a personal, subjective probability, reflecting a particular person's degree of belief based on his or her unique set of background information.

And second, things get even worse for P(observation): this is the probability of making the observation, averaged over the complete set of competing hypotheses. Because this is an average over the complete set, we have to know all P(hypothesis) values for every competing hypothesis. But as we said just in the previous paragraph, computing even one of these values is difficult. If that wasn't hard enough, in real-life situations we may not even be able to enumerate the complete set of competing hypotheses. And then, even if we somehow got through all these difficulties, we still have to calculate P(observation|hypothesis) values for each of these hypotheses, which itself is no trivial task, then calculate their average across all the hypotheses. This step often requires more computation than the rest of Bayes' theorem put together, even for well-defined problems with fixed values for all other probabilities.

For these reasons I often like to use Bayes' theorem in odds form: simply write down the equations for two different hypotheses and divide one by the other, and you get:

P(hypothesis A|observation)/P(hypothesis B|observation) =
P(hypothesis A)/P(hypothesis B) * P(observation|hypothesis A)/P(observation|hypothesis B)

This can be summarized as "posterior odds = prior odds * likelihood ratio (of the observation being made from each hypothesis)", where:

P(hypothesis A|observation)/P(hypothesis B|observation) = posterior odds,
P(hypothesis A)/P(hypothesis B) = prior odds,
P(observation|hypothesis A)/P(observation|hypothesis B) = likelihood ratio.

 Let's go through an example: say you're investigating a murder. You think that Alice is twice as likely to be guilty compared to Bob - this is your prior odds. You then observe fingerprints on the murder weapon that are 3000 times more likely to have come from Alice than from Bob - this is the likelihood ratio. You multiply these ratios to calculate your new opinion, the posterior odds: Alice is now 6000 times more likely to be guilty than Bob. Posterior odds is prior odds times likelihood ratio.

This is still Bayes' theorem, just in a different algebraic form. The intuition captured by this equation is the same: an observations counts as evidence towards the hypothesis that better predicts, anticipates, explains, or agrees with that observation. But notice that in this form, P(observation) - which was difficult or impossible to calculate - has been cancelled out. Also, P(hypothesis) - another troublesome number - only appears in a ratio of two competing hypotheses, which I think is a more reasonable way to think of it: it's easier to say how much more likely one hypothesis is than another, instead of assigning absolute probabilities to both of them. In short, this form makes the math easier, and allows you to think of just two hypotheses at a time, rather than having to account for the complete set of competing hypotheses all at once. You don't have to worry about Carol and her fingerprints for the time being in the above murder investigation example.

Let's go through a couple more examples:

Say that your friend claims that he has a trick coin: he says it lands "heads" all the time, rather than the 50% of the time that you'd normally expect. You're somewhat skeptical, and based on his general trustworthiness and the previous similar claims he's made, you only think that there's a 1:4 odds that this is a 100% "heads" coin, versus it being a normal coin. This is your P(always heads)/P(normal), the prior odds.

When you express your skepticism, your friend says, "well then, let me just show you!" and flips the coin. It lands "heads". "See!" says your friend. "I told you it'll always lands heads!" Now, obviously a single flip doesn't prove anything. But it certainly is evidence - not very strong evidence, but some evidence. Since the coin will land "heads" 100% of the time if your friend is right, but only 50% of the time if it's a normal coin, their ratio - the likelihood ratio - is 100%:50%, or 2:1.

Now, according to the odds form of Bayes' theorem, posterior odds is prior odds times likelihood ratio. 1:4 * 2:1  = 1:2, so you should now believe that there's a 1:2 odds that this is a trick coin like your friend claimed, versus it being a normal coin. You're still skeptical of the claim, but you're now less skeptical.

Noting your remaining skepticism, your friend then flips the coin again. "Ha, another heads!" he says as he calls out the result. Now, to calculate your new opinion, simply repeat the calculation above, with the previous answer - the old posterior odds of 1:2 - serving as the new prior odds. The likelihood ratio remains 2:1. Posterior odds is prior odds times likelihood ratio, so our new posterior odds is 1:2*2:1 = 1:1. You should now be completely uncertain as to whether this coin in fact is a trick coin. You say to your friend, "well, you may have something there".

"Okay, fine then." says your friend. "Let's flip this thing ten more times." And behold, it comes up "heads" all ten times. Your posterior odds get multiplied by 2:1 for each of the ten flips, and it's now 1:1 * (2:1)^10 = 1024:1. You should now believe that the chance of this being an "always heads" coin is 1024 times greater than it being a normal coin. If you're willing to consider "normal" and "always heads" as the complete set of competing hypotheses, this would give you over 99.9% certainty that your friend is right that this coin will always land heads.

"Wow, amazing." you tell your friend, as you're now pretty much convinced. "I've never actually seen one of these before", you say, as you idly grab the coin and flip it again, fully expecting it to land "heads" once more. But this time, it lands "tails".

What now? The likelihood ratio for the coin to land "tails" - P(tails|always heads)/P(tails|normal) - is 0%:50%, or 0:1. Our new posterior odds is 1024:1 * 0:1 = 0:1. There is now absolutely no chance that this coin is one that will land heads 100% of the time. But at the same time, it also seems unlikely that it's just a normal coin. given that it landed "heads" 12 times in a row just before this. A new possibility suggests itself: that this coin has something like a 90% chance of landing heads.

This illustrates one of the major advantages of the odds form of Bayes' theorem. Before this, you hadn't even considered that the chance for this coin to land "heads" was anything other than 50% or 100%. All of the other hypotheses - such as the coin landing "heads" 90% or 80% or 20% of the time - you had ignored. And yet, even without considering the complete set of competing hypotheses, you were still able to carry out valid calculations and make statistical inferences, reaching sound conclusions.

You both stare at the coin that landed "tails". You ask your friend, "What just happened?" He replies, "well, the magician I bought it from said that it would always land heads. And it seemed to be working fine up 'til now. Maybe he just meant that it'll land heads most of the time?" Being naturally suspicious, you respond, "Looks like he lied to you then. He probably just sold you a normal coin".  But your friend comes back with, "C'mon, you know that's not fair. Human language doesn't work like that. It's imprecise by its very nature. When someone says 'always' in casual conversation, they don't necessarily mean '100.000000...% of the time' with an infinite number of significant figures. Even 'normal' coins don't land heads exactly 50.000000...% of the time". Struck by your friend's rare moment of lucid articulation, you become temporarily speechless. "Besides", your friend continues, "the magician might have said that the coin 'nearly always lands heads'. I don't remember exactly".

With this new insight, you realize that your had set your priors to the wrong hypotheses at the beginning of the problem. Instead of the hypotheses that the coin to land "heads" exactly 100% of the time, or exactly 50% of the time, you should have set them to 'close to 100% of the time' and 'close to 50% of the time'. Giving the odds of P(close to 100%)/P(close to 50%) = 1:4 as before, and interpreting "close to" as a flat distribution within 2% of the given value, We get that the likelihood ratio for the coin landing "heads" is P(heads|close to 100)/P(heads|normal) = 99%:50% = 1.98:1, and for the coin landing "tails" is P(tails|close to 100)/P(tails|normal) = 1%:50% = 0.02:1. Then the value for the posterior odds after 12 heads and 1 tails is given by prior odds times likelihood ratio, and it is roughly:

1:4 * (1.98:1)^12 * 0.02:1 = 18.15:1

(This is an approximation, made by assuming that the probability distribution can be thought of as being entirely focused at the center of their interval. The actual value, 16.97:1, can be obtained by a straightforward integration over the probability distributions, but that calculation lies beyond the scope of this introductory post.)

So you don't have to abandon the "close to 100%" hypothesis along with the "exactly 100% hypothesis. The odds are still 18:1 in favor of the coin landing "heads" more than 98% of the time, against it being a "normal" coin - enough for you to be reasonably confident in believing as your friend does.

This illustrates again the advantages of using the odds form. Firstly, we again didn't have to consider other probability values for the coin landing "heads", such as 75%. We were still able to come to a reasonable conclusion without having to specify the complete set of competing hypotheses, and their probability distribution. Secondly, we were able to completely switch the class of hypotheses under consideration, without losing consistency. If we had stuck to the original form of Bayes' theorem, then we would have had to specify our prior probabilities for P(heads exactly 100% of the time) and P(heads exactly 50% of the time). To maintain our 1:4 ratio, we would assign them as 20% and 80%, taking up all 100% of our probability, because we were not thinking about other possibilities. But then, upon realizing our mistake, we would have no choice but to contradict our previous priors, and assign P(heads close to 100% of the time) and P(close to 50% of the time) some values, while going back and admitting that the chances of the coin giving exactly 50% or 100% "heads" are nearly zero. This is a problem created entirely by being unaware of the complete set of competing hypotheses.

But with the odds form, we don't have to have complete awareness. All the conclusions that we came to are still perfectly consistent with the data: there is zero chance for the coin to land "heads" exactly 100% of the time, yet it is much more likely that the "heads" probability is close to 100% than it being a normal coin. Our two sets of priors do not contradict each other either: it's quite reasonable for our prior odds to be 1:4 in both cases, because we have not specified how much of the total probability they take up. In general, I feel that it's easier to say how likely two hypotheses are relative to one another, rather than specifying the absolute probability value for a hypothesis.

I hope this convinces you of the virtues of the odds form of Bayes' theorem. This is how I use Bayes' theorem in everyday situations to sharpen my thinking: I didn't know if this one movie was going to be any good (prior odds), but upon its recommendation from a friend (likelihood ratio), I revise my opinion and are now more likely to see it (posterior odds). I didn't know whether Argentina or Germany is more likely to win the World Cup (prior odds), but upon watching Germany slaughter Brazil (likelihood ratio), I now consider Germany more likely than Argentina to win the World Cup (posterior odds). So on and so forth. Posterior odds is prior odds times likelihood ratio.

Let's consider a couple of last examples:

I don't know if Bill Gates owns Fort Knox (prior odds). But I know that he's rich, and he's more likely to be rich if the owns Fort Knox than if he does not (likelihood ratio). Therefore, given that Bill Gates is rich, he's more likely to own Fort Knox (posterior odds).

Does that reasoning sound suspicious? It should. I took it straight from the Wikipedia page on "affirming the consequent", which is a logical fallacy. But the structure of the above argument is correct according to Bayes' theorem. It follows the same structure as all of my other examples. So, has Bayesian reasoning lead to a logical fallacy? Oh no! What shall we do?

Hold that thought, while we consider our last example:

 I don't know whether Einstein's theory of general relativity, or Newton's theory of gravity is correct. (prior odds). But upon considering the experimental evidence of bending of starlight observed during the 1919 solar eclipse (likelihood ratio), I now consider general relativity much more likely to be correct than Newtonian gravity (posterior odds).

You should recognize that as the event that actually "proved" general relativity to the public, and the epitome of the scientific method at work: hypotheses are judged according to their agreement with experimental observations. But this is nothing more than just straightforward Bayesian reasoning, following the same structure as all of my other examples. So, it turns out that Bayesian reasoning underlies the scientific method, by providing the logical framework for it.

What are we to make of these two last examples? Does Bayesian reasoning allow for affirming the consequent? But isn't that a logical fallacy? But doesn't Bayesian reasoning also underlie the scientific method? Does that mean that science follows a logically flawed system? What are we to make of this?

I will address these issues in my next post.


You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 4) (Next post of this series)
Isn't the universe too big to have humans as its purpose?
What is "evidence"? What counts as evidence for a certain position?
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 2)

Image: portrait of Thomas Bayes, public domain
In my previous post, I explained that instead of thinking of logical statements as only being "true" or "false", we should assign probability values for their chance of being true. This is the fundamental tenet of Bayesian reasoning. This allows us to employ the entire mathematical field of probability theory in our thinking and expands the rules of logic far beyond their limited forms in propositional logic.

Essentially, we can now use any valid equation in probability theory as a rule of logic. We saw an useful example in the last post: P(C|A) = P(C|BA)P(B|A) + P(C|~BA)P(~B|A). This captures the intuitive idea that if A is likely to lead to B, and B is likely to lead to C, then A is likely to lead to C. But it also does more - it tells us precisely how to calculate the probability for our conclusion, while simultaneously sharpening, guiding, and correcting our thinking. In particular, it tells us that in the second step, it's not enough that B is likely to lead to C, instead requiring that BA is likely to lead to C. (Incidentally, this is why a blind person is not likely to get traffic tickets, even though a blind person is likely to be a bad driver, and bad drivers are likely to get tickets.)

In this post, I will introduce another such equation in probability theory:

P(A|B) = P(B|A)/P(B) * P(A)

This is Bayes' Theorem. Named after Reverend Thomas Bayes, this unassuming little equation, which can be derived immediately from the definition that P(A|B) = P(AB)/P(B), is so important that its use is nearly synonymous with Bayesian logic, and its interpretation is the logical basis for the scientific method. At its heart, this equation tells you how to update your beliefs based on the evidence. To see how that works, set A = "hypothesis", and B = "observation" in the formula. The equation then becomes:

P(hypothesis|observation) = P(observation|hypothesis)/P(observation) * P(hypothesis)

Each factor can then be translated into words as:

P(hypothesis): probability that the hypothesis is true, before considering the observation.
P(hypothesis|observation): probability that the hypothesis is true, after considering the observation.
P(observation|hypothesis): probability for the observation, as predicted by the hypothesis.
P(observation): probability for the observation, averaged over the predictions from every hypothesis.

This equation tells us how we should update our opinion on a hypothesis after we make a relevant observation. That is, it tells us how to go from P(hypothesis) to P(hypothesis|observation). It says that a hypothesis becomes more likely to be true if it's able to predict an observation better than the "average" hypothesis: the bigger the ratio of P(observation|hypothesis)/P(observation), the more likely the hypothesis becomes. Conversely it becomes less likely to be true if it could not beat the "average" hypothesis in its predictions. In short, it says that an observation counts as evidence for the hypothesis that better predicted it. We already intuitively knew this to be true - but Bayes' theorem states it in a mathematically rigorous fashion, and allows us to put firm numbers to some of these factors.

Let's consider an example: Alice and Bob go out on a date. Bob liked Alice and wants to ask her to a second date, but he's not sure how she'll respond. So he hypothesizes two possible outcomes: Alice will say "yes" to a second date, or she will say "no". Based on all the information he has - how Alice acted before and during the date, how they communicated afterwards, etc. - he thinks that there's a 50-50 chance between Alice saying "yes" or "no". That is to say:

P(Alice will say "yes") = P(Alice will say "no") = 0.5

For the sake of simplicity, we will not consider other possibilities, such as Alice saying some form of "maybe". These two "yes" and "no" will serve as our complete set of possible hypotheses.

While Bob is agonizing over this second date, he runs into Carol, who is a mutual friend to both Alice and Bob. She tells Bob, "Alice absolutely loved it last night! She can't wait to go out with you again!". Carol's affirmation serves as evidence that Alice will say "yes" to a second date. We already knew this intuitively: Carol's affirmation is obviously good news for Bob. But Bayes' theorem allows us to calculate the probability explicitly from some starting probabilities. To see this, we need to evaluate two probability values: P(Carol's affirmation|Alice will say "yes"), and P(Carol's affirmation|Alice will say "no").

What value should we assign to P(Carol's affirmation|Alice will say "yes")? That is, if Alice would say "yes" to a second date, what is the probability that Carol would have given Bob her affirmation? Not particularly high - After all, Carol could have simply forgotten to mention Alice's reaction, or Alice and Carol might not have had a chance to discuss the first date, or Alice could have had a terrible time, but she might still give Bob a second chance. All these are ways that the "yes" hypothesis might not lead to Carol's affirmation. Taking these things into account, let's say that P(Carol's affirmation|Alice will say "yes") = 0.2.

What about P(Carol's affirmation|Alice will say "no")? This is the probability that Carol would still communicate her affirmation to Bob, even though Alice would say "no" to a second date. Now, it could be that Alice hated her first date with Bob, but Carol deliberately lied to him. Or maybe Carol simply wanted to encourage Bob even though she didn't really know how Alice felt. Or Alice did really enjoy her time with Bob, but she'll be suddenly struck by amnesia before Bob asks her out again. But assuming that Alice and Carol are honest people, and that nothing particularly strange happens, it's very unlikely that Carol gives Bob her affirmation if Alice is going to say "no". So let's say that P(Carol's affirmation|Alice will say "no") = 0.02

Now, what about P(Carol's affirmation)? This is the last factor we need to apply Bayes' theorem. This is the probability that Carol gives Bob her affirmation, averaged over both the "yes" and "no" hypotheses. Since there's a 50-50 chance that Alice will say "yes" or "no", this is simply the average of the two probabilities mentioned above: 0.5*0.2 + 0.5*0.02 = 0.11. This step can get complicated, but because of the 50-50 chance for our two hypotheses, it is mercifully short in this simple example. So P(Carol's affirmation)=0.11.

This now gives Bob enough information to compute P(Alice will say "yes"|Carol's affirmation). That is, given that Carol told Bob that Alice wants to go out again, what is the probability that Alice will answer "yes" to a second date? According to Bayes' theorem:

P(Alice will say "yes"|Carol's affirmation) =
P(Carol's affirmation|Alice will say "yes")/P(Carol's affirmation) * P(Alice will say "yes") =
0.2/0.11 * 0.5 = 0.909090... = 10/11

Carol's affirmation, upon considering it as evidence, has pulled the probability from 50% to 91%. That is, if Bob thought before that there was only a 50% chance that Alice will agree to a second date, he should now think that there is a 91% chance. That is what evidence does: it pulls the probability for a hypothesis in one direction or another. A strong piece of evidence might pull it all the way from 0.1% to 99.9%, whereas a weak piece of evidence might only pull it from 50% to 60%. An opposing piece of evidence will pull the probability in the other direction, as in a tug-of-war. This is why we commonly speak of "weighing" the evidence. This exemplifies how Bayesian reasoning corresponds to the common sense we use in everyday life, except that it's mathematically precise.

So there is a 10 out of 11 probability, or about a 91% chance, that Alice will say "yes" to Bob's request for a second date. Things are looking good for Bob! Of course, there is the remaining 1/11 probability that Alice will say "no". Bob will have to live with that chance of rejection. That's the nature of Bayesian reasoning - you can't ever be 100% certain, but you can be certain enough to act. Bob should definitely ask Alice out again.

Note that Bayes' theorem, as with all Bayesian reasoning, compels you to accept its conclusions: you cannot simply say "I don't buy this argument" or "I don't find this convincing". If you accept its premises, you must accept it conclusion: otherwise you're violating the rules of mathematical logic.

But where do the premises come from? How did I assign, for example, that P(Alice will say "yes")=0.5, or P(Carol's affirmation|Alice will say "no")=0.02? Well, for the sake of this problem, I just made up some reasonable values. In real life, computing these values would be far more difficult than the example problem itself. For instance, to calculate P(Alice will say "yes"), Bob would have to consider all the relevant background information he has. This would include how Alice interacted with him during the first date, his knowledge of human mating behaviors (it's a good sign if she laughs at your jokes, it's bad if she calls the cops on you, etc), and any other relevant information. Based on this total information, he would calculate how often a woman like Alice would agree to a second date, and that would be his P(Alice will say "yes"). That's why this probability can be thought of as a personal, subjective degree of belief: because nobody else has the exact set of background information that Bob has.

What about calculating P(Carol's affirmation|Alice will say "no")? This is the probability that Carol will convey Alice's approval to Bob, even though Alice will say "no" to the second date. This number might be obtained through some sociological studies, by asking questions like "How often do women tell their friends that they enjoyed a date even if they didn't?" The nature of the relationship between Alice, Bob, and Carol also needs to be taken into account, along with their personalities. Are Alice and Carol very close friends? Is Carol generally reliable, or is she prone to hyperbole? The value of P(Carol's affirmation|Alice will say "no") is all this information condensed into a single number.

You may be disappointed that these probabilities are not simple to calculate. This is often the case in real-life scenarios. It turns out that humans and human relationships cannot be reduced down to a simple calculation, even with Bayes' theorem. Real life is complicated: this should not surprise anyone. Often, the relevant starting probabilities can only be guessed at from intuition. Being able to do that well is a large part of what it means to be a reasonable, logical person in the real world.

So those are the strengths and weaknesses of Bayes' theorem. On the one hand, it provides a firm, computationally exact way of updating your beliefs based on the evidence. On the other hand, the probabilities needed to perform the calculations can be difficult or impossible to assign. In particular, the assignment of prior probabilities - The degree of belief in the hypothesis before considering the observation - is a famously contentious issue within Bayesian reasoning, and there is no way to assign these numbers that's been established as being correct. This was the value of P(Alice will say "yes") in our example above, and I have described it as a personal, subjective probability based on the unique set of background information that a person has. This gives us a ballpark number that we can immediately use, but its imprecise and subjective nature, combined with the human capacity for self-deception, is a cause for concern.

Is Bayesian reasoning still useful in light of these weaknesses? Definitely. It is still far more applicable than propositional logic, and it still tells us, in a mathematically precise way, how to logically incorporate evidence into your beliefs. And often, when there is enough evidence, the specific values of these questionable probabilities turn out to be irrelevant. This is why we often look for overwhelming evidence, beyond any reasonable doubt, before we decide to take action based on a hypothesis. So while Bayes' theorem cannot tell us everything (nothing in this world can), it is a very useful tool for sharpening our thinking and processing evidence to update our beliefs.

In my next post, I will re-cast Bayes' theorem into a different mathematical equation - the odds form - which eases some of the difficulties of Bayesian reasoning. I will use this new form to discuss more examples in Bayesian reasoning.


You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 3) (Next post of this series)
Why are there so few Christians among scientists? (part 1)
How to make a fractal: version 2.0
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 1)

Image: by me. Feel free to use, just link back to this post.
What is Bayesian inference? I've already mentioned it in several of my previous posts, and I'm sure to bring it up again in the future. I obviously think it's important. Why?

Bayesian inference is the mathematical extension of propositional logic using probability theory. It is superior to deductive propositional logic, which is what many people think of when they hear the word "logic". In fact it includes the rules of propositional logic as special cases of its more powerful and general rules. It is the logical framework that underlies the scientific method, and it encompasses a great deal of what it means to be a rational, logical, scientific individual. As with "normal", propositional logic, you don't necessarily have to be formally trained to use it in your daily life, but knowing its basics will greatly clarify your thoughts and sharpen your rational thinking skills. The intent of this post is to provide an introduction to this important topic.

Let's study an easy problem in propositional logic as a prerequisite review and a starting point for Bayesian logic. You should have learned in middle or high school that if A implies B, and B implies C, then A implies C. With some symbols, it becomes "if (A → B) and (B → C), then (A → C)". In an example with words, it might look like "If Socrates was a human, and all human are mortals, then Socrates was mortal". This is well and good. This is a fine way of thinking. Learning how to think this way is useful and worth learning.

However, when we examine the world around us, this rule is severely restricted in its applicability. Consider the following: "If Socrates was a human, and all humans have ten fingers, then Socrates had ten fingers". Is this sound? Can we conclude that Socrates necessarily had ten fingers? Well, no. The second premise - "all humans have ten fingers" - is not strictly true. Certainly most humans do, but not all. So we cannot conclude that Socrates had ten fingers. For that matter, we're not completely 100% sure that Socrates was human either.

"What's wrong with that?" You ask. "Hasn't logic brought us to a correct conclusion, that Socrates might not have had ten fingers?" True. But that's a very weak conclusion. Someone who was basing an argument on the possibility that Socrates didn't have ten fingers would need some additional evidence. I mean, until now I had implicitly assumed that Socrates had ten fingers, and I don't think I was being particularly irrational. Isn't there some way to conclude that "Socrates probably had ten fingers"? Maybe with a rule like "If A is likely to lead to B, and B is likely to lead to C, then A is likely to lead to C"? Doesn't that seem like a pretty logical conclusion?

Of course, "If A is likely to lead to B, and B is likely to lead to C, then A is likely to lead to C" is not a valid argument in propositional logic, and you can certainly find examples where A is true while C is not. For instance, "A blind person is likely to be a poor driver. Poor drivers are likely to get traffic tickets. Therefore, a blind person is likely to get traffic tickets" seems to be a incorrect chain of reasoning. But how could we be sure that it's not just an instance of bad luck, that this is just one of the cases where that probabilistic statement, "likely", just didn't pan out?

So we can't come to any firm conclusions about the "likely" rule in logic, although it seems to make sense sometimes. At any rate we can't use rigid propositional logic with such statements. But this is an enormous restriction, because there is nothing we know in the physical world with absolute certainty. Every instrument of measurement - including your own eyes and hands - are subject to errors and uncertainty. Even if you double check and verify, that only reduces the uncertainty to infinitesimal levels, without ever completely eliminating it. How can we reason in such cases - that is, in any real world scenarios where we are perpetually plagued by uncertainty?

In Bayesian reasoning, these uncertainties are built into its foundations. The truth of a statement is not represented by just "true" and "false", but by a continuous numerical probability value between 0 and 1. So, for instance, the statement "It will rain tomorrow" might get a probability value of 0.1, representing a 10% chance of rain. A statement like "I will still be alive tomorrow" might get a value like 0.999999, as I will almost certainly not die today. "1" and "0" would respectively correspond to absolute certainty in the truth or falsehood of a statement, but as I said they cannot be used in statements about the physical world. Instead we use numbers like 0.5 to represent the certainty that the coin will land heads, or 0.65 to represent the certainty you feel that you're going to marry that girl.

But isn't any given statement ultimately either true or false? Perhaps, but we are not God. We're ignorant of many things. But we still need to reason, even in our uncertainties. Giving a numerical, probabilistic truth value to a statement allows Bayesian reasoning to mirror the human mind much more closely than propositional logic. In essence, you can treat the numerical value you give to a statement as your personal degree of subjective certainty that the statement is true, given the information that you have.

But isn't this all very probabilistic, subjective, and uncertain? In one sense, yes. And that is a strength of Bayesian reasoning, because that's an actual limitation of the human mind. In representing the truth in this way we are only accurately representing how the truth actually exists in our minds. If we actually cannot be certain, then it's appropriate that our logical system actually represents that uncertainty.

But in another sense, this probabilistic thinking is completely rigorous and unyielding. By assigning a probabilistic value to the truth, you can use all of the mathematical tools of probability theory to process them, and their conclusions are mathematically certain. Bayesian reasoning makes very definitive statements about what these probability values must be, and how they must change in light of new evidence. I said earlier that Bayesian reasoning is an extension of logic using math, and that is exactly as rigorous and compelling as it sounds.

To finish this post, let me give you an extended example to illustrate both the rigor and flexibility of Bayesian reasoning, and its superseding superiority over propositional logic. We will address the question about Socrates and his fingers. There will be some math ahead, but nothing you can't understand at a high school level.

First, let me introduce some notations:

P(X) is the probability that you assign to statement X being true. So, if statement X is "I will roll a 1 on this dice", you might assign P(X)=1/6. But if you happened to know that the dice was loaded, you might assign P(X)=1/2 instead.

P(X|Y) is the probably that X is true, given that Y is already known to be true. So, if X is "It will rain tomorrow" and Y is "It will be cloudy tomorrow", then P(X) might only be 0.1, whereas P(X|Y) would be larger, perhaps something like 0.3. That is, there is only a 10% chance of rain tomorrow, but if we know that it will be cloudy, the chance of rain increases to 30%. Notice that the probabilities change depending on what additional relevant information is known.

This P(X|Y) notation is a little bit awkward, as it's written backwards from the more intuitive "if Y, then X" way of thinking. But unfortunately it's the standard notation. By definition, P(X|Y) = P(XY)/P(Y), where P(XY) is the probability that both X and Y are true.

~X is the negation of X. It is the statement that "X is false". By the rules of probability, P(~X)+P(X)=1, because X must be either true or false. Likewise, P(~X|Y)+P(X|Y)=1, and P(~XY)+P(XY)=P(Y)

You can translate a statement in propositional logic into a statement in Bayesian, probabilistic representation, simply by setting certain probabilities to 1 or 0. For instance, "X implies Y", which would be written as "X → Y" in propositional logic, would be written as "P(Y|X)=1" in terms of probabilities.

Now back to Socrates. Let the relevant statements be represented as follows:

A: "This person is Socrates"
B: "This person is a Human"
C: "This person has ten fingers"

Given these statements, we can translate the following statements as follows:

"Socrates was Human": P(B|A)
"Humans have ten fingers": P(C|B)

Now, let's show that we can duplicate the results of propositional logic simply by setting the probabilities to 1. If P(B|A) = P(C|B) = 1, then by the definitions given earlier, P(BA)=P(A), P(CB)=P(B), therefore P(~BA)=P(~CB)=0, therefore P(C~BA)=P(~CBA)=P(~C~BA)=0. But P(C|A) = [ P(CBA)+P(C~BA) ] / [ P(CBA)+P(C~BA)+P(~CBA)+P(~C~BA) ], which reduces to P(CBA)/P(CBA) =1 after eliminating all the zero terms. That is to say, if P(B|A) = P(C|B) = 1, then P(C|A) = 1. Or, translating back into words, "If Socrates is human, and humans have ten fingers, then Socrates has ten fingers".

Don't worry too much if you got lost in the notation in the above paragraph. The important point is that Bayesian reasoning can reduce down to propositional logic for the special cases where the probability values are set to 1 or 0. Bayesian reasoning thereby completely encompasses and supersedes propositional logic, like General relativity supersedes Newtonian gravity.

What if the probabilities are not 100%? This is the real-life problem of dealing with uncertainties. What is the actual value of P(B|A), the probability that Socrates was human? Might he not have been an alien, or an angel? As ridiculous as these possibilities seem, they ruin our complete certainty and makes propositional logic flounder. What about P(C|B) - the probability that a human has ten fingers? It's certainly not 100%. And what can we conclude about P(C|A) - the probability that Socrates had ten fingers?

To tackle this question, we need to consider the following formula for P(C|A), which can be derived from straightforward application of the rules and definitions mentioned earlier. The fact that this formula exists - that we can actually derive it and use it to perform exact calculations - is one of the compelling fruits of the Bayesian way of thinking. Here it is:

P(C|A) = P(C|BA)P(B|A) + P(C|~BA)P(~B|A)

Let's say that Socrates has a P(B|A)=0.999 999 chance of being human, and that given all this, he has a P(C|BA)=0.998 chance of having ten fingers. This means that P(~B|A) = 0.000 001 is the chance that Socrates was not human. The last factor we need to know, P(C|~BA), is the probability that a non-human Socrates had ten fingers. This is nearly impossible to estimate, as we'd have to consider all the different things Socrates could have been - alien, angel, a demon in disguise, etc. But it will turn out not to matter much for our final result. Let's just assign P(C|~BA)=0.1. Plugging in the numbers and calculating, we get that P(C|A) = 0.997999102. That is to say, Socrates almost certainly had ten fingers.

If you want extra practice, you can also try using the formula on the "blind man getting traffic tickets" scenario above, and see why a blind man is not likely to get traffic tickets. Or you can wait until next week's post to get the answer.

But again, don't worry too much about the details of numerical calculation. The important point is that Bayesian reasoning provides an exact formula for calculating the probability of a conclusion, even when the premises were also only probabilities - which is always the case in the physical universe. Furthermore, the conclusions drawn this way are compelling, because they are mathematical results. If you accept the probabilities of the premises, then you must accept the conclusion. This is the same compelling force which is at work in propositional reasoning. The premises lead to inescapable conclusions.

I hope that this example demonstrates to you the usefulness of the Bayesian way of reasoning. It can be actually applied to situations with uncertain premises, which is really nearly all situations. It is completely rigorous in that a correct Bayesian argument forces you to accept its conclusions if you accept its premises. Yet it's also flexible in assigning probabilities to reflect your current, subjective, personal degree of belief in the truthfulness of a statement. It duplicates propositional logic as its special cases, and in its full form it's more general and more powerful than propositional logic. There are other advantages I have not yet touched on, such as its ability to naturally explain inductive reasoning and Occam's razor, and how it serves as the framework for the scientific method. On the whole, it encompasses a great deal of what it means to be a logical, rational, and scientific thinker.

In my next post, I will discuss a particularly important formula in this probabilistic way of thinking, one that is nearly synonymous with Bayesian reasoning - Bayes' theorem.


You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 2) (Next post of this series)
What is "evidence"? What counts as evidence for a certain position?
Miracles: their definition, properties, and purpose
Another post, from the table of contents

The dialogue between two aliens who found a book on Earth

Image: Toy Story aliens from Amazon.com
Alice and Bob are two aliens. In their interstellar journey, they pass by Earth and there picked up a book. Some time afterwards, they meet to discuss their new acquisition.

Alice:
So, have you had a chance to look at that Terran artifact that we got?

Bob:
I have. It has many fascinating properties. I and my colleagues have studied it quite thoroughly, and although there are obviously still more discoveries to be made, we can make some certain statements about this object.

Alice:
Great! I was looking at it too, and I wanted talk to you about what I found. The alien artifact is clearly a book, and it's got an... interesting... message.

Bob:
Well, it's made of many thin sheets of cellulose fiber, upon which appears characters that consist of a carbon black mixture. I suppose you can call this configuration of materials a "book" if you'd like.

Alice:
Um... yeah, sure. It's made of paper, and the letters are in ink. Of course. But really I'm interested in its meaning.

Bob:
"Meaning"? I don't know what you mean.

Alice:
You mean that you haven't found the message of the book? I thought you said you looked at the book, and studied the writing on it.

Bob:
No, I mean that your question is nonsense. What is "meaning"? We certainly haven't found anything like that in this "book". The object is as I have described it: thin sheets of cellulose fiber, upon which there appears characters consisting of a carbon black mixture. That is all our empirical investigations have found. "Meaning" is not an component of the object, as far as we know. And, although this is a minor point, I must correct your usage of the words "letter" and "writing". The body of professional typographers to which I belong have decided that the technically correct typographic designation for these markings is "characters".

Alice:
Well, okay, whatever, but you haven't actually read those "characters"?

Bob:
Again you're not making any sense. What do you mean by "read"? The characters are characters. Although we have now studied them in great detail and can say a great deal about them, the best way to describe them remains "characters consisting of a carbon black mixture".

Alice:
How could you have studied them and not know how to read them, or their meaning? Look, you see here at the beginning of the book, where the letters "I" and "n"...

Bob:
...Characters.

Alice:
Whatever. Where the characters "I" and "n" appear together, making the word "In"? That combination of characters has a meaning, of being contained by something, or near the center of something, or surrounded by something. And the next word is "the", which is...

Bob:
Wait a minute. "Word"? "meaning"? what do those words mean? This sounds like more nonsense. All you've shown me are just characters.

Alice:
Yes, but the characters form words, which form phrases, which form sentences, which form paragraphs, then...

Bob:
Wait, wait, slow down. That is a lot of entities you've brought up just now that I'm not sure can be empirically verified. So you say that characters form "words"?

Alice:
Yes! Like the words right here, "In", "the", and so forth.

Bob:
You've merely pointed to a set of five characters. Of course you can have sets of characters. You can group the characters however you'd like. But they're still just characters. Where is the "word"? We typographers have not found anything like that in our study of this object.

Alice:
A group of characters IS a word. And each word has a meaning that can be determined in conjunction with its place in the sentence, which...

Bob:
Okay, let me see if what you're saying makes any sense. So, if I show you a character, you can tell me what "word" it belongs to? And what "meaning" it has?

Alice:
Yes.

Bob:
What about this character over here, this "I" character?

Alice:
That? That is the word "I", which means the self, the first person, the one who is also the speaker or the writer of that sentence.

Bob:
But clearly that is the character "I". Where is the "word"?

Alice:
The character IS the word.

Bob:
How could this be a "word" when it is clearly a character? I thought you said that "words" were groups of characters.

Alice:
This happens to be a one-character word.

Bob:
Well, isn't that convenient for you. I see no empirical evidence for any of this. And you say that this "word" is a character but also a "word", that it has "meaning"?

Alice:
Yes, of course. "I" is a particularly important word, with a very important meaning. It can refer to the author of the book, or it can be used in a rhetorical device to address a hypothetical person, or used by a character in a fictional story. You have to look at the context to figure it out. In an abstract sense, a lot of the literature in this book is about the relationship between the "I" and the...

Bob:
Enough. This is all nonsense. "I" is a character, but it is also suppose to be a "word", which has a "meaning", which can also be one of many different "persons"? I, as an empirical typographer, cannot accept such untypographical statements.

Alice:
But can't you clearly see that the characters form words?

Bob:
"Words"? I have no need for such a hypothesis. Everything you've mentioned, everything you've brought up, are only characters. You have no evidence that there is anything else.

Alice:
Look, if you'd just learn to read, you'll see that this is a book of profound truth and meaning. You have to recognize the meaning of each word and learn to use them in the context of a sentence, and build up your ability to interpret the writing all the way up to its full literary context, taking the author's intentions into account. It all makes sense once...

Bob:
You simply see many characters in this object, and in your wishful thinking you have concluded that there must be some "meaning" to them all. You therefore construct these convoluted system of "words", "sentences", "paragraphs", and on top of that, "rhetoric" and "literature", which is all suppose to express some "meaning" that reflects on some "truth" expressed by some "author"! And yet you can provide no empirical evidence for any of it. The reality is that we have investigated the characters in this object and they are now very well understood. There are no "words" to be found in them. Any such ideas are the products of a gullible mind, the yearnings of the untypographic individuals given to delusion. Such thinking is the opium of the masses.

Alice:
You have to begin by learning to understand the meaning in the words. That's how you learn to read. Once you start, you'll see that it all makes sense.

Bob:
So you have to buy into this "meaning" business to see that there is "meaning"? That's circular reasoning. Your argument is invalid.

Alice:
Look, let me read you some passages from the book. You'll see that there is in fact meaning to be found in the object. Watch. I'll read this passage, and you'll see that the characters here become words and sentences and have meaning. Listen: "... these are written so that you may..."

Bob:
Your cultic "reading" rituals are not evidence. All you were doing was to scan your eyes over the characters and making corresponding sounds with your mouth. In fact, upon studying your "reading" rituals, it becomes totally obvious that you're failing to understand the typography of the characters, which is the only underlying entity that actually exists. There is no "meaning".

Alice:
What do you mean?

Bob:
I mean that your so-called "reading" is merely you responding to the typography of the characters. For instance, through our intensive empirical study of this object conducted at the millimeter scales, we have found that each character takes on two forms: an upper case form and a lower case form. For instance, the "t" character is lower case, and its upper case form is "T". Furthermore, we have found that after a "." character, the next character is always in the upper case form. There are other such laws of typography we've discovered - for example, a "q" character is always followed by a "u" character. All of this is verifiable through empirical observations. All you're doing when you're "reading" is employing these laws to look at the characters and making the corresponding sound with your mouth. At the bottom, it's only typography, and because you fail to understand typography you imagine that you're "reading".

Alice:
Look, of course things like clean writing, punctuation, and spelling are important, but you're missing the point here. To read the book means to get its meaning out of it.

Bob:
Then why is it that when I see you "read", I only see you employing typography when I break it down to what's really going on? Or let me put it this way: could you still "read" if the laws of typography were different? For example, if "T" characters looked like ";" characters, and the characters all ran together without any space between them?

Alice:
Of course not. I'm not being anti - typography. I obviously employ it in reading the book. I'm saying that there's meaning behind it all.

Bob:
Yes, you are being anti-typography. You're clinging to your "meaning" instead of recognizing that all of your "meaning" comes from the characters arranged according to typographical laws.

Alice:
So you're completely rejecting the idea of any meaning in the book?

Bob:
I am only holding to beliefs which have been empirically verified. Of course, there is still a possibility that your "meaning" exists, although there is no evidence for it. But if it does exist, even that meaning will be found by typographically examining the characters. For instance, one of the issues at the frontier of typographical research is the similarity between the "1" character and the "I" character. Some typographers suspect that there is even a difference between the "I" character and the "l" character, which would open up the possibilities for discovering new typographical laws. We cannot be certain yet, of course. This will require exciting new studies at the sub-millimeter scales. It may be that your "meaning" will be discovered at these sub-millimeter scales or in these new typographical laws, although I see no reason to expect that to happen.

Alice:
But that's not what's meant by "meaning" at all. The meaning of any object is found outside the object itself. Meaning is what's meant by the author of the book, what's intended for the readers of the book to understand.

Bob:
There you go again with your circular definitions. "Not what's meant by meaning"? What does that even mean? The fact that you're not excited by the prospect of progress at the typographical frontier gives me reason to believe that your "meaning" is antithetical to typography, only brought about by your ignorance of typographical matters. The truth is that there is no outside "meaning". The object is exactly what you'd expect it to look like if it was blindly created only from the laws of typography.

Alice:
Okay, so what are all these characters in the book for then? Why do they exist? What's their reason for being?

Bob:
There is no ultimate, outside "meaning", but there is perhaps meaning to be found in the beauty of the laws of typography, in exploring its depth and appreciating that they are all meaningless. We've found this meaningless "book", and it is up to each of us to choose to give it meaning. I think that's actually far more beautiful and profound than trying to discover some "meaning" that's thrust upon us. It may be depressing to think that there is no ultimate purpose or "meaning" in this object, but we can't let that depression beat us. We find our meaning in fighting against that depression and finding our own meaning - in standing against the meaninglessness of it all. I make my own meaning.


You may next want to read:
Isn't the universe too big to have humans as its purpose?
How to determine the specific purpose of the universe
The trends in science as evidence for Christianity against atheism (part 1)
Another post, from the table of contents

How to determine the specific purpose of the universe

Last week, I cited the fine tuning argument to conclude that the universe does have a purpose, for it is nearly impossible for its features to be the result of purposeless randomness. Just as a rational but ignorant alien who comes across a human book will conclude that it has a purpose, we too are compelled by the same reasoning to conclude that the universe has a purpose.

The fine-tuning argument will get its own series of articles in the future. But for now, here's the basics: the universe has certain fundamental parameters which must fall within exceedingly narrow values for life to have evolved in it. These values are so narrowly determined, and the probability of a random process generating these values so low, that it would be simply called "impossible" in any ordinary situations. That is to say, a purposeless process would almost certainly not have created our universe.

This allows us to firmly conclude that the universe did not come about randomly, that it really does have a purpose. But what is that purpose? The fine-tuning argument only mentions life, so how do we go from that to the biblical claim of the universe being made by and for Christ? How do we know that the universe was not made for cats, or bacteria? In light of the many different life forms in existence, is it not merely human hubris to say that humanity, and especially one particular human, is the reason for the existence of the universe?

As before, we will approach this question using Bayesian inference, and start by tackling an easier, analogous question: that of an alien considering a book. How could our alien conclude that the purpose of this book was to convey information? After all, couldn't the book also serve as a paperweight, or kindling for a fire? How does the alien go from "this object has a purpose" to "that purpose is to convey information"?

Once again, Bayesian inference gives our alien the answer: look for features that could be anticipated, predicted, or explained by each of these purposes. These features then serve as evidence for the purpose which best predicts them. So, the thin paper pages of the book serves as evidence for both the "kindling" and the "information" hypothesis: both can explain why the book has thin paper pages. However, only the "information" hypothesis can explain why the pages contain symbolic markings, and this then decides the question in favor of the "information" hypothesis.

Note that the alien's conclusion would be greatly strengthened by a knowledge of the language in the book. If he did not know the language, he may only tentatively infer that the markings in the book had meaning. But knowing the language brings with it a much greater certainty that the content of these markings are highly unlikely to have come about by chance. If you are ignorant of English, the word "meaning" looks like a random sequence of letters, and you may decide that it was just randomly put together. But as someone who understands English, you know that this sequence of letters is not likely to be the result of chance. When applied to the text in our book, this low probability then serves as strong evidence that the book's purpose really is to convey information.

Once again, it all comes down to probability. The true purpose of the book is that which best explains the least probable feature of the book. Since the least probable feature of the book is its text, its true purpose is that which explains that text: the book exists to transmit information. Reaching this conclusion is greatly aided by the knowledge of the language. This is extendable to all objects: the true purpose of a given object is that which best explains the least probable features of that object, whose recognition is greatly assisted by some prior knowledge.

Now that we've considered this hypothetical book, let's apply the same reasoning to the purpose of the universe: the universe is designed for life, as per the fine-tuning argument, but this does not distinguish between humans, cats, or bacteria being that purpose. If we consider only fine-tuning, we cannot tell whether cats exist to make us laugh or we exist to serve cats. Or perhaps we both exist to serve bacteria. However, this equivalence is broken upon considering other features of the universe, such as human civilization. Only the primacy of humans can explain why humans have achieved civilization while cats and bacteria have not: the other hypotheses cannot explain why this highly unlikely feature should exist for humanity.

Note that this conclusion is likely to be reached by someone with some prior knowledge of human civilization, who understands that civilization is not something that could have come about by chance. A random pile of matter - even a random pile of matter put together by humans - is unlikely to result in civilization. So only by being ignorant of human civilization - only by failing to recognize its low probability starting from randomness - can one claim that the purpose of the universe is to generate cats or bacteria. Those who recognize civilization therefore rightly conclude that its low probability is firm evidence for placing humans at the apex of the purpose of the universe.

It again comes down to probabilities. Humans are the most complex life-forms, and we're the only ones to have achieved an advanced civilization. Both complexity and civilization are low-probability events: therefore among the life-forms we are the least likely to have randomly evolved. And among the humans, Jesus was the least likely person to have ever lived: one does not just randomly fulfill messianic prophecies, then randomly say the things that Jesus said about himself, then randomly lead a morally perfect life, then randomly rise from the dead. But if Jesus really is the incarnate God for whom the universe was created, then everything is explained.

That is how you go from merely stating that universe has a purpose, to specifying that purpose. That is how you narrow down from the purpose existing, to it being life, to humanity, and finally to Jesus. The purpose of the universe is that which explains the least probable features of the universe: and as the least likely member of the least likely species in our improbable universe, Jesus Christ was that purpose. And to all who acknowledged him, he gives them the ability to become the children of God.


You may next want to read:
The dialogue between two aliens who found a book on Earth
The biblical timeline of the universe
Isn't the universe too big to have humans as its purpose?
Another post, from the table of contents