Plausible Reasoning

The seeds of the modern era could arguably be traced to the Enlightenment and the invention of rationality. I say invention because although we may be universal computers and we are certainly capable of applying the rules of logic, it is not what we naturally do. What we actually use, as coined by E.T. Jaynes in his iconic book Probability Theory: The Logic of Science, is plausible reasoning. Jaynes is famous for being a major proponent of Bayesian inference during most of the second half of the last century. However, to call Jaynes’s book a book about Bayesian statistics is to wholly miss Jayne’s point, which is that probability theory is not about measures on sample spaces but a generalization of  logical inference.  In the Jaynes view, probabilities measure a degree of plausibility.

I think a perfect example of how unnatural the rules of formal logic are is to consider  the simple implication

A \rightarrow B

which means – If A is true then B is true.  By the rules of formal logic, if A is false then B can be true or false (i.e. a false premise can prove anything). Conversely, if B is true, then A can be true or false.  The only valid conclusion you can deduce from A\rightarrow B is that if B is false then A is false.   Implication is equivalent to the logical statement (\neg A) \vee B, where \neg means negation and \vee means logical OR.

However, people don’t always (seldom?) reason this way. Jaynes points out that the way we naturally reason also includes what he calls weak syllogisms: 1) If A is false then B is less plausible and 2) If B is true then A is more plausible.  In fact, more likely we mostly use weak syllogisms and that interferes with formal logic. Jaynes showed that weak syllogisms as well as formal logic arise naturally from Bayesian inference.

We can see this more clearly in a concrete example.  Let’s say that A is the statement it is raining and B is the statement it is cloudy.   So A \rightarrow B is equivalent to if it is raining then it is cloudy.  The only logical inference that can be drawn is that if it is not cloudy then it is not raining.  In particular, it is not raining does not imply that it is not cloudy but it does imply that being cloudy is less plausible. Conversely, it is cloudy doesn’t mean it is raining but it does imply that rain is more plausible.  This is why if it is cloudy we may bring an umbrella and if is not raining we may reach for some sunblock.  If we operated purely using formal logic, we would never draw such conclusions.

Jaynes showed that we can quantify weak syllogisms using Bayesian inference if we let probability measure the degree of plausibility.  Let the probability that A is true be given by P(A) and the probability that A is not true be given by P(\negA).  The probability that A is true if B is true is represented by P(A|B).  There are similar relations for B by reversing A and B.  The relationship between joint and  conditional probability

P(A,B)=P(A|B)P(B)=P(B|A)P(A)

leads immediately to Bayes theorem

P(A|B)=P(B|A)P(A)/P(B),     (1)

where P(B)= P(B|A)P(A)+P(B|\negA)P(\negA).  In Bayesian lingo, P(A|B) is called the posterior probability, P(A) is called the prior probability and P(B|A) is called the likelihood function.  We can similarly rearrange variables to obtain

P(B|A)=P(A|B)P(B)/P(A),   (2)

P(A|\neg B)=P(\neg B|A)P(A)/P(\neg B),   (3)

P(B|\neg A)=P(\neg A|B)P(B)/P(\neg A).   (4)

 

We can now use these formulas to parse logic and plausible reasoning.  The statement that A is true implies B is true implies that P(B|A)=1.  We can see this using equation (2) since if A is true then P(A|B)P(B)=P(A) (since the probability that A is true is independent of B).  Now suppose that B is false.  Then we can use equation (3) to find out about A.  Here the probability that B is false and A is true is zero so P(\negB|A)P(A)=0.  Hence, we conclude that P(A|\negB)=0, or A is false.

Now what about the weak syllogisms?  Well let’s consider what happens to our probability for A if B is true.  We can use equation (1).  We know that P(B|A)=1 from before.  Since P(B) \le 1 then P(A|B) \ge P(A). Hence knowing that B is true increased the probability that A is true.  Finally what happens if A is false.  Here we can use equation (4) to find out about B.  From P(A|B) \ge P(A) we can obtain P(\negA|B) \le P(\negA) (since P(\negx) = 1 – P(x)), which leads to  P(B|\negA)  \le P(B) i.e. B is less plausible if A is false.

I think this strongly implies that the brain is doing Bayesian inference. The problem is that depending on your priors you can deduce different things.  This explains why two perfectly intelligent people can easily come to different conclusions.  This also implies that reasoning logically is something that must be learned and practiced.  I think it is important to know when you draw a conclusion, whether you are using deductive logic or if you are depending on some prior.  Even if it is hard to distinguish between the two for yourself, at least you should recognize that it could be an issue.

7 thoughts on “Plausible Reasoning

  1. There is no need to distinguish deductive logic from Bayesian inference because deep down they are the same thing! When doing Bayesian inference, you *are* inevitably doing formal logic, although typically on a different set of propositions than in the naive approach you used to demonstrate it’s purported inadequacy.

    The theory of probability easily reduces to propositional logic when you allow the propositions to represent relationships between (finite) sets. Even though Jaynes goes to great lengths to avoid it in his intro (even more so than R. Cox), he seems to fall back on this idea when he comes to explaining maximum entropy. In one paper he goes through Laplacian “multiplicities”, a set theoretic concept. Moreover, the translations of “problem statements” into “probability assignments” by his “robot” can always be presented using combinatorial arguments, although Jaynes was careful to point out that there’s often no need to enumerate all the possibilities explicitly. Anyway, whether we talk about “priors” or “which sets to consider” is just a matter of lingustics.

    I like to think that Jaynes was a “frequentist” himself, in the sense that he anchored his reasoning not in the counting outcomes of some physical random experiments, but in counting possibilities and set sizes corresponding to the various propositions in a model world.

    By the way, your chosen example is not so much propositional logic vs. Bayesian inference, but propositional logic vs. causal reasoning (where the time dimension plays a key role). Most learners have trouble grasping logical implication because they naturally attempt to explain it in causal terms.

    Like

  2. Hi,

    Thanks for your comments. I think applying causality is a big part of the problem, there may also be the confusion between normative and descriptive statements. I also think that a false premise can imply anything is not “natural” to most people, but we can debate this.

    Like

  3. Another explanation of the confusion about implication is that the proposition “A => B” is easily misinterpreted as a statement about the possibility of logical derivation (“from A we can infer B” – note the hidden time dimension in here – it also reflects the causal meaning of the word ‘imply’; curse mathematicians for overloading such words).

    The correct interpretation is that A => B expresses one’s assessment of the actual truth values of A and B. I find that it becomes clear when viewed as the negation of “(not A) and B”. In words: “I know for sure it’s not the case that both A is false and B is true (but I don’t know anything beyond that)”. Given the potential for confusion, it wouldn’t do harm to get rid of the => operator entirely.

    As for the “false premise implies anything” saying, it seems to be another source of confusion by natural misunderstanding into “from false premise anything can be derived”. However, as stated above the => operator does not represent logical derivation.

    I’m curious about your distinction between normative and descriptive statements. As far as I can see, propositional logic and probability theory only deal with descriptive statements…?

    Like

  4. I just meant that perhaps people get “is” and “should be” confused. So if a statement contains something that involves something that they find morally disagreeable then that may cloud their reasoning.

    Like

  5. I thought by “normative” you’d meant something more because your blog post uses modal words abundantly. In particular, you presented implication in “if … then” terms and wrote about what “can” be true rather than what “is” known to be true.

    I have a little suspicion that propositional logic may appear “normative” when you treat the symbols as propositional variables rather than as concrete propositions, but I’m not sure how these both approaches are related or even if the concept of “propositional variables” (each of which stands for an unknown truth value of some well-defined proposition) makes much sense.

    Like

Leave a comment