Saturday, April 25, 2015

Here we go again... 'extraordinary claims require extraordinary evidence'

I've already posted about the dangers of this quote, but I recently saw it on a portuguese skeptics web site and decided to give it another go.

Before we get to it, I should clarify that my posts on skeptics/atheists, are based on my experience with US atheists and skeptics and their non-american orbiters. The portuguese group and a recently found YouTube channel seem to be much more interesting.

Ok, so what about the "extraordinary claims require extraordinary evidence" quote, which on the portuguese web site is attributed to Carl Sagan (though I think Rev. Thomas Bayes got there first) and has been misattributed to Christopher Hitchens in a number of places?


Your rationale, at least in Bayesian terms, is correct...


The statement reflects a quasi-Bayesian view of the world, which I like: say the extraordinariness of a proposition $P$ is to be supported by evidence $E$; what does Sagan's statement mean?

Let $\Pr(P)$ the the prior probability of $P$, that is the degree to which the person receiving the evidence believes in $P$ prior to the evidence being presented. After the evidence is presented, the proposition can be evaluated by its posterior probability. To compute the posterior probability, $\Pr(P|E)$, we need to know a couple of other things: the conditional probability of the evidence on the proposition $\Pr(E|P)$ and the unconditional probability of the evidence $\Pr(E)$.

Whoa! Too much math, I can hear a lot of people who like science as long as they don't have to learn any saying...

Ok, say $P$ is "Bob is a powerlifter" and $E$ is "Bob bench-presses 400Lbs." If about one out of one-hundred people in our social circle are powerlifters, then $\Pr(P) = 0.01$. If Rose tells me that Bob bench-pressed 400Lbs, I need to know a couple of things to determine whether Bob really is a powerlifter:

1. Given that $X$ is a powerlifter, how likely is $X$ to bench 400Lbs? (Most powerlifters can bench 400Lbs, but our social group might have a few weakling powerlifters, usually bodybuilders who are trying to pretend they're athletes.) This gives me the $\Pr(E|P)$.

2. What percentage of our social group benches 400Lbs? Many football and basketball players, who are not powerlifters, bench 400Lbs; there might be several of these in our social group. This gives me the $\Pr(E)$. Note that this includes the powerlifters too; it's a proportion of everybody.

Note that the ratio of these two quantities, $\Pr(E|P)/\Pr(E)$, is a measure of informativeness of the evidence:

Say $\Pr(E|P) = 0.75$, three-quarters of all powerlifters can bench 400Lbs, and $\Pr(E) = 0.75$, meaning that so can a similar proportion of everyone. In this case, the evidence is quite useless. It's non-informative at all, as can be seen by computing the posterior probability using Bayes's rule:
\[
\Pr(P|E) = \frac{\Pr(E|P) \times \Pr(P)}{\Pr(E)} = \frac{0.01 \times 0.75}{0.75} = 0.01
\]
The "evidence" doesn't change the probability of the proposition, in other words, no information comes from knowing it. (For the purposes of determining the $P$; we learn that Bob can bench 400Lbs, which might be useful when we need a friend to help us move.)

Now, say $\Pr(E|P) = 0.75$, so three-quarters of all powerlifters can bench 400Lbs, same as above, and $\Pr(E) = 0.05$, meaning that only few people in the social group can do it. This suggests that the evidence is somewhat dispositive, as can be computed by:
\[
\Pr(P|E) = \frac{\Pr(E|P) \times \Pr(P)}{\Pr(E)} = \frac{0.01 \times 0.75}{0.05} = 0.15
\]
Note that the probability increased fifteen-fold. This is strong evidence, but, given the low prior probability, Bob's powerlifter-ness is still very much in question. The 400Lbs bench wasn't "extraordinary"-enough evidence.*

In general, the lower the $\Pr(P)$, i.e. "more extraordinary claims", the higher $\Pr(E|P)/\Pr(E)$ must be for accepting $P$, i.e. "require more extraordinary evidence." So far, so good for Carl Sagan.


... but your understanding of human psychology is lacking


The problem is the lack of indices in those probabilities above. In particular, the lack of an index to separate different probabilities assigned by different people. (Oh, and people also have assorted cognitive biases that make this worse, but we don't need them to make the case so let's stay within the bounds of strict Bayesian rationality.)

What the Sagan quote gets wrong is that what person A thinks is an extraordinary claim and what person B considers extraordinary claim can be opposed. Typically, when people invoke that quote, what they mean is:

"Claims that contradict that which I and my circle of friends believe in are such proofs of stupidity by those who believe them, that I'll quote Carl Sagan and stop trying to engage in intellectual discussion." 

For what it's worth, I think this is mostly applicable to the US/US-orbiter crowd. The portuguese web site and the skinny YouTube brit seem to be actually trying to engage people.

Let's consider Bob's case again, but now the probabilities have a subscript, $A$ or $C$, for Arnold or Cooper.

Arnold only knows bodybuilders and powerlifters (and has taken too much Deca-Durabolin and Dianabol for his brain be able to tell the difference) so he thinks that almost everyone is a powerlifter, $\Pr_A(P)=0.99$. All powerlifters in Arnold's world can bench well above 400Lbs, so his $\Pr_A(E|P)=1$, and he assumes everyone else is a weakling who can't bench an unloaded bar, so with minimal computation we get $\Pr_A(E)=0.99$.

Cooper, on the other hand, when he's not busy writing flawed papers about the benefits of running [ahem: if you ignore self-selection], believes that most people are not powerlifters, $\Pr_C(P) = 0.01$, but knows quite a few people who can bench 400Lbs (because they played football or do some real exercise in a gym on the sly) and knows only one [pretend] powerlifter who can't bench 400Lbs (probably a bodybuilder), so he thinks that $\Pr_C(E|P)=0.5$ and $\Pr_C(E)=0.45$. (For kicks, compute Cooper's probability that a non-powerlifter can bench 400Lbs.)

Arnold and Cooper are having a discussion about Bob. Arnold, with a thick Austrian accent despite having lived in California for almost fifty years, says:

"Auf kourse Bob ees a powerlivtehr. Ohlmahst eferryvon ees."

Cooper: "No, he's not, and they aren't."

Rose arrives and says Bob benches 400Lbs.

Cooper: "That changes almost nothing."
(He's right: $\Pr_C(P|E) = 0.011$)

Arnold: "Zee? Unwiderlegbar [incontrovertible – ed.] evidenz! Ah'll pahmp yew Ahp!" [Strikes frontal biceps pose.]
(He's right: $\Pr_A(P|E) = 1.00$.)

Cooper, who finds the proposition "Bob is a powerlifter" extraordinary, does require extraordinary evidence to change his mind, and considers the evidence that Rose brought in to be insufficient. Arnold thinks it's dispositive evidence. Any discussion between them that doesn't start by acknowledging that their probabilities are different will be a pointless waste of time.

A pointless waste of time, that is, if the purpose is to convince. I believe that that was Carl Sagan's purpose, unlike many of the current day best-selling popular-in-america skeptics whose purpose seem to be a convex combination of politicking (and I mean in the sense of promoting a particular political party, not just policies) and monetizing their echo chamber.

Monetizing the echo chamber: like preaching to the choir, but with monetary reward.

(Again, as far as I can tell, not what the portuguese skeptics or the needs-a-sandwich YouTuber are doing.)


Bonus: the effect of informativeness of evidence




Informativeness in this table is $\Pr(E|P)/\Pr(E)$. So for example, for someone who has $\Pr(P) = 0.000 025$, i.e. is very skeptical about the proposition, to be completely uncertain about $P$, $\Pr(P|E) = 0.5$, you need the evidence to be twenty thousand times more likely if P is true, than overall, $\Pr(E|P) = 20000 \times  \Pr(E)$.

Note that when $\Pr(E|P)/\Pr(E) < 1$, the evidence is against the proposition, in that the probability of observing the evidence given the proposition is lower than the incidence of the evidence in the general population of events.

For kicks, why are some cells greyed-out? Is this a cop-out to avoid showing probabilities above 1, or is there a real reason why informativeness is bound below some limit? Hint: there's a real reason.

-- -- -- --

* Of course, the most dispositive test would be to show Bob a squat rack; if Bob said "what is that?" (normal person) or "it's for doing standing curls, isn't it?" (bodybuilder), that would be proof of non-powerlifter-ness. A powerlifter would say "I'd rather use a cage or a monolift, but sure I'll SHUT UP AND SQUAT!"

Saturday, April 11, 2015

The sky is blue, therefore no vodka on transatlantic flights

Consider a truthful proposition, say "the sky is blue." In this hypothetical, imagine that for historical reasons a majority of the people are indoctrinated to believe that the sky is red; a minority of the people know it's blue.

Now imagine that there's a subgroup of those who believe that the sky is blue who organize and attend conferences, write articles and blog posts, and publish books, all dedicated to making fun of people who don't know that the sky is blue.

Most of the minority who know that the sky is blue find these conferences, articles, and books (like "The Red Delusion" and "Red is not Great") both trivial and mean-spirited: trivial because they don't actually elaborate on the blueness of the sky as a phenomenon; and mean-spirited because when you scrape the thin veneer of interest in the truth, what's left is a group of people mocking those to whom they feel superior.

Imagine that you meet, possibly on a discussion forum, some of these "the sky is blue" activists who are very vocal about the blueness of the sky, but don't know that the color blue maps into a specific range of wavelengths, how the eye senses color, or that the color of the sky results from the scattering of sunlight in the atmosphere. Instead, when topics like these surface, the activists quickly move the discussion to the topic of some other person who believes the sky is red and should be mocked or punished for that. Or ban you from the forum.

Now, imagine that at one of these conferences, or in the articles by some of the least competent writers, you find clearly wrong statements, such as "the oceans are yellow and made of butter" or "golf turf is grass made of little Burberry umbrellas." Or prescriptive non-sequiturs like "because the sky is blue, Absolut vodka should be forbidden on transatlantic flights."

Possibly you'd learn to avoid these people, their conferences and forums, and their books, articles and blog posts.

Possibly. Probably. Maybe definitely.

On a totally unrelated subject, a few friends are puzzled that I don't belong to, or support, any atheist or skeptic organizations, given my lifelong interest in science.

Yeah... mysteries of the universe.

Friday, April 3, 2015

Does "50% below average" convey innumeracy?

Apparently some people believe that saying "fifty percent are below average" shows ignorance of statistics.

There's some ignorance going on, but it tends to belong to those who act as if the phrase is a mathematical tautology. Consider what happens to a group of non-millionaire friends that gets in a room with Bill Gates: all but one person in that room will have below-room-average wealth.

Use that example whenever smug people who "like math" as long as understanding it is optional make fun of the "fifty percent are below average" phrase.

There are many real-life cases where the mean (or "average"; added later: see below, note IV) is different from the median (the point in the support of the distribution that has half the probability mass on either side). Understanding this is quite important for many things in life.

Consider independent random events in time. Think, for example, of random customers walking into a store, computer processes generating demand for CPU time, packets in a switching network requesting dispatch or queueing, time of death for certain terminal diseases, or radioactive decay.

If you have random independent events that can happen with some fixed probability per unit time, then the time between those events follows an exponential distribution with a probability density function
\[
f_{T}(t) = \lambda \, \exp(-\lambda \, t)
\]
where the mean time between occurrences of the event is $1/\lambda$. The median of this distribution is $\log(2)/\lambda$, which implies that there's always more probability on the left side of the mean than on the right. To be precise, $63\%$ of all intervals between successive events have a length below $1/\lambda$, the mean interval length.

"Sixty-three percent are below the mean." And true!

This asymmetry, from skewness of the distribution, also applies to more complex inter-temporal laws with dependent events, like Weibull random variables, and to power laws, which describe many natural, social, and artificial phenomena. Not always $63\%$, obviously.

So, the next time someone mocks the "fifty percent below average" as proof of innumeracy, educate them about the difference between the mean and the median.

-- -- -- --

Note I: Neil nothing-like-Carl-Sagan Tyson apparently uses the phrase to mock other people. This is no surprise, since his schtick is basically the same as Penn & Teller's: mockery of the out-group and praise of the in-group, with no education at all or, occasionally, anti-education.

Note II: $\log(2)$ is logarithm of $2$ in the natural base $e$. Even though I'm an engineer, I follow the mathematicians' convention and use $\log_{10}$ or $\log_{2}$ to make explicit when I'm not using the natural base.

Note III: Yes, it's always $63\%$, no matter the $\lambda$:
\[
\Pr(T \le 1/\lambda) = \int_{0}^{1/\lambda} \, \lambda \, \exp( - \lambda \, t) \, dt = \Bigg[ - \exp( - \lambda \, t) \Bigg]_{0}^{1/\lambda} = 0.63.
\]
This has to do with the exponential distribution and its peculiarities. As you can see, unlike many "science" popularizers, I show my work.

Note IV: A family member points out that "average" can be used for many other measures of central tendency (a point I had made in this earlier post), but: (a) pretty much all instances of the use of that phrase that I've seen refer to the mean; and (b) the people who mock the usage I explain are generally not cognizant of the other measures of central tendency, they just want to play the identity game.