Saturday, May 2, 2015

There are 10 types of people in the world...

...those who know base 3, those who think this joke only works with binary, and the rest. The second group is the worst. – My new twitter header.

Yes, another post about people who "like STEM" as a way to assert their in-group identity – as long as they don't actually have to learn any STEM.

The standard joke is "there are 10 types of people in the world, those who understand binary and those who don't." It's been productized in a number of ways, including on a ThinkGeek t-shirt. My version is a little more elaborate, since it uses base 3 coding. The "second group" part has to do with poseurs who make the binary joke without understanding it.

The motivator for the new header was the media circus following the Tesla Power Wall announcement. Namely, all the ignorant bleating about "Tesla killing nuclear power." I made a short tweet-storm about it,

but my issue is not with Tesla or the pundits lacking the basics of electrical and chemical engineering,  economics, and the realities of bringing a product to a mass market. These are problems of ignorance, and ignorance can be addressed. That's what education and information are for. Between MOOCs and library books, there are plenty of educational opportunities around.

The problem is the attitude that knowledge, even information (basic tech specs), is not necessary for expressing an opinion. Loudly. As long as it's the right opinion. The opinion that the right people must have. The opinion that must not be questioned.

And that's a serious problem: we live in a society ever more dependent on technology – based on science and engineered into products. And the foundational attitude of science and engineering, respect for physical reality, has been replaced by compliance with an identity-based narrative.

So I now engage in knowledge-based attitude guerrilla: fighting identity-based ignorance by undermining the credibility of that identity. Like so:

– "Yes, yes, science is very important. By the way, why do we need a different flu vaccine every year?" [Silence.] "So, your entire knowledge of evolution is limited to saying creationism is wrong?"

– "Ah, a quote by Neil 'Hayden gift shop' Tyson. How fast are we moving relative to the center of the Earth? Our latitude is around 38 degrees, Earth's radius is around 6400km."

(These are not "gotcha" science questions, they require no more than middle-school education.)

There are plenty of sources motivation out there, like this video by astronaut Samantha Cristoforetti. All that's needed is to fight the attitude that thinking, knowledge, and information are superfluous.

For a successful technological society, reality must take precedence over narrative, for Nature cannot be fooled. (Adapted from the last sentence here.)

Saturday, April 25, 2015

Here we go again... 'extraordinary claims require extraordinary evidence'

I've already posted about the dangers of this quote, but I recently saw it on a portuguese skeptics web site and decided to give it another go.

Before we get to it, I should clarify that my posts on skeptics/atheists, are based on my experience with US atheists and skeptics and their non-american orbiters. The portuguese group and a recently found YouTube channel seem to be much more interesting.

Ok, so what about the "extraordinary claims require extraordinary evidence" quote, which on the portuguese web site is attributed to Carl Sagan (though I think Rev. Thomas Bayes got there first) and has been misattributed to Christopher Hitchens in a number of places?

Your rationale, at least in Bayesian terms, is correct...

The statement reflects a quasi-Bayesian view of the world, which I like: say the extraordinariness of a proposition $P$ is to be supported by evidence $E$; what does Sagan's statement mean?

Let $\Pr(P)$ the the prior probability of $P$, that is the degree to which the person receiving the evidence believes in $P$ prior to the evidence being presented. After the evidence is presented, the proposition can be evaluated by its posterior probability. To compute the posterior probability, $\Pr(P|E)$, we need to know a couple of other things: the conditional probability of the evidence on the proposition $\Pr(E|P)$ and the unconditional probability of the evidence $\Pr(E)$.

Whoa! Too much math, I can hear a lot of people who like science as long as they don't have to learn any saying...

Ok, say $P$ is "Bob is a powerlifter" and $E$ is "Bob bench-presses 400Lbs." If about one out of one-hundred people in our social circle are powerlifters, then $\Pr(P) = 0.01$. If Rose tells me that Bob bench-pressed 400Lbs, I need to know a couple of things to determine whether Bob really is a powerlifter:

1. Given that $X$ is a powerlifter, how likely is $X$ to bench 400Lbs? (Most powerlifters can bench 400Lbs, but our social group might have a few weakling powerlifters, usually bodybuilders who are trying to pretend they're athletes.) This gives me the $\Pr(E|P)$.

2. What percentage of our social group benches 400Lbs? Many football and basketball players, who are not powerlifters, bench 400Lbs; there might be several of these in our social group. This gives me the $\Pr(E)$. Note that this includes the powerlifters too; it's a proportion of everybody.

Note that the ratio of these two quantities, $\Pr(E|P)/\Pr(E)$, is a measure of informativeness of the evidence:

Say $\Pr(E|P) = 0.75$, three-quarters of all powerlifters can bench 400Lbs, and $\Pr(E) = 0.75$, meaning that so can a similar proportion of everyone. In this case, the evidence is quite useless. It's non-informative at all, as can be seen by computing the posterior probability using Bayes's rule:
\Pr(P|E) = \frac{\Pr(E|P) \times \Pr(P)}{\Pr(E)} = \frac{0.01 \times 0.75}{0.75} = 0.01
The "evidence" doesn't change the probability of the proposition, in other words, no information comes from knowing it. (For the purposes of determining the $P$; we learn that Bob can bench 400Lbs, which might be useful when we need a friend to help us move.)

Now, say $\Pr(E|P) = 0.75$, so three-quarters of all powerlifters can bench 400Lbs, same as above, and $\Pr(E) = 0.05$, meaning that only few people in the social group can do it. This suggests that the evidence is somewhat dispositive, as can be computed by:
\Pr(P|E) = \frac{\Pr(E|P) \times \Pr(P)}{\Pr(E)} = \frac{0.01 \times 0.75}{0.05} = 0.15
Note that the probability increased fifteen-fold. This is strong evidence, but, given the low prior probability, Bob's powerlifter-ness is still very much in question. The 400Lbs bench wasn't "extraordinary"-enough evidence.*

In general, the lower the $\Pr(P)$, i.e. "more extraordinary claims", the higher $\Pr(E|P)/\Pr(E)$ must be for accepting $P$, i.e. "require more extraordinary evidence." So far, so good for Carl Sagan.

... but your understanding of human psychology is lacking

The problem is the lack of indices in those probabilities above. In particular, the lack of an index to separate different probabilities assigned by different people. (Oh, and people also have assorted cognitive biases that make this worse, but we don't need them to make the case so let's stay within the bounds of strict Bayesian rationality.)

What the Sagan quote gets wrong is that what person A thinks is an extraordinary claim and what person B considers extraordinary claim can be opposed. Typically, when people invoke that quote, what they mean is:

"Claims that contradict that which I and my circle of friends believe in are such proofs of stupidity by those who believe them, that I'll quote Carl Sagan and stop trying to engage in intellectual discussion." 

For what it's worth, I think this is mostly applicable to the US/US-orbiter crowd. The portuguese web site and the skinny YouTube brit seem to be actually trying to engage people.

Let's consider Bob's case again, but now the probabilities have a subscript, $A$ or $C$, for Arnold or Cooper.

Arnold only knows bodybuilders and powerlifters (and has taken too much Deca-Durabolin and Dianabol for his brain be able to tell the difference) so he thinks that almost everyone is a powerlifter, $\Pr_A(P)=0.99$. All powerlifters in Arnold's world can bench well above 400Lbs, so his $\Pr_A(E|P)=1$, and he assumes everyone else is a weakling who can't bench an unloaded bar, so with minimal computation we get $\Pr_A(E)=0.99$.

Cooper, on the other hand, when he's not busy writing flawed papers about the benefits of running [ahem: if you ignore self-selection], believes that most people are not powerlifters, $\Pr_C(P) = 0.01$, but knows quite a few people who can bench 400Lbs (because they played football or do some real exercise in a gym on the sly) and knows only one [pretend] powerlifter who can't bench 400Lbs (probably a bodybuilder), so he thinks that $\Pr_C(E|P)=0.5$ and $\Pr_C(E)=0.45$. (For kicks, compute Cooper's probability that a non-powerlifter can bench 400Lbs.)

Arnold and Cooper are having a discussion about Bob. Arnold, with a thick Austrian accent despite having lived in California for almost fifty years, says:

"Auf kourse Bob ees a powerlivtehr. Ohlmahst eferryvon ees."

Cooper: "No, he's not, and they aren't."

Rose arrives and says Bob benches 400Lbs.

Cooper: "That changes almost nothing."
(He's right: $\Pr_C(P|E) = 0.011$)

Arnold: "Zee? Unwiderlegbar [incontrovertible – ed.] evidenz! Ah'll pahmp yew Ahp!" [Strikes frontal biceps pose.]
(He's right: $\Pr_A(P|E) = 1.00$.)

Cooper, who finds the proposition "Bob is a powerlifter" extraordinary, does require extraordinary evidence to change his mind, and considers the evidence that Rose brought in to be insufficient. Arnold thinks it's dispositive evidence. Any discussion between them that doesn't start by acknowledging that their probabilities are different will be a pointless waste of time.

A pointless waste of time, that is, if the purpose is to convince. I believe that that was Carl Sagan's purpose, unlike many of the current day best-selling popular-in-america skeptics whose purpose seem to be a convex combination of politicking (and I mean in the sense of promoting a particular political party, not just policies) and monetizing their echo chamber.

Monetizing the echo chamber: like preaching to the choir, but with monetary reward.

(Again, as far as I can tell, not what the portuguese skeptics or the needs-a-sandwich YouTuber are doing.)

Bonus: the effect of informativeness of evidence

Informativeness in this table is $\Pr(E|P)/\Pr(E)$. So for example, for someone who has $\Pr(P) = 0.000 025$, i.e. is very skeptical about the proposition, to be completely uncertain about $P$, $\Pr(P|E) = 0.5$, you need the evidence to be twenty thousand times more likely if P is true, than overall, $\Pr(E|P) = 20000 \times  \Pr(E)$.

Note that when $\Pr(E|P)/\Pr(E) < 1$, the evidence is against the proposition, in that the probability of observing the evidence given the proposition is lower than the incidence of the evidence in the general population of events.

For kicks, why are some cells greyed-out? Is this a cop-out to avoid showing probabilities above 1, or is there a real reason why informativeness is bound below some limit? Hint: there's a real reason.

-- -- -- --

* Of course, the most dispositive test would be to show Bob a squat rack; if Bob said "what is that?" (normal person) or "it's for doing standing curls, isn't it?" (bodybuilder), that would be proof of non-powerlifter-ness. A powerlifter would say "I'd rather use a cage or a monolift, but sure I'll SHUT UP AND SQUAT!"

Saturday, April 11, 2015

The sky is blue, therefore no vodka on transatlantic flights

Consider a truthful proposition, say "the sky is blue." In this hypothetical, imagine that for historical reasons a majority of the people are indoctrinated to believe that the sky is red; a minority of the people know it's blue.

Now imagine that there's a subgroup of those who believe that the sky is blue who organize and attend conferences, write articles and blog posts, and publish books, all dedicated to making fun of people who don't know that the sky is blue.

Most of the minority who know that the sky is blue find these conferences, articles, and books (like "The Red Delusion" and "Red is not Great") both trivial and mean-spirited: trivial because they don't actually elaborate on the blueness of the sky as a phenomenon; and mean-spirited because when you scrape the thin veneer of interest in the truth, what's left is a group of people mocking those to whom they feel superior.

Imagine that you meet, possibly on a discussion forum, some of these "the sky is blue" activists who are very vocal about the blueness of the sky, but are unaware that the color blue maps into a specific range of wavelengths, how the eye senses color, or that the color of the sky results from the scattering of sunlight in the atmosphere. Instead, when topics like these surface, the activists quickly move the discussion to the topic of some other person who believes the sky is red and should be mocked or punished for that. Or ban you from the forum.

Now, imagine that at one of these conferences, or in the articles by some of the least competent writers, you find clearly wrong statements, such as "the oceans are yellow and made of butter" or "golf turf is grass made of little Burberry umbrellas." Or prescriptive non-sequiturs like "because the sky is blue, Absolut vodka should be forbidden on transatlantic flights."

Possibly you'd learn to avoid these people, their conferences and forums, and their books, articles and blog posts.

Possibly. Probably. Maybe definitely.

On a totally unrelated subject, a few friends are puzzled that I don't belong to, or support, any atheist or skeptic organizations, given my lifelong interest in science.

Yeah... mysteries of the universe.

Friday, April 3, 2015

Does "50% below average" convey innumeracy?

Apparently some people believe that saying "fifty percent are below average" shows ignorance of statistics.

There's some ignorance going on, but it tends to belong to those who act as if the phrase is a mathematical tautology. Consider what happens to a group of non-millionaire friends that gets in a room with Bill Gates: all but one person in that room will have below-room-average wealth.

Use that example whenever smug people who "like math" as long as understanding it is optional make fun of the "fifty percent are below average" phrase.

There are many real-life cases where the mean (or "average"; added later: see below, note IV) is different from the median (the point in the support of the distribution that has half the probability mass on either side). Understanding this is quite important for many things in life.

Consider independent random events in time. Think, for example, of random customers walking into a store, computer processes generating demand for CPU time, packets in a switching network requesting dispatch or queueing, time of death for certain terminal diseases, or radioactive decay.

If you have random independent events that can happen with some fixed probability per unit time, then the time between those events follows an exponential distribution with a probability density function
f_{T}(t) = \lambda \, \exp(-\lambda \, t)
where the mean time between occurrences of the event is $1/\lambda$. The median of this distribution is $\log(2)/\lambda$, which implies that there's always more probability on the left side of the mean than on the right. To be precise, $63\%$ of all intervals between successive events have a length below $1/\lambda$, the mean interval length.

"Sixty-three percent are below the mean." And true!

This asymmetry, from skewness of the distribution, also applies to more complex inter-temporal laws with dependent events, like Weibull random variables, and to power laws, which describe many natural, social, and artificial phenomena. Not always $63\%$, obviously.

So, the next time someone mocks the "fifty percent below average" as proof of innumeracy, educate them about the difference between the mean and the median.

-- -- -- --

Note I: Neil nothing-like-Carl-Sagan Tyson apparently uses the phrase to mock other people. This is no surprise, since his schtick is basically the same as Penn & Teller's: mockery of the out-group and praise of the in-group, with no education at all or, occasionally, anti-education.

Note II: $\log(2)$ is logarithm of $2$ in the natural base $e$. Even though I'm an engineer, I follow the mathematicians' convention and use $\log_{10}$ or $\log_{2}$ to make explicit when I'm not using the natural base.

Note III: Yes, it's always $63\%$, no matter the $\lambda$:
\Pr(T \le 1/\lambda) = \int_{0}^{1/\lambda} \, \lambda \, \exp( - \lambda \, t) \, dt = \Bigg[ - \exp( - \lambda \, t) \Bigg]_{0}^{1/\lambda} = 0.63.
This has to do with the exponential distribution and its peculiarities. As you can see, unlike many "science" popularizers, I show my work.

Note IV: A family member points out that "average" can be used for many other measures of central tendency (a point I had made in this earlier post), but: (a) pretty much all instances of the use of that phrase that I've seen refer to the mean; and (b) the people who mock the usage I explain are generally not cognizant of the other measures of central tendency, they just want to play the identity game.

Wednesday, April 1, 2015

How to annoy people and feel justified in doing so

Partly in jest, I recently tweeted something like "I'm happy to hear your thoughts on analytics, but first tell me what $\frac{d}{dx} \, \sin( \exp(- x^2/2))$ is."

This, according to some of my social network, is a form of ad hominem, the fallacy of evaluating a thought by considering the source. The reason why this is a fallacy is that a "good" person can say wrong things, while a "bad" person can say correct things.

But this is more of a screening tool for rationally allocating a scarce resource (time and attention) than a fallacy.

(If you want to skip the math go to the red text "END MATH" below. At this abstract level, the math basically tracks common sense. It becomes useful when I plug actual functional forms and probability distributions in it. Then it's a useful experiment design tool.)

Let's say that I have $T$ free time and I'm trying to decide whether to spend a part $t$ of it arguing analytics with a person $j$. My purpose in arguing is to derive some learning $L(t,k(j))$, as a function of the time spent $t$ and the person's knowledge $k(j)$, increasing in both and supermodular on the interaction. There's an opportunity cost of the time, since I will only have $T-t$ time left for other pursuits, say with an expected utility $U(T-t)$ in the same scale as $L$.

If I know $j$ well, then I know $k(j)$ well, and can make the decision of how much time to argue by choosing $t: 0\le t \le T$ such that
\left. \frac{dU}{d\tau}\right|_{\tau= T-t}= \frac{dL}{dt}(t,k(j)).
(Take derivative of total happiness, equal to zero, mind the $-t$ in the marginal $U$, move $U$ to the left-hand side for giggles.) With decreasing marginal utility on the left (a common assumption), the implicit function theorem yields optimal $t$ increasing in $k(j)$.

We call this the smell test: basic result makes sense. In other words, more time is spent arguing with more knowledgeable people, since the objective is to learn.

But, if I don't know $j$, I need to figure out how much I'm likely to learn, and for that I need some idea of what $k(j)$ is. Picking a knowledge probe $\omega$ that is in the field of interest, I can gather information about $k(j)$, by using the distribution of knowledge in the population $f_{K(\cdot)} \, (k(\cdot))$ and the conditional probability of $\omega$ on the knowledge, $f_{\Omega|K} \, (\omega|k)$, to get a probability distribution for $k(j)$, conditioned on the observed $\omega$
f_{K(\cdot)|\Omega} \, (k(j)|\omega) = \frac{f_{\Omega|K} \, (\omega|k(j)) \times f_{K(\cdot)} \, (k(j))}{\int_{K} f_{\Omega|K} \, (\omega|k) \times f_{K(\cdot)} \, (k) \, dk}
which can be replaced in the decision rule, using the results of the probe, $\omega(j)$, thusly:
\left. \frac{dU}{d\tau}\right|_{\tau= T-t} = \int_{K}  \frac{dL}{dt}(t,k) \times f_{K(\cdot)|\Omega} \, (k|\omega(j)) \, dk.
For an informative probe, $f_{K(\cdot)|\Omega} \, (k|\omega(j))$ is increasing in $\omega$, therefore the expectation on the right is increasing in $\omega$ and, with decreasing marginal utility on the left, again the IFT shows that optimal $t$ is increasing in $\omega(j)$.

For my simple example, probes that maximize variance of $f_{K|\Omega} \, (k|\omega)$ over the space of $K$ are reasonable choices; threshold probes, for example. For more complex cases we'd have to choose the probes taking into account cost of different types of error and strategic behavior by the $j$.

(The technical area of designing optimal probes is economics of information, and its predecessor is the field of  -- no I'm not kidding --  multi-armed bandit problems.) END MATH. 

In other words, rationally, a time-constrained agent will want to spend time arguing only with people who can demonstrate some basic knowledge of the field. 

So here are some ideas to screen people who like science as long as they don't have to learn any, and want to talk science at you based on something they saw on tell-lies-vision:

"Sure I'd like to hear your opinion on space exploration, just tell me (a) what is specific impulse, and (b) what is the mass ratio for a stochiometric LOX/LH2 engine? No, I don't care about Neil DeGrasse Tyson's vest."

"Sure I'll listen to your advice on nutrition, just sketch out and explain a Krebs cycle on this napkin. Too hard? Ok, just list the amino-acids that humans cannot synthesize and need to get from food. By the way, we portuguese have been eating kale soup for centuries, so keep that raw cabbage away from me."

"Sure I'll listen to your opinions on fitness, but first show me your squat and deadlift (to assess form and technique, not weight). No I don't care what InShape says."

"Sure I'll listen to your opinion about solar power; just tell me how the operation of a cadmium-telluride photovoltaic cell is different from that of a polycrystalline silicon cell. Too hard? Ok, how much power do you think you need for a small-sized city? Order of magnitude is fine. And get the units right, I said power, not energy."

"Sure I'll listen to your opinion about nuclear power; just tell me how a liquid-fluorine thorium reactor works. Too difficult? Ok, if you start with 478g of Plutonium-239 and leave it alone for 72,300 years (three half-lives), how much Pu-239 do you expect to have then?"

"Sure I'll listen to your opinion on global warming, just tell me how to relate the black-body radiation of an object to its temperature. If you want to argue the value of computer models start by explaining how factor analysis works."

"Sure I'll listen to your opinion on economics. Just redo all the math above in an equilibrium model with strategic agents manipulating the informativeness of $\omega$."

Haughty superciliousness. Still better than self-contented ignorance.

-- -- -- --

P.S. Definitely going for popular appeal in this blog.

Tuesday, March 24, 2015

The danger of weak arguments

Weak arguments are not neutral, they are damaging for technical or scientific propositions.

There's overwhelming evidence for the proposition "Earth is much older than 6000 years." (It's about 4.54 billion years old, give or take fifty million.) Let's say that Bob, who likes science, as long has he doesn't have to learn any, is arguing with Alex, an open-minded young-Earth creationist:

Alex: Earth was created precisely on Saturday, October 22, 4004 B.C., at 6:00 PM, Greenwich Mean Time, no daylight savings.

Bob: That's ridiculous, we know from Science(TM) that the Earth is much older than that.

Alex: What science? I'm willing to listen, but not without details.

Bob: Well, scientists know exactly and it was in Popular Science the other day, too.

Alex: What did the Popular Science article say?

Bob: I forget, but it had two pretty diagrams, lots of numbers, and a photo of Neil DeGrasse Tyson in his office. He has a wood model of Saturn that he made when he was a kid.

Alex: So you don't really know how the age of the Earth is calculated by these scientists, you're just repeating the conclusion of an argument that you didn't follow. Maybe you didn't follow because it's a flawed argument.

Bob: I don't remember, it's very technical, but the scientists know and that's all I need. Why don't you believe in Science(TM)?

Alex: It appears to me that your argument is simply intimidation: basically "if you don't agree with me, I'll tag you with a fashionable insult." Perhaps that's also the argument of the scientists. They certainly sound smug on television, as if they're too good to explain themselves to us proles.

Alex, despite his nonsensical belief about the age of the Earth, is actually right about the form of argument; by presenting a weak argument for a truthful proposition, Bob weakens the case for that proposition. Note that this is purely a psychological or Public Relations issue. Logically, a bad argument for a proposition shouldn't change the truth of that proposition. Too bad people's brains aren't logical inference machines.

(There's a Bayesian argument for downgrading a belief in a proposition when the case presented for that proposition is weak, but a rational person trying to learn in a Bayesian manner the truth of a proposition will do a systematic search over the space of arguments, not just process arguments collected by convenience sampling.)

This is one of the major problems with people who like science but don't learn any: because of the way normal people process arguments and evidence, having many Bobs around helps the case of the Alexes.

A weak argument for a true proposition weakens the public's acceptance of that proposition. People who like science without learning any are fountains of weak arguments.

Let's convince people who "like science" that they should really learn some.

Friday, March 20, 2015

Adventures in science-ing among the general public

I've been running an informal experiment in social situations, based on an example by physicist Eric Mazur:

A light car moving fast collides with a slow heavy truck. Which of the following options is true?

a) The force that the car exerts on the truck is smaller than the force that the truck exerts on the car.

b) The force that the car exerts on the truck is equal to the force that the truck exerts on the car.

c) The force that the car exerts on the truck is larger than the force that the truck exerts on the car.

d) To know which force is larger (that of the car on the truck or that of the truck on the car) we need to know more details, for example the speed and weight (mass, really) of each vehicle.

The majority in my convenience sample pick the last option, d. Included in this sample are people with science and engineering degrees. Most of the people I asked this question can quote Newton's third law of motion: when prompted with "every action has..." they complete it with "an equal and opposite reaction."

So far, my convenience sample replicates Mazur's results.

But unlike his measurement (which was made with those classroom clickers that universities use to avoid hiring more faculty and having smaller, more personalized class sessions), mine sometimes comes with arguments, explanations, and resistance.

And here's the interesting part: the farther the person's training or occupation is from science and technology, the stronger their objections and attempts to argue for d, even as they quote Newton. I don't think this is the Dunning-Kruger effect. It's more like a disconnect between concept, principle, meaning, and application.

It's not like linking concepts to principles and meaning and then applying those concepts is important, right? Especially in science and engineering...