Monday, November 26, 2012

How misleading "expected value" can be


The expression "expected value" can be highly misleading.

I was just writing some research results and used the expression "expected value" in relation to a discrete random walk of the form

$x[n+1] = \left\{ \begin{array}{ll}
   x[n] + 1 & \qquad \text{with prob. } 1/2 \\
  & \\
   x[n] -1 & \qquad \text{with prob. } 1/2
   \end{array}\right. $ .

This random walk is a martingale, so

$E\big[x[n+1]\big|x[n]\big] = x[n]$.

But from the above formula it's clear that it's never the case that $x[n+1] = x[n]$. Therefore, saying that $x[n+1]$'s expected value is $x[n]$ is misleading — in the sense that a large number of people may expect the event $x[n+1] = x[n]$ to occur rather frequently.

Mathematical language may share words with daily usage, but the meaning can be very different.

----

Added Nov 27: In the random walk above, for any odd $k$, $x[n+k] \neq x[n]$. On the other hand, here's an example of a martingale where $x[n+1] = x[n]$ happens with probability $p$, just for illustration:


$x[n+1] = \left\{ \begin{array}{ll}
   x[n] + 1 & \qquad \text{with prob. } (1-p)/2 \\

  & \\
   x[n]  & \qquad \text{with prob. } p \\

  & \\
   x[n] -1 & \qquad \text{with prob. } (1-p)/2
   \end{array}\right. $ .

(Someone asked if it was possible to have such a martingale, which makes me fear for the future of the world. Also, I'm clearly going for popular appeal in this blog...)

Friday, November 16, 2012

Don't ignore the distribution of estimates!


If forecasts ignore the distribution of estimates, they will be biased.

For example, when computing the probability of purchase using a logit model, we take the estimates for the coefficients in the utility function and use them as the true coefficients, thus:

$P(i) = \Pr(v_i > 0| \beta) = \frac{\exp(v(x_i; \beta))}{1 + \exp(v(x_i; \beta))}$.

But the estimates are themselves random variables, and have a distribution of their own, so, more correctly, the probability $P(i)$ should be written as

$P(i) = \int \Pr(v_i > 0| \hat\beta) \, dF_{B}(\hat\beta)$.

Note that $\beta = E[\hat\beta]$; so we can integrate by parts the formula above and get

$P(i) = \Pr(v_i > 0| \beta) -  \int   \frac{\partial  \, \Pr(v_i > 0| \hat\beta)}{\partial \, \hat\beta} \, F_{B}(\hat\beta) \, d\hat\beta$.

The term

$-  \int   \frac{\partial  \, \Pr(v_i > 0| \hat\beta)}{\partial \, \hat\beta} \, F_{B}(\hat\beta) \, d\hat\beta$

is a bias introduced by ignoring the distribution of $\hat\beta$.

(This simple exercise in probability and calculus was the result of having to make this point over and over on the interwebs, despite the fact that it should be basic knowledge to the people involved. Some of whom, ironically, call themselves "data scientists.")

Added Nov 17: A simple illustration at my online scrapbook.

Saturday, November 10, 2012

Skeptic activism promotes bad thinking


I've heard it many times before, and recently seen it in a poster for the Center For Inquiry, I think, but this quote has always struck me as being counterproductive:
"Extraordinary claims require extraordinary evidence."
In my Bayesian view of the world, this is obviously true; it is also an example of the major problem with skeptic activism:

Cluelessness.

Because what is extraordinary depends on each person's beliefs. Here are a few statements that many people consider extraordinary:
  1. The universe (including space and time) was created in the Big Bang.
  2. In the double-slit experiment, each photon acts as if it goes through both slits.
  3. A moving object becomes shorter along the direction of movement.
  4. A sizable percentage of a person's mass is non-human bacteria.
  5. Genetically, bonobos are closer to humans than to other primates.
  6. Canada is south of Michigan.
(These are true, though some are haphazardly phrased.)

Here's the main problem with skeptic activists: they assume that both the prior beliefs and the informativeness of the evidence are the same for all people, i.e. that the audience shares the skeptics' views of what is extraordinary and the skeptics' trust on the evidence presented.

Before anything else, let me make absolutely clear that I'm not defending superstition; but the approach taken by many skeptic activists seems to be less about diffusing knowledge (convincing people) than establishing identity ("I'm not like them").

This post is, then, an attempt to diagnose problems with skeptic activism so that these problems can be addressed and stop being obstacles to education of the general public.


"WHY SHOULD I BELIEVE YOU?"

To a large percentage of the population, what skeptic activists call extraordinary claims are that population's prior beliefs. Therefore, defeating superstition requires extraordinary evidence. Or, at least, convincing evidence; usually, scientific evidence.

And that's the problem.

How do I know gravity exists? I can drop a pen on my table. How do I know it's a warping of space and time created by mass, where the effects of change in mass distributions propagate through space-time at the speed of light?

Erm...

Ok, so once we're past the basic phenomena of physical reality, evidence becomes complex and depends on prior knowledge and trust in the experimenters and scientists.

In an early episode of The Big Bang Theory, the scientists bounce a laser off a reflector on the Moon, to which Zach, Penny's dumb boyfriend, says: "That’s your big experiment? All that for a line on the screen?" And he's right.

For Zach to interpret the line on the screen as proof of a reflector on the Moon, he needs to believe that: the laser was strong enough to reach the Moon despite attenuation and diffusion in the Earth's atmosphere; people have been to the Moon and left a reflector there; the laser hit that reflector at a ninety-degree angle so the light is reflected along the return path to the roof (corrected on Dec 12: not necessary since the reflectors are corner reflector arrays; live and learn, I always say); and the detector is sensitive enough to detect a reflection from the Moon through the Earth's atmosphere. The alternative theory is that the laptop waited 2.5 seconds after the button was pressed, invoked a graphic routine, and drew the line.

The alternative theory is simpler and closer to daily experience. To accept Leonard's statement that "We hit the Moon," Zach has to believe many things he cannot test himself. In other words, he must show unquestioned trust in what he's told by an authority, also known as faith.

We're told that one should trust science because it has checks and balances, replication, serves as the foundation for engineering, and adapts to change by incorporating new information and, if needed, completely changing.

Oh, and I have some shares in the Golden Gate Bridge I'll sell to you cheaply.

Checks and balances are all well and good, but some scientists don't let others examine their data or their equipment. This is not necessarily because they don't trust the data; it's more often the case that they want to write many papers with the same data, and sharing it would let others write those papers. But it surely looks suspicious.

(For the moment let's not dwell on various cases of outright fraud, clientelism, publishing cliques, data "cleaning" and selective reporting, academic witch-hunts, and other characteristics of human interprises that have infected the practice of high-stakes science.)

Replication? Perhaps in the past, but now replicating other people's experiments is considered a mark of mediocrity by grant committees, tenure committees, and Sheldon (regarding Leonard's work). Every scientist agrees that replication is needed, that it's very important, and that someone else should do it. The only way we get any replication is when some researchers hate other researchers so much that they want to prove them wrong. Hey, blood feuds are good for something, after all.

Science (as a description of reality) is the foundation for engineering. But since most people have as little understanding of technology as they do of science, this argument doesn't help. Technology might as well be run by incantations or, as we call them in the business, user interface designs.

As for science being amenable to change, it's a little better than what most people in the humanities and social sciences have to put up with; but still it took Leonard Susskind many years to convince physicists that information loss in black holes was a problem. Doctors thought so much of Ignaz Semmelweis's exhortations to wash their hands that they committed him to a lunatic asylum. Scientists are people, which is a major obstacle to science.

So, there's a lot of faith involved in the process of trusting science. The difference, of course, is that at the bottom of that faith in science there's usually experimentally verifiable truth, whereas at the bottom of faith in superstition there's always unverifiable dogma.

The problem is that, while real science popularizers (like Carl Sagan, plus a few hundred) focus on the science and its wonder, some skeptic activists pollute the intellectual field with sophistry and politics. This is bad enough when it comes from non-scientists, but unfortunately some of the more visible offenders actually are scientists — they just aren't acting like scientists anymore.

If the future becomes a matter of faith in whomever has the best sophistry and is best at grabbing power, the superstition-peddlers will win: they've been at it millennia longer.


WITH FRIENDS LIKE THESE...

I really enjoy the Mythbusters and Penn and Teller's Bullsh*t, and I like Christopher Hitchens's writing. So it pains me to point out that these high-visibility skeptics aren't that helpful. And when their persuasion approach is imitated by the activist skeptics, it's counter-productive.

The Mythbusters are, at best, a caricature of hypothesis testing. That's fine for an entertainment show, and serves as a motivator to get people interested in science, but the problem is that people who believe in various superstitions can point out the flaws in the Mythbusters tests, and use these to indict science in general.

Take, for example, their test of a movie myth, climbing the outside of a building using suction clamps. The Mythbusters "proved" it was impossible to do so by showing that Jamie couldn't do it with their homemade suction rig. Obviously, a trained climber with better equipment might have. (In fact, some climbers do climb the outside of buildings without any suction clamps.) This is good entertainment, but it reinforces the idea that science plays fast and loose with the facts in pursuit of an agenda.

A spillover of the Mythbusters "Big Booms as Science" approach is a ridiculous demonstration — to college students — of the ideal gas law: a teacher fills a plastic bottle with liquid nitrogen, drops it into a bucket of hot water, covers with ping-pong balls, and BOOM! Science!

Correction (Nov 22): The teacher fills the plastic bottle with dry ice (frozen CO2), not liquid nitrogen.(End of correction.)

No. Boom, but no science.

There was nothing scientific about it; it reminds me of the I effing love science page, where little science ever treads. This demonstration might be okay for children (though when liquid nitrogen is involved children prefer instant ice-cream making); for college students? What are they supposed to learn? And this, to our benighted media, is "the best physics class ever?"

Aargh! Try these instead.

I already explained, in another post, the problem of Penn and Teller's style of "argument by making fun of people who hold the wrong beliefs." tl;dr: it convinces people that it's who says things that matters, not what they say.

These ad hominem attacks are a common tactic of many skeptic activists: attacking the individuals who hold opposing views. (I think that some of the activists might do so because they don't actually know enough science to make a substantive argument.)

Since audiences can understand that something might be true regardless of who is saying it, the repeated recourse to ad-hominem attacks by skeptic activists not only turns away reasonable people, it also gives fodder to the skeptics' opponents.

Christopher Hitchens wrote very well; but his writing was political, not scientific. This is a broader problem that deserves a section of its own. Possibly a future post of his own as well.


IF YOU BRING POLITICS INTO SCIENCE, THEN SCIENCE BECOMES POLITICS

I was watching a talk by Richard Dawkins and Larry Krauss about the origins of the universe and life, when, a few minutes in, Krauss decided to make fun of an american political party. This is not an uncommon occurrence.

For example, there's a big schism in the skeptic activists regarding a number of social issues, none of which is a matter of science. I'm excluding social science, on the basis that if you have to put a qualifier, "social," then you don't belong in the club.

If you're trying to get the general public to understand, say, that the universe is expanding, it's really really counterproductive to undermine your credibility by including non-sequiturs about american politics in the dialog.

For starters, a non-sequitur shows a logical failure on your part. If you show me that your reasoning is flawed on something I can tell, why should I trust your judgment regarding things I cannot tell, like the validity of a mathematical model or the relevance of some experimental data?

Also, even if one were disposed to accept scientific evidence as contributing to a political decision, there are other considerations. For example other parts of the party platform: "I don't care if Orange believes in Odin Allfather, her tax plan makes more sense than atheist Purple's." For Dawkins and Krauss to assume that everyone must agree with their priorities is another weakness in their persuasive strategy.

Finally, if the argument becomes about who gets power or who is a "good person," that style of argumentation corrupts the whole point of scientific and skeptic education. And leads to internecine warfare among the various factions in skeptic activism, for example.


CODA: BAD SKEPTIC ACTIVISM CROWDS OUT GOOD SCIENCE

A friend who keeps in touch with skeptic activists sent me a link to a fight among these skeptic activists, which included: group A trying to get people from group B fired from their jobs, name-calling, physical threats, exclusion at conferences, etc.

I went and saw a few minutes of a video and was hooked: the drama, the fights, the name-calling, Wow! Science is so dramatic...

Hey, wait a second. Why did I just spend three hours watching these videos about the lives of people I couldn't care less about? — That's what they are; there's not a single new argument or scientific fact or even a mention of science.

It's Keeping Up With The Kardashians with a skeptic tag.

Three hours is enough to read about half of Why We Get Sick. To watch three classes in Robert Sapolsky or Leonard Susskind's courses. To write about 500 words in a research paper. To edit a chapter in my work-in-progress book. To tighten the core code on my estimators. To go for a 13.5Km walk. To understand four pages of Rudin's Real and Complex Analysis, maybe even five.

All this potential; all crowded out by a very human attraction to gossip. That temporary insanity was diagnosed and addressed; but what about the broader population, how many of these three-hour intervals have they wasted? How much attention squandered? How much anti-learning?

When God is not Great and The God Delusion were best-sellers, what books with real science and real education in logic, biases, and numeracy did they crowd out?

Thursday, November 1, 2012

Facts up for debate? SRSLY?


I think I figured out the cause of the decline of Western Civilization!

I just watched part of a debate about a point of fact. Not politics, not preferences, not opinions. A debate about a point of fact, meaning something that is either true or false in reality was debated and put to a vote.

A moderator set the fact up as a proposition; two teams tried to out-debate each other; whenever they disagreed on what some study or piece of evidence referred to (but not presented), the moderator declared the point stalled and moved on; and at the end the audience voted on whether the proposition was true.

This was done under the aegis of reason and, presumably, science.

Debate and consensus are accepted as good things in themselves, while supporting evidence is apparently considered unimportant, to the point that neither team thought to bring any.

Oh, good grief!

(Just in case you don't see the problem, physical reality is not decided by vote and all the great orators in the world cannot make gravity go away.)