Monday, November 26, 2012

How misleading "expected value" can be

The expression "expected value" can be highly misleading.

I was just writing some research results and used the expression "expected value" in relation to a discrete random walk of the form

$x[n+1] = \left\{ \begin{array}{ll}
   x[n] + 1 & \qquad \text{with prob. } 1/2 \\
  & \\
   x[n] -1 & \qquad \text{with prob. } 1/2
   \end{array}\right. $ .

This random walk is a martingale, so

$E\big[x[n+1]\big|x[n]\big] = x[n]$.

But from the above formula it's clear that it's never the case that $x[n+1] = x[n]$. Therefore, saying that $x[n+1]$'s expected value is $x[n]$ is misleading — in the sense that a large number of people may expect the event $x[n+1] = x[n]$ to occur rather frequently.

Mathematical language may share words with daily usage, but the meaning can be very different.


Added Nov 27: In the random walk above, for any odd $k$, $x[n+k] \neq x[n]$. On the other hand, here's an example of a martingale where $x[n+1] = x[n]$ happens with probability $p$, just for illustration:

$x[n+1] = \left\{ \begin{array}{ll}
   x[n] + 1 & \qquad \text{with prob. } (1-p)/2 \\

  & \\
   x[n]  & \qquad \text{with prob. } p \\

  & \\
   x[n] -1 & \qquad \text{with prob. } (1-p)/2
   \end{array}\right. $ .

(Someone asked if it was possible to have such a martingale, which makes me fear for the future of the world. Also, I'm clearly going for popular appeal in this blog...)

Friday, November 16, 2012

Don't ignore the distribution of estimates!

If forecasts ignore the distribution of estimates, they will be biased.

For example, when computing the probability of purchase using a logit model, we take the estimates for the coefficients in the utility function and use them as the true coefficients, thus:

$P(i) = \Pr(v_i > 0| \beta) = \frac{\exp(v(x_i; \beta))}{1 + \exp(v(x_i; \beta))}$.

But the estimates are themselves random variables, and have a distribution of their own, so, more correctly, the probability $P(i)$ should be written as

$P(i) = \int \Pr(v_i > 0| \hat\beta) \, dF_{B}(\hat\beta)$.

Note that $\beta = E[\hat\beta]$; so we can integrate by parts the formula above and get

$P(i) = \Pr(v_i > 0| \beta) -  \int   \frac{\partial  \, \Pr(v_i > 0| \hat\beta)}{\partial \, \hat\beta} \, F_{B}(\hat\beta) \, d\hat\beta$.

The term

$-  \int   \frac{\partial  \, \Pr(v_i > 0| \hat\beta)}{\partial \, \hat\beta} \, F_{B}(\hat\beta) \, d\hat\beta$

is a bias introduced by ignoring the distribution of $\hat\beta$.

(This simple exercise in probability and calculus was the result of having to make this point over and over on the interwebs, despite the fact that it should be basic knowledge to the people involved. Some of whom, ironically, call themselves "data scientists.")

Added Nov 17: A simple illustration at my online scrapbook.

Saturday, November 10, 2012

Skeptic activism promotes bad thinking

I've heard it many times before, and recently seen it in a poster for the Center For Inquiry, I think, but this quote has always struck me as being counterproductive:
"Extraordinary claims require extraordinary evidence."
In my Bayesian view of the world, this is obviously true; it is also an example of the major problem with skeptic activism:


Because what is extraordinary depends on each person's beliefs. Here are a few statements that many people consider extraordinary:
  1. The universe (including space and time) was created in the Big Bang.
  2. In the double-slit experiment, each photon acts as if it goes through both slits.
  3. A moving object becomes shorter along the direction of movement.
  4. A sizable percentage of a person's mass is non-human bacteria.
  5. Genetically, bonobos are closer to humans than to other primates.
  6. Canada is south of Michigan.
(These are true, though some are haphazardly phrased.)

Here's the main problem with skeptic activists: they assume that both the prior beliefs and the informativeness of the evidence are the same for all people, i.e. that the audience shares the skeptics' views of what is extraordinary and the skeptics' trust on the evidence presented.

Before anything else, let me make absolutely clear that I'm not defending superstition; but the approach taken by many skeptic activists seems to be less about diffusing knowledge (convincing people) than establishing identity ("I'm not like them").

This post is, then, an attempt to diagnose problems with skeptic activism so that these problems can be addressed and stop being obstacles to education of the general public.


To a large percentage of the population, what skeptic activists call extraordinary claims are that population's prior beliefs. Therefore, defeating superstition requires extraordinary evidence. Or, at least, convincing evidence; usually, scientific evidence.

And that's the problem.

How do I know gravity exists? I can drop a pen on my table. How do I know it's a warping of space and time created by mass, where the effects of change in mass distributions propagate through space-time at the speed of light?


Ok, so once we're past the basic phenomena of physical reality, evidence becomes complex and depends on prior knowledge and trust in the experimenters and scientists.

In an early episode of The Big Bang Theory, the scientists bounce a laser off a reflector on the Moon, to which Zach, Penny's dumb boyfriend, says: "That’s your big experiment? All that for a line on the screen?" And he's right.

For Zach to interpret the line on the screen as proof of a reflector on the Moon, he needs to believe that: the laser was strong enough to reach the Moon despite attenuation and diffusion in the Earth's atmosphere; people have been to the Moon and left a reflector there; the laser hit that reflector at a ninety-degree angle so the light is reflected along the return path to the roof (corrected on Dec 12: not necessary since the reflectors are corner reflector arrays; live and learn, I always say); and the detector is sensitive enough to detect a reflection from the Moon through the Earth's atmosphere. The alternative theory is that the laptop waited 2.5 seconds after the button was pressed, invoked a graphic routine, and drew the line.

The alternative theory is simpler and closer to daily experience. To accept Leonard's statement that "We hit the Moon," Zach has to believe many things he cannot test himself. In other words, he must show unquestioned trust in what he's told by an authority, also known as faith.

We're told that one should trust science because it has checks and balances, replication, serves as the foundation for engineering, and adapts to change by incorporating new information and, if needed, completely changing.

Oh, and I have some shares in the Golden Gate Bridge I'll sell to you cheaply.

Checks and balances are all well and good, but some scientists don't let others examine their data or their equipment. This is not necessarily because they don't trust the data; it's more often the case that they want to write many papers with the same data, and sharing it would let others write those papers. But it surely looks suspicious.

(For the moment let's not dwell on various cases of outright fraud, clientelism, publishing cliques, data "cleaning" and selective reporting, academic witch-hunts, and other characteristics of human interprises that have infected the practice of high-stakes science.)

Replication? Perhaps in the past, but now replicating other people's experiments is considered a mark of mediocrity by grant committees, tenure committees, and Sheldon (regarding Leonard's work). Every scientist agrees that replication is needed, that it's very important, and that someone else should do it. The only way we get any replication is when some researchers hate other researchers so much that they want to prove them wrong. Hey, blood feuds are good for something, after all.

Science (as a description of reality) is the foundation for engineering. But since most people have as little understanding of technology as they do of science, this argument doesn't help. Technology might as well be run by incantations or, as we call them in the business, user interface designs.

As for science being amenable to change, it's a little better than what most people in the humanities and social sciences have to put up with; but still it took Leonard Susskind many years to convince physicists that information loss in black holes was a problem. Doctors thought so much of Ignaz Semmelweis's exhortations to wash their hands that they committed him to a lunatic asylum. Scientists are people, which is a major obstacle to science.

So, there's a lot of faith involved in the process of trusting science. The difference, of course, is that at the bottom of that faith in science there's usually experimentally verifiable truth, whereas at the bottom of faith in superstition there's always unverifiable dogma.

The problem is that, while real science popularizers (like Carl Sagan, plus a few hundred) focus on the science and its wonder, some skeptic activists pollute the intellectual field with sophistry and politics. This is bad enough when it comes from non-scientists, but unfortunately some of the more visible offenders actually are scientists — they just aren't acting like scientists anymore.

If the future becomes a matter of faith in whomever has the best sophistry and is best at grabbing power, the superstition-peddlers will win: they've been at it millennia longer.


I really enjoy the Mythbusters and Penn and Teller's Bullsh*t, and I like Christopher Hitchens's writing. So it pains me to point out that these high-visibility skeptics aren't that helpful. And when their persuasion approach is imitated by the activist skeptics, it's counter-productive.

The Mythbusters are, at best, a caricature of hypothesis testing. That's fine for an entertainment show, and serves as a motivator to get people interested in science, but the problem is that people who believe in various superstitions can point out the flaws in the Mythbusters tests, and use these to indict science in general.

Take, for example, their test of a movie myth, climbing the outside of a building using suction clamps. The Mythbusters "proved" it was impossible to do so by showing that Jamie couldn't do it with their homemade suction rig. Obviously, a trained climber with better equipment might have. (In fact, some climbers do climb the outside of buildings without any suction clamps.) This is good entertainment, but it reinforces the idea that science plays fast and loose with the facts in pursuit of an agenda.

A spillover of the Mythbusters "Big Booms as Science" approach is a ridiculous demonstration — to college students — of the ideal gas law: a teacher fills a plastic bottle with liquid nitrogen, drops it into a bucket of hot water, covers with ping-pong balls, and BOOM! Science!

Correction (Nov 22): The teacher fills the plastic bottle with dry ice (frozen CO2), not liquid nitrogen.(End of correction.)

No. Boom, but no science.

There was nothing scientific about it; it reminds me of the I effing love science page, where little science ever treads. This demonstration might be okay for children (though when liquid nitrogen is involved children prefer instant ice-cream making); for college students? What are they supposed to learn? And this, to our benighted media, is "the best physics class ever?"

Aargh! Try these instead.

I already explained, in another post, the problem of Penn and Teller's style of "argument by making fun of people who hold the wrong beliefs." tl;dr: it convinces people that it's who says things that matters, not what they say.

These ad hominem attacks are a common tactic of many skeptic activists: attacking the individuals who hold opposing views. (I think that some of the activists might do so because they don't actually know enough science to make a substantive argument.)

Since audiences can understand that something might be true regardless of who is saying it, the repeated recourse to ad-hominem attacks by skeptic activists not only turns away reasonable people, it also gives fodder to the skeptics' opponents.

Christopher Hitchens wrote very well; but his writing was political, not scientific. This is a broader problem that deserves a section of its own. Possibly a future post of his own as well.


I was watching a talk by Richard Dawkins and Larry Krauss about the origins of the universe and life, when, a few minutes in, Krauss decided to make fun of an american political party. This is not an uncommon occurrence.

For example, there's a big schism in the skeptic activists regarding a number of social issues, none of which is a matter of science. I'm excluding social science, on the basis that if you have to put a qualifier, "social," then you don't belong in the club.

If you're trying to get the general public to understand, say, that the universe is expanding, it's really really counterproductive to undermine your credibility by including non-sequiturs about american politics in the dialog.

For starters, a non-sequitur shows a logical failure on your part. If you show me that your reasoning is flawed on something I can tell, why should I trust your judgment regarding things I cannot tell, like the validity of a mathematical model or the relevance of some experimental data?

Also, even if one were disposed to accept scientific evidence as contributing to a political decision, there are other considerations. For example other parts of the party platform: "I don't care if Orange believes in Odin Allfather, her tax plan makes more sense than atheist Purple's." For Dawkins and Krauss to assume that everyone must agree with their priorities is another weakness in their persuasive strategy.

Finally, if the argument becomes about who gets power or who is a "good person," that style of argumentation corrupts the whole point of scientific and skeptic education. And leads to internecine warfare among the various factions in skeptic activism, for example.


A friend who keeps in touch with skeptic activists sent me a link to a fight among these skeptic activists, which included: group A trying to get people from group B fired from their jobs, name-calling, physical threats, exclusion at conferences, etc.

I went and saw a few minutes of a video and was hooked: the drama, the fights, the name-calling, Wow! Science is so dramatic...

Hey, wait a second. Why did I just spend three hours watching these videos about the lives of people I couldn't care less about? — That's what they are; there's not a single new argument or scientific fact or even a mention of science.

It's Keeping Up With The Kardashians with a skeptic tag.

Three hours is enough to read about half of Why We Get Sick. To watch three classes in Robert Sapolsky or Leonard Susskind's courses. To write about 500 words in a research paper. To edit a chapter in my work-in-progress book. To tighten the core code on my estimators. To go for a 13.5Km walk. To understand four pages of Rudin's Real and Complex Analysis, maybe even five.

All this potential; all crowded out by a very human attraction to gossip. That temporary insanity was diagnosed and addressed; but what about the broader population, how many of these three-hour intervals have they wasted? How much attention squandered? How much anti-learning?

When God is not Great and The God Delusion were best-sellers, what books with real science and real education in logic, biases, and numeracy did they crowd out?

Thursday, November 1, 2012

Facts up for debate? SRSLY?

I think I figured out the cause of the decline of Western Civilization!

I just watched part of a debate about a point of fact. Not politics, not preferences, not opinions. A debate about a point of fact, meaning something that is either true or false in reality was debated and put to a vote.

A moderator set the fact up as a proposition; two teams tried to out-debate each other; whenever they disagreed on what some study or piece of evidence referred to (but not presented), the moderator declared the point stalled and moved on; and at the end the audience voted on whether the proposition was true.

This was done under the aegis of reason and, presumably, science.

Debate and consensus are accepted as good things in themselves, while supporting evidence is apparently considered unimportant, to the point that neither team thought to bring any.

Oh, good grief!

(Just in case you don't see the problem, physical reality is not decided by vote and all the great orators in the world cannot make gravity go away.)

Wednesday, October 31, 2012

Why I'm somewhat apprehensive about Apple's reshuffle

Though I'm not as pessimistic about the Apple executive shuffle as the markets and Joy Of Tech, I'm apprehensive regarding the future of Apple's products.

Jony Ive is a great industrial designer, but Human-Computer Interaction is not Industrial Design. And some of the design decisions in recent hardware (meaning Ive's decisions) seem to ignore realities on the field. Take the latest iMac.

The new iMac doesn't have an optical drive; some pundits (and, I think, Phil Schiller on the Apple event) say that's a normal evolution. After all there aren't floppy disks on computers any longer and Apple was the first to drop them. And look how pretty the tapered edges of the iMac are.

Floppy disks existed as part of a computer-only ecosystem. CDs, DVDs, and BluRay Discs are part of a much larger ecosystem, which includes dedicated players and big screen TVs, production and distribution chains for content, and a back catalog and personal inventory for which downloads are not a complete alternative. (Some movies and music are not available as downloads and people already have large collections of DVDs and BluRay Discs.)

Using floppy disks as an example of change, implying that it is repeated with optical drives, shows a complete disregard of the larger ecosystem and willful ignorance of the difference between the earlier situation and the current situation.

For a laptop, the absence of an optical drive may be an acceptable trade-off for lower weight; for a desktop, particularly one that is a "home" desktop with a HD screen, the lack of a BluRay/DVD/CD drive is a questionable decision.

But look how pretty the tapered edges are, here in the uncluttered Apple Store retail shelves — oops, those computers will be in cluttered real world environments, where the necessary external drive (what, no BluRay drive yet, Apple?) will add even more clutter.

But, on the empty tables and antiseptic environments of "minimalist" designers' imagined world, that tapered edge is really important.

In the rest of the world, there are scores of people who like watching really old movies (available on DVD, not as downloads or streaming — except illegally), new movies in 1080p discs with lots of special features (i.e. BluRay discs that they can buy cheaply in big box stores), or their own movies (which they already own, and could rip — in violation of the DMCA — for future perusal, as long as they want piles of external hard drives); or maybe they want to rip some music that isn't available in download format, say CDs they bought in Europe that aren't available in the US yet.

So, using a decision that is not isomorphic at all (dropping the floppy disk) as a justification, Apple ignores a big chunk of the value proposition (consumption of media that is not available via digital download) on behalf of elegance. And, perhaps some extra iTunes sales — probably too small to make a difference on the margin.

What will this type of philosophy do to software? As Donald Norman wrote in this piece, there's nothing particularly good about fetishizing simplicity. Even now, many power users of Apple products spend a lot of time developing work-arounds for Apple's unnecessary rigid limitations.

Steve Jobs's second stint at Apple had the advantage of his having failed twice before (his first stint at Apple and NeXT), which tempered him and made him aware of the power of ecosystems (not just network effects). This is a powerful learning experience for an executive. Jony Ive hasn't failed in this manner.


Saturday, October 27, 2012

Great Science/Technology documentary - just no STEM in it please!

Since these days every communication is supposed to be a story, see if you can find the commonality among these four vignettes:

1. A while ago I saw a documentary about a Steinway piano, Note by note, the making of Steinway L1037, which didn't really say anything about the piano: what wood it's made of, how the key mechanism works, how Steinways differ from Bosendorfers or Young Changs, the history of pianos, or really anything about the construction of the piano other than a sequence of glossed-over steps. But it did tell us about the lives of the people who work in the factory, showed a kid hammering away at the keys, and a number of other "human interest stories." The lack of any information about the piano or the making thereof doesn't seem to have stopped this documentary from earning accolades, quite the contrary.

2. Magicians Penn and Teller have a Showtime program called Bullsh*t, where they purport to debunk bad thinking. It's entertaining, but what they do is foster more bad thinking: they typically present the "wrong side" by having highly mockable people say stupid things on camera (helped by editing); then they mock those people. This is precisely the wrong way to go: though it's entertaining to watch, this creates the impression that something is wrong or right because of who is saying it, not of what is being said — a clear ad hominem fallacy.

3. Jerome Groopman's book How Doctors Think has a nice collection of decision biases in medical practice. It's a good book; I have both the audiobook and the hardcover versions. But about one-fourth of the text is wasted in descriptions of the environment where the author is interviewing someone, what the person looks like, their life stories, their hobbies, and other unrelated material. These fait-divers are distracting, and not in the sense that they are entertaining.

4. A long time ago I played my first (and I think last) game of Trivial Pursuit. I recall getting the following science question: "who was the president of the [portuguese version of the AMA] after [name long forgotten]?" I said this was a preposterous science question (preposterous enough that I remember it over twenty years later) but other players didn't agree with me. To these mostly educated, but not clear-thinking people, science was about bureaucrats in local professional associations of some scientific nature.

These are four examples — out of hundreds — of the popular misperception of what science and technology are about: most people seem to think that science and technology, STEM to use the broader acronym, is about the people involved. Hence a call for more scientists and engineers, instead of a call for better science and engineering.

But there's hope, though it comes from far away (the UK) and long ago (the 1980s): Richard Hammond and James May and Cosmos.

Cosmos was a great series. Carl Sagan didn't talk about Einstein's love affairs, the view from Galileo's palace, or what Copernicus looked like in breeches. Sagan made the Universe the hero of the series, scientific thinking the sidekick, and occasionally threw in some history for context. Brian Greene's documentaries on PBS are the closest intellectual heir to Cosmos, I think; unfortunatly few people even know these exist.

Browsing YouTube, I found a number of episodes of Richard Hammond's Engineering Connections and James May's Things You May Need to Know. These programs focus on the information or knowledge and leave all the "human interest" aside. (*) A little like Mythbusters shows, except that the MBs have a lot of problems with the science.

Far away and long ago these may be, but they might catch on due to the interaction between the power of the interwebs and a rare species, homo sapiens nerdus.

Reporting on Felix Baumgarten's skydive from the edge of space a few days ago, MSNBC had an unfortunate chyron with "Fearless Felix traveled fasted than the speed of light." A lot of people noticed the problem (MSNBC meant the speed of sound, obviously), but I took it a little further and made some minimal computations to determine that if FF had reached ninety percent of the speed of light, his collision with New Mexico would have wiped out a big chunk of that state in a 1500 megaton explosion. This is called "nerding out."

(Nerds might save our civilization. Be kind to your local nerd.)

Note that there's minimal knowledge of Physics involved in my nerding out. All one has to do is to compute the Lorentz mass $m = m_0 / (1 - v^2/c^2)^{1/2}$ and then use the kinetic energy formula $E= 1/2\,  m v^2$.  (In fact, you could ignore the Lorentz correction altogether and get a close enough answer.) I wonder what percentage of the people who laughed at MSNBC could have done these calculations; I also wonder how many understand why the speed of light is an absolute limit while the speed of sound is not. (**)

STEM is not about haughty jokes or the personalities of those involved. It's about truth (science), achievement (engineering and technology), and thought (math). It's a world of wonder beyond the lives of those involved. That some popularizers think that to make this wonder interesting we must intersperse it with gossip and distractions is an insult to their audiences. Or maybe the popularizers are cowards, unwilling to educate or inspire their audience.

Can we please get people who are really interested in science, engineering, technology, and math to take over their popularization? Otherwise, we'll end up with science documentaries that spend more time on Einstein's hair and political opinions than on the implications of $E = mc^2$.

Like the ones we already have.

– – – –

* This expression is getting on my nerves. Is suggests that humans are not interested in knowledge or thought other than gossip and the picayune. Some people defend the inclusion of irrelevant personal details as part of giving the characters involved more depth. My point is that the characters involved are irrelevant for the STEM part; focussing on them only emphasizes the natural human tendency for ad hominem reasoning. (Obviously if the field of study is leadership or creativity or similar the depth of the characters involved is part of the science.)

** In other words, how does most people's knowledge of science differ from what we'd call religious belief in science? To clarify: scientific knowledge (which must be consistent with experiment and observation) differs in a fundamental way from religious belief (based on revelation by authority). But, in my experience, many people "believe in science," not because they understand it and can work out the implications, but because it's been revealed to them by authority (the education system) and it's socially unacceptable in their peer group to not "believe in science." In that sense, their belief in science is not scientific, but rather religious.  Yes, there's a post about this in the wings.

Monday, October 22, 2012

Can we stop talking about "manufacturing jobs"?

A lot of people worry about "manufacturing jobs," but the metric is seriously flawed.

Politicians and some financial analysts decry the decline of manufacturing jobs. There has been some decline, but the way these jobs are measured is inherently flawed, as it fails to take into account the change in managerial attitudes towards vertical integration.

Easy to see why with an example:

Ginormous Corp. makes widgets. In the 60s to mid-80s, as it went from being Bob's Homemade Widgets to Ginormous Corp., it added new facilities which had janitorial, accounting, cafeteria, legal, and other support services. All personnel in these support services counted as "manufacturing jobs."

In the mid-80s, Ginormous Corp. figured out (with a little help from Pain & Co and McQuincy & Co consultancies) that these support services were (a) not strategic and (b) internal monopolies. Part (a) meant that they could be outsourced and part (b) strongly suggested they should be outsourced. Let's say that Ginormous Corp. spun out these support services into wholly-owned subsidiaries, with no significant change in overall personnel.

So, all the personnel in janitorial, accounting, cafeteria, legal, and even some of the technical business support went from being in "manufacturing jobs" to being in "service jobs" without any change to what actually is produced and any actual job.

A metric that can change dramatically while the underlying system and processes don't change much is not a good foundation for decision-making. "Manufacturing jobs" is one such metric, as it depends on organizational decisions at least as much as on actual structural changes.

Metrics: useful only when well-understood.

Note: There are many reasons why focusing on manufacturing jobs over service jobs is a bad idea: Old Paul Krugman explains the most relevant, differential productivity increases, here.

Friday, October 19, 2012

Math in business courses: derivating + grokking

I used to start my Product Management class with a couple of business math problems like the following: let's say we use a given market research technique to measure the value of a product; call the product $i$ and the value $v(i)$. We know -- by choice of the technique -- that the probability that the customer will buy $i$ is given by

$\Pr(i) = \frac{\exp(v(i))}{1 + \exp(v(i))}$.

My question: is this an increasing or a decreasing function of the $v(i)$?

Typically this exercise divided students in three groups:

First, students who were afraid of math, were looking for easy credits, or otherwise unprepared for the work in the class. These math problems made sure students knew what they were getting into.

Second, students who could do the math, either by plug-and-chug (take derivative, check the sign) or by noticing that the formula may be written as

$\Pr(i) = \frac{1}{1 + \exp(-v(i))}$

and working the increasing/decreasing chain rule.

Third, students who had a quasi-intuitive understanding ("grok" in Heinlein's word) that probability of purchase must be an increasing function of value, otherwise these words are being misused.

Ideally we should be training business students to mix the skills of the last two groups: a fluency in basic mathematical thinking and grokking business implications.

- - - - - - -

Administrative note: Since I keep writing 4000+ word drafts for "important" posts that never see the light of blog (may see the light of Kindle single), I've decided to start posting these bite-sized thoughts.

Saturday, October 6, 2012

Thinking - What a novel idea

Or: it may look like brawn won the day, but it was really brains.

Yesterday I took some time off in the afternoon to watch the Blue Angels practice and the America's Cup multihull quarterfinals. Parking in the Marina/Crissy Field area was a mess and I ended up in one of the back roads in the Presidio. As I drove up, I saw a spot -- the last spot -- but, alas, there was a car in front of me. It drove into the spot, partly, then backed up and left.

I drove up to the spot and saw a block of cement with twisted metal bits in it, about three feet from the back end. I got out, grabbed the block, assessed its weight at about 100Kg, farmer-walked it to the berm, and got a parking spot.

Ok, so moving 100Kg or so doesn't make me the Hulk. What is my point, exactly?

There were at least two men in the car that gave up the space. They could have moved that block with ease. Instead they went in search of parking further into the Presidio; probably futile, if traffic was any indication. Why didn't they do what I did? Why didn't anyone before me (the parking areas well above the one I ended up in were already full as well)?

They didn't think of it.

Actually thinking is a precondition to problem-solving. Many problems I see are not the result of bad thinking but rather of the lack of thinking.

Thursday, August 23, 2012

Enough with the non-sequiturs and unrelated asides in technical talks

If you have a main point on a technical matter, don't muddle it with social, political, aesthetic, or other unrelated asides.

Let's say I agree with these two propositions:
  • P1: Gravity exists, and it creates a force that varies in the direct proportion to the product of the masses of the objects involved and in inverse proportion to the square of the distance between them.
  • P2: Eating raw DiGiorno pizza is disgusting.
These are different in type, not just content: P1 is a fair approximation of the way the world works, while P2 is a matter of my personal taste. Note that disagreeing with P2 means we have different tastes, while disagreeing with P1 means you don't know what you're talking about (or you're being overly pedantic and ignoring the "fair approximation" bit above).

My agreeing with P1 and P2, however, doesn't mean that (in increasing order of irritation):
  • E1: It's fine to intersperse comments about how disgusting raw DiGiorno pizza is at random points in a talk on gravity.
  • E2: Anyone who agrees with P1 must agree with P2.
  • E3: Anyone who agrees with P1 must agree that P1 implies P2.
  • E4: Anyone who agrees with P1 must agree that P1 implies the government must do something (that is, force must be used) to stop people from eating raw diGiorno pizza.
I've noticed, in many talks purported to be about science or technology, the occurrence of events of all of these types. Often by people whose training or occupation should preclude such fallacies.

Monday, June 18, 2012

How I learned to make better presentations by paying attention to the performing arts

Cinema, especially documentaries, and the performing arts in general have a few useful lessons for presentations.

Of course, the main lesson regarding presentations is that preparation is key — just like with the performing arts. But this is a post about details rather than a rehashing of my big presentations post.

1. Have a script. A real script, like a movie script.

For blog post

In the past I used the Donald Norman approach: have an outline, annotated with some felicitous turns of phrase (those work better if you figure them out in advance) and important facts and figures. But now I find that having a script is a great tool, even if I tend to go off-script early and often in a presentation:

a) It forces me to plan everything in detail before the first rehearsal. Then I can determine what works and what doesn't and adapt the script. (Just like in a movie production.)

b) It makes obvious when there's over-use of certain expressions, unintended aliteration, tongue twisters, and pronunciation traps. Not to mention embarrassing unnoticed double entendres.

c) It creates a visual representation of the spoken word, which lets me see how long some chains of reasoning are.

d) It serves as a security blanket, a mental crutch, especially when I'm lecturing in a language different from the one I speak all day at the lecture location (say, speak portuguese in Lisbon, listen to Puccini arias in italian, read Le Monde in french, and then have to capstone discussion classes with english lecturettes).

2. Explicitly write treatments on script

By writing the treatments (slide, board, video, interactive, prop, handout, demonstration, discussion, etc) explicitly into the script I can identify potential problems:

a) If there's a block of more than 750 words (i.e. about ten minutes) that has no treatments, it had better be an interesting story. If I think that the story could use some attention-grabbing treatment, I have time to figure out what to do as I write the script.

b) If there's a diagram on a projected material that requires several builds, and is marked as such on the script, I identify that as an opportunity for a step-by-step construction on the board. That change comes at a cost of production values (upon seeing my drawings, arts teachers suggested I follow a career in text- or numbers-based fields); the benefit is the physicality of the writing and the motion of the speaker. (Note: draw on board, turn, then speak. Even with a microphone, speaking into the board is bad practice, as the speaker cannot gauge the audience's reaction.)

c) By having a cue to the upcoming treatment, I can compensate for lag in equipment response. For example, in the script above I want the video to start as I finish my sentence "money can't buy taste or style." Given my equipment lag, I need to click the remote when I say "money" so that the words in the video follow mine without noticeable pause. (The pause would distract the audience, and direct some of their attention to the technology instead of the content. Obviously I don't say "let's see a video about that" or some such.)

d) Explicit treatments also make it easy to check whether I have all the materials ready and to make sure that I don't forget anything in the mise en scène before I start. (This is a particularly useful reminder to set up demonstrations and double-check props before the event.)

3. There's only one "take" on stage – so practice, practice, practice

The first practice talk  is basically a check for design issues; many things that sound or appear adequate in one's mind ear or eye fail when said out loud, projected on a screen, or written on a board. After the first practice there's usually a lot of presentation materials rework to do. It goes without saying that failing to do this practice presentation means that the problems that would have been discovered during the practice will happen during the actual presentation – when it matters and in front of an audience.

A few iterations of this practice-analyze-rework (reminiscent of the draft-edit-redraft-edit- process for written word) should converge to a "gold master" talk. At this point practice will be for delivery issues rather than design issues: intonation, pronunciation, movement, posture, etc.

Full dress rehearsals, preferably in the presentation venue, are great tools to minimize surprises at the presentation time. I always try to get access to the venue ahead of time, preferably with the A/V people that will be there for the presentation.

If you feel ridiculous giving a full dress rehearsal talk to an empty room while the A/V people watch from their booth, just think how much more ridiculous it is to fail in front of an audience for lack of preparation.

(It goes without saying, but I said it before and will say it again, that practice is the last step in the preparation process before the presentation event; some presenters believe that practice can replace the rest of the preparation process, which is a grave error.)

4. Record and analyze presentations, even the practice ones

Given how cheap recording equipment is, there's no reason not to record presentations (except perhaps  contractual restrictions).

The main reason for recording is quality control and continuous improvement; a second reason is to capture any impromptu moments of brilliance or interesting issues raised during the Q&A.

Depending on various arrangements and the presenter's approach to sharing, these recordings can also be part of the public portfolio of the presenter.

5. The Ken Burns effect - it's not a spurious animation

I have railed against the profusion of unnecessary animations in presentation, so it's ironic that I'm advocating adding animation to static images. But there's a logic to it.

There are a few times when I have a few minutes worth of words that refer to or are supported by a photo. That photo is the projected material for those minutes, but I've started using very slow pans and zooms (the Ken Burns effect, after the PBS documentarian) to create a less boring experience.

My pragmatic guidelines for using the Ken Burns effect are:

a) Use sparingly: once, maybe twice in a presentation, and not in a row.

b) Very slow motion; the idea is to create a just-noticeable-difference so that there's something to keep the attention on the picture, but not enough to distract from what I'm saying.

c) The picture has to be high-resolution so that there's no pixelation.

d) In case of uncertainty, no effect. (Less is more.)

e) Since the photo supports the words I'm saying, and Keynote doesn't allow slide transitions in the middle of animations, the length of the effect has to be just short of the time it takes to say the words.

And a big difference from performing arts and documentaries: Every talk is new, even the canned ones.

Unless you're Evelyn Waugh, you don't want to give the same talk every time. Knowledge evolves, circumstances change, new examples appear in the media, and you learn new stuff from the question and answer period after a talk, or in the socializing period.

Having a script (and a master presentation a la Tom Peters) lets a speaker track the changes that a talk goes through over its lifecycle. It's an interesting exercise in itself, but also can give hints for how to adapt other "canned" talks one may have in one's portfolio.

Preparation, Practice, and Performance. Gee, it's like one of those management things where a complicated field is summarized by a few words that start with the same letter. But it's accurate.

—  —  —  —

Related posts:

Posts on presentations in my personal blog.

Posts on teaching in my personal blog.

Posts on presentations in this blog.

My 3500-word post on preparing presentations.

Saturday, May 19, 2012

Is Pete Fader right that Big Data doesn't imply big money?

He's right, in that Big Data doesn't necessarily lead to big money, but I think he exaggerates for pedagogical effect. Why he feels the need to do so is instructive, especially for Big Data acolytes.

Some days ago there was agitation in the Big Data sociosphere when an interview by Wharton marketing professor Peter Fader questioned the value of Big Data. In The Tech, Fader says
[The hype around Big Data] reminds me a lot of what was going on 15 years ago with CRM (customer relationship management). Back then, the idea was "Wow, we can start collecting all these different transactions and data, and then, boy, think of all the predictions we will be able to make." But ask anyone today what comes to mind when you say "CRM," and you'll hear "frustration," "disaster," "expensive," and "out of control." It turned out to be a great big IT wild-goose chase. And I'm afraid we're heading down the same road with Big Data. [Emphasis added.]
I think Pete's big point is correct, that Big Data by itself (to be understood as: including the computer science and the data analysis tools, not just the data -- hence the capitalization of "Big Data") is not sufficient for Big Money. I think that he's underestimating, for pedagogical effect, the role that Big Data with the application of appropriate business knowledge can have in changing the way we do marketing and the sources of value for customers (that is both the job of marketer and the foundations of business).

This is something I've blogged about before.

So, why make a point that seems fairly obvious (domain knowledge is important, not just data processing skills), and especially why make it so pointedly in a field that is full of strong personalities?

First, since a lot of people working in Big Data don't know technical marketing, they keep reinventing and rediscovering old techniques. Not only is this a duplication of work, it also ignores all knowledge of these techniques' limitations, which has been developed by marketers.

As an example of marketing knowledge that keeps being reinvented, Pete talks about the discovery of Recency-Frequency-Money in direct marketing,
The "R" part is the most interesting, because it wasn't obvious that recency, or the time of the last transaction, should even belong in the triumvirate of key measures, much less be first on the list.*    [...]
Some of those old models are really phenomenal, even today. Ask anyone in direct marketing about RFM, and they'll say, "Tell me something I don't know." But ask anyone in e-commerce, and they probably won't know what you're talking about. Or they will use a lot of Big Data and end up rediscovering the RFM wheel—and that wheel might not run quite as smoothly as the original one.

Second, some of the more famous applications of machine learning, for example the Netflix prize and computers beating humans at chess, in fact corroborate the importance of field-specific knowledge. (In other words, that which many Big Data advocates seem to believe is not important, at least as far as marketing is concerned.)

Deep Blue, the specialized chess-playing computer that defeated Kasparov, had large chess-specific pattern-matching and evaluation modules; and as for the Netflix prize, I think Isomorphismes's comment says all:
The winning BellKor/Pragmatic Chaos teams implemented ensemble methods with something like 112 techniques smushed together. You know how many of those the Netflix team implemented? Exactly two: RBM’s and SVD.    [...] 
Domain knowledge trumps statistical sophistication. This has always been the case in the recommendation engines I’ve done for clients. We spend most of our time trying to understand the space of your customers’ preferences — the cells, the topology, the metric, common-sense bounds, and so on.

Third, many people who don't know any technical marketing tools continuously disparage marketing (and its professionals), and some do so from positions of authority and leadership. That disparagement, repeated and amplified by me-too retweets and Quora upvotes, is what makes reasonable people feel the need for pointedly making their points.

Here are two paraphrased tweets by people in the Big Data sociosphere; I paraphrased them so that the authors cannot be identified with a simple search, because my objective is not to attack them but rather illustrate a more widespread attitude:
It's time marketing stopped being based on ZIP codes. (Tweeted by a principal in an analytics firm.)
Someone should write a paper on how what matters to marketing is behavior not demographics. (Tweeted by someone who writes good posts on other topics.)
To anyone who knows basic marketing, these tweets are like a kid telling a professional pianist that "we need to start playing piano with all fingers, not just the index fingers" and "it's possible to play things other than 'chopsticks' on the piano." (Both demographics and ZIP codes have been superseded by better targeting approaches many decades ago.)

These tweets reflect a sadly common attitude of Big Data people trained in computer science or statistics: that the field of marketing cannot possibly be serious, since it's not computer science or statistics. This attitude in turn extends to each of these fields: many computer scientists dismiss statistics as something irrelevant given enough data and many statisticians dismiss computer scientists as just programmers.

That's a pernicious attitude: that what has been known by others isn't worth of consideration, because we have a shiny new tool. That attitude needs deflating and that's what Pete's piece does.

-- -- -- --

* An explanation of the importance of recency is that it's a proxy  for "this client is still in a relationship with our firm." There's a paper by Schmittlein, Morrison, and Colombo, "Counting your customers," Management Science, v33n1 (1987), that develops a model of market activity using a two-state model:  the purchases are Poisson with unknown $\lambda$ in one of the states (active) and there's an unobserved probability of switching to the other state (inactive), which is absorbing and has no purchases. Under some reasonable assumptions, they show that recency increases the probability that the consumer is in the active state. BTW, I'm pretty sure that it was Pete Fader who told me about this paper, about ten years or so ago.

Friday, May 11, 2012

A tale of two colloquia

It was the best of talks, it was the worst of talks.

(Yes, I understand that Dickens's opener has been used to the cliché limit; but the two examples I have in mind really bracket the space of possible talks. At least those talks with voluntary attendance.)

The best of talks: Michael Tilson Thomas at TED.

Even if you don't like art music, this talk is well worth watching for the presentation skills demonstrated by MTT:

MTT opens with a personal story of an interesting coincidence (his father's name was Ted); this is not my preferred type of opener, but he builds a personal narrative out of that opener and then merges it with his main topic very well.

MTT sits at a baby grand piano, which he occasionally plays to illustrate points about music evolution. This interactive production of the presentation material, similar to writing and running code or analyzing data in a technical presentation, has three main presentation advantages that make up for its risks:

1. Visual and physical variety, or more generally, presentation process variety. Every few seconds the image changes, the activity changes, the type of presentation changes: speaking, playing piano, describing a photo, narrating a video, watching a video without narration, listening to recorded music. Compare that with 18 minutes of speaking to slides bearing bullet points.

2. Clear demonstration of expertise, which projecting a video or playing recorded music  cannot do. In a live demonstration or performance there's always a risk that something will go wrong, which is why many presenters avoid this kind of demonstration. But the willingness to take that risk is a strong signal to the audience of the presenter's competence and expertise.

3. Adaptability (not really used by MTT, since his was not a talk with audience interaction). This is particularly important in teaching technical material, I think: allowing the students to ask questions and see the answers come from the techniques that we're teaching them is a lot better than just showing them slides. (Of course real learning happens when the students do the work themselves, but this kind of demonstration helps begin the process and motivates them to act.)

The supporting materials were superbly chosen and executed. Credit here is due to a large supporting cast for MTT: this presentation uses materials and skills from the education and multi-media presence of the San Francisco Symphony, an organization whose main business is performing. But here are five important lessons that these materials illustrate:

1. No bullet points, and few words (mostly as subtitles for foreign language). The projected materials (including a large camera shot of MTT when no other materials are using the screen) are there to support what MTT is saying, not to remind MTT of what he wants to say.

2. The production values of the materials are professional (you can assess their quality on the 720p video) and that signals that this presentation is important to MTT, not something put together in the flight down, between checking email and imbibing airline liquor.

3. MTT's presentation never mentions the support, only the content: he doesn't say "this slide shows a photo of my father," he tells the story of discussing music with his father as the photo appears on screen. The photo is a support for the narrative instead of the narrative taking a detour to acknowledge the technology and the specifics of the material that is supporting it.

4. The interaction between materials, speech, and piano playing was choreographed in advance, with the video producer knowing which shots to use at each time. This comes from the extensive documentary and educational work of the San Francisco Symphony under MTT, but to some extent can be replicated by presenters of more technical material if they take the time to think of their presentation as a series of "cuts" in a video production.

5. It's not on the video, but it's obvious from the fluidity of the speaking, piano playing, and video materials that this talk was carefully planned and thoroughly rehearsed. That's not surprising: after all, a dress rehearsal is nothing new to a performing artist, and MTT clearly saw this talk as a performance. Most presenters would benefit from seeing their talks as performances (once they get the content part well taken care of, obviously).

The speech was well structured, with a strong opener and closer, repetition of the key points with different phrasing at the bridge points, and with the right mix of entertainment and education that is expected of a TED talk.

MTT had a teleprompter at his feet and notes on top of the piano, which in the video appear to include a couple of lines of music score, possibly as a reminder of the harmonic evolution he demonstrates at timecode 5:28 to 6:02. Many presenters are afraid that using speaker notes makes them look unprepared or "just reading their speech." This is an erroneous attitude for five reasons:

1. Expertise can be demonstrated in different ways, like MTT playing the piano. And as a general rule, the audience will have some idea of the expertise of the presenter, established ahead of time by other means.

2. Open discussion or question and answer periods allow the speaker to wow the audience with his or her ability to extemporize. (As a general rule, I suggest speakers prepare notes on some of the more likely questions that may need some thinking ahead, but not read them verbatim.)

3. Reading a speech is a difficult skill; most people can't do it correctly. Even when I write a speech for myself, I find that I also make notations on it and end up using it more as a psychological crutch than an actual speech to read. It's fairly obvious that MTT is not reading the speech verbatim.

4. Even if MTT is partially reading a prepared speech, it's most likely one that he had a big input in writing. Other than celebrities, politicians, and CEOs, most presenters will have written their speeches, and most audiences will expect that they did.

5. Ironically, many people who look down on unobtrusive speaker notes or teleprompters put their speaker notes on the screen as bullet points, confusing the materials that are there to help the speaker (notes) with the materials that are there to help the audience process the presentation (visual support).

The material MTT covers meshes with music history so he uses stories and storytelling as the main text form. Stories are one of the six tools for memorability the Heath brothers recommend in the book Made To Stick, and they work very well here. MTT also uses what Edward Tufte calls the P-G-P approach to exposition, presenting a Particular case first, then making a General point, then capstoning that point with another Particular example.

Dancing and singing aren't common techniques in presentations, but MTT uses them to great effect at timecode 2:24. In other presentations some acting or character impressions can be used for the same purpose: break the solemnity of the occasion, signal that you take the subject seriously but you don't take yourself too seriously, or to bridge topics.

(On a video that's no longer available online, John Cleese of Monty Python keeps interrupting his own presentation on creativity techniques with "How many X does it take to change a light bulb" jokes, as a way to give the audience breaks. And those jokes are part of a running arc that he established at the beginning of "there's no real training for creativity so I might as well spend my time telling jokes.")

Personally I don't recommend singing, dancing, or telling jokes in a talk unless you are a professional singer, dancer, or comedian, and even so only sparingly. Note that MTT did it for a very specific and memorable point: that a "piece of 18th Century Austrian aristocratic entertainment" turned into the "victory crow of [a] New York kid," and that's the atemporal power of music.

And as a closer, MTT rehashes the opening theme "what and how" and adds a cornerstone "why," ending on a good note and high energy. It's always important to have a strong closer, almost as important as a good opener.

Two minor observations:

1. MTT should have had a sip of water right before the talk and sloshed it around his mouth and lips, to avoid that smacking sound when he speaks. That sound is created by dryish areas in the mouth letting go at inappropriate times; sloshing the water solves it, drinking doesn't.

2. I assume that MTT's fleece was chosen to match his clothes and accessories, but he could have one custom-made in that color with the logo of the San Francisco Symphony. Maybe this is my crass commercialism rearing its ugly head, but with not flaunt the brand?

The worst of talks: a presenter who will remain anonymous at an undisclosed conference.

For clarity of exposition I'll call the presenter EF, for "Epic Fail," and use the pronoun "he" without loss of generality over gender.

EF started his presentation with a classic: computer trouble.

EF's talk was the last in a four-talk session; the other three presenters had installed their presentations in the podium computer during the break before the session, but EF did not. An alternative to using the podium computer would be to connect his laptop and test the setup during the pre-session break. A third possibility would be to connect his computer while the previous presenter was taking questions from the audience; personally I find this disruptive and avoid it, but it's better than what happened.

And what happened was that after four minutes of failed attempts to connect his computer to the podium (out of a total time per speaker of twenty minutes, including the Q&A period), EF asked the audience for a flash drive so he could transfer his presentation to the podium computer.

Presentation starts after six minutes of unnecessary computer-related entropy.

The room where this happened was an executive education classroom, with U-shaped seating, two projection screens side-by-side at the front and large flat screen TVs on the side walls so that the people on the straight part of the U could look at them instead of the front screens. These TVs also serve as a way for the presenter to see what's on screen while looking towards the audience.

Which is why everyone was puzzled when EF walked to one side of the front screens, turned his back to the audience and started talking in a monotone, while -- apparently -- clicking the remote at random. Really: he moved his slides up and down apparently at random and at high speed, maybe one-second on screen per slide, and without any connection to what he was saying.

But that's fine, because what he was saying was also disconnected within itself. In fact, I don't think he had any idea -- let alone a clear idea -- of what he wanted the audience to take away from the talk.

As far as I could gather, from reading the abstract about four times until I made some sense of it by writing a modal logic model of the essential words therein and crossing the 90% of words that were filler: there's a well-established phenomenon that is observable in a series of measures $X(p)$ as we vary the parameter $p$. The presentation was about changing the parameter space from $P_1$ to $P_2$, with $P_1 \subset P_2$. All tests in the literature concern themselves with the effects measured in $P_1$, and this paper tests the effects in $P_2$. This was not clear in the abstract or the presentation.

One of the slides that was on-screen several times, for about 4 seconds at a time, showed a table with the results from the literature, that is $X(p), p\in P_1$. Every time EF wanted to say something about these results, he moved several slides up and down, looking for the bullet point he wanted -- a point about the table that he had therefore removed from the screen. But that's not the worst.

After spending ten minutes explaining to an audience of experts in the subject matter a well-known point in the field of their expertise, EF glossed over details of his measurement technique, experimental procedure, and data processing, and presented his table of $X(p), p\in P_2$.

Without the $X(p), p\in P_1$ values for comparison.

Let me repeat that: he presented his results, which are to be compared and contrasted to the established results, on a separate table. Now, the phenomenon is well-established, but this is a table of numbers with three or four significant digits, so the details aren't that easy to recall. They are even harder to recall when EF keeps changing slides to look for bullet points about this table, again removing the table from the screen. Let me also point out that these are about 12 rows of 2 numbers per row, 4 with the comparison, well within the capacity of a one-slide table.

Every so often EF would stop abruptly in the middle of a sentence and silently move his slides up and down looking for something, then start a whole new sentence, without stopping the up-and-down movement of the slides.

But the clincher, the payoff after this painful exercise?

EF had no conclusions. His team was still analyzing the data, but so far it appeared that there was no change at all from the well-established phenomenon.

Now, in many fields, showing that a well-established phenomenon applies beyond the boundaries of the previous experiments is a valuable contribution. But in this case the expansion from $P_1$ to $P_2$ was trivial at best.

At this point, and about four minutes over time, EF invited the audience to ask questions. There were no takers, so EF asked one of the audience members (presumably an acquaintance) what he thought of some minor detail that EF had actually not talked about. The audience member said something noncommittal, and EF pressed the point, trying to get a discussion going. The rest of the audience was packed and ready to leave, but EF paid them as much attention during this failed attempt at a dialog as he had during his failed attempt at a presentation.

I was told later by another attendee that this presentation was not atypical for EF.

(Suggestions for improvement? I wrote a post about preparing presentations before.)

Coda: An unfair comparison, perhaps?

MTT is a performing artist, a showman by profession. The presentation he delivered was designed by a support team of graphic artists, cinematographers, writers: it fits within the education efforts of the San Francisco Symphony. MTT's audience is mostly there for entertainment and positively predisposed towards the celebrity presenter. His material is naturally multi-media, interactive, and pleasant, requiring very little effort on the audience part to process it. And, let's not forget, the presentation event itself was a team effort -- MTT is not operating the video screen or the teleprompter at his feet.

EF is a researcher and a professor. His presentation was designed by him, an untrained presenter (obvious from the talk), and delivered to an academic audience: hard to impress, critical, and possibly even hostile. His material is technical, dry, and requires significant effort (even in the best circumstances) to process and follow. He didn't have a teleprompter (though he could have speaker notes had he chosen to) nor a presentation support team.

So, yes, it seems that I'm being unfair in my comparison.

Except that there were, in that very same conference, three keynote speakers with equally dry, non-multimedia, hard to process material, who did a great job. They varied a lot in style and delivery but all made their points clear and memorable, kept their presentations moving along, and didn't use their projected materials as a crutch.

Above all, they had something interesting and important to say, they knew precisely what it was, and they made sure the audience understood it.

Sunday, April 22, 2012

Frequentists, Bayesians, and HBO's "Girls"

Yielding to pressure, I watched the first episode of HBO's "Girls" on YouTube — well, the first ten minutes or so. The experience wasn't a total waste: I got an example of the difference between frequentists and Bayesians from it.

The protagonist, whose name I can't remember (henceforth "she"), has an unpaid internship that she took on the expectation of a job. She doesn't get the job and it's implied that there never was a job.*

Given that there's only that one data point, frequentists would have to decline any conclusion regarding the existence of the potential job. (The point estimate would be irrelevant without a variance for that estimate.)

Bayesians have a different view of things.

There are two possible states of the world: boss told the truth ($T$) or boss lied ($L$). There are two possible events: she gets a job ($J$) or she doesn't ($N$).

Without any information about the boss, we'll assume that the probability of truth or lie before any event was observed (that is the apriori probability) is

$\Pr(T) = p_0 = 1/2$,
$\Pr(L) = 1-p_0 = 1/2$.

The $1/2$ is the maximum entropy assumption for $p_0$, meaning we are the most uncertain about the truthfulness of the boss.

If the boss lied, then we can never observe event $J$,

$\Pr(J|L) = 0$,
$\Pr(N|L) = 1$.

If the boss told the truth, and there was in fact a potential job, she might still not get the job, as she might be a bad match. Given no other information, we can assume the same high-entropy case, here for the conditional probabilities:

$\Pr(J|T) = 1/2$,
$\Pr(N|T) = 1/2$.

We can now determine the probability that the the boss was telling the truth:

$\Pr(T|J) = 1$,
$\Pr(T|N)  = \frac{\Pr(N|T)  \Pr(T)}{\Pr(N|T) \Pr(T)+\Pr(N|L) \Pr(L)}= \frac{1/4}{1/4 + 1/2} = 1/3.$

Since she didn't get the job, there's a $2/3$ chance that there was never any job.

Note that there's really no magic on the Bayesian side; we bring a lot of baggage to the problem with the apriori and conditional probabilities. But in doing so we make our assumptions and ignorance explicit, which allows us to make inferences.

It's not magic, it's Bayes.

-- -- -- --
* I was going to write "SPOILER ALERT," but then I realized there's no way to spoil the show more than it already is...

Counterintuitive solution for being a late chronotype

Hi. I'm Joe and I'm a late chronotype.

A late chronotype is someone whose energy level, after waking up, increases more slowly than than the average person's; also known as "not a morning person." Typically this slow start is balanced by high levels of energy in the evening, when other people are crashing. (Panel I below depicts this for illustration.)

Mismatching Chronotypes

Many late chronotypes believe that the solution to their problem is to sleep late. That is exactly the wrong approach. The problem of being a late chronotype is that our level of energy doesn't match everyone else's. Starting the day later only increases the problem (as illustrated in panel II).

The solution, which may sound counter-intuitive, is to get up much earlier than everyone else, therefore reaching peak energy at the same time as everyone else (as shown in panel III).

I have used a number of approaches to manage being a late chronotype (caffeine, no breakfast, exercise, ice-cold morning shower), but none was ever as effective as being on Boston time while living in California.

Monday, April 2, 2012

Bundling for a reason

There's much to dislike about the current monetization of television shows, but bundling isn't necessarily a bad idea for the channels.

On a recent episode of The Ihnatko Almanac podcast, Andy Ihnatko, talking about HBO pricing and release schedule for Game Of Thrones (which he had blogged about before), said that a rule of commerce is "when customers have money to give you for your product, you take it" (paraphrased). I don't like to defend HBO, but that rule is incomplete: it should read " take it as long as it doesn't change your ability to get more money from other customers."

An example (simplistic for clarity, but the reason why HBO bundles content):

In this example HBO has three shows: Game of Thrones, Sopranos, Sex and the City; and there are only three customers in the world, Andy, Ben, and Charles. Each of the customers values each of the shows differently. What they're willing to pay for one season of each show is:

$ \begin{array}{lccc}
 & \mathrm{GoT} & \mathrm{Sopranos}  & \mathrm{SatC} \\
\mathrm{Andy} &100 & 40 &10\\
\mathrm{Ben}  & 40 & 10 & 100 \\
\mathrm{Charles}   & 10 &100 & 40\\

HBO can sell each of them a subscription for $\$150$/yr. Or it can price each show at $\$100$ and get a total of $\$100$ from each customer (any other price is even worse). This is the standard rationale for all bundling: take advantage of uncorrelated preferences.

By keeping the shows exclusively on their channel for a year, they get to realize those $\$150$ from the "high value" customers. After that, HBO sells the individual shows to make money off of people who don't value the HBO channel enough to subscribe (people other than Andy, Ben, or Charles above). This is standard time-based price segmentation.

This is not to say that HBO and other content providers won't have to adapt; but their release schedule is not just because they're old-fashioned.

Friday, March 9, 2012

Curse of the early adopter: no new iPad for me

I probably won't buy the new iPad (a/k/a iPad 3) because I tend be an early adopter of Apple's products.

Yes, you read that right. I tend to adopt Apple products early, buying the first generation of a product line. This makes me question the wisdom of upgrading when new generations are released:

  • Does the new generation add enough incremental happiness to justify the expense? 
  • Shouldn't I wait for the next new generation and get an even larger increase in happiness?

The decision parameters are illustrated in the following figure:

Deciding whether to buy the new iPad

The comparative statics of that picture are: as the difference in happiness from the old iPad to the new iPad, DH1, increases, I'm more likely to buy the new iPad; as the difference in happiness from the next new iPad to the current new iPad, DH2, increases, I'm less likely to buy the new iPad; and the longer the period between the new iPad and the next new iPad, T, the more likely I am to buy the new iPad.

Of course, I don't really know these quantities, except T: the time between iPad refreshes seems to be one year, give or take, and I tend to skip every other generation in all products anyway, so T equals two years.

To evaluate DH1 and have some forecast of DH2 I need to consider how I use my iPad now and how that might change with the new iPad. I also need to consider what incremental changes DH2 could include. (If I determine that DH1 is not high enough, I don't need to worry about DH2, so I'll start with that.)

What I do with my iPad now is consume content; the new iPad seems to be targeted at creating it, but I don't know whether that would work for me. Here's the content I create and its relation to the iPad:

Technical papers in LaTeX. There are a few apps that allow me to edit formulas on the iPad (and one that uses Dropbox and a paired app on a laptop to allow the user to create actual documents), but in general LaTeX is not a good fit for the iPad.

R code. There's no R for iPad and apparently there won't be in the near future. I could edit the code in a word processor on the iPad and run it on a laptop, but that's of minor value to me. (I code in other languages and use Mathematica and Stata too, but the overwhelming volume of programming I do is in R.)

Presentations. I own iWork for the iPad and have tried to use it for presentation design, but I find I  require higher-powered tools: even on a laptop I make most of my slides with Illustrator and Photoshop. The image above was made with Keynote (on a laptop), but that's not up to my presentation standards. I do like how easy Magic Move makes creating simple animations. I can and do use Pages on the iPad to outline presentations.

Teaching materials. I can certainly create some teaching text, but again the drafting and page layout tools I use are not available for the iPad. I find the process of making spreadsheets on the iPad cumbersome, but I use the iPad version of Numbers to create and fill forms to keep track of students (important for participant-centered learning).

There are other types of content that I could create with the new iPad: videos, photos, and music. I think these may have their value, but not for me. I take photos with a DSLR and tweak them with Photoshop, make videos (mostly of my presentations, classes, and exec-ed) with a Kodak Zi8, and make music when I play piano.

What about consumption? Perhaps that's where I can find a reason to upgrade. I mostly do three (and a half) things with the iPad:

Reading: Kindle books, iBooks, Instapaper, and PDFs. By far the most time I spend using the iPad is spent reading. A retina display might make a difference, but I tend to make the type very large anyway. Perhaps new iBooks will make it worthwhile to upgrade, but that suggests I should wait for the next new generation and the books that will be available then.

Browsing the web. I do this usually while watching/listening to TV, usually to deplete my RSS monster, check Twitter and Facebook using Flipboard, and to check out forums. I feed a lot of content into my Tumblr blogs (personal and teaching) and Instapaper this way. The extra speed would be nice, but the binding constraint so far appears to be the low speed of my home internet connection.

Email. I check, and usually process to zero, my morning emails even before getting out of bed. I prefer to read my email on the iPad and compose it on a laptop. (It's true that a bluetooth keyboard would probably make composing the email much easier. But that would negate the compactness and self-containment of the iPad.)

Games. I seldom play games (computer or otherwise; ironic that I'm a game theorist), but, when I do, the only ones I play are the ones on my iPad: Solitaire, Mahjong, and Crosswords. Usually while listening to audio podcasts or television.

I also occasionally use the iPad to finish watching a Netflix movie in bed, to read Marvel comic books, or to watch video podcasts on a repeated exercise machine (like a recumbent bicycle). These are minor uses and don't influence the decision.*

Which, unsurprisingly, is to not buy the new iPad.

Of course there's always the possibility that the decision will come down to a emotional death match between gadget lust and the no-nonessential purchases rule, all logic above be damned.

* I listen to music, audiobooks, and audio podcasts on my first-generation iPod Nano or my first-generation iPod Touch. My first-generation iPod – yes, the one with the mechanical clickwheel – was recently decommissioned, after ten years of service, due to hard drive failure (its earlier battery death was circumvented by using it as a home MP3 player, feeding my stereo and powered by its AC adapter).

Saturday, March 3, 2012

Screen interactions to avoid wasting time

Sometimes the winning move is to make the game go away.

Like other scientists who appeal to a popular audience, Richard Feynman corresponded with a number of cranks; some of this correspondence is available in the book Perfectly reasonable deviations from the beaten track: The letters of Richard P. Feynman

On pages 129-134 of the hardcover, Feynman addresses a Mr. Y, who believes the "Physics establishment" to be wrong about relativity and he, Mr. Y, to be right. Mr. Y clearly knows very little physics.

On his second reply to Mr. Y, Feynman includes a problem, slightly modified from an undergraduate class, under the guise of clarifying the source of their disagreement. Once Mr. Y's next letter fails to answer the problem, Feynman excuses himself from the conversation by explaining that, without knowing what Mr. Y's theory predicts in that problem, there's no way to determine the source of their (Feynman and Mr. Y's) differences.

I was reminded of this story when, some days ago, I employed a similar screening device to avoid getting drawn into a lunchtime argument with an ignoramus. (Details and domain changed.)

Ignoramus: It's clear that we need to do Action 1 because the average of Variable X is increasing.

Me (thinking): Did Naan & Curry stop including Palak Paneer in the lunch buffet or have they just run out?

Ignoramus: Don't you agree that people who don't want to do Action 1 are anti-scientific?

Me: I was thinking... I'm not sure, but let me just get the details right: the average of Variable X is increasing, you say. How is that average computed, precisely? I mean, there are parts of Domain of Variable X that are volumes and parts of the Domain of Variable X that are areas. So, how does one compute an average over two domains with different dimensions?

(If you assert that Science is on your side, you'd better know what you're talking about. Otherwise, you're just parroting whomever convinced you last and it's a waste of my time to talk to you.)

Ignoramus: I don't follow.

Me: Well, if you have an average of $X_1$ per $m^2$  over Domain 1 and an average of $X_2$ per $m^{3}$ over Domain 2, how do you combine that?

Ignoramus: I don't know. Why is that important? Everyone agrees that the average of Variable X is increasing, except the anti-scientific. Are you anti-scientific?

Me: I have no opinion over a quantity that I cannot define. How are the averages of X/volume and X/area combined? They have different dimensions, so you can't add them.

Ignoramus: But everyone know that the average of Variable X is increasing.

Me: So, let me get this straight: you cannot define the quantity "average of Variable X" in a precise way, but you're sure it's increasing?

Ignoramus: I'm sure the experts know how to do that.

Me: But how can it be possible to average two quantities, one that is defined in X per square meters and one that is defined in X per cubic meters? That's a mathematical impossibility.

Ignoramus: But the experts agree.

Me: I just can't see how you can believe that a quantity is so important, have such strong opinions about its trend, the implications of that trend, and the people who disagree with those implications — and at the same time have no idea how the quantity is computed.

Ignoramus (sulking): All the experts agree.

Me: I cannot express an opinion over a quantity I don't understand. Perhaps someone who knows this matter better than you do will be able to explain it to me and then I'll be able to form an opinion.

This exchange captures the basic problem of the Ignoramus: a little knowledge is a dangerous thing. It also illustrates the power of screening questions to stop people from wasting my time.

Saturday, February 18, 2012

Analysis of the Tweets vs. Likes at the Monkey Cage

I find the question of what posts are more likely to be tweeted than liked a little strange; ideally one would want more of both.

The story so far:  a Monkey Cage post proposed some hypotheses for what characteristics of a post made it more likely to be tweeted than liked. Causal Loop did the analysis (linked at the Monkey Cage) using a composite index. Laudable as the analysis was (and how different Political Science is from the 1990s), I think I can improve upon it.

First, there are 51 (of 860 total) posts with zero likes and zero tweets. This is important information: these are posts that no one thought worthy of social media attention. Unlike Causal Loop, I want to keep these data in my dataset.

Second, instead of a ratio of likes to tweets (or more precisely, an index based on a modified ratio), I'll estimate separate models for likes and tweets, with comparable specifications. To see the problem with ratios consider the following three posts

Post A: 4 tweets, 2 likes
Post B: 8 tweets, 2 likes
Post C: 400 tweets, 200 likes

A ratio metric treats posts A and C as identical, while separating them from post B. But intuitively we expect a post like C, which generates a lot of social media activity in aggregate, to be different from posts A and B, which don't. (This scale insensitivity is a general characteristic of ratio measures.) This is one of the reasons I prefer disaggregate models. Another reason is that adding Google "+1"s would be trivial to a disaggregate model -- just run the same specifications for another dependent variable -- and complex to a ratio-based index.

To test various hypotheses one can use appropriate tests on the coefficients of the independent variables in the models or simulations to test inferences when the specifications are different (and a Hausman-like test isn't conveniently available). That's what I would do for more serious testing. With identical specifications one can compare the z-values, of course, but that's a little too reductive.

Since the likes and tweets are count variables, all that is necessary is to model the processes generating each as the aggregation of discrete events. For this post I assumed a Poisson process; its limitations are discussed below.

I loaded Causal Loop's data into Stata (yes, I could have done it in R, but since the data is in Stata format and I still own Stata, I minimized effort) and run a series of nested Poisson models: first with only the basic descriptor variables (length, graphics, video, grade level), then adding the indicator variables for the authors, then adding the indicator variables for the topics.  The all-variables-included models results (click for bigger):

Determinants of likes and tweets for posts in The Monkey Cage blog

A few important observations regarding this choice of models:

1. First and foremost, I'm violating the Prime Directive of model-building: I'm unfamiliar with the data. I read the Monkey Cage regularly, so I have an idea of what the posts are, but I didn't explore the data to make sure I understood what each variable meant or what the possible instantiations were. In other words, I acted as a blind data-miner. Never do this! Before building models always make sure you understand what the data mean. My excuse is that I'm not going to take the recommendations seriously and this is a way to pass the morning on Saturday. But even so, if you're one of my students, do what I say, not what I just did.

2. The choice of Poisson process as basis for the count model, convenient as it is, is probably wrong. There's almost surely state dependence in liking and tweeting: if a post is tweeted, then a larger audience (Twitter followers of the person tweeting rather than Monkey Cage readers) gets exposed to it, increasing the probability of other tweets (and also of likes -- generated from the diffusion on Twitter which brings people to the Monkey Cage who then like posts to Facebook). By using Poisson, I'm implicitly assuming a zero-order process and independence between tweets and likes -- which is almost surely not true.

3. I think including the zeros is very important. But my choice of a non-switching model implies that the differences between zero and other number of likes and tweets is only a difference of degree. It is possible, indeed likely, that they are differences of kind or process. To capture this, I'd have to build a switching model, where the determinants of zero likes or tweets were allowed to be separate from the determinants of the number of tweets and likes conditional on their being nonzero.

With all these provisos, here are some possible tongue-in-cheek conclusions from the above models:
  • Joshua Tucker doesn’t influence tweetability, but his authorship decreases likability; ditto for Andrew Gelman and John Sides. Sorry, guys.
  • James Fearon writes tweetable but not likable content.
  • Potpourri is the least tweetable tag and also not likable; International relations is the most tweetable but not likable; Frivolity, on the other hand is highly likable. That says something about Facebook, no? 
  • Newsletters are tweetable but not likable… again Nerds on Tweeter, Airheads on Facebook.
As for Joshua Tucker's hypotheses, I find some support for them, from examining the models, but I wouldn't want to commit to a support or reject before running some more elaborate tests.

Given my violation of the Prime Directive of model building (make sure you understand the data before you start building models), I wouldn't start docking the -- I'm sure -- lavish pay and benefits afforded by the Monkey Cage to its bloggers based on the numbers above.