Wednesday, March 30, 2016

Three cardinal sins of presenting

Observations from yet another terrible talk.

(To protect the guilty, the presenter will be called "Epic," short for "Epic Fail II," and without loss of generality will be referred to with masculine pronouns.)

Epic committed three cardinal sins of presentations (there are more than three and some of the others were present in the terrible talk), in increasing order of badness:

The sin of humming: 

"Hum... like... basically..." were Epic's most common words. Or sounds, more precisely, because that's what they are. Sounds that Epic made as his brain composed the sentence that was to come.

This is the main problem of using slides-as-presenter-notes, though it also happens to presenters who have separate "talk skeleton" notes and don't rehearse a few times: bullet points aren't feasible out-loud sentences, so, to unprepared presenters, they act as stumbling blocks rather than helpful hints.

Some people are very articulate; some can be articulate from notes; most of the others need to do at least one run-through of the notes, preferably to camera so they can review it. The camera is essential, as without feedback there's little improvement.

Humming is a sign the presenter didn't care enough for the audience to rehearse his presentation.

The sin of non-preparedness:

Like most presenters, Epic seems to have created his presentation in a small fraction of the presentation time. That's usually a recipe for disaster. While some people can make good presentations impromptu or quasi-impromptu, most presenters should prepare carefully.

Epic's presentation had no clear objectives, no clear structure, and above all, no clear arguments. For comparison, there was another presenter at the conference who, in order to explain a programming philosophy created a motivating example based on refactoring a cookbook.

The procedure for preparing isn't complicated: decide what the presentation objectives are; decide how they sequence into each other; devise ways to explain these objectives; assemble the presentation; rehearse.

Epic skipped all these stages, except the assembling of the presentation as a sequence of presenter-notes-on-slides, but without actually thinking much about what each point. Epic didn't think about the phrasing of the points (see previous sin), let alone consider how to best explain them to the audience.

Good presentations begin in the preparation; bad presentations in the lack of it.

The sin of self-absorption:

The audience was promised, and therefore expected, a technical talk about a technical tool. Epic delivered a presentation about Epic: Epic's education (really, a CV slide and multiple name-drops to Epic's school, Epic's degree, Epic's degree advisor); Epic's actions ("I did this," "I found that" not "data show" or "tool does this"); Epic's performance on Epic's job (via repeated references to a sort of limited field contests/competitions, to which the audience groan was the only appropriate answer).

Two other presenters in the same session described highly technical tools, barely ever using the first person, talking about the tools, offering interesting if technically challenging knowledge. That's because, unlike Epic, they understood that the audience wasn't there to learn about the presenters' lives, but rather about the tools.

Epic, like many terrible presenters, bought into the idea that every presentation has to be a story (more or less right, even for a technical audience) about the presenter (absolutely wrong, unless you're presenting an autobiography).

Audiences don't like bait-and-switch: deliver what was promised, not what you like.

Many talks are bad, and that's a choice made by the presenter.

Saturday, March 12, 2016

Read before writing

A quick refresher this morning before tackling a writing task in the afternoon.

A quick read of my notes on these two books always helps focus my attention for any writing task.

I make a point of re-reading Zinsser's book in its entirety at least once a year. It takes but a couple of hours, best 'writing skills preventative maintenance' I can think of. It's also worth re-reading my notes prior to any major writing task, which is why I'm doing it today. I think of it as 'pre-flighting my writing skills'.

Before any major writing task, I go over Strunk & White's rules so that they're fresh in my mind as I write. That helps cut down on editing time later.

-- -- -- --

For the terminally lazy: Amazon links to On Writing Well and The Elements Of Style. (I would make them affiliate links, but I too am lazy.)

Saturday, March 5, 2016

Powerlifters vs Gym Rats - A tale of two means

In my last post I wrote:

For example, some time ago I had a discussion with a friend about strength training. The gist of it was that powerlifters are typically much stronger than the average athlete, but they are also much fewer; because of that, in a typical gym the strongest athlete might not be a powerlifter, but as we get into regional competitions and national competitions, the winner is going to be a powerlifter.

And the explanation, which the friend didn't understand, was "because on the upper tail the difference between means is going to dominate the difference in sizes of the population."

So here's an illustration of what I meant, with pictures and numbers and bad jokes.

First let's make the setup explicit. That's the great power of math and numerical examples, making things explicit. "Powerlifters are typically much stronger than the average athlete" will be operationalized with four assumptions:
A1: There's some composite metric of strength, call it $S$ that we care about and we'll normalize it so that the average gym rat has a mean $\mu(S_{\mathrm{GR}})$ of zero and a variance of $1$. 
A2: The distribution of strength within the population of gym rats is Normally distributed. 
A3: The distribution of strength in the sub-population of powerlifters is also Normally distributed. 
A4: For illustration purposes only, we will assume that powerlifters have a mean $\mu(S_{\mathrm{PL}})$ of 2 and the same variance as the rest of the gym rats.
We operationalize "they are also much fewer" with
A5: For illustration, the number of powerlifters is $1\%$ of gym rats.
(Powerlifters are gym rats, so the distribution for $S_{\mathrm{GR}}$ includes these $1\%$, balanced by CrossFit people, who bring down the mean strength and IQ in the gym while raising the insurance premiums. Watch Elgintensity to understand.)

The following figure shows the distributions:

When we look at the people in a gym with above-average strength, that is people with $S_{\mathrm{GR}}>0$, we find that one-half of all gym rats have that, and $98
\%$ of all powerlifters have that: $\Pr(S_{\mathrm{GR}}>0) = 0.5$ and $\Pr(S_{\mathrm{PL}}>0) = 0.98$. This is illustrated in the next figure:

Powerlifters are over-represented in the above-average strength, approximately twice as much as in the general population, but they are only about $2\%$ of the total, as their over-representation is multiplied by $1\%$.

As we become more selective, the over-representation goes up. For athletes that are at least one standard deviation above the mean, we have:

with $\Pr(S_{\mathrm{GR}}>1) = 0.16$ and $\Pr(S_{\mathrm{PL}}>1) = 0.84$. Powerlifters are over-represented 5-fold, so about $5\%$ of the total athletes in this category.

When we become more and more selective, for example when we compute the number of gym rats that have at least as much strength as the average powerlifter, $\Pr(S_{\mathrm{GR}}>2)$, we get

with $\Pr(S_{\mathrm{GR}}>2) = 0.023$ and $\Pr(S_{\mathrm{PL}}>2) = 0.5$, a 22-fold over-representation, meaning that of every six athletes in this category, one is a powerlifter. (Yes, one out of six, not one out of five. See if you can figure out why; if not, look at the solution for $S>6$ below and you'll understand. Or not, but that's a different problem.)

And as we look at subsets of stronger and stronger athletes, the over-representation of powerlifters becomes higher and higher: $\Pr(S_{\mathrm{GR}}>3) = 0.00135$ and $\Pr(S_{\mathrm{PL}}>3) = 0.159$, $118$-fold ratio. There will be a few more powerlifters in this group that other gym rats; another way to say that is that powerlifters will be a little bit more than one-half of all gym rats that are at least one standard deviation stronger than the average powerlifter.

The ratios grow exponentially with increasing values for strength (the rare correct use of "exponentially" as they are ratios of Normal distribution tail probabilities; see below).

For $S>4$ the ratio is $718$, for $S>5$ the ratio is $4700$, for $S>6$ the ratio is $32 100$, in other words, there will be one non-powerlifter per group of $322$ gym rats with strength greater than 6 standard deviations above the mean of all gym rats.

This is what the effect of the differences in the tails of Normals always implies: eventually the small size of the better population (powerlifters) will be irrelevant as the higher mean will dominate.

See? That wasn't complicated at all.

-- -- -- --

For the mathematically inclined (strangely themselves over-represented in the set of powerlifters...)

Note that the ratio of probability density functions for the two Normal distributions in the post, for realizations of strength $S = x$ is
\frac{f_{S}(x|\mu_{S}=2)}{f_{S}(x|\mu_{S}=0)}= \frac{e^{-(x-2)^2/2}}{e^{-x^2/2}}= e^{2x-2}
which grows unbounded with $x$; no matter how small the fraction of powerlifters, say $\epsilon$, there's always a minimal $\bar S$ beyond which that ratio becomes greater than $1/\epsilon$ Which means that at some point above $\bar S$ the ratio of the remaining tail itself becomes greater than $1/\epsilon$. (It's very easy to calculate $\bar S$ and I have done so; I'll leave it as an exercise for the dedicated reader...)

Oh, that's the rare occurrence of the correct use of "exponentially," which is usually incorrectly treated as a synonym for "convex."

Wednesday, March 2, 2016

Acalculia, innumeracy, or numerophobia?

I think there's an epidemic of number-induced brain paralysis going around.

There are quite a few examples of quant questions in interviews creating the mental equivalent of a frozen operating system (including this post by Sprezzaturian), but I think that there's something beyond that, something that applies in social situations and that affects people who should know better.

Here's a simple example. What is the orbital speed of the International Space Station, roughly? No, don't google it, calculate it. Orbital period is about 90 minutes, altitude (distance to ground) about 400km, Earth radius is about 6370km.

Seriously, this question stumps people with university degrees, including some in the life sciences who necessarily have taken college level science courses.

And what college-level math do you need to answer it? The formula for the circumference of a circle of radius $r$. Yes, $2\times\pi\times r$. The orbital velocity in km/h is the total number of kilometers per orbit ($2\times\pi\times (6370+400)$) divided by the time to orbit in hours ($1\frac{1}{2}$), that is around $28\,000$ km/h, which is close to the actual value, $27\, 600$ km/h. (The orbit is an ellipse and takes more than 90 minutes.)

Can it possibly be ignorance, innumeracy? Is it plausible that college-educated professionals don't know the circumference formula?  Nope, they can recite the formula when prompted.

Or is it acalculia? That they have a mental inability to do calculation? Nope, they can compute exactly how much I owe on the lunch bill for the extra crème brûlée and the expensive entrée.

No, I think it's a mild case of numerophobia, a mental paralysis created by the appearance of an unexpected numerical challenge in normal life. This is a problem, as most of the world can be perceived more deeply if one thinks like a quant all the time; many strange "paradoxes" become obvious when seen through the lens of numerical (or parametrical) thinking.

For example, some time ago I had a discussion with a friend about strength training. The gist of it was that powerlifters are typically much stronger than the average athlete, but they are also much fewer; because of that, in a typical gym the strongest athlete might not be a powerlifter, but as we get into regional competitions and national competitions, the winner is going to be a powerlifter.

"That's because on the upper tail the difference between means is going to dominate the difference in sizes of the population." That quoted sentence is what I said. I might as well have said "boo-blee-gaa-gee in-a-gadda-vida hidee-hidee-hidee-oh" for all the comprehension. The friend is an engineer. A numbers person. But apparently, numbers are work-domain only.

The awesome power of quant thinking is being blocked by this strange social numerophobia. We must fight it. Liberate your inner quant; learn to love numbers in all areas of life.

Everything is numbers.