Si Tacuisses, Philosophus Mansisses: Quants

Showing posts with label Quants. Show all posts

Friday, October 25, 2019

How many tangerines fit in this room?

How a person answers simple questions can tell a lot about what type of thinker they are.

It's not that you need to know a lot of math to answer this question (it's basic geometry and arithmetic), but rather that people who think quantitatively as part of their day-to-day life can be identified by their attitude towards this question.

There's a big difference between someone who thinks like a quant and someone who can do math on demand, so to speak. Thinking like a quant means that you generally look at the world through the prism of math; that when you're solving a work problem, you're not just applying knowledge from your education, but also something you practice every day. And that practice makes a difference.

It's like the difference between an athlete (even if amateur) and someone who goes to gym class.

To illustrate, consider your typical "lone inventor can upset entire industry" story, in particular this one that was in the last Fun With Numbers.

I didn't read the article, but from the photo [which is deceptive, in the article the 1500-mile battery is bigger, though still small enough to make the result non-credible] we can see that the '1500-mile battery' volume is about 2 liters, so a little bit of arithmetic ensued:

1500 miles w/ better-than-current vehicles [a google search shows that they're all over 250 Wh/mi], say 200 Wh/mi: 300 kWh (1.08 GJ)

Volume of battery, from article photo [estimated by eye], let's say 2 l, so energy density = 504 MJ/l

Current Li-Ion battery energy density [google search] ~2.5 MJ/l to 5 MJ/l (experimental)

Home inventor creates something 100 to 200 times more dense than
current technology (and about 15 times more energy-dense than gasoline)?! Not credible.

Are we to believe that the journalists can't do the simple search and arithmetic needed to raise the concerns we can see? Or that they expect none of their audience to? (This second question assuming that the journalists know that the battery can't work, but are willing to write these clickbait headlines because they assume their credibility is not going to be questioned by innumerate audiences.)

Back to the tangerines, and a tale of three people.

Person one gets confused by the question, takes a while to think in qualitative terms (sometimes verbalizing those), then eventually realizes it's a geometry question and with more or less celerity solves it. Person one can do math "on demand," but doesn't think like a quant.

Person two grasps the geometric nature of the problem immediately, estimates the size of the room and of an average tangerine, reaches for a calculator, and gives an estimate. Person two "groks" the problem and is a quant thinker.

Person three sketches out the same calculation as person two, but then adds a twist: instead of a calculator, person three reaches for a spreadsheet, to create a model where the parameters can be varied to allow for sensitivity analysis. Person three is an advanced version of a quant thinker, a model-based thinker.

Wednesday, June 12, 2019

A statistical analysis of reviews of L.A. Finest: audience vs. critics

"If numbers are available, let's use the numbers. If all we have are opinions, let's go with mine." -- variously attributed to a number of bosses.

There's a new police procedural this season, L.A. Finest, and Rotten Tomatoes has done it again: critics and audience appear to be at loggerheads. Like with The Orville, Star Trek Discovery, and the last season of Doctor Who.

But "appear to be" is a dequantified statement. And Rotten Tomatoes has numbers; so, what can these numbers tell us?

Before they can tell us anything, we need to write our question: first in words, then as a math problem. Then we can solve the math problem and that solution gets translated into a "words" answer, but now a quantified "words" answer.

The question, which is suggested by the above numbers is:

Do the critics and the audience use similar or opposite criteria to rate this show?

One way to answer this question, which would have been feasible in the past when Rotten Tomatoes had user reviews, would be to do text analytics on the reviews themselves. But now the user reviews are gone so that's no longer possible.

Another way, a simpler and cleaner way, is to use the data above.

To simplify we'll assume that all ratings are either positive or negative, 0 or 1; there are some unobservable random factors that make some people like a show more or less, so these ratings are random variables. For a given person $i$, the probability that that person likes L.A. Finest is captured in some parameter $\theta_i$ (we don't observe that, of course), which is the probability of that person giving a positive rating.

So, our question above is whether the $\theta_i$ of the critics and the $\theta_i$ of the audience are the same or "opposed." And what is "opposed"? If $i$ and $j$ use opposite criteria, the probability that $i$ gives a 1 is the probability that $j$ gives a 0, so $\theta_i = 1-\theta_j$.

We don't have the individual parameters $\theta_i$ but we can simplify again by assuming that all variation within each group (critics or audience) is random, so we really only need two $\theta$.

We are comparing two situations, call them: hypothesis zero, $H_0$, meaning the critics and the audience use the same criteria, that is they have the same $\theta$, call it $\theta_0$; and hypothesis one, $H_1$, meaning the critics use criteria opposite to those of the audience, so if the critics $\theta$ is $\theta_1$, the audience $\theta$ is $(1-\theta_1)$.

Yes, I know, we don't have $\theta_0$ or $\theta_1$. We'll get there.

Our "words" question now becomes the following math problem: how much more likely is it that the data we observe is created by $H_1$ versus created by $H_0$, or in a formula: what is the likelihood ratio

$LR = \frac{\Pr(\mathrm{Data}| H_1)}{\Pr(\mathrm{Data}| H_0)} $?

Observation: This is different from the usual statistics test: the usual test is whether the two distributions are different; we are testing for a specific type of difference, opposition. So there are in fact three states of the world: same, opposite, and different but not opposite; we want to compare the likelihood of the first two. If same is much more likely than opposite, then we conclude 'same.' If opposite is much more likely than same, we conclude 'opposite.' If same and opposite have similar likelihoods (for some notion of 'similar' we'd have to investigate), then we conclude 'different but not opposite.'

Our data is four numbers: number of critics $N_C = 10$, number of positive reviews by critics $k_C = 1$, number of audience members $N_A = 40$, number of positive reviews by audience members $k_A = 30$.

But what about the $\theta_0$ and $\theta_1$?

This is where the lofty field of mathematics gives way to the down and dirty world of estimation. We estimate $\theta$ by maximum likelihood, and the maximum likelihood estimator for the probability of a positive outcome of a binary random variable (called a Bernoulli variable) is the sample mean.

Yep, all those words to say "use the share of 1s as the $\theta$."

Not so fast. True, for $H_0$, we use the share of ones

$\theta_0 = (k_C + k_A)/(N_C + N_A) = 31/50 = 0.62$;

but for $H_1$, we need to address the audience's $1-\theta_1$ by reverse coding the zeros and ones, in other words,

$\theta_1 = (k_C + (N_A - k_A))/(N_C + N_A) = 11/50 = 0.22$.

Yes, those two fractions are "estimation." Maximum likelihood estimation, at that.

Now that we are done with the dirty statistics, we come back to the shiny world of math, by using our estimates to solve the math problem. That requires a small bit of combinatorics and probability theory, all in a single sentence:

If each individual data point is an independent and identically distributed Bernoulli variable, the sum of these data points follows the binomial distribution.

Therefore the desired probabilities, which are joint probabilities of two binomial distributions, one for the critics, one for the audience, are

$\Pr(\mathrm{Data}| H_0) = c(N_C,k_C) (\theta_0)^{k_C} (1- \theta_0)^{N_C- k_C} \times c(N_A,k_A) (\theta_0)^{k_A} (1- \theta_0)^{N_A- k_A}$

and

$\Pr(\mathrm{Data}| H_1) = c(N_C,k_C) (\theta_1)^{k_C} (1- \theta_1)^{N_C- k_C} \times c(N_A,k_A) (1 -\theta_1)^{k_A} (\theta_1)^{N_A- k_A}$.

Replacing the symbols with the estimates and the data we get

$\Pr(\mathrm{Data}| H_0) = 3.222\times 10^{-5}$;
$\Pr(\mathrm{Data}| H_1) = 3.066\times 10^{-2}$.

We can now compute the likelihood ratio,

$LR = \frac{\Pr(\mathrm{Data}| H_1)}{\Pr(\mathrm{Data}| H_0)} = 915$,

and translate that into words to make the statement

It's 915 times more likely that critics are using criteria opposite to those of the audience than the same criteria.

Isn't that a lot more satisfying than saying they "appear to be at loggerheads"?

Tuesday, August 30, 2016

Some thoughts on quant interviews

Being a curmudgeonly quant, I started reacting to people who "love" science and math with simple Post-It questions like this:

(This is not a gotcha question, all you need is to apply Pythagorean theorem twice. I even picked numbers that work out well. Yes, $9 \sqrt{2}$ is a number that works out well.)

Which reminds me of quant interviews and their shortcomings.

I already wrote about what I think is the most important problem in quantitative thinking for the general public, in Innumeracy, Acalculia, or Numerophobia, which was inspired by this Sprezzaturian's post (Sprezzaturian was writing about quant interviews).

In search of quants

That was for the general public. This post is specifically about interviewing to determine quality of quantitative thinking. Which is more than just mathematical and statistical knowledge.

One way to test mathematical knowledge is to ask the same type of questions one gets in an exam, such as:

$\qquad$ Compute $\frac{\partial }{\partial x} \frac{\partial }{\partial y} \frac{2 \sin(x) - 3 \sin(y)}{\sin(x)\sin(y)}$.

Having interacted with self-appointed "analytics experts" who had trouble with basic calculus (sometimes even basic algebra), this kind of test sounds very appealing at first. But its focus in on the wrong side of the skill set.

Physicist Eric Mazur has the best example of the disconnect between being able to answer a technical question and understanding the material:

TL; DR: students can't apply Newton's third law of motion (for every action there's an equal and opposite reaction) to a simple problem (car collision), though they can all recite that selfsame third law. I wrote a post about this before.

Testing what matters

Knowledge tests should at the very least be complemented with (if not superseded by) "facility with quantitative thinking"-type questions. For example, let's say Bob is interviewing for a job and is given the following graph (and formula):

Nina, the interviewer, asks Bob to explain what the formula means and to grok the parameters.

Bob Who Recites Knowledge will say something like "it's a sine with argument $2 \pi \rho x$ multiplied by an exponential of $- \kappa x$; if you give me the data points I can use Excel Solver to fit a model to get estimates of $\rho$ and $\kappa$."

Bob Who Understands will start by calling the graph what it is: a dampened oscillation over $x$. Treating $x$ as time for exposition purposes, that makes $\rho$ a frequency in Hertz and $\kappa$ the dampening factor.

Next, Bob Who Understands says that there appear to be 5 1/4 cycles between 0 and 1, so $\hat \rho = 5.25$. Estimating $\kappa$ is a little harder, but since the first 3/4 cycle maps to an amplitude of $-0.75$, all we need is to solve two equations, first translating 3/4 cycle to the $x$ scale,

$\qquad$ $ 10.5 \, \pi x = 1.5 \, \pi$ or $x= 0.14$

and then computing a dampening of $0.75$ at that point, since $\sin(3/2 \, \pi) = - 1$,

$\qquad$ $\exp(-\hat\kappa \times 0.14) = 0.75$, or $\hat \kappa = - \log(0.75)/0.14 = 2.3$

Bob Who Understands then says, "of course, these are only approximations; given the data points I can quickly fit a model in #rstats that gets better estimates, plus quality measures of those estimates."

(Nerd note: If instead of $e^{-\kappa x}$ the dampening had been $2^{-\kappa x}$, then $1/\kappa$ would be the half-life of the process; but the numbers aren't as clean with base $e$.)

This facility with approximate reasoning (and use of #rstats :-) signal something important about Bob Who Understands: he understands what the numbers mean in terms of their effects on the function; he groks the function.

Nina hires Bob Who Understands. Bonuses galore follow.

Bob Who Recites Knowledge joins a government agency, funding research based on "objective, quantitative" metrics, where he excels at memorizing the 264,482 pages of regulation defining rules for awarding grants.