Friday, October 25, 2019

How many tangerines fit in this room?

How a person answers simple questions can tell a lot about what type of thinker they are.


It's not that you need to know a lot of math to answer this question (it's basic geometry and arithmetic), but rather that people who think quantitatively as part of their day-to-day life can be identified by their attitude towards this question.

There's a big difference between someone who thinks like a quant and someone who can do math on demand, so to speak. Thinking like a quant means that you generally look at the world through the prism of math; that when you're solving a work problem, you're not just applying knowledge from your education, but also something you practice every day. And that practice makes a difference.

 It's like the difference between an athlete (even if amateur) and someone who goes to gym class.

To illustrate, consider your typical "lone inventor can upset entire industry" story, in particular this one that was in the last Fun With Numbers.
I didn't read the article, but from the photo [which is deceptive, in the article the 1500-mile battery is bigger, though still small enough to make the result non-credible] we can see that the '1500-mile battery' volume is about 2 liters, so a little bit of arithmetic ensued: 
  1. 1500 miles w/ better-than-current vehicles [a google search shows that they're all over 250 Wh/mi], say 200 Wh/mi: 300 kWh (1.08 GJ)
  2. Volume of battery, from article photo [estimated by eye], let's say 2 l, so energy density = 504 MJ/l
  3. Current Li-Ion battery energy density [google search] ~2.5 MJ/l to  5 MJ/l (experimental) 
Home inventor creates something 100 to 200 times more dense than
current technology (and about 15 times more energy-dense than gasoline)?! Not credible.
Are we to believe that the journalists can't do the simple search and arithmetic needed to raise the concerns we can see? Or that they expect none of their audience to? (This second question assuming that the journalists know that the battery can't work, but are willing to write these clickbait headlines because they assume their credibility is not going to be questioned by innumerate audiences.)

Back to the tangerines, and a tale of three people.

Person one gets confused by the question, takes a while to think in qualitative terms (sometimes verbalizing those), then eventually realizes it's a geometry question and with more or less celerity solves it. Person one can do math "on demand," but doesn't think like a quant.

Person two grasps the geometric nature of the problem immediately, estimates the size of the room and of an average tangerine, reaches for a calculator, and gives an estimate. Person two "groks" the problem and is a quant thinker.

Person three sketches out the same calculation as person two, but then adds a twist: instead of a calculator, person three reaches for a spreadsheet, to create a model where the parameters can be varied to allow for sensitivity analysis. Person three is an advanced version of a quant thinker, a model-based thinker.




Saturday, October 19, 2019

Fun with numbers for October 19, 2019

(Yes, yet another tweet-recycling post. When I unfroze the blog the reason was that I was tweetstorming blog posts, so now I'm refactoring ideas from twitter, with — one hopes — improvements.)


Negative [effect on carbon capture]


Via Thunderf00t, who manages to find the occasional bad product gem amongst the many non-bad products he "busts!" by not understanding engineering (or pretending not to), we learn of Negative, a captured-carbon bracelet.*


Enter basic math, illusion exits stage left.

Say Bay Area Bob commutes from San Francisco to Palo Alto (100 mi roundtrip), 5 days/week (500 mi/week) on a 25 MPG car; that's 20 gallons of gasoline burned per week.

Gasoline is a complicated mixture, but let's simplify by treating it as 100% iso-octane (2-2-4-trimethylpentane), C8H18; let's simplify further by assuming perfect stoichiometric burn, so 1 kg of iso-octane generates 3.1 kg of CO2.

Gasoline has a density of 0.7489 kg/l or 2.835 kg/gal; this generates 8.75 kg(CO2)/gal(gasoline), so a weekly commute creates 175 kg of CO2.

Say that bracelet is 25 g of pure carbon. That corresponds to 1/1910th of the carbon in a single one-week commute for Bob. (175 kg of CO2 contain 47.7 kg of carbon.)

I'm sure every Bay Area Bob will be sporting one of these Negative bracelets.

What about other hydrocarbons? Given the small mass differences between alkanes, alkenes, and alkynes, we can take a look at the CO2 per kg(hydrocarbon) with a simple calculation:


Note that the maximum CO2 per kg is when the fuel is pure carbon, at 3.67 kg (CO2)per kg (C). So the approximation above (for Bob) isn't too bad.

-- -- -- --
*Another annoying habit of TF is to gloss over the math, usually to the point where his approximations accumulate into nonsensical territory and occasionally even significant technical errors.



Much ado about Ruby Rose's petite physique.


One of the criticisms of Batwoman that might have some merit is that a petite person like Ruby Rose is not credible as an action hero; that a punch from her not-very-muscular arms would not knock out a 250-lb henchman. To which I reply: as opposed to not-exactly-Schwarzenegger Ben Affleck or Christian Bale throwing said 250-lb henchman clear across a parking lot with a single arm? Pah!

This scene, where Batwoman gets shot by a pistol led to some comments on how she would have been thrown in the air, backwards. Because "momentum," say the people who love science but can't do math (or actually bother to learn the science they profess to "love").


The batsuit is bulletproof (has been all along); assuming that it completely distributes the pressure of the impact over the 1/4 square meter of her torso front, there's little effect, as can be seen from the delta speed for the system:

Say Batwoman (Ruby Rose + suit) = 50 kg, bullet (looks like a .45 ACP) is 15g at a muzzle velocity of 250 m/s, so conservation of momentum shows the after-impact speed to be (0.015 * 250)/(50.015) = 0.075 m/s or less than 0.3 km/h, a very small change in velocity to Batwoman that can be easily countered by a braced position.

An alternative way to see the limited effect:

Consider that the bullet is stopped by the suit and loses all its velocity while pushing back 5cm. Assuming constant force, it takes t = 2 s/v = 2 (0.05)/250 = 0.0004 s to stop, for an acceleration of a = v/t = 625000 m/s^2 and a force F = 9375 Newton (almost 975 kgf, but just for 400 microseconds), which spread over 1/4 square meter of her torso is a pressure of 0.38 kgf/cm^2, which is the pressure of a light finger poke (again, for 400 microseconds).

And a tip of the hat to old-style scifi machinery (no labels on buttons or indicators):




Flexagons. Not the hexa ones.





A late entry: more battery nonsense.




Via eevblog, we learn of yet another life-changing momentous innovation by a lone inventor squashed by the Big Industry Conformance Bureau:


I didn't read the article, but from the photo we can see that the '1500-mile battery' volume is about 2 liters, so a little bit of arithmetic ensued:
1500 miles w/ better-than-current vehicles (say 200 Wh/mi): 300 kWh (1.08 GJ)
Volume of battery, from article photo let's say 2 l) so energy density = 504 MJ/l
Current Li-Ion battery energy density ~2.5 MJ/l to  5 MJ/l (experimental)
Home inventor creates something something 100 to 200 times more dense than
current technology (and about 15 times more energy-dense than gasoline)?!

Nope, not credible.

(Note: apparently the photo is deceptive, and the actual "1500 mile battery" is larger, only 9 times more energy-dense than current technology. Which is as non-credible, especially the idea that car manufacturers would be able to stop small electronics makers from adopting a technology that would allow for smaller batteries in laptops and longer times between charge in cell phones. Added Oct 21.)

Wednesday, October 9, 2019

Fun with numbers for October 9, 2019

Rotten Tomatoes and Batwoman


The day after the pilot, a familiar pattern emerges:


Using the same math as these two previous posts, it's 198,134,550 (almost 200 million) times more likely that the critics are using the opposite criteria to those of the audience than they both using the same criteria.

A couple of days later, more data is available:


This data makes the case even more stark: it's now 2,924,953,580,108 (almost three trillion!) times more likely that the critics are using the opposite criteria to those of the audience than they both using the same criteria.

And today (with a tip of the Homburg to local vlogging nerd Nerdrotics), it's even worse:


With this data (ah, the joys of reusable models, even if "model" is a bit of a stretch for something so simple, relatively speaking) we get that it's now 11,028,450,795,963,200 (eleven quadrillion!) times more likely that the critics are using the opposite criteria to those of the audience than they both using the same criteria.

For what it's worth, I liked the pilot, despite my nit-picking it on twitter:




Aerobics: The Paper that Started The Craze.



Here are three explanations that all match the data:

I. The official story: running develops cardiovascular endurance. This is the story that led to the aerobics explosion, to jogging, and to all sorts of "cardio" nonsense. Note that this story is isomorphic to "playing basketball makes people taller."

II. The selection effect story: people with good cardiovascular systems can run faster than those without. This is the "tall people are better at basketball than short people" version of the story.

III. The athletes are better at both story: people who have athletic builds (strong muscles, large thoracic capacity, low body fat) are better at both running and cardiovascular fitness because of that athleticism.

Most likely the result is a combination of these three effects, or in expensive words, the three variables (cardiovascular fitness, muscular development, and running ability) are jointly endogenous. Note also the big excerpt from Body By Science at the end of this post.

Let's take  a closer look at that table:


Not that I'm questioning Cooper's data (okay, I am), but isn't it strange that there are no cases when, say a runner with a distance of 1.27 mi had VO2max of 33.6? That the discrete categories on one side map into non-overlapping categories on the other? No boundary errors? That's an unlikely scenario.

Also, no data about the distribution of the 115 research subjects over the five categories. That would be interesting to know, since the bins for the distance categories are clearly selected at fixed distance intervals, not as representatives of the distribution of subjects. (It would be extremely suspicious if the same number of subjects happened to fall into each category. But if they don't, that's informative and important to the interpretation of the data.)

I know this was the 60s; on the other hand, the 60s were the first real golden age of large-scale data processing (with those "computer" things) and a market research explosion.

One of the factors that confounds these "cardio" results is that training for a specific test makes you better at that test. Another is that strengthening the muscles that are used in a specific motion makes that motion less demanding and therefore puts less strain on the cardiovascular system.

This excerpt from Body By Science illustrates both of these confounds:




Grant Sanderson (3 Blue 1 Brown) on prime number spirals





A late addition: Elon Musk promises PowerPacks for CA



Which brings up two thoughts:

a. Is "just waiting on permits" the new "funding secured"?
 
b. Each powerpack has 210 kWh capacity, so one charges ~3 Teslas, assuming they're low on charge but not zero. (Typical tank truck ~ 11,000 gal tops-up 733 x 15 gal gas tanks. Just FYI)

Friday, October 4, 2019

Fun with numbers for October 4, 2019

It's flu season, let's talk product diffusion


One of the classic marketing models people learn in innovation classes is basically a SIR(1) model without the R part: the Bass model of product diffusion.

The idea is that some fraction $a$ of the consumers are "innovators" who adopt a product without social pressure, while another fraction $b$ are "imitators" who adopt a product when they see others with it. The fraction $x$ of the market that has adopted the product at a given time is given by the following differential equation

$\dot x = (a  + b x)(1-x)$, 

and the behavior looks like a traditional product life-cycle curve (an S-shaped curve):




The process for a viral infection is similar: some people get the virus from the environment (those would be the $a$ fraction), some get it from contact with other people (those would be the $b$); the infection process has a third element, recovery, which we ignored here.



Growth confusion and punditry, part 1


Pundits throwing around growth numbers seem to be unaware that there are significant differences even with very small growth numbers.




Growth confusion and punditry, part 2


A pundit: "it's important to get the economics high-growth first, so that the slower growth starts from a higher number." (Paraphrased.)

Me: Gah! Multiplication is transitive. The order doesn't matter, what matters is that the high-growth period be the longer period.

Consider two periods, with $t_1$ and $t_2$, with associated growth rates $r_1$ and $r_2$. Starting from some value $x_0$, the result of period 1 before period 2 is:

$\left( x_0 \, e^{r_1 t_1} \right) \, e^{r_2 t_2}$,

and the result of period 2 before period 1 is

$\left( x_0 \, e^{r_2 t_2} \right) \, e^{r_1 t_1}$,

in other words, the same result.

These pundits get paid to go on television and say these things and to write them in Op-Eds. And influential people take them seriously. The innumeracy is staggering.



Having some fun with Tesla data


Downloaded some historical data from Yahoo Finance (yes, I have other better sources, but this one is public and can be shared) and played around with smoothing. Here's a nice view of the TSLA closing price for the last year using the same triangular smoothing I did for my bodyweight (in other words, a second-order moving average of (5,5)):



Throughout the first half of 2019 Tesla boosters on Twitter were fully convinced that this would be the year that heralded the end of the internal combustion engine car. In reality, this seems to be the year in which Tesla's financial shenanigans are likely to bring its valuation to a more appropriate level.

CYA statement: I have no personal position on Tesla and will not initiate one in the next 72 hours. This is not intended as financial advice and represents my personal views (of making fun of Tesla boosters) not those of my employer or our clients.

Also:

(Yes, it's sarcastic.Very, very sarcastic.)



Yet another infrastructure photo



Wednesday, October 2, 2019

When nutrition and fitness studies attempt science with naive statistics

A little statistics knowledge is a dangerous thing.

(Inspired by an argument on Twitter about a paper on intermittent fasting, which exposes the problem of blind trust in "studies" when such studies are done to lower statistical standards than market research since at least the 70s.)

Given an hypothesis, say "people using intermittent fasting lose weight faster than controls even when calories are equated," any market researcher worth their bonus and company Maserati would design a within-subjects experiment. (For what it's worth, here's a doctor suggesting within-subject experiments on muscle development.)

Alas, market researchers aren't doing fitness and nutrition studies, mostly because market researchers like money and marketing is where the market research money is (also, politics, which is basically marketing).

So, these fitness and nutrition studies tend to be between-subjects: take a bunch of people, assign them to control and treatment groups, track some variables, do some first-year undergraduate statistics, publish paper, get into fights on Twitter.

What's wrong with that?

People's responses to treatments aren't all the same, so the variance of those responses, alone, can make effects that exist at the individual level disappear when aggregated by naive statistics.

Huh?

If everyone loses weight faster on intermittent fasting, but some people just lose it a little bit faster and some people lose it a lot faster, that difference in response (to fasting) will end up making the statistics look like there's no effect. And what's worse, the bigger the differences between different people in the treatment group, the more likely the result is to be non-significant.

Warning: minor math ahead.

Let's say there are two conditions, control and treatment, $C$ and $T$. For simplicity there are two segments of the population: those who have a strong response $S$ and those who have a weak response $W$ to the treatment. Let the fraction of $W$ be represented by $w \in [0,1]$.

Our effect is measured by a random variable $x$, which is a function of the type and the condition. We start with the simplest case, no effect for anyone in the control condition:

$x_i(S,C) = x_i(W,C) = 0$.

By doing this our statistical test becomes a simple t-test of the treatment condition and we can safely ignore the control subsample.

For the treatment conditions, we'll consider that the $W$ part of the population has a baseline effect normalized to 1,

$x_i(W,T) = 1$.

Yes, no randomness. We're building the most favorable case to detect the effect and will show that population heterogeneity alone can hide that effect.

We'll consider that the $S$ part of the population has an effect size that is a multiple of the baseline, $M$,

$x_i(S,T) = M$.

Note that with any number of test subjects, if the populations were tested separately the effect would be significant, as there's no error. We could add some random factors, but that would only complicate the point, which is that even in the most favorable case (no error, both populations show a positive effect), the heterogeneity in the population hides the effect.

(If you slept through your probability course in college, skip to the picture.)

If our experiment has $N$ subjects in the treatment condition, the expected effect size is

$\bar x = w + (1-w) M$

with a standard error (the standard deviation of the sample mean) of

$\sigma_{\bar x} =  (M-1) \,\sqrt{\frac{w(1-w)}{N}} $.

(Note that because we actually know the mean, this being a probabilistic model rather than a statistical estimation, we see $N$ where most people would expect $N-1$.)

So, the test statistic is

$t = \bar x/\sigma_{\bar x} = \frac{w + (1-w) M}{(M-1) \,\sqrt{\frac{w(1-w)}{N}}}$.

It may look complicated, but it's basically a three parameter analytical function, so we can easily see what happens to significance with different $w,M,N$, which is our objective.

Because we're using a probabilistic model where all quantities are known, the test statistic is distributed Normal(0,1), so the critical value for, say, 0.95 confidence, single-sided, is given by $\Phi^{-1}(0.95) = 1.645$.

To start simply, let's fix $N= 20$ (say a convenience sample of undergraduates, assuming a class size of 40 and half of them in the control group). Now we can plot $t$ as a function of $M$ and $w$:


(The seemingly-high magnitudes of $M$ and $w$ are an artifact of not having any randomness in the model. We wanted this to be simple, so that's the trade-off.)

Recall that in our model both sub-populations respond to the treatment and there's no randomness in that response. And yet, for a small enough fraction of the $S$ population and a large enough multiplier effect $M$, our super-simple, extremely-favorable model shows non-significant effects using a single-sided test (the most favorable test, and we're using the lowest acceptable significance for most journals, 95%, also most favorable choice).

Let's be clear what that "non-significant effects" means: it means that a naive statistician would look at the results and say that the treatment shows no difference from the control, in the words of our example, that people using intermittent fasting don't lose weight faster than the controls.

This, even though everyone in our model loses weight faster when intermittent fasting.

Worse, the results are less and less significant the stronger the effect on the $S$ population relative to the $W$ population. In other words, the faster the weight loss of the highly-responsive subpopulation relative to the less-responsive subpopulation, when both are losing weight with intermittent fasting, the more the naive statistics shows intermittent fasting to be ineffectual at producing weight loss.

Market researchers have known about this problem for a very long time. Nutrition and fitness practices (can't bring myself to call them sciences) are now repeating errors from the 50s-60s.

That's not groovy!