Si Tacuisses, Philosophus Mansisses

Saturday, March 21, 2020

Fun with numbers for March 21, 2020

Recycling some tweets on the third day of California shelter-in-place. Weather is nice:

I really don't like these "flatten the curve" diagrams posing as science

Maybe it’s just me, but this diagram strikes me as a number of unsupported unquantified statements presented as if it’s some sort of quantitative model based on real data

Axes have labels but no scales… so all we can measure is the relative magnitudes. Is that high peak at (D,A) 1%, 10%, 25%, or 90% of the population? Does it happen in a week, a month, or a year?
A/B = 415/110 so this undefined intervention lowers the peak by 73.5%. How many patients is that? Do these measures really slow down infection rates this much? Assuming that there’s no change to recovery speed, that’s a 4-fold reduction from an unidentified intervention.
E/D = 475/280 so this undefined intervention delays the peak by 70%. So if D is a month, this delays the peak a further three weeks, not long enough for a vaccine; if D is a year, that’s another 8 months, presumably enough.
B is still greater than C, so what happens when the slowed-down process crosses over the health system capacity? Rationing/triage or does this mean bodies littering the streets? That depends on that (B-C)/C = (110-83)/83 or 33% over capacity, but what happens needs absolute numbers, not relative; since there are no numbers, there’s no real meaning.

All the calculations above are just to show that if we’re to take a chart seriously we need to have real numbers and real details, and the above figure is just a qualitative “let’s hope this works to convince people to wash hands and stay away from others” masquerading as a technical models.

BY ALL MEANS, WASH YOUR HANDS, DON’T TOUCH YOUR FACE, AND STAY AWAY FROM OTHERS, because that makes sense. I've been doing it for as far as I can remember.

The information we're getting is preliminary and we're treating it as dogma

From a study of Italian testing:

The internal consistency of this test is 75% (25% of the time the test doesn’t agree with itself in retesting); this doesn’t mean that the test is 75% accurate, because that’s measured relative to the underlying condition. This is an upper bound on the accuracy of the test, since we know that at least 25% of the time it's inaccurate for sure. (Sample size appears small, but for Medicine this is almost their version of "big data.")

I have a few observations about medical tests and probability of infection in a previous blog post.

A more general point about COVID19 testing

It's easy to show that missing covariates leads to panic-inducing overestimates. The following numbers are not COVID19 data, just an illustration

Sometimes I despair of what people try to do with small amounts of data, and then the sarcasm comes out:

How can anyone deny this calamity?! In less than two months the entire population of the Earth will test positive.

In 100 days, over 8 trillion people will test positive. That's 5 times the total number of humans who've ever lived!!!!

TSLA twitter, always good for a laugh

No matter what the stock does or at what price it's trading, Ross always says "buy." One wonders how he charges 2-and-20 to his clients to give advice of this quality.

Richard "Hamster" Hammond drives a Tesla Model X

And gets very excited at adding one mile every few seconds at a Tesla Supercharger. (We can see in the touchscreen that the Supercharger is delivering 65 kW and Tesla claims 310 Wh/mi,* so that would average out at about 16 seconds per mile of range.) Not to be a spoilsport, but a gas pump adds about 26 miles of range per second (3 l/s in a 35 MPG car).

Then there's a small blur fail that reveals Hammond isn't really driving under the speed limit:

That's okay, Mr. Hammond, no one else is either.

- - - - -
* If you believe that number, you're exactly the kind of investor I'm targeting with a new product structured mostly with 2020 pandemic cat bonds; act now, supplies are limited.**

** CYA statement: this is a facetious offer, expressing derision for Tesla's number, not a proffer of a tradable security structured from out-of-the-money cat bonds.

Some videos to watch while the economy tanks around us

Grant Sanderson of 3blue1brown gave a talk at Berkeley about having people engage with math. The gist is that people want relevance and/or a story. That's good advice, but I think 3B1B's problem is that his audience is self-selected. In other words, that's how you engage an audience that's predisposed to look for and watch math videos. Still, good points.

Experimentboy is back, with thermal cameras. Very fun stuff.

PhysicsGirl suggests fun experiments to keep us from losing our minds while we wait to be moved to FEMA camps or be turned into Soylent Green.

YouTube affords the überdorks amongst us the opportunity to watch talks waaaaay above our expertise, something that in real life would be embarrassing, not to mention logistically difficult. So here are some links to:

Caltech. MIT-West, as some people who went to a technical school in Massachusetts call it.

Stanford Institute for Theoretical Physics. Fair warning: Susskind eats cookies when he talks, so there's spraying in some videos (all Susskind videos, really).

Institute for Advanced Studies at Princeton.

Nasa Jet Propulsion Laboratory.

Art talks at Le Louvre, Musée D'Orsay, the British Museum, the Smithsonian Institution, and the Museum of Fine Arts in Boston, a small town in a hard-to-spell state

Live long and prosper.

Sunday, March 15, 2020

Fun with geekage while social distancing for March 15, 2020

(I'm trying to get a post out every week, as a challenge to produce something intellectual outside of work. Some* of this is recycled from Twitter, as I tend to send things there first.)

Multicriteria decision-making gets a boost from Covid-19

A potential upside (among many downsides) of the coronavirus covid-19 event is that some smart people will realize that there's more to life choices than a balance between efficiency and convenience and will build [for themselves if not the system] some resilience.

In a very real sense, it's possible that PG&E's big fire last year and follow-up blackouts saved a lot of people the worst of the new flu season: after last Fall, many local non-preppers stocked up on N95 masks and home essentials because of what chaos PG&E had wrought in Northern California.

Anecdotal evidence is a bad source for estimates: coin flips

Having some fun looking at small-numbers effects on estimates or how unreliable anecdotal evidence really can be as a source of estimates.

The following is a likelihood ratio of various candidate estimates versus the maximum likelihood estimate for the probability of heads given a number of throws and heads of a balanced coin; because there's an odd number of flips, even the most balanced outcome is not 50-50:

This is an extreme example of small numbers, but it captures the problem of using small samples, or in the limit, anecdotes, to try to estimate quantities. There's just not enough information in the data.

This is the numerical version of the old medicine research paper joke: "one-third of the sample showed marked improvement; one-third of the sample showed no change; and the third rat died."

Increasing sample size makes for better information, but can also exacerbate the effect of a few errors:

Note that the number of errors necessary to get the "wrong" estimate goes up: 1 (+1/2), 3, 6.

Context! Numbers need to be in context!

I'm looking at this pic and asking myself: what is the unconditional death rate for each of these categories; i.e. if you're 80 today in China, how likely is it you don't reach march 15, 2021, by all causes?

Because that'd be relevant context, I think.

Estimates vs decisions: why some smart people did the wrong thing regarding Covid-19

On a side note, while some people choose to lock themselves at home for social distancing, I prefer to find places outdoors where there's no one else. For example: a hike on the Eastern span of the Bay Bridge, where I was the only person on the 3.5 km length of the bridge (the only person on the pedestrian/bike path, that is).

How "Busted!" videos corrupt formerly-good YouTube channels

Recently saw a "Busted!" video from someone I used to respect and another based on it from someone I didn't; I feel stupider for having watched the videos, even though I did it to check on a theory. (Both channels complain about demonetization repeatedly.) The theory:

Many of these "Busted!" videos betray a lack of understanding (or fake a lack of understanding for video-making reasons) of how the new product/new technology development process goes; they look at lab rigs or technology demonstrations and point out shortcomings of these rigs as end products. For illustration, here's a common problem (the opposite problem) with media portrayal of these innovations:

It's not difficult to "Bust!" media nonsense, but what these "Busted!" videos do is ascribe the media nonsense to the product/technology designers or researchers, to generate views, comments, and Patreon donations. This is somewhere between ignorance/laziness and outright dishonesty.

In the name of "loving science," no less!

Johns Hopkins visualization makes pandemic look worse than it is

Not to go all Edward Tufte on Johns Hopkins, but the size of the bubbles on this site makes the epidemic look much worse than it is: Spain, France, and Germany are completely covered by bubbles, while their cases are

0.0167 % for Spain
0.0070 % for Germany
0.0067 % for France

of the population.

Cumulative numbers increase; journalists flabbergasted!

At some point someone should explain to journalists that cumulative deaths always go up, it's part of the definition of the word "cumulative." Then again, maybe it's too quantitative for some people who think all numbers ending in "illions" are the same scale.

Stanford Graduate School of Education ad perpetuates stereotypes about schools of education

If this is real, then someone at Stanford needs to put their ad agency "in review." (Ad world-speak for "fired with prejudice.")

Never give up; never surrender.

- - - - -
* All.

Friday, March 6, 2020

Fun with numbers (and other geekage) for March 6, 2020

More collected tweeterage and other social media detritus.

MSNBC doesn't care about getting numbers right

And water is wet and fire burns... Okay, this one is particularly egregious. It starts on twitter, with a person who doesn't understand the difference between millions and trillions:

But then, Brian Williams and NYT Editorial Board member Mara Gay put it up in a discussion of Bloomberg's failed presidential bid, and agree with it (video here):

The problem here isn't so much that anchors and producers at MSNBC can't do this basic math, it's that they don't care enough about getting the numbers right to ask a fact-checker to check them. Note that they had the graphic made in advance, and this was a scripted segment, so they didn't just extemporize and made an error. They didn't care enough about the numbers to check them.

And, given their response, they still don't care. This is sad.

A puzzle that's going around, solved correctly

Saw this on Twitter, and a lot of snark with it:

Apparently some people have difficulty with this puzzle, drawing a line in B that's parallel to the bottom of the bottle (perhaps they think the water is frozen?). But many of the people who mock those who draw that parallel line draw a horizontal line that is too low, creating a triangle.

Here's the correct solution:

As with all math problems, even very simple ones like this, the right approach is to do the math, not to try to guess and hand-wave your way to a probably-wrong solution.

In their haste to badmouth Millennials, finance researchers misstate their results

I saw this "Millennials are bad with money" article on Yahoo Finance, got the original report (PDF), and found a glaring problem with their data. (The table notes make it clear they're saying a conjunction, 'AND,' not a 'GIVEN THAT' conditional.)

My guess is that despite the table notes and the 'AND,' what they're measuring is the proportion of people who answered the three questions correctly GIVEN THAT they self-described as having high finance literacy, I.O.W. that's 19% of the 62%, not 19% of the 9041 Millennials. That would make the population in the conjunction 1065, whereas the number of people who got the three right answers is 1447; so about 4% of Millennials are money-smart[ish] but think they aren't.

But if you're going to get snarky about other people's issues with money, maybe write your tables and table notes a bit more carefully…

About the financial literacy of Millennials, these were the three multiple-choice questions:

Suppose you had $\$100$ in a savings account, and the interest rate was 2% per year. After 5 years, how much do you think you would have in the account if you left the money to grow? Answers: a) More than $\$102$; b) Exactly $\$102$; c) Less than $\$102$; d) Do not know; e) Refuse to answer.

Imagine that the interest rate on your savings account was 1% per year and inflation was 2% per year. After 1 year, how much would you be able to buy with the money in this account? Answers: a) More than today; b) Exactly the same; c) Less than today; d) Do not know; e) Refuse to answer.

Please tell me whether this statement is true or false. “Buying a single company’s stock usually provides a safer return than a stock mutual fund.” Answers: a) True; b) False; c) Do not know; d) Refuse to answer.

These questions are extremely simple, which makes the low incidence of correct answers troubling.

Science illustration lie factor: 71 million

How bad can science illustrations get? Let's ask the Daily Express from the UK:

We don't need to calculate to see that that meteor is much larger than 4.1 km, but if we do calculate (I did), we realize they exaggerated the volume of that meteor by just a hair under SEVENTY-ONE MILLION-FOLD:

To put that lie factor into perspective, here's the Harvester Mothership from Independence Day: Resurgence, which has only a lie factor of 50 (linear, because that's the dimensionality of the problem here):

Fun with our brains: the Stroop interference test

From a paper on the effect of HIIT and keto on BDNF production and cognitive performance that intermittent fasting and low carb advocate (and responsible for at least 50% of my fat loss) P.D. Mangan shared on twitter, we learn that people with metabolic syndrome show improvement on their cognitive executive function when on a ketogenic diet and even more if interval training is used.

To measure cognitive executive function they use a Stroop interference test, which is a fun example of our brains' limitations, so here's an example:

The test compares the speed with which participants can state the colors of the words in the columns inside the box: on the left the color and the word are congruent (the word is the name of the color of the text for that word), on the right the color and the word are incongruent (the word is the name of a color, but not the color of the text for that word).

Other than color-blind people, almost everyone takes less time and makes fewer mistakes with the congruent than the incongruent column. That's because the brain CEO (executive function) has to stop the reading and process color in the case of incongruent. This is easy to see if one compares the test with the two extras: speed of the incongruent is about the same as that of reading the words in Extra 1 column, while the speed of stating the colors of the Extra 2 column is much faster (and less error-prone) than that of the incongruent column.

(The paper also measures BDNF, the chemical usually associated with better executive function, directly, by drawing blood and doing an ELISA test; but it's interesting to know that diet and exercise may make you a more disciplined thinker and to see that in the numbers for an actual executive function test, not just the serum levels.)

Technically, Target isn't lying, it's 4 dollars off

But I've never seen that $\$$11.99 'regular' price for this coffee, which would make it the only coffee in the entire aisle not to have a regular price of $\$$9.99. All the other sale signs say 'Save $\$$2,' for what it's worth…

Destin 'Smarter Every Day' Sandlin visits a ULA rocket factory

And, on twitter, ULA CEO Tory Bruno gets a dig into SpaceX's Texas operations:

Live long and prosper!

Sunday, March 1, 2020

Fun with COVID-19 Numbers for March 1, 2020

NOTA BENE: The Coronavirus COVID-2019 is a serious matter and we should be taking all reasonable precautions to minimize contagion and stay healthy. But there's a lot of bad quantitative thinking that's muddling the issue, so I'm collecting some of it here.

Death Rate I: We can't tell, there's no good data yet.

This was inspired by a tweet by Ted Naiman, MD, whose Protein-to-Energy ratio analysis of food I credit for at least half of my weight loss (the other half I credit P. D. Mangan, for the clearest argument for intermittent fasting, which convinced me); so this is not about Dr Naiman's tweet, just that his was the tweet I saw with a variation of this proposition:

"COVID-19 is 'like the flu,' except the death rate is 30 to 50 times higher."

But here's the problem with that proposition: we don't have reliable data to determine that. Here are two simple arguments that cast some doubt on the proposition:

⬆︎ How the death rate could be higher: government officials and health organizations under-report the number of deaths in order to contain panic or to minimize criticism of government and health organizations; also possible that some deaths from COVID-19 are attributed to conditions that were aggravated by COVID-19, for example being reported as deaths from pneumonia.

⬇︎ How the death rate could be lower: people with mild cases of COVID-19 don't report them and treat themselves with over-the-counter medication (to avoid getting taken into forced quarantine, for example), hence there's a bias in the cases known to the health organizations, towards more serious cases, which are more likely to die.

How much we believe the first argument applies depends on how much we trust the institutions of the countries reporting, and... you can draw your own conclusions!

To illustrate the second argument, consider the incentives of someone with flu-like symptoms and let's rate their seriousness or aversiveness, $a$, as a continuous variable ranging from zero (no symptoms) to infinity (death). We'll assume that the distribution of $a$ is an exponential, to capture thin tails, and to be simple let's make its parameter $\lambda =1$.

Each sick patient will have to decide whether to seek treatment other than over-the-counter medicine, but depending on the health system that might come with a cost (being quarantined at home, being quarantined in "sick wards," for example); let's call that cost, in the same scale of aversiveness, $c$.

What we care about is how the average aversiveness that is reported changes with $c$. Note that if everyone reported their $a$, that average would be $1/\lambda = 1$, but what we observe is a self-selected subset, so we need $E[a | a > c]$, which we can compute easily, given the exponential distribution, as

\[
E[a | a > c]
=
\frac{\int_{c}^{\infty} a \, f_A(a) da }{1 - F_A(c)}
=
\frac{\left[ - \exp(-a)(a+1)\right]^{\infty}_{c}}{\exp(-c)}
= c + 1
\]
Note that the probability of being reported is $\Pr(a>c) = \exp(-c)$, so as the cost of reporting goes up, a vanishingly small percentage of cases are reported, but their severity increases [linearly, but that's an artifact of the simple exponential] with the cost. That's the self-selection bias in the second argument above.

A plot for $c$ between zero (everyone reports their problems) and 5 (the cost of reporting is so high that only the sickest 0.67% risk reporting their symptoms to the authorities):

Remember that for all cases in this plot the average aversiveness/seriousness doesn't change: it's fixed at 1, and everyone has the disease, with around 63% of the population having less than the average aversiveness/seriousness. But, if the cost of reporting is, for example, equal to twice the aversiveness of the average (in other words, people dislike being put in involuntary quarantine twice as much as they dislike the symptoms of the average seriousness of the disease), only the sickest 13.5% of people will look for help from the authorities/health organizations, who will report a seriousness of 3 (three times the average seriousness of the disease in the general population).*

With mixed incentives for all parties involved, it's difficult to trust the current reported numbers.

Death Rate II: Using the data from the Diamond Princess cruise ship.

A second endemic problem is arguing about small differences in the death rate, based on small data sets. Many of these differences are indistinguishable statistically, and to be nice to all flavors of statistical testing we're going to compute likelihood ratios, not rely on simple point estimate tests.

The Diamond Princess cruise ship is as close as one gets to a laboratory experiment in COVID-19, but there's a small numbers problem. In other words we'll get good estimates when we have large scale, high-quality data. Thanks to @Clarksterh on Twitter for the idea.

Using data from Wikipedia for Feb 20, there were 634 confirmed infections (328 asymptomatic) aboard the Diamond Princess and as of Feb 28 there were 6 deaths among those infections. The death rate is 6/634 = 0.0095.

(The ship's population isn't representative of the general population, being older and richer, but that's not what's at stake here. This is about fixating on the point estimates and small differences thereof. There's also a delay between the diagnosis and the death, so these numbers might be off by a factor of two or three.)

What we're doing now: using $d$ as the death rate, $d = 0.0095$ is the maximum likelihood estimate, so it will give the highest probability for the data, $\Pr(\text{6 dead out of 634} | d = 0.0095)$. Below, we calculate and plot the likelihood ratio between that probability and the computed probability of the data for other candidate death rates, $d_i$.**

\[LR(d_i) = \frac{\Pr(\text{6 dead out of 634} | d = 0.0095)}{\Pr(\text{6 dead out of 634} | d = d_i)}\]

We can't reject any rates between 0.5% and 1.5% with any confidence (okay, some people using single-sided point tests with marginal significance might narrow that a bit, but let's not rehash old fights here), and that's a three-fold range. And there are still a lot of issues with the data.

On the other hand...

It's easy to see that the COVID-19 death rate is much higher than that of the seasonal flu (0.1%): using the data from the Diamond Princess, the $LR(0.001) = 3434.22$, which should satisfy both the most strong-headed frequentists and Bayesians that these two rates are different. Note that $LR(0.03) = 510.01$, which also shows that with the data above the Diamond Princess invalidates the 3% death rate. (Again, noting that the numbers might be off by a factor of two or three in either direction due to the delay in diagnosing the infection and between diagnosis and recovery or death.)

As with most of these analyses, disaggregate clinical data will be necessary to establish these rates, which we're estimating from much less reliable [aggregate] epidemiological data.

Stay safe: wash hands, don't touch your face, avoid unnecessary contact with other people.

- - - - -

* A friend pointed out that there are some countries or subcultures where hypochondria is endemic and that would lead to underestimation of the seriousness of the disease; this model ignores that, but anecdotally I've met people who get doctor's appointments because they have DOMS and want the doctor to reassure them that it's normal, prescribe painkillers and anti-inflammatories, and other borderline psychotic behavior...

** We're just computing the binomial here, no assumptions beyond that:

$\Pr(\text{6 dead out of 634} | d = d_i) = C(634,6) \, d_i^6 (1-d_i)^{628}$,

and since we use a ratio the big annoying combinatorials cancel out.

Thursday, February 27, 2020

Learning and understanding technical material – some thoughts

Learning technical material

From my YouTube subscriptions, the image that inspired all this:

Ah, MIT teaching, where professors get former students who they consult for/with to teach all their classes, while still getting their teaching requirement filled…

(For what it's worth, students probably get better teaching this way, given the average quality of MIT engineering professors' teaching.)

These are not the typical MIT/Stanford/Caltech post-docs or PhD students teaching the classes of their Principal Investigators or Doctoral Advisors. These are business associates of Tom Eagar, who get roped into teaching his class "as an honor." (In other words, for free.)

Note that there is such a thing in academia as "organizing a seminar series," which some professors do (for partial teaching credit), formally different from "teaching a class" (full teaching credit). Doing the former for the credit of the latter… questionable, but sadly common in certain parts of academe.

On the other hand, as most MIT faculty and students will confirm, technical learning is 0.1% lectures, 0.9% reading textbook/notes, 9% working through solved examples, 90% solving problem sets, so all this "who teaches what" is basically a non-issue. (These numbers aren't precise estimates, just an orders-of-magnitude reference used at MIT.)

That's probably the major difference between technical fields and non-technical fields, that all the learning (all the understanding, really) is in the problem-solving. Concepts, principles, and tools only matter inasmuch as they are understood to solve problems.

(Sports analogy: No matter how strong you are, no matter how many books you read and videos you watch about handstand walks, the only way to do handstand walks is to get into a handstand, then "walk" with your hands.)

Which brings us to the next section:

Understanding technical material

There are roughly five levels of understanding technical material, counting 'no knowledge or understanding at all' as a level; the other four are illustrated in the following picture:

The most basic knowledge is that the phenomenon exists, perhaps with some general idea of its application. We'll be using gravity as the example, so the lowest level of understanding is just knowing that things under gravity, well, fall.

This might seem prosaic, but in some technical fields one meets people whose knowledge of the technical material in the field is limited to knowing the words but not their meaning; sometimes these people can bluff their way into significant positions simply by using a barrage of jargon on unsuspecting victims, but generally can be discovered easily by anyone with deeper understanding of the material.

A second rough level of knowlege and understanding is a conceptual or qualitative understanding of a field; this is the type of understanding one gets from reading well-written and correct mass-market non-fiction. In other words, an amateur's level of understanding, which is fine for amateurs.

In the case of gravity this would include things like knowing that the gravity is different on different planets, that there's some relationship with the mass of the planet, and that on a given planet objects of different masses fall at the same rate (with some caveats regarding friction and fluid displacement forces).

The big divide is between this qualitative level of understanding (which in technical fields is for amateurs, though it's also the level some professionals decay to by not keeping up with the field and not keeping their learned skills sharp) and the level at which a person can operationalize the knowledge to solve problems.

Operational understanding means that we can solve problems using the material. For example, we can use the formula $d= 1/2 \, g \, t^2$ to determine that a ball bearing falling freely will drop 4.9 m in the first second. We can also compute the equivalent result for the Moon, using $g_{\mathrm{Moon}} = g/6$, so on the Moon the ball bearing would only fall 82 cm in the first second.

This level of understanding is what technical training (classes, textbooks, problem sets, etc) is for. It's possible to learn by self-study, of course, since that's a component of all learning (textbooks were the original MOOCs), but the only way to have real operational understanding is to solve problems.

There's a level of understanding beyond operational, typically reserved for people who work in research and development, or the people moving the concepts, principles, and tools of the field forward. Since that kind of research and development needs a good understanding of the foundations of (and causality within) the field, I chose to call it deep understanding, but one might also call it causal understanding. Such an understanding of gravity would come from doing research and reading and publishing research papers in Physics, rather than applying physics to solve, say, engineering problems.

An example: Sergei Krikalev, the time-traveling cosmonaut

The difference between qualitative understanding and operational understanding can be clarified with how each level processes the following tweet:

More precise data can be obtained from the linked article and that's what we'll use below.*

Qualitative understanding: Special Relativity says that when people are moving their time passes slower than that of people who are stationary; the 0.02 seconds in the tweet come from the ISS moving around the Earth very fast.

(There's a lot of issues with that explanation; for example: from the viewpoint of Krikalev the Earth was moving while he was stationary, so why is Krikalev, instead of the Earth, in the future? Viascience explains this apparent paradox here.)

Operational understanding: time dilation relative to a reference frame created by being in a moving frame with speed $v$ is given by $\gamma(v) = (1 - (v/c)^2)^{-1/2}$. The ISS moves at approximately 7700 m/s, so that dilation is $\gamma(7700) = 1.00000000032939$. When we apply this dilation to the total time spent by Krikalev at the ISS (803 days, 9 hours, and 39 minutes = 69,413,940 s) we get that an additional 0.0228642576966 seconds passed on Earth during that time.

Because we have operational understanding of time dilation, we could ask how much in the future Krikalev would have traveled at faster speeds (not on the ISS, since its orbit determines its speed). We can see that if Krikalev had moved at twice the ISS speed, he'd have been 0.0914570307864 seconds younger. At ten times the speed, 2.2864181341266 seconds younger. And at 10,000 times the speed – over 25% of the speed of light – almost 28 days younger.

As a curiosity, we can use that $\gamma(7700)$ to compute kinetic energy, $E_k(v) = (\gamma(v)-1) \, mc^2$, or more precisely, since we don't have the mass, the specific energy, $E_k(v)/m = (\gamma(v)-1) \, c^2$. At its speed of 7.7 km/s the ISS and its contents have the specific energy of ethanol (30 MJ/kg) or seven times that of an equivalent mass of TNT.

To say that one understands technical material without being able to solve problems with that same understanding is like saying one knows French without being able to speak, read, write, or understand French speech or text. Sacré Bleu!

The application is what counts.

- - - - -
* The article also refers to the effect of gravity, noting that it's too low to make any difference (Earth gravity at the ISS average altitude of ~400 km is 89% of surface gravity; both are too small for the General Relativity effect of gravity slowing down time to be of any impact on Krikalev, or for that matter anyone on Earth).

Wednesday, February 19, 2020

Seriously, pulling a Mensa card?

Some thoughts on IQ testing, inspired by someone who pulled a "I have a higher IQ than you" on scifi author TJIC (read his books if you like hard scifi: first, second) and then pulled —I kid you not — a Mensa card. An actual Mensa card.

The obvious logical fallacy of implying "I have a high IQ, therefore what I say is right" being evidence of not engaging one's intelligence notwithstanding, there's something funny about claims that IQ, as measured by tests designed for the mass of the population, is somehow a measure of the ability to think about complex or difficult issues.

Note that for mass testing purposes the IQ test as designed is useful, for reasons that will become clear below.

The tests typically consist of a number of simple problems of pattern matching or other low algorithmic complexity and low computational complexity tasks. This works out as a good way to separate people with intelligence from zero up to a standard deviation or two above the mean. In other words, this type of testing separates people who will have serious difficulties, mild difficulties, and no difficulties in following basic education (say up to high school), and people who can do well in education beyond the basic if they choose to.

Because some of the people who do well in these tests go on to do well in situations with high algorithmic and/or computational complexity, IQ metrics (or proxies thereof like the SAT) are used as one of the tools in selection for jobs or education that include such tasks, such as STEM education and jobs.

(Note that it is possible for someone to do badly in IQ tests and still do well in tasks with high algorithmic and/or computational complexity, though that tends to be unlikely and generally happens due to considerations orthogonal to actual intellectual capabilities.)

Because some of the people who do well in IQ tests don't do well once the algorithmic and/or computational complexity increase, using IQ measures as the sole selection tool would be a bad idea; which is why most recruiters look at school transcripts, relevant achievements (like code on Github, Ramanujan's notebooks, billionaire parents*), and other metrics.

The people who do well in IQ tests but not so well in more complex tasks tend to be the ones who join Mensa, which is why it's so funny anyone would thing showing a Mensa card means anything.

Oh, a small thing, though...

The tasks in these tests, themselves, tend to be, well... there's really no nice way to say this, wrong. Just wrong.

Other than word parallel tests (A : B :: C : ?), which measure vocabulary fluidity above all else, pretty much all the pattern matching tasks in these tests can be coded as "what's the next vector in this sequence of vectors of numbers?" to which anyone with a basic understanding of mathematics would answer "a vector of appropriate dimension with any numbers you want."

For example, consider the following sequence: 1, 1, 2, 3, 5, 8. Which is the next number in the sequence?

It's $e^{\pi^3}$.

Clearly!

That's because that sequence is clearly an enumeration in increasing order of the zeros of the following polynomial:

$(x-1)^2 \, (x-2) \, (x-3) \, (x-5) \, (x-8) \, (x - e^{\pi^3})$.

How about the sequence 1, 1, 1, 1, 1, what number comes next?

Clearly the next number is 5. This is the well-known Cinconacci sequence (five ones followed by the sum of the previous five numbers), after the Tribonacci (three ones followed by the sum of the previous three numbers) and Fibonacci (two ones followed by the sum of the previous two numbers) sequences. The Quatronacci sequence is left as an exercise to the reader.

By the way, only people with limited imagination think that the sequence 1, 1, 2, 3, 5, 8 above could only be the beginning of the Fibonacci sequence. That's a cultural bias towards a specific sequence out of an infinity of possible sequences. (A big infinity, at that, $\aleph_1$.)

Note that a smart test-taker will realize that in the infinity of sequences there are some sequences the people who write the test believe are the "right" ones, so a smart test-taker will choose those, thus using both the ability to recognize patterns and the perception of test makers' intellectual limitations.

This is not to say that the tests are useless per se; beyond being good at the separation of the lower levels of thinking ability, they measure the ability to follow instructions and to concentrate on a task for some time, both of which are important as they measure brain executive function.**

But, as I've told many a recruiter (as a consultant to the recruiter, not as a candidate), if you want to know whether a candidate can write code or solve math problems, don't bother them with puzzles; give them a coding task or a math problem.

Alas, puzzle interviews have grown to mythological status, so they're here to stay.

- - - - -

* Or as they call them at Hahvahd admissions, high-potential donors.

** There's an old recruitment test, no longer used, with an instruction sheet and a worksheet. The top of the instruction sheet said in large type "read through all the instructions before beginning," and proceded in regular type to instructions like "1 - draw a line in the worksheet, diagonally from top left to bottom right," and say, another 9 like this; at the bottom of the page it said "turn page to continue" and on the back it said, again in large type, "don't follow instructions 1-10; just write your name in the center of the worksheet and hand that in."

A significant number of people failed the test by doing tasks 1-10 as they read them, ignoring the "read all instructions before beginning" command at the top. This test is no longer used because (a) it's too well-known and (b) people who fail it never want to accept that it's their fault for not following the main instruction to read all instructions before beginning.

AFTERTHOUGHT:

My IQ, you ask? I'm pretty sure, say with 99% probability, that it falls somewhere between 50 and 500. On a good day, of course.

Wednesday, February 12, 2020

Contagion, coronavirus, and charlatans

This post is an illustration of a simple epidemiological model and why some of the ad-hoc modeling of coronavirus that some charlatans are spreading on social media platforms is a nonsensical distraction.

Math of contagion: the SIR-1 model

A simple model for infectious diseases, the SIR-1 model (also known as Kendrick-McCormack model), is too simple for the coronavirus, but contains some of the basic behavior of any epidemic.

The model uses a fixed population, with no deaths, no natural immunity, no latent period for the disease (when a person is exposed but not infectious; not to be mistaken for what happens with the coronavirus, where people are infectious but asymptomatic), and a simple topology (the population is in a single homogeneous pool, instead of different cities and countries sparsely connected).

There are three states that a given individual can be in: susceptible (fraction on the population in this state represented by $S$), infectious (fraction represented by $I$), and recovered (fraction represented by $R$); recovered means immune, so there isn't recurrence of an infection.

There are two parameters: $\beta$, the contagiousness of the disease, and $\gamma$, the recovery rate. To illustrate using discretized time, $\beta= 0.06$ means that any infectious individual has a 6% chance of infecting another individual in the next period (say, a day); $\gamma= 0.03$ means that any infectious individual has a 3% chance of recovering in the next period.

The dynamics of the model are described by three differential equations:

$\dot S = - \beta S I$;
$\dot I = (\beta S - \gamma) I$;
$\dot R = \gamma I$.

The ratio $R_0 = \beta/\gamma$ is critical to the behavior of an epidemic: if lower than one, the infection dies off without noticeable expansion, if much higher than one, it becomes a large epidemic.

There is no analytic solution to the differential equations, but they're easy enough to simulate and to fit data to. Here are some results for a discretized, 200-period simulation for some values of the parameters $(\beta, \gamma)$, starting with an initial infected population of 1%.

First, a model with an $R_0=2$, illustrating the three processes:

Note that although a large percentage of the population is eventually infected (if we continue to run the model, it will converge to 100%), the number of people infectious at a given time (and presumably also feeling the symptoms of the disease) is much lower, and this is a very important metric, as the number of people sick at a given time determines how effectively health providers can deal with the disease.

Next, a model of runaway epidemic (the $R_0 = 24$ is beyond any epidemic I've known; used here only to make the point in a short 200 periods):

In this case, the number of sick people grows very fast, which makes it difficult for the health system to cope with the disease, plus the absence of the sick people from the workforce leads to second-order problems, including stalled production, insufficient logistics to distribute needed supplies, and lack of services and support for necessary infrastructure.

Finally, a model closer to non-epidemic diseases, like the seasonal flu (as opposed to epidemic flu), though the $(\beta,\gamma)$ are too high for that disease; this was necessary for presentation purposes, in order to make the 200-period chart more than three flat lines.

Note how low the number of people infected at any time is, which is why these things tend to die off, instead of growing into epidemics, once people start taking precautions and that $\beta$ becomes smaller than $\gamma$ which leads to a $R_0 < 1$, a condition for the disease to die off eventually.

The problem with estimating ad-hoc models

One of the problems with ignoring the elements of these epidemiological models and calibrating statistical models on early data can be seen when we take the first example above ($\beta=0.06,\gamma=0.03$) and use the first 50 data points to calibrate a statistical model for forecasting the evolution of the epidemic:

As a general rule of thumb, models for processes that follow a S-shaped curve are extremely difficult to calibrate on early data; any data set that doesn't extend at least some periods into the concave region of the model is going to be of questionable value, especially if there are errors in measurement (as is always the case).

Consider that the failure of that estimation is for the simplest model (SIR-1), without the complexities of topology (multiple populations in different locations, each with a $(\beta,\gamma)$ of their own, connected by a network of transportation with different levels of quarantine and preventative measures, etc.), possible obfuscation of some data due to political concerns, misdiagnosis and under-reporting due to latency, changes to the $\beta$ and $\gamma$ as people's behavior adapts and health services adapt, and many other complications of a real-world epidemic including second-order effects on health services and essential infrastructure, which change people's behavior as well.

No, that forecasting error comes simply from that rule of thumb, that until the process passes the inflection point, it's almost certain that estimates based on aggregate numbers (as opposed to clinical measures of $\beta$ and $\gamma$, based on analysis of clinical cases; these are what epidemiologists use, by the way) will give nonsensical predictions.

But those nonsensical predictions get retweets, YouTube video views, and SuperChat money.