Si Tacuisses, Philosophus Mansisses: January 2017

Saturday, January 28, 2017

Learning, MOOCs, and production values

Some observations from binge-watching a Nuclear Engineering 101 course online.

Yes, the first observation is that I am a science geek. Some people binge-watch Kim Cardassian, some people binge-watch Netflix, some people binge-watch sports; I binge-watch college lectures on subjects that excite me.

(This material has no applicability to my work. Learning this material is just a hobby, like hiking, but with expensive books instead of physical activity.)

To be fair, this course isn't a MOOC; these are lectures for a live audience, recorded for students who missed class or want to go over the material again.

The following is the first lecture of the course, and to complicate things, there are several different courses from UC-Stalingrad with the same exact name, which are different years of this course, taught by different people. So kudos for the laziness of not even using a playlist for each course. At least IHTFP does that.

(It starts with a bunch of class administrivia; skip to 7:20.)

Production values in 2013, University of California, Berkeley

To be fair: for this course. There are plenty of other UC-Leningrad courses online with pretty good production values. But they're usually on subjects I already know or have no interest in.

Powerpoint projections of scans of handwritten notes; maybe even acetate transparencies. In 2013, in a STEM department of a major research university. Because teaching is, er…, an annoyance?

The professor points out that there's an error in the slide, that the half-life of $^{232}\mathrm{Th}$ is actually $1.141 \times 10^{10}$ years, something that he could have corrected before the class (by editing the slide) but decided to say it in class instead, for reasons...?

The real problem with these slides isn't that handwriting is hard to read or that use of color can clarify things; it's the clear message to the students that preparing the class is a very low priority activity for the instructor.

A second irritating problem is that the video stream is a recording of the projection system, so when something is happening in the classroom there's no visual record.

For example, there was a class experiment measuring the half-life of excited $^{137}\mathrm{Ba}$, with students measuring radioactivity of a sample of $^{137}\mathrm{Cs}$ and doing the calculations needed to get the half-life (very close to the actual number).

For the duration of the experiment (several minutes), this is all the online audience sees:

Learning = 1% lecture, 9% individual study, 90% practice.

As a former and sometimes educator, I don't believe in the power of lectures without practice, so when the instructor says something like "check at home to make sure that X," I stop the video and check the X.

For example, production of a radioactive species at a production rate $R$ and with radioactive decay with constant $\lambda$ is described by the equation at the top of the highlighted area in the slide above and the instructor presents the solution on the bottom "to be checked at home." So, I did:

Simple calculus, but makes for a better learning experience. (On a side note, using that envelope for calculations is the best value I've received from the United frequent flyer program in years.)

This, doing the work, is the defining difference between being a passive recipient of entertainment and an active participant in an educational experience.

Two tidbits from the early lectures (using materials from the web):

Binding energy per nucleon explains why heavy atoms can be fissioned and light atoms can be fused but not the opposite (because the move is towards higher binding energy per nucleon):

The decay chains of Uranium $^{235}\mathrm{U}$ and Thorium $^{232}\mathrm{Th}$:

(Vertical arrows are $\alpha$ decay, diagonals are $\beta$ decay.)

Unfair comparison: The Brachistochrone video

It's an unfair comparison because the level of detail is much smaller and the audience is much larger; but the production values are very high.

Or maybe not so unfair: before his shameful (for MIT) retconning out of the MIT MOOC universe, Walter Lewin had entire courses on the basics of Physics with high production values:

(I had the foresight to download all Lewin's courses well before the shameful retconning. Others have posted them to YouTube.)

Speaking of production values in education (particularly in Participant-Centered Learning), the use of physical props and audience movement brings a physicality that most instruction lacks and creates both more immersive experience and longer term retention of the material. From Lewin's lecture above:

Wednesday, January 25, 2017

Not all people who "love science" are like that

Yes, yet another rant against the "I Effing Love Science" crowd.

Midway through a MOOC lecture on nuclear decay I decided to write a post about production values in MOOCs (in my case not really a MOOC, just University lectures made available online). Then, midway through that post, I started to refine my usual "people who love science" vs "people who learn science" taxonomy; this post, preempting the MOOC post, is the result. Apparently my blogging brain is a LIFO queue (a stack).

Nerd, who, me?

I've posted several criticisms of people who "love science" but never learn any (for example here, here, here, and here; there are many more); but there are several people who do love science and therefore learn it. So here's a diagram of several possibilities, including a few descriptors for the "love science but doesn't learn science" crowd:

The interesting parts are the areas designated by the letters A, B, and C. There's a sliver of area where people who really love science don't learn science to capture the fact that some people don't have the time, resources, or access necessary to learn science, even these days. (In the US and EU, I mean; for the rest of the world that sliver would be the majority of the diagram, as many people who would love science have no access to water, electricity, food, let alone libraries and the internet.)

Area A is that of people who love science and learn it but don't make that a big part of their identity. That would have been the vast majority of people with an interest in science in the past; with the rise of social media, some of us decided to share our excitement with science and technology with the rest of the world, leading to area B.

People in area B aren't the usual "I effing love science" crowd. First, they actually learn science; second, their sharing of the excitement of science is geared towards getting other people to learn science, while the IFLS crowd is virtue signaling.

People in area C are those who learn science for goal-oriented reasons. They want to have a productive education and career, so they choose science (and engineering) in order to have marketable skills. They might have preferred to study art or practice sports, but they pragmatically de-prioritize these true loves in favor of market-valued skills.

As for the rest, the big blob of IFLS people, I've given them enough posts (for now).

- - - - -

Note 1: the reason to follow real scientists and research labs on Twitter and Facebook is that they post about ongoing research (theirs and others'), unlike professional popularizers who post "memes" and self-promotion. Or complete nonsense --- only to be corrected by much smarter and incredibly nice Destin "Smarter Every Day" Sandlin:

Note 2: For people who still think that if one of two children is a boy, then the probability of two boys is 1/3 (it's not, it's 1/2):

and the frequentist answer is in this post. Remember: if you think a math result is incorrect, you need to point out the error in the derivation. (There are no errors.)

This particular math problem is one favorite of the IFLS crowd, as it makes them feel superior to the "rubes" who say 1/2, whereas in fact that is the right answer. The IFLS crowd, in general, cannot follow the rationales above, though some may slog through the frequentist computation.

Friday, January 13, 2017

Medical tests and probabilities

You may have heard this one, but bear with me.

Let's say you get tested for a condition that affects ten percent of the population and the test is positive. The doctor says that the test is ninety percent accurate (presumably in both directions). How likely is it that you really have the condition?

[Think, think, think.]

Most people, including most doctors themselves, say something close to $90\%$; they might shade that number down a little, say to $80\%$, because they understand that "the base rate is important."

Yes, it is. That's why one must do computation rather than fall prey to anchor-and-adjustment biases.

Here's the computation for the example above (click for bigger):

One-half. That's the probability that you have the condition given the positive test result.

We can get a little more general: if the base rate is $\Pr(\text{sick}) = p$ and the accuracy (assumed symmetric) of the test is $\Pr(\text{positive}|\text{sick}) = \Pr(\text{negative}|\text{not sick}) = r $, then the probability of being sick given a positive test result is

\[ \Pr(\text{sick}|\text{positive}) = \frac{p \times r}{p \times r + (1- p) \times (1-r)}. \]

The following table shows that probability for a variety of base rates and test accuracies (again, assuming that the test is symmetric, that is the probability of a false positive and a false negative are the same; more about that below).

A quick perusal of this table shows some interesting things, such as the really low probabilities, even with very accurate tests, for the very small base rates (so, if you get a positive result for a very rare disease, don't fret too much, do the follow-up).

There are many philosophical objections to all the above, but as a good engineer I'll ignore them all and go straight to the interesting questions that people ask about that table, for example, how the accuracy or precision of the test works.

Let's say you have a test of some sort, cholesterol, blood pressure, etc; it produces some output variable that we'll assume is continuous. Then, there will be a distribution of these values for people who are healthy and, if the test is of any use, a different distribution for people who are sick. The scale is the same, but, for example, healthy people have, let's say, blood pressure values centered around 110 over 80, while sick people have blood pressure values centered around 140 over 100.

So, depending on the variables measured, the type of technology available, the combination of variables, one can have more or less overlap between the distributions of the test variable for healthy and sick people.

Assuming for illustration normal distributions with equal variance, here are two different tests, the second one being more precise than the first one:

Note that these distributions are fixed by the technology, the medical variables, the biochemistry, etc; the two examples above would, for example, be the difference between comparing blood pressures (test 1) and measuring some blood chemical that is more closely associated with the medical condition (test 2), not some statistical magic made on the same variable.

Note that there are other ways that a test A can be more precise than test B, for example if the variances for A are smaller than for B, even if the means are the same; or if the distributions themselves are asymmetric, with longer tails on the appropriate side (so that the overlap becomes much smaller).

(Note that the use of normal distributions with similar variances above was only for example purposes; most actual tests have significant asymmetries and different variances for the healthy versus sick populations. It's something that people who discover and refine testing technologies rely on to come up with their tests. I'll continue to use the same-variance normals in my examples, for simplicity.)

A second question that interested (and interesting) people ask about these numbers is why the tests are symmetric (the probability of a false positive equal to that of a false negative).

They are symmetric in the examples we use to explain them, since it makes the computation simpler. In reality almost all important preliminary tests have a built-in bias towards the most robust outcome.

For example, many tests for dangerous conditions have a built-in positive bias, since the outcome of a positive preliminary test is more testing (usually followed by relief since the positive was a false positive), while the outcome of a negative can be lack of treatment for an existing condition (if it's a false negative).

To change the test from a symmetric error to a positive bias, all that is necessary is to change the threshold between positive and negative towards the side of the negative:

In fact, if you, the patient, have access to the raw data (you should be able to, at least in the US where doctors treat patients like humans, not NHS cost units), you can see how far off the threshold you are and look up actual distribution tables on the internet. (Don't argue these with your HMO doctor, though, most of them don't understand statistical arguments.)

For illustration, here are the posterior probabilities for a test that has bias $k$ in favor of false positives, understood as $\Pr(\text{positive}|\text{not sick}) = k \times \Pr(\text{negative}|\text{sick})$, for some different base rates $p$ and probability of accurate positive test $r$ (as above):

So, this is good news: if you get a scary positive test for a dangerous medical condition, that test is probably biased towards false positives (because of the scary part) and therefore the probability that you actually have that scary condition is much lower than you'd think, even if you'd been trained in statistical thinking (because that training, for simplicity, almost always uses symmetric tests). Therefore, be a little more relaxed when getting the follow-up test.

There's a third interesting question that people ask when shown the computation above: the probability of someone getting tested to begin with. It's an interesting question because in all these computational examples we assume that the population that gets tested has the same distribution of sick and health people as the general population. But the decision to be tested is usually a function of some reason (mild symptoms, hypochondria, job requirement), so the population of those tested may have a higher incidence of the condition than the general population.

This can be modeled by adding elements to the computation, which makes the computation more cumbersome and detracts from its value to make the point that base rates are very important. But it's a good elaboration and many models used by doctors over-estimate base rates precisely because they miss this probability of being tested. More good news there!

Probabilities: so important to understand, so thoroughly misunderstood.

- - - - -
Production notes

1. There's nothing new above, but I've had to make this argument dozens of times to people and forum dwellers (particularly difficult when they've just received a positive result for some scary condition), so I decided to write a post that I can point people to.

2. [warning: rant] As someone who has railed against the use of spline drawing and quarter-ellipses in other people's slides, I did the right thing and plotted those normal distributions from the actual normal distribution formula. That's why they don't look like the overly-rounded "normal" distributions in some other people's slides: because these people make their "normals" with free-hand spline drawing and their exponentials with quarter ellipses, That's extremely lazy in an age when any spreadsheet, RStats, Matlab, or Mathematica can easily plot the actual curve. The people I mean know who they are. [end rant]

Sunday, January 8, 2017

Numerical thinking - A superpower everyone can get

There are significant advantages to being a numerical thinker. So, why isn't everyone one?

Some people can't be numerical thinkers (or won't be numerical thinkers), typically due to one of three causes:

Acalculia: the inability to do calculations; in its pure form a type of brain damage, but more commonly a consequence of bad educational system.

Innumeracy: lack of mathematical and numerical knowledge, again generally as the result of a bad educational system.

Numerophobia: a fear of numbers and numerical (and mathematical) thinking, possibly an attitude brought on by exposure to the educational system.

On a side note, a large part of the problem is the educational system, particularly the way logic and math are covered in it. Just in case that wasn't clear.

Numerical thinkers get a different perspective on the world. It's like a superpower, one that can be developed with practice. (Logical thinkers have a related, but different, superpower.)

Take, for example, this list of large power generating plants, from Wikipedia:

Left to themselves, the numbers on the table are just descriptors, and there's very little that can be said about these plants, other than that there's a quick drop in generation capacity from the first few to the rest.

When numerical thinkers see those numbers, they see the numbers as an invitation to compute; as a way to go beyond the data, to get information out of that data. For example, my first thought was to look at the capacity factors of these power plants: how much power do they really generate as a percentage of their nominal (or "nameplate") power.

Sidenote: Before proceeding, there's an interesting observation I should make here, about operational numerophobia (similar to this older post): in social interactions when this type of problem comes up, educated people who can do calculations in their job, or at least could during their formal education, have trouble knowing where to start to convert a yearly production of 98.8 TWh into a power rating (in MW).

Since this is trivial (divide by the number of hours in one year, 8760, and convert TW to MW by multiplying by one million), the only explanation is yet another case of operational numerophobia. End of sidenote.

Capacity (or load) factor is like any other efficiency measure: how much of the potential is realized? Here are the results for the top 15 or so plants (depending on whether you count the off-line Japanese nuclear plant):

Once these additional numbers are computed, more interesting observations can be made; for example:

The nuclear average capacity factor is $87.7\%$, while the hydro average is just $47.2\%$. That might be partly from use of pumped hydro as storage for surplus energy on the grid (it's the only grid-scale storage available at present; explained in the video below).

That is the power of being a numerical thinker: the ability to go beyond simple numbers and have a deeper understanding of reality. It's within most people's reach to become a numerical thinker, all that's necessary is the will to do so and a little practice.

Alas, many people prefer the easier route of being numerical-poseurs...

A lot of people I interact with pepper their discussions with numbers and even charts, but they aren't numerical thinkers. The numbers and the charts are props, mostly, like the raw numbers on the Wikipedia table. It's only when those numbers are combined among themselves and with outside data (none in this example), information (the use of pumped hydro as grid-level storage), and knowledge (nameplate vs effective capacity, capacity factors) that they realize their potential for informativeness.

A numerical thinker can always spot a numerical-poseur. It's in what they don't do.

- - - -

Bonus content: Don Sadoway talking about electricity storage and liquid metal batteries: