Sunday, December 27, 2015

One of two children is a boy, what's the probability the other is a boy?

It's one-half. Not one-third, one-half.

And no, none of that "both solutions work" or "it depends" or "the Bayesian or frequentist solution" nonsense. It's one-half for Bayesians, for frequentists, and for people who don't know what these categories mean. The only thing the one-third "solution" is good for is an example of the importance of understanding the difference between states of the world and observed events.

I go over the Bayesian solution and explain what's wrong with one-third in the following video:



I had a frequentist video, as well, but I deleted it in the big online cleaning of 2012, so here's a simple version.

There are four possible states of the world $\{(B_1,B_2),(G_1,B_2),(B_1,G_2),(G_1,G_2)\}$, where $B,G$ is the sex and the subscript is the birth order. From each of these states there are two possible events, observing the first or the second child, leading to eight possible state-event pairs:
\[\begin{array}{rlc}
(B_1,B_2) & \rightarrow & B_1 \\
(B_1,B_2) & \rightarrow & B_2 \\
(G_1,B_2) & \rightarrow & G_1 \\
(G_1,B_2) & \rightarrow & B_2 \\
(B_1,G_2) & \rightarrow & B_1 \\
(B_1,G_2) & \rightarrow & G_2 \\
(G_1,G_2) & \rightarrow & G_1 \\
(G_1,G_2) & \rightarrow & G_2
\end{array}\]
The event "one is a boy" means that only four of these state-event pairs are feasible in the universe of possibilities:
\[\begin{array}{rlc}
(B_1,B_2) & \rightarrow & B_1 \\
(B_1,B_2) & \rightarrow & B_2 \\
(G_1,B_2) & \rightarrow & B_2 \\
(B_1,G_2) & \rightarrow & B_1
\end{array}\]
The frequentist answer to "how likely is the state $(B_1,B_2)$?" is computed as the number of favorable pairs, that is pairs including $(B_1,B_2)$, two, divided by the total number of feasible pairs in the universe of possibilities, four. Two divided by four is one-half.

Why do people fall for the one-third "solution"? Two main reasons, I believe:

1. Understanding the difference between states and events and how to relate information to changes in probabilities is not a simple matter; most people think that they know how to do this better than they actually do.

2. The one-half solution sounds too simple, and therefore doesn't allow the person to affect sophistication. That, and most people's interest in STEM as an identity product only, is one of the most destructive mental attitudes one can have: it blocks learning.

Here's a different probability puzzle that suffers from the same problems:



In all fairness, in illo tempore when I saw the one-third "solution" I believed it correct, but my much smarter classmate Dave Godes immediately showed me it was wrong. To my credit, I didn't argue – when I'm shown to be wrong, I change my mind. I know, crazy.

- - - -
PS: Yes, I know that the frequentist computation is the total probability (not number) of the favorable pairs divided by the total probability of the feasible pairs. Since this is a simple example and all pairs are equiprobable, counting is equivalent to computing total probability in this computation.

Wednesday, December 23, 2015

Project 2016

Some acquaintances -- who never have time to learn anything new (or to exercise), although they seem to have vast knowledge of TV show storylines and sports events -- have challenged me to blog (their word was "prove") how it's possible to continue to learn stuff after formal education ends.

Hence "Project 2016," in which I'll mostly document the use of books, articles, MOOCs, podcasts, public lectures, and other sources for learning as entertainment.

Yes, learning as entertainment. I have to keep learning new things for my job; those will not be blogged. I like knowing stuff even when there's no monetary payoff to it. In the past I would keep the learning to myself, but since I like to peruse other people's educational blogs, I'll give something back to the community.

My main ludic learning interests are in STEM, economics, and business management. (I work on the quantitative side of business, but there's a lot beyond my area of expertise that I find enjoyable to learn.) There'll be references to books I read and I may occasionally stray into the application of logic and thinking to subjects like fitness, travel, or packing. I might even blog about science popularization.

There will be math (typesetting courtesy of MathJax):
\[
\Pr(\text{math}) = \lim_{x \rightarrow +\infty} \frac{x^{2016}-\log(x)}{x^{2016}}.
\]

Saturday, May 2, 2015

There are 10 types of people in the world...

...those who know base 3, those who think this joke only works with binary, and the rest. The second group is the worst. – My new twitter header.

Yes, another post about people who "like STEM" as a way to assert their in-group identity – as long as they don't actually have to learn any STEM.

The standard joke is "there are 10 types of people in the world, those who understand binary and those who don't." It's been productized in a number of ways, including on a ThinkGeek t-shirt. My version is a little more elaborate, since it uses base 3 coding. The "second group" part has to do with poseurs who make the binary joke without understanding it.

The motivator for the new header was the media circus following the Tesla Power Wall announcement. Namely, all the ignorant bleating about "Tesla killing nuclear power." I made a short tweet-storm about it,



but my issue is not with Tesla or the pundits lacking the basics of electrical and chemical engineering,  economics, and the realities of bringing a product to a mass market. These are problems of ignorance, and ignorance can be addressed. That's what education and information are for. Between MOOCs and library books, there are plenty of educational opportunities around.

The problem is the attitude that knowledge, even information (basic tech specs), is not necessary for expressing an opinion. Loudly. As long as it's the right opinion. The opinion that the right people must have. The opinion that must not be questioned.

And that's a serious problem: we live in a society ever more dependent on technology – based on science and engineered into products. And the foundational attitude of science and engineering, respect for physical reality, has been replaced by compliance with an identity-based narrative.

So I now engage in knowledge-based attitude guerrilla: fighting identity-based ignorance by undermining the credibility of that identity. Like so:

– "Yes, yes, science is very important. By the way, why do we need a different flu vaccine every year?" [Silence.] "So, your entire knowledge of evolution is limited to saying creationism is wrong?"

– "Ah, a quote by Neil 'Hayden gift shop' Tyson. How fast are we moving relative to the center of the Earth? Our latitude is around 38 degrees, Earth's radius is around 6400km."

(These are not "gotcha" science questions, they require no more than middle-school education.)

There are plenty of sources motivation out there, like this video by astronaut Samantha Cristoforetti. All that's needed is to fight the attitude that thinking, knowledge, and information are superfluous.

For a successful technological society, reality must take precedence over narrative, for Nature cannot be fooled. (Adapted from the last sentence here.)

Saturday, April 25, 2015

Here we go again... 'extraordinary claims require extraordinary evidence'

I've already posted about the dangers of this quote, but I recently saw it on a portuguese skeptics web site and decided to give it another go.

Before we get to it, I should clarify that my posts on skeptics/atheists, are based on my experience with US atheists and skeptics and their non-american orbiters. The portuguese group and a recently found YouTube channel seem to be much more interesting.

Ok, so what about the "extraordinary claims require extraordinary evidence" quote, which on the portuguese web site is attributed to Carl Sagan (though I think Rev. Thomas Bayes got there first) and has been misattributed to Christopher Hitchens in a number of places?


Your rationale, at least in Bayesian terms, is correct...


The statement reflects a quasi-Bayesian view of the world, which I like: say the extraordinariness of a proposition $P$ is to be supported by evidence $E$; what does Sagan's statement mean?

Let $\Pr(P)$ the the prior probability of $P$, that is the degree to which the person receiving the evidence believes in $P$ prior to the evidence being presented. After the evidence is presented, the proposition can be evaluated by its posterior probability. To compute the posterior probability, $\Pr(P|E)$, we need to know a couple of other things: the conditional probability of the evidence on the proposition $\Pr(E|P)$ and the unconditional probability of the evidence $\Pr(E)$.

Whoa! Too much math, I can hear a lot of people who like science as long as they don't have to learn any saying...

Ok, say $P$ is "Bob is a powerlifter" and $E$ is "Bob bench-presses 400Lbs." If about one out of one-hundred people in our social circle are powerlifters, then $\Pr(P) = 0.01$. If Rose tells me that Bob bench-pressed 400Lbs, I need to know a couple of things to determine whether Bob really is a powerlifter:

1. Given that $X$ is a powerlifter, how likely is $X$ to bench 400Lbs? (Most powerlifters can bench 400Lbs, but our social group might have a few weakling powerlifters, usually bodybuilders who are trying to pretend they're athletes.) This gives me the $\Pr(E|P)$.

2. What percentage of our social group benches 400Lbs? Many football and basketball players, who are not powerlifters, bench 400Lbs; there might be several of these in our social group. This gives me the $\Pr(E)$. Note that this includes the powerlifters too; it's a proportion of everybody.

Note that the ratio of these two quantities, $\Pr(E|P)/\Pr(E)$, is a measure of informativeness of the evidence:

Say $\Pr(E|P) = 0.75$, three-quarters of all powerlifters can bench 400Lbs, and $\Pr(E) = 0.75$, meaning that so can a similar proportion of everyone. In this case, the evidence is quite useless. It's non-informative at all, as can be seen by computing the posterior probability using Bayes's rule:
\[
\Pr(P|E) = \frac{\Pr(E|P) \times \Pr(P)}{\Pr(E)} = \frac{0.01 \times 0.75}{0.75} = 0.01
\]
The "evidence" doesn't change the probability of the proposition, in other words, no information comes from knowing it. (For the purposes of determining the $P$; we learn that Bob can bench 400Lbs, which might be useful when we need a friend to help us move.)

Now, say $\Pr(E|P) = 0.75$, so three-quarters of all powerlifters can bench 400Lbs, same as above, and $\Pr(E) = 0.05$, meaning that only few people in the social group can do it. This suggests that the evidence is somewhat dispositive, as can be computed by:
\[
\Pr(P|E) = \frac{\Pr(E|P) \times \Pr(P)}{\Pr(E)} = \frac{0.01 \times 0.75}{0.05} = 0.15
\]
Note that the probability increased fifteen-fold. This is strong evidence, but, given the low prior probability, Bob's powerlifter-ness is still very much in question. The 400Lbs bench wasn't "extraordinary"-enough evidence.*

In general, the lower the $\Pr(P)$, i.e. "more extraordinary claims", the higher $\Pr(E|P)/\Pr(E)$ must be for accepting $P$, i.e. "require more extraordinary evidence." So far, so good for Carl Sagan.


... but your understanding of human psychology is lacking


The problem is the lack of indices in those probabilities above. In particular, the lack of an index to separate different probabilities assigned by different people. (Oh, and people also have assorted cognitive biases that make this worse, but we don't need them to make the case so let's stay within the bounds of strict Bayesian rationality.)

What the Sagan quote gets wrong is that what person A thinks is an extraordinary claim and what person B considers extraordinary claim can be opposed. Typically, when people invoke that quote, what they mean is:

"Claims that contradict that which I and my circle of friends believe in are such proofs of stupidity by those who believe them, that I'll quote Carl Sagan and stop trying to engage in intellectual discussion." 

For what it's worth, I think this is mostly applicable to the US/US-orbiter crowd. The portuguese web site and the skinny YouTube brit seem to be actually trying to engage people.

Let's consider Bob's case again, but now the probabilities have a subscript, $A$ or $C$, for Arnold or Cooper.

Arnold only knows bodybuilders and powerlifters (and has taken too much Deca-Durabolin and Dianabol for his brain be able to tell the difference) so he thinks that almost everyone is a powerlifter, $\Pr_A(P)=0.99$. All powerlifters in Arnold's world can bench well above 400Lbs, so his $\Pr_A(E|P)=1$, and he assumes everyone else is a weakling who can't bench an unloaded bar, so with minimal computation we get $\Pr_A(E)=0.99$.

Cooper, on the other hand, when he's not busy writing flawed papers about the benefits of running [ahem: if you ignore self-selection], believes that most people are not powerlifters, $\Pr_C(P) = 0.01$, but knows quite a few people who can bench 400Lbs (because they played football or do some real exercise in a gym on the sly) and knows only one [pretend] powerlifter who can't bench 400Lbs (probably a bodybuilder), so he thinks that $\Pr_C(E|P)=0.5$ and $\Pr_C(E)=0.45$. (For kicks, compute Cooper's probability that a non-powerlifter can bench 400Lbs.)

Arnold and Cooper are having a discussion about Bob. Arnold, with a thick Austrian accent despite having lived in California for almost fifty years, says:

"Auf kourse Bob ees a powerlivtehr. Ohlmahst eferryvon ees."

Cooper: "No, he's not, and they aren't."

Rose arrives and says Bob benches 400Lbs.

Cooper: "That changes almost nothing."
(He's right: $\Pr_C(P|E) = 0.011$)

Arnold: "Zee? Unwiderlegbar [incontrovertible – ed.] evidenz! Ah'll pahmp yew Ahp!" [Strikes frontal biceps pose.]
(He's right: $\Pr_A(P|E) = 1.00$.)

Cooper, who finds the proposition "Bob is a powerlifter" extraordinary, does require extraordinary evidence to change his mind, and considers the evidence that Rose brought in to be insufficient. Arnold thinks it's dispositive evidence. Any discussion between them that doesn't start by acknowledging that their probabilities are different will be a pointless waste of time.

A pointless waste of time, that is, if the purpose is to convince. I believe that that was Carl Sagan's purpose, unlike many of the current day best-selling popular-in-america skeptics whose purpose seem to be a convex combination of politicking (and I mean in the sense of promoting a particular political party, not just policies) and monetizing their echo chamber.

Monetizing the echo chamber: like preaching to the choir, but with monetary reward.

(Again, as far as I can tell, not what the portuguese skeptics or the needs-a-sandwich YouTuber are doing.)


Bonus: the effect of informativeness of evidence




Informativeness in this table is $\Pr(E|P)/\Pr(E)$. So for example, for someone who has $\Pr(P) = 0.000 025$, i.e. is very skeptical about the proposition, to be completely uncertain about $P$, $\Pr(P|E) = 0.5$, you need the evidence to be twenty thousand times more likely if P is true, than overall, $\Pr(E|P) = 20000 \times  \Pr(E)$.

Note that when $\Pr(E|P)/\Pr(E) < 1$, the evidence is against the proposition, in that the probability of observing the evidence given the proposition is lower than the incidence of the evidence in the general population of events.

For kicks, why are some cells greyed-out? Is this a cop-out to avoid showing probabilities above 1, or is there a real reason why informativeness is bound below some limit? Hint: there's a real reason.

-- -- -- --

* Of course, the most dispositive test would be to show Bob a squat rack; if Bob said "what is that?" (normal person) or "it's for doing standing curls, isn't it?" (bodybuilder), that would be proof of non-powerlifter-ness. A powerlifter would say "I'd rather use a cage or a monolift, but sure I'll SHUT UP AND SQUAT!"

Saturday, April 11, 2015

The sky is blue, therefore no vodka on transatlantic flights

Consider a truthful proposition, say "the sky is blue." In this hypothetical, imagine that for historical reasons a majority of the people are indoctrinated to believe that the sky is red; a minority of the people know it's blue.

Now imagine that there's a subgroup of those who believe that the sky is blue who organize and attend conferences, write articles and blog posts, and publish books, all dedicated to making fun of people who don't know that the sky is blue.

Most of the minority who know that the sky is blue find these conferences, articles, and books (like "The Red Delusion" and "Red is not Great") both trivial and mean-spirited: trivial because they don't actually elaborate on the blueness of the sky as a phenomenon; and mean-spirited because when you scrape the thin veneer of interest in the truth, what's left is a group of people mocking those to whom they feel superior.

Imagine that you meet, possibly on a discussion forum, some of these "the sky is blue" activists who are very vocal about the blueness of the sky, but don't know that the color blue maps into a specific range of wavelengths, how the eye senses color, or that the color of the sky results from the scattering of sunlight in the atmosphere. Instead, when topics like these surface, the activists quickly move the discussion to the topic of some other person who believes the sky is red and should be mocked or punished for that. Or ban you from the forum.

Now, imagine that at one of these conferences, or in the articles by some of the least competent writers, you find clearly wrong statements, such as "the oceans are yellow and made of butter" or "golf turf is grass made of little Burberry umbrellas." Or prescriptive non-sequiturs like "because the sky is blue, Absolut vodka should be forbidden on transatlantic flights."

Possibly you'd learn to avoid these people, their conferences and forums, and their books, articles and blog posts.

Possibly. Probably. Maybe definitely.

On a totally unrelated subject, a few friends are puzzled that I don't belong to, or support, any atheist or skeptic organizations, given my lifelong interest in science.

Yeah... mysteries of the universe.

Friday, April 3, 2015

Does "50% below average" convey innumeracy?

Apparently some people believe that saying "fifty percent are below average" shows ignorance of statistics.

There's some ignorance going on, but it tends to belong to those who act as if the phrase is a mathematical tautology. Consider what happens to a group of non-millionaire friends that gets in a room with Bill Gates: all but one person in that room will have below-room-average wealth.

Use that example whenever smug people who "like math" as long as understanding it is optional make fun of the "fifty percent are below average" phrase.

There are many real-life cases where the mean (or "average"; added later: see below, note IV) is different from the median (the point in the support of the distribution that has half the probability mass on either side). Understanding this is quite important for many things in life.

Consider independent random events in time. Think, for example, of random customers walking into a store, computer processes generating demand for CPU time, packets in a switching network requesting dispatch or queueing, time of death for certain terminal diseases, or radioactive decay.

If you have random independent events that can happen with some fixed probability per unit time, then the time between those events follows an exponential distribution with a probability density function
\[
f_{T}(t) = \lambda \, \exp(-\lambda \, t)
\]
where the mean time between occurrences of the event is $1/\lambda$. The median of this distribution is $\log(2)/\lambda$, which implies that there's always more probability on the left side of the mean than on the right. To be precise, $63\%$ of all intervals between successive events have a length below $1/\lambda$, the mean interval length.

"Sixty-three percent are below the mean." And true!

This asymmetry, from skewness of the distribution, also applies to more complex inter-temporal laws with dependent events, like Weibull random variables, and to power laws, which describe many natural, social, and artificial phenomena. Not always $63\%$, obviously.

So, the next time someone mocks the "fifty percent below average" as proof of innumeracy, educate them about the difference between the mean and the median.

-- -- -- --

Note I: Neil nothing-like-Carl-Sagan Tyson apparently uses the phrase to mock other people. This is no surprise, since his schtick is basically the same as Penn & Teller's: mockery of the out-group and praise of the in-group, with no education at all or, occasionally, anti-education.

Note II: $\log(2)$ is logarithm of $2$ in the natural base $e$. Even though I'm an engineer, I follow the mathematicians' convention and use $\log_{10}$ or $\log_{2}$ to make explicit when I'm not using the natural base.

Note III: Yes, it's always $63\%$, no matter the $\lambda$:
\[
\Pr(T \le 1/\lambda) = \int_{0}^{1/\lambda} \, \lambda \, \exp( - \lambda \, t) \, dt = \Bigg[ - \exp( - \lambda \, t) \Bigg]_{0}^{1/\lambda} = 0.63.
\]
This has to do with the exponential distribution and its peculiarities. As you can see, unlike many "science" popularizers, I show my work.

Note IV: A family member points out that "average" can be used for many other measures of central tendency (a point I had made in this earlier post), but: (a) pretty much all instances of the use of that phrase that I've seen refer to the mean; and (b) the people who mock the usage I explain are generally not cognizant of the other measures of central tendency, they just want to play the identity game.

Tuesday, March 24, 2015

The danger of weak arguments

Weak arguments are not neutral, they are damaging for technical or scientific propositions.

There's overwhelming evidence for the proposition "Earth is much older than 6000 years." (It's about 4.54 billion years old, give or take fifty million.) Let's say that Bob, who likes science, as long has he doesn't have to learn any, is arguing with Alex, an open-minded young-Earth creationist:

Alex: Earth was created precisely on Saturday, October 22, 4004 B.C., at 6:00 PM, Greenwich Mean Time, no daylight savings.

Bob: That's ridiculous, we know from Science(TM) that the Earth is much older than that.

Alex: What science? I'm willing to listen, but not without details.

Bob: Well, scientists know exactly and it was in Popular Science the other day, too.

Alex: What did the Popular Science article say?

Bob: I forget, but it had two pretty diagrams, lots of numbers, and a photo of Neil DeGrasse Tyson in his office. He has a wood model of Saturn that he made when he was a kid.

Alex: So you don't really know how the age of the Earth is calculated by these scientists, you're just repeating the conclusion of an argument that you didn't follow. Maybe you didn't follow because it's a flawed argument.

Bob: I don't remember, it's very technical, but the scientists know and that's all I need. Why don't you believe in Science(TM)?

Alex: It appears to me that your argument is simply intimidation: basically "if you don't agree with me, I'll tag you with a fashionable insult." Perhaps that's also the argument of the scientists. They certainly sound smug on television, as if they're too good to explain themselves to us proles.

Alex, despite his nonsensical belief about the age of the Earth, is actually right about the form of argument; by presenting a weak argument for a truthful proposition, Bob weakens the case for that proposition. Note that this is purely a psychological or Public Relations issue. Logically, a bad argument for a proposition shouldn't change the truth of that proposition. Too bad people's brains aren't logical inference machines.

(There's a Bayesian argument for downgrading a belief in a proposition when the case presented for that proposition is weak, but a rational person trying to learn in a Bayesian manner the truth of a proposition will do a systematic search over the space of arguments, not just process arguments collected by convenience sampling.)

This is one of the major problems with people who like science but don't learn any: because of the way normal people process arguments and evidence, having many Bobs around helps the case of the Alexes.

A weak argument for a true proposition weakens the public's acceptance of that proposition. People who like science without learning any are fountains of weak arguments.

Let's convince people who "like science" that they should really learn some.

Friday, March 20, 2015

Adventures in science-ing among the general public

I've been running an informal experiment in social situations, based on an example by physicist Eric Mazur:

A light car moving fast collides with a slow heavy truck. Which of the following options is true?

a) The force that the car exerts on the truck is smaller than the force that the truck exerts on the car.


b) The force that the car exerts on the truck is equal to the force that the truck exerts on the car.


c) The force that the car exerts on the truck is larger than the force that the truck exerts on the car.


d) To know which force is larger (that of the car on the truck or that of the truck on the car) we need to know more details, for example the speed and weight (mass, really) of each vehicle.


The majority in my convenience sample pick the last option, d. Included in this sample are people with science and engineering degrees. Most of the people I asked this question can quote Newton's third law of motion: when prompted with "every action has..." they complete it with "an equal and opposite reaction."

So far my convenience sample replicates Mazur's results.

But unlike his measurement (which was made with those classroom clickers that universities use to avoid hiring more faculty and having smaller, more personalized class sessions), mine sometimes comes with arguments, explanations, and resistance.

And here's the interesting part: the farther the person's training or occupation is from science and technology, the stronger their objections and attempts to argue for d, even as they quote Newton. I don't think this is the Dunning-Kruger effect. It's more like a disconnect between concept, principle, meaning, and application.

It's not like linking concepts to principles and meaning and then applying those concepts is important, right? Especially in science and engineering...

Sunday, March 15, 2015

Discussing technical material ≠ arguing opinions

A problem of discussing [minimally] technical material with educated non-technical people is that they don't understand the difference between arguing opinions and discussing technical material.

This problem becomes much greater when the material is probability and when the example is something that the non-technical persons have been using for a while to assert their mastery of quantitative thinking.

Take, for example, the boy-girl problem: "one of two children is a boy, how likely is it that the other is also a boy?"

The right answer is one-half, though figuring that out requires some minimal understanding of probability, namely the difference between states and events and the mechanics of using prior and conditional probability to compute a posterior probability.

That computation is not the point.

The point is that even after this explanation, even in-person, some people think that they can argue for $1/3$. And that verb, "argue," is the problem.

Given a mathematical derivation yielding a result you don't like, the first step in a discussion of the result has to be pointing out the error in the derivation. My video does that for the $1/3$: the error is assigning "prior" probabilities after observing an event, in particular an informative event. (It's at the end of the computation because I need to introduce the basics of probability thinking first.)

But the people arguing for $1/3$ after that video never think they have to find the error; they either want both solutions to be valid (and don't understand why that's a problem, which is much more worrisome than not knowing how to think about probability) or appeal to some form of authority, like "I saw the $1/3$ on SciShow and they have millions of views" (which is an even bigger problem and one that is widespread, probably a consequence of how science is being popularized).

For a successful technological society, reality must take precedence over self-esteem, for nature cannot be fooled, paraphrasing a much smarter person (last sentence of report).

Software I use - part of a new computer decision process

Trying to decide whether to update (by buying a new one) my MacBook Pro, get a new MacBook Air, or switch platforms to Linux or even Windows. So I listed the software I use, and the first observation is that unless I'm willing to spend a lot of money on new programs, I'm hard-locked to the Mac platform...

TexShop. I write mostly in LaTeX. In the past I used LaTeX only for research but now I make almost all my handouts and discussion documents in LaTeX. (When I don't, they are almost always InDesign one- or two-page documents.) I know that there are WYSIWYG environments for people who want to write in a Word-like environment, but being a long-time programmer I prefer to edit LaTeX source code.

R. This is my main programming environment, having replaced Stata and MATLAB. I considered using Octave or Python, but in the end R is the best combination for my needs.

Mathematica. Every so often I need to do some tedious calculus, so I trust Mathematica for that. (When I do more than a few pages of calculus by hand, there's usually a missing sign or a transposed fraction somewhere.)

TextWrangler. Heir to the venerable BBEditLite, it's my mainstay text editor. I use it for all text that is not LaTeX, including programming, web posts, drafts of long emails, and outliner for talks. (I don't use a specialized outliner program for the reasons I gave in this post.)

Keynote. I used it as mostly a projector management system, with all content created on other tools, but now I use it for about one-quarter to one-third of all slides. Integration with iTouch and iPad allows for good control (which, I'm told, has existed in the Windoze ecosystem for several years now…).

Numbers. Not as good as Excel for most tasks that a manager would use a spreadsheet, but it's a simple way to mock-up quick models for class demonstrations. Anyone doing serious spreadsheet work must use Excel, though, since Apple seems intent on leaving the professionals behind. Really.

Pages. Although I don't use  Microsoft Word as a text editor, I occasionally work with people who do. It's hard to believe that a word processor in 2015 doesn't allow facing pages (odd/even pages); were I to use a word processor rather than LaTeX, this would mean Word, not Pages. Apparently Apple is intent on leaving even school reports to Microsoft...

Adobe Illustrator. My main drawing program, for diagrams and illustrations. Even though there are now some minimally acceptable drawing tools in Keynote, they are still very weak compared to Illustrator.

Adobe InDesign. When I need to make diagrams that include a lot of text and not a lot of drawing, I prefer InDesign to Illustrator. InDesign is also my program of choice for making compact handouts, of the type I send for remote discussions or distribute at speaking events. (In the old days, I used to make my teaching handouts with InDesign, but once I went for long handouts, I switched to LaTeX.)

Adobe Photoshop.
 I use it for final production on many slides, though a little less now as I move towards a simpler aesthetic. It also serves as my photo editor, not that I edit photos that often.

Magical number machine. A good calculator for quick arithmetic, which I used to do with an HP calculator, but gave that away in my last physical decluttering. I also use it to do arithmetic on the projection screen while using boards or flip charts.

LaTexIt. Quick LaTeX rendering for inclusion in diagrams or slides.

Voila. Page capture on steroids; can capture entire web pages as well. It has some minor editing affordances, but I do all image editing in Photoshop.

Screenflow. Captures screen, mic, and camera, for webinar-style videos. I use it for all sorts of video editing as well. Haven't opened iMovie since I got Screenflow.

VLC. Because Apple's video players are terrible.

NetNewsWire. My RSS feed reader. I could move to the cloud, and have considered that, but for now I'm happy with this. I only open it once a day, in the morning, to get a sense of what's going on.

Google Chrome. It's less of a background hog than Safari, which isn't saying much, really.

Skype. To communicate with people. Despite Microsoft's best efforts to make it unusable, the network I have on Skype is still strong enough for me to use it.

Kindle app. I have lots of Kindle books, so this is a no-brainer. (I replaced a lot of paper books with Kindle books in the 2013 declutter, using the rule that if I was likely to reread a book and its Kindle price was low, I'd rather have the electronic copy and the free physical space.)

iBooks. I also have a lot of ePubs and even some Apple iBooks, so this is again a no-brainer. I think iBooks manages multimedia content better than the Kindle.

iBooks Author. Maybe. I'm considering using this to release an interactive version of some of my teaching materials, but the limited platform (Apple only ecosystem) and the volatility of the eLearning technologies are a concern.

Simple comic. It reads comic book formats, of course, but also some other formats such as 7z which can be useful under certain circumstances. Also, I have a number of old comics in .cbr format, for nostalgia sake.

iTunes. For now my music player; it's acceptable when fed through a quality DAC. Its strong point is organization, thought that's just relative to competitors: as far as art music is concerned, no program works well, just passably.

iPhoto, soon to be replaced with Photos. To organize photos, not really a serious competitor to Photoshop when it comes to edit them.

That's it. No Handbrake for a new laptop since they no longer have optical drives (though I might install it for video file conversion, which it does very well); no email program, since I use web interfaces to keep email checking to a minimum; and no games, since I have the three I play on my phone, iTouch, and iPad (falling tiles, mahjong, and solitaire).

Sunday, February 8, 2015

Science popularization has an identity problem

Some influential science popularizers are doing a disservice to public understanding of science and possibly even to science education.

Yes, it's a strong statement. Alas, it's a demonstrable one.

With the caveats that I enjoy the Mythbusters show, especially the recent series with their back-to-origins style, and that this post is not specifically about them, the recent episode about The A-Team presented an almost-perfect example of the problem.

"Stoichiometry."

Midway through the episode Adam uses this word. It's an expensive way of saying "mass balancing of chemical equations" (not how it was described in the show). And then, well... and then Jamie proceeded to not use stoichiometry.

To be concrete: they were exploding propane. Jamie tried mixing it with pure oxygen and got a big explosion. Then they mention stoichiometry. At this point, what they should have done was to introduce some basic chemistry.

The propane molecule has 3 carbon and 8 hydrogen atoms, $\mathrm{C}_{3} \mathrm{H}_{8}$. It burns with molecular oxygen, $\mathrm{O}_{2}$, yielding carbon dioxide, $\mathrm{C} \mathrm{O}_{2}$, and water vapor, $\mathrm{H}_{2} \mathrm{O}$.

Chemists represent reactions with equations, like this:

$\mathrm{C}_{3} \mathrm{H}_{8} + \mathrm{O}_{2} \rightarrow \mathrm{C} \mathrm{O}_{2} + \mathrm{H}_{2} \mathrm{O}$

This equation is unbalanced: for example, there are three carbons on the left-hand side, but only one on the right-hand side. By changing the proportions of reagents, we can get both sides to match:

$\mathrm{C}_{3} \mathrm{H}_{8} + \mathbf{5} \, \mathrm{O}_{2} \rightarrow \mathbf{3} \, \mathrm{C} \mathrm{O}_{2} + \mathbf{4} \, \mathrm{H}_{2} \mathrm{O}$

Once we have this balance, we can determine that we need 160 grams of oxygen for each 44 grams of propane. For this we need to look up the atomic masses (to compute molar masses) of carbon (12 g/mol), hydrogen (1 g/mol) and oxygen (16 g/mol). (*)

Back on the Mythbusters, after mentioning stoichiometry, Jamie starts trying out different proportions of propane to oxygen. If he had actually used stoichiometry he'd already have the proportions calculated, as I did above, about four times more oxygen than propane by mass; no need to experiment with different proportions.

(Yes, there'a a lot of experimentation in engineering, but no engineer ignores the basic scientific foundations of her field. Chemical engineers don't figure out mass balances by trial and error; they use trial and error after exhausting the established science.)

This illustrates a major problem in the way science is being popularized: to a segment of the educated and interested audience, science is an identity product. Like a Prada bag or a sports franchise logo on a t-shirt, they see science as something that can signal membership in a desired group and exclusion from undesirable groups.

Hence the word "stoichiometry" inserted in a show that doesn't actually use stoichiometry.

"Stoichiometry" here is, like the sports franchise logo, purely a symbol. The audience learns the word, in the sense that they can repeat it, but not the concept, let alone the principles and the tools of stoichiometry. The audience gains a way to signal that they "like" science, but no actual knowledge. Like a sedentary person who wears "team colors" to watch televised games.

Some successful science popularizers pander to this "like, not learn, science" audience, instead of trying to use that audience's interest in science to educate them.

So what, most people will ask. It's the market working: you give the audience what they want. And there's no question that selling science as identity is good business. Shows like House MD, Bones, The Big Bang Theory, all take advantage of this trend. Gift shops at science museums cater to the identity much more than the education: a look at their sales typically finds much more logo-ed merchandize than chemistry sets or microscopes.

(Personal anecdote: despite having three science museums nearby, I had to use the web to get a real periodic table poster. A printable simple table from Los Alamos National Lab.)

"Liking" science without learning it is bad for society:

1. Crowds out opportunities for education. People have limited time (and money) for their hobbies and activities. If they spend their "science budget" on identity, they won't have any left for actual science learning. Many more people read Feynman's two autobiographies than his Lectures On Physics or his popular physics books.

2. Devalues the work of scientists and engineers, by presenting a view of science that excludes the hard work of learning and the value of the knowledge base (trial-and-error in lieu of mass balance calculations, for example). Some people end up thinking that science is just another type of institution credential (or celebrity worship) instead of being validated by physical reality.

3. Weakens science education. Some people who go into science expect it to be easy and entertaining (in the purely ludic sense), instead of hard but rewarding (deriving satisfaction from really understanding something), as that's what the popularization depicts. They then want schools to match those expectations. While colleges may not want to simplify science and engineering classes, they put pressure on faculty for more "engaging" teaching: less technical, more show. (**)

4. As science becomes more of an identity product to some people, and increasingly perceived as identity-only by others, it becomes more vulnerable to non-scientific identity threats, such as derailing a major scientific and technical achievement in space exploration by talking about sartorial choices and sociological forces in academia.


So, what can we do?

First, we should recognize that an interest in science, even if currently trending towards identity, can be channeled into support for science and science education. As societal trends go, a generalized liking for science is better than most alternatives.

Second, there are plenty of sources of information and education that can be used to learn science. There's a broad variety of online resources for science education at different levels of knowledge, free and accessible to anyone with an internet connection (or indeed a library card; books were the original MOOCs).

Third, current "science as identity" popularizers may be open to educating their audiences. Contacting them, offering feedback, and using social media to otherwise proselytize for science (as in scientific knowledge and thinking like a scientist) might induce them to change their approach.

The most important thing anyone can do, though, is to try to get people who "like" science to understand that they should really learn some.

(Final note on the A-Team episode: Adam should have played Murdock, not Hannibal.)

- - - -
(*) I learned to do this on my own as a kid, but the material was covered in ninth grade chemistry. (A long time ago in a country far away, in ninth grade you chose a technical or artistic area in school; mine was 'chemical technology' because my school didn't have electronics.) A side-effect of my early interest in chemistry is that I have quasi-Brezhnevian eyebrows: you burn them off five or six hundred times, they grow back with a vengeance.

(**) Some schools protect their main reputation-building degrees by creating non-technical versions of the technical courses and bundling them into subsidiary degrees. So, for example, they have information technology courses, which sound like computer science courses but are in fact nothing like them.
          Another approach is the encroachment of humanities, arts, and social sciences "breadth" requirements into science and engineering degrees. When I studied EECS in Europe, we had five years of math, physics, chemistry, and engineering courses. A similar degree in the US has four years and usually a minimum of one-year-equivalent of those "breadth" requirements, though some people can have more than two-year-equivalent by choosing "soft engineering" courses like "social impact of computers."

Friday, January 9, 2015

Three lessons from teaching MBAs in 2014


Use longer, content-heavier handouts; integrate local and up-to-date content; and show numbers and math.


Change 1: Longer and content-heavier handouts

The only significant complaint from previous cohorts was regarding the lack of a textbook. I post a selection of materials to the course support intranet (consultancy reports, managerial articles, academic papers, book chapters), but a few students always remark on the lack of a unifying text for the class.

(There's no unifying text because -- in my never humble opinion -- most Consumer Behavior textbooks are written from a consumer psychology point of view, while I prefer a more marketing engineering point of view.)

Taking that into consideration, I made longer, denser handouts, each like a book chapter rather than just support for in-class activities. The class is participant-centered, with minimal lecturing, so these longer handouts help students feel that they have a coherent framework to fall back on.

Handouts changed from a median size of four pages of mostly diagrams, in 2012, to a median size of eighteen pages of text, diagrams, and numbers, in 2014. (Just a reminder, since there's some confusion about it, that handouts and slides serve different purposes.)


Change 2: More local content

I used local content in most class sessions: local products, merchandizing from local retailers, and examples from local advertising. In particular, using outdoors from around the campus allowed students to recognize their location, for a little a-ha moment that improves mood.

The main advantage of local content is student familiarity with it. Examples are more effective when students don't have to learn new brands, new product categories, and other regional differences. A disadvantage is additional preparation work, but that work also signals to the students the instructor's commitment to the class.

A secondary advantage of local content is as evidence of instructor competence. Local content, and up-to-date content, requires confidence, ability, and practice. For this reason alone, it's worth the additional work, even if old or foreign examples would be equally good for teaching.


Change 3: More numerical content

The rise of analytics is a highly visible trend in marketing; marketing courses are therefore increasingly quantitative. Still, most Consumer Behavior courses shy away from math.

Our course was different: there were plenty of numbers and models. I did most of the work, not the students, since the objective was not to teach them analytics; but I did do the work, so the students were shown modern marketing techniques rather than a lot of hand-waving.

For example, to illustrate the effects of memory on different types of advertising timing, I used a computer simulation of a learning model: instead of rules-of-thumb for media planning, students saw how learning and forgetting rates change the effectiveness of blitz versus pulsing media timing.

(References to technical materials were provided for students wishing to learn more, of course.)


Results

Despite objectively covering more material than before and using harder assessment measures, student grades were higher. In other words, these changes achieved their primary objective: students learned more material and learned it better.

Class dynamics were better than before, though they were pretty good in previous years. When I pick up my teaching evals in 2016 (they're on paper), I'll know whether I kept my top-5 ranking from 2012.

Addendum:  In short events since the MBA class, I replicated these three changes, yielding performance improvements along all dimensions: participant learning (as measured by in-event exercises), participant experience (as measured by client-run event evaluations), follow up contact with the participants, and word-of-mouth.