Friday, July 5, 2019

A family has two children. One is a boy. Now, do the math!


Problem


A family has two children. One is a boy. How likely is it that the other child is a boy?


Popular yet wrong solution


"There are four possible cases: two boys, a boy and a girl, a girl and a boy, and two girls. But because one child is a boy, it can't be the last case (two girls), so there are only three cases. Therefore the probability is one-third."

This solution is popular. Among others, Nassim Nicholas Taleb (on a since deleted tweet), vlogbrother Hank Green in an old SciShow episode (IIRC), probability instructors trying to show how interesting their class is to bored undergraduates, and people interviewing job candidates have used this solution.

This solution is fun because it's counter-intuitive; because of that it also looks like a smart solution.

This solution is wrong.

It's wrong because after we use "one is a boy" to eliminate the possibility of a family with two girls, we can no longer divide the probability equally among the remaining three possibilities. Equal division of probability can be used in a case of no information, but not in a case when information has already been used to change the set of possibilities.

The more attentive reader will notice that this is the same error most people make in the Monty Hall three-door problem. As a general rule, it's a bad idea to try to solve math problems by hand-waving.

If it's a math problem, do the math.*


Frequentist approach


Let's say we have a large number of cases, 4000 families for example. That's 1000 each for each combination of children: $(B,B), (B,G), (G,B)$, and $(G,G)$. Now we look at all the possibilities where we observe one of the children at random:

1000 $(B,B)$ families yield a total of 1000 boys;
1000 $(B,G)$ families yield a total of 500 boys;
1000 $(G,B)$ families yield a total of 500 boys;
1000 $(G,G)$ families yield a total of 0 boys.

We have a total of 2000 observed boys, and 1000 of these boys come from the case when the family has two boys, $(B,B)$. Half the time we observe a boy the underlying family has two boys; therefore the probability of a second boy is 1/2.

If instead of 4000 we had generic $N$ families, and called them "cases," this argument would be the frequentist derivation of the result. In frequentist parlance, the 2000 total boys are called the "possibles" and the 1000 boys from $(B,B)$ are called the "favorables." The probability is calculated as the ratio of favorables to possibles.

(The frequentist approach is how most people learn about probability and combinatorics.)


Bayesian approach


Frequentist arguments become unwieldy with more elaborate problems, so we can use this puzzle to illustrate a more elegant approach, Bayesian inference.†

First let's call things by their name: $(B,B), (B,G), (G,B)$, and $(G,G)$ are the unobserved states of the world. "One is a boy," which we'll represent by $B$, is an observed event.

Some events are uninformative, for example "one is blond," in that they don't help answer the question. Others like "one is a boy," $B$, are informative, because they help answer the question. But how can we tell?

Event $B$ is informative because it happens with different probabilities in different states of the world; therefore observing $B$ gives information about what states we're more likely to be in:

$\Pr(B|(B,B)) = 1$;
$\Pr(B|(B,G)) = 1/2$;
$\Pr(B|(G,B)) = 1/2$;
$\Pr(B|(G,G)) = 0$.

We don't know the unobserved state of the world (that is, in which of those four states the family in question falls), so in this situation we can assign equal probabilities to all four (we could look up demographics tables and confirm the numbers, but let's keep this simple):

$\Pr((B,B)) = \Pr((B,G)) = \Pr((G,B)) = \Pr((G,G)) = 1/4$.

What we want is the probability of the state $(B,B)$ having observed the event $B$; this is the conditional probability $\Pr((B,B)|B)$, which can be computed using the Bayes formula,

\[
\Pr((B,B)|B) = \frac{\Pr(B|(B,B)) \Pr((B,B))}{\Pr(B)}.
\]
Because the $\Pr(B)$ trips a lot of people, let's be clear about what it is: it's the probability that you will observe a boy in general, not in this particular case; sometimes called the a-priori probability or the unconditional probability. This is the probability that if we picked a two-child family at random and then picked one of the children at random, that child would be a boy. It's not "one, because we observe a boy," a common error.

To compute $\Pr(B)$ we must consider all four states of the world and add up ("integrate over the space of states" in expensive wording) the probability of observing a boy in each of these states weighed by the probability of the state itself:

$\begin{array}{rl}\Pr(B) =& \Pr(B|(B,B)) \Pr((B,B)) + \\
 & \Pr(B|(B,G)) \Pr((B,G)) +  \\
& \Pr(B|(G,B)) \Pr((G,B)) + \\
&\Pr(B|(G,G)) \Pr((G,G)) \\
=& 1/2
\end{array}$

(Unsurprisingly, it's 1/2, since half of the children are boys.)

Now we can compute our quantity of interest $\Pr((B,B)|B)$ by replacing the numbers in the Bayes formula. In fact, we can do that for all the states,

$\Pr((B,B)|B) = 1/2$;
$\Pr((B,G)|B) = 1/4$;
$\Pr((G,B)|B) = 1/4$;
$\Pr((G,G)|B) = 0$.

(As they used to say in the Soviet Union, trust but verify: check those numbers to be sure.)



If it's a math problem, do the math.




-- -- -- --
* "Do the math" means apply the rules of math, not just the notation and numbers.

† There's a bit of a schism in statistical modeling between frequentists and Bayesians. I'll let you figure out which side I'm on.