Bayes's Theorem Archives - Probably Overthinking It

The Girl Born on Tuesday

January 31, 2026 AllenDowney

Some people have strong opinions about this question:

In a family with two children, if at least one of the children is a girl born on Tuesday, what are the chances that both children are girls?

In this article, I hope to offer

A solution to one interpretation of this question,
An explanation of why the solution seems so counterintuitive,
A discussion of other interpretations, and
An implication of this problem for teaching and learning probability.

Let’s get started.

One interpretation

One reason this problem is contentious is that it is open to multiple interpretations. I’ll start by presenting just one – then we’ll get back to the ambiguity.

First, to avoid real-world complications, let’s assume an imaginary world where:

Every family has two children.
50% of children are boys and 50% are girls.
All days of the week are equally likely birth days.
Genders and birth days are independent.

Second, we will interpret the question in terms of conditional probability; that is, we’ll compute P(B|A), where

A is “at least one of the children is a girl born on Tuesday”, and
B is “both children are girls”.

Under these assumptions and this interpretation, the answer is unambiguous – and it turns out to be 13/27 (about 48.1%).

But why?

This problem is counterintuitive because it elicits confusion between causation and evidence.

If a family has a girl born on a Tuesday, that does not cause the other child to be a girl.
But the fact that a family has a girl born on Tuesday is evidence that the other child is a girl.

To see why, imagine two families: the first has one girl and the other has ten girls. Suppose I choose one of the families at random, check to see whether they have a girl born on Tuesday, and find that they do.

Which family do you think I chose?

If I chose the family with one girl, the chance is only 1/7 (about 14%) that she was born on Tuesday.
If I chose the family with ten girls, the chance is about 79% that at least one of them was born on a Tuesday.

And that’s the key to understanding the problem:

A family with more than one girl is more likely to have one born on Tuesday. Therefore, if a family has a girl born on a Tuesday, it is more likely that they have more than one girl.

That’s the qualitative argument. Now we’ll make it quantitative – with Bayes’s Theorem.

Bayes’s Theorem

Let’s start with four kinds of two-child families.

kinds = ['Boy Boy', 'Boy Girl', 'Girl Boy', 'Girl Girl']

Under our simplifying assumptions, these combinations are equally likely, so their prior probabilities are equal.

from fractions import Fraction

prior = pd.Series(Fraction(1, 4), kinds)
display(prior, 'prior')

	prior
Boy Boy	1/4
Boy Girl	1/4
Girl Boy	1/4
Girl Girl	1/4

Now for each kind of family, let’s compute the likelihood of a girl born on Tuesday:

If there are two boys, the probability of a girl born on Tuesday is 0.
If there is one girl, the probability she is born on Tuesday is 1/7.
If there are two girls, the probability at least one is born on Tuesday is 1 - (6/7)**2.

Let’s put those values in a list.

p = Fraction(1, 7)
likelihood = [0, p, p, 1 - (1-p)**2]
likelihood

[0, Fraction(1, 7), Fraction(1, 7), Fraction(13, 49)]

To compute the posterior probabilities, we multiply the prior and likelihood, then normalize so the results add up to 1.

posterior = prior * likelihood
posterior /= posterior.sum()
display(posterior, 'posterior')

	posterior
Boy Boy	0
Boy Girl	7/27
Girl Boy	7/27
Girl Girl	13/27

The posterior probability of two girls is 13/27. As always, Bayes’s Theorem is the chainsaw that cuts through the knottiest problems in probability.

Other versions

Everything so far is based on the interpretation of the question as a conditional probability. But many people have pointed out that the question is ambiguous because it does not specify how we learn that the family has a girl born on a Tuesday.

This objection is valid:

The answer depends on how we get the information, and
The statement of the problem does not say how.

There are many versions of this problem that specify different ways you might learn that a family has a girl born on a Tuesday, and you might enjoy the challenge of solving them.

In general, if we specify the process that generates the data, we can use simulation, enumeration, or Bayes’s Theorem to compute the conditional probability given the data.

But what should we do if the data-generating process is not uniquely specified?

One option is to say that the question has no answer because it is ambiguous.
Another option is to specify a prior distribution of possible data-generating processes, compute the answer under each process, and apply the law of total probability.

Some of the people who choose the second option also choose a prior distribution so that the answer turns out to be 1/2. In my view, that is a correct answer to one interpretation, but that interpretation seems arbitrary – by choosing different priors, we can make the answer almost anything.

I prefer the interpretation I presented, because

I believe it is what was intended by the people who posed the problem,
It is consistent with the conventional interpretation of conditional probability,
It yields an answer that seems paradoxical at first, so it is an interesting problem,
The apparent paradox can be resolved in a way that sheds light on conditional probability and the idea of independent events.

So I think it’s a perfectly good problem – it’s just hard to express it unambiguously in natural language (as opposed to math notation).

But you don’t have to agree with me. If you prefer a different interpretation of the question, and it leads to a different answer, feel free to write a blog post about it.

What about independence?

I think the girl born on Tuesday carries a lesson about how we teach. In introductory probability, students often learn two ways to compute the probability of a conjunction. First they learn the easy way:

P(A and B) = P(A) P(B)

But they are warned that this only applies if A and B are independent. Otherwise, they have to do it the hard way:

P(A and B) = P(A) P(B|A)

But how to we know whether A and B are independent? Formally, they are independent if

P(B|A) = P(B)

So, in order to know which formula to use, you have to know P(B|A). But if you know P(B|A), you might as well use the second formula.

Rather than check independence by conditional probability, it is more common to assert independence by intuition. For example, if we flip two coins, we have a strong intuition that the outcomes are independent. And if the coins are known to fair, this intuition is correct. But if there is any uncertainty about the probability of heads, it is not.

The coin example – and Monty Hall, and Bertrand’s Boxes, and many more – demonstrate the real lesson of the girl born on Tuesday – our intuition for independence is wildly unreliable.

Which means we might want to rethink the way we teach it.

In general

Previously I wrote about a version of this problem where the girl is named Florida. In general, if we are given that a family has at least one girl with a particular property, and the prevalence of the property is p, we can use Bayes’s Theorem to compute the probability of two girls.

I’ll use SymPy to represent the priors and the probability p.

from sympy import Rational

prior = pd.Series(Rational(1, 4), kinds)
display(prior, 'prior')

	prior
Boy Boy	1/4
Boy Girl	1/4
Girl Boy	1/4
Girl Girl	1/4

Here are the likelihoods in terms of p.

from sympy import symbols

p = symbols('p')

likelihood = [0, p, p, 1 - (1-p)**2]
likelihood

[0, p, p, 1 - (1 - p)**2]

And here are the posteriors.

posterior = prior * likelihood
posterior /= posterior.sum()

for kind, prob in posterior.items():
    print(kind, prob.simplify())

Boy Boy 0
Boy Girl -1/(p - 4)
Girl Boy -1/(p - 4)
Girl Girl (p - 2)/(p - 4)

So the general answer is (p-2) / (p-4).

If we plug in p = 1/7, we get 13/27 again.

prob = posterior['Girl Girl'].subs({p: Rational(1, 7)})
prob

Or for the girl named Florida, let’s assume one girl out of 1000 is named Florida.

prob = posterior['Girl Girl'].subs({p: Rational(1, 1000)})
prob

The following figure shows the probability of two girls as a function of the prevalence of the property.

xs = np.linspace(0, 1)
ys = (xs-2) / (xs-4)

plt.plot(xs, ys)
plt.xlabel('Prevalence of the property')
plt.ylabel('Conditional probability of two girls')

_images/c81aa262e67d9b56ecabe5664c2397cdd0375ce23e2d2c683d8d281e36c47726.png

If the property is rare – like the name Florida – the conditional probability is close to 1/2. If the property is common – like having a name – the conditional probability is close to 1/3.

Objections

Here are some objections to the “girl born on Tuesday” problem along with my responses.

You have to model the message, not just the event

Objection.
The statement “at least one child is a girl born on Tuesday” should not be treated as a bare event in a probability space. It should be treated as the outcome of a random process that generates messages or facts we learn. Therefore, the probability space must include not only family composition, but also the mechanism by which that information is produced. Any solution that conditions only on the family outcomes is incomplete.

Response.
I agree that if the problem is interpreted as conditioning on a message (something that is said, reported, or chosen from among several true statements), then the reporting mechanism matters and must be modeled explicitly. However, I don’t think such a mechanism is required in all cases. It is standard and meaningful to interpret a question as conditioning on an event – an extensional property of outcomes – without introducing an additional random variable for how the information was obtained. That is the interpretation I adopt here.

Without a specified selection rule, symmetry forces the answer to 1/2

Objection.
If the problem does not specify how the information was obtained, then we must assume a symmetric rule for selecting which true statement is revealed. Under that assumption, conditioning on “at least one boy” or “at least one girl” must give the same answer, and applying the law of total probability forces the posterior probability to equal the prior. Therefore, the correct answer must be 1/2.

Response.
This conclusion follows only if we assume that the conditioning is on a message chosen from a symmetric set of alternatives. Under that interpretation, the result does depend on the selection rule, and 1/2 is a valid answer for one particular choice of rule. But if the conditioning is on an event rather than a message, there is no requirement that different events form a symmetric partition or that the law of total probability be applied across them in this way. Under the event-based interpretation, the argument forcing 1/2 does not apply.

The problem is ambiguous and therefore has no answer

Objection.
Because the problem does not specify how we learn that there is a girl born on Tuesday, it is fundamentally ambiguous. Since different interpretations lead to different answers, the question has no single correct solution.

Response.
It’s true that the problem is ambiguous as stated in natural language. One option is to declare it unanswerable. Another is to resolve the ambiguity by adopting a conventional default interpretation. I choose the latter: I interpret the question as a conditional probability defined on an explicit probability model and make that interpretation clear by enumerating the sample space. Under that interpretation, the answer is unambiguous and, in my view, interesting and instructive – even if other interpretations lead to different answers.

You are changing the sampling procedure

Objection.
Some people object that the 13/27 result comes from changing how families are selected. Conditioning on “at least one child is a girl born on Tuesday” oversamples families with more girls, so the conditional distribution no longer represents the original population of two-child families. From this perspective, the result feels like an artifact of biased sampling rather than a genuine probability update.

Response.
That description is accurate, but it is not a flaw. Conditioning is biased sampling: evidence changes the distribution of outcomes. Families with more girls really are more likely to satisfy the condition, and the conditional probability reflects that fact.

The day of the week seems irrelevant

Objection.
Tuesday has nothing to do with gender, so it feels wrong that adding this detail should change the probability. Since the day of the week does not cause a child to be a girl, it seems irrelevant to the question.

Response.
This objection reflects a common confusion between causal independence and evidential relevance. While the day of the week does not cause the other child’s gender, it provides evidence about the number of girls in the family. Evidence can change probabilities even when there is no causal connection.

The result depends on unrealistic independence assumptions

Objection.
The solution assumes that genders and days of the week are independent and uniformly distributed, which is not true in the real world. If those assumptions are relaxed, the answer changes.

Response.
That is correct, but those assumptions are not the source of the puzzle. Relaxing them changes the numerical value of the answer, but not the underlying logic. The same kind of reasoning applies under more realistic models.

The problem is artificial or pathological

Objection.
Some readers reject the problem not because the calculation is wrong, but because the setup feels artificial or unlike how information is learned in real life. From this view, the problem is a trick rather than a meaningful probability question.

Response.
Whether this is a flaw or a feature depends on the goal. The problem is artificial, but it is intended to expose how unreliable our intuitions about conditional probability and independence can be. In that sense, its artificiality is what makes it pedagogically useful. The underlying issue – determining how evidence bears on hypotheses – comes up in real-world problems all the time. And getting it wrong has real-world consequences.

Bayesian Dice

August 9, 2021 AllenDowney

This article is available in a Jupyter notebook: click here to run it on Colab.

I’ve been enjoying Aubrey Clayton’s new book Bernoulli’s Fallacy. The first chapter, which is about the historical development of competing definitions of probability, is worth the price of admission alone.

One of the examples in Chapter 1 is a simplified version of a problem posed by Thomas Bayes. The original version, which I wrote about here, involves a billiards (pool) table; Clayton’s version uses dice:

Your friend rolls a six-sided die and secretly records the outcome; this number becomes the target T. You then put on a blindfold and roll the same six-sided die over and over. You’re unable to see how it lands, so each time your friend […] tells you only whether the number you just rolled was greater than, equal to, or less than T.
Suppose in one round of the game we had this sequence of outcomes, with G representing a greater roll, L a lesser roll, and E an equal roll:
G, G, L, E, L, L, L, E, G, L
Clayton, Bernoulli’s Fallacy, pg 36.

Based on this data, what is the posterior distribution of T?

Computing likelihoods

There are two parts of my solution; computing the likelihood of the data under each hypothesis and then using those likelihoods to compute the posterior distribution of T.

To compute the likelihoods, I’ll demonstrate one of my favorite idioms, using a meshgrid to apply an operation, like >, to all pairs of values from two sequences.

In this case, the sequences are

hypos: The hypothetical values of T, and
outcomes: possible outcomes each time we roll the dice

hypos = [1,2,3,4,5,6]
outcomes = [1,2,3,4,5,6]

If we compute a meshgrid of outcomes and hypos, the result is two arrays.

import numpy as np

O, H = np.meshgrid(outcomes, hypos)

The first contains the possible outcomes repeated down the columns.

array([[1, 2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5, 6]])

The second contains the hypotheses repeated across the rows.

array([[1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4, 4],
       [5, 5, 5, 5, 5, 5],
       [6, 6, 6, 6, 6, 6]])

If we apply an operator like >, the result is a Boolean array.

O > H

array([[False,  True,  True,  True,  True,  True],
       [False, False,  True,  True,  True,  True],
       [False, False, False,  True,  True,  True],
       [False, False, False, False,  True,  True],
       [False, False, False, False, False,  True],
       [False, False, False, False, False, False]])

Now we can use mean with axis=1 to compute the fraction of True values in each row.

(O > H).mean(axis=1)

array([0.83333333, 0.66666667, 0.5       , 0.33333333, 0.16666667,
       0.        ])

The result is the probability that the outcome is greater than T, for each hypothetical value of T. I’ll name this array gt:

gt = (O > H).mean(axis=1)

The first element of the array is 5/6, which indicates that if T is 1, the probability of exceeding it is 5/6. The second element is 2/3, which indicates that if T is 2, the probability of exceeding it is 2/3. And do on.

Now we can compute the corresponding arrays for less than and equal.

lt = (O < H).mean(axis=1)

array([0.        , 0.16666667, 0.33333333, 0.5       , 0.66666667,
       0.83333333])

eq = (O == H).mean(axis=1)

array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667,
       0.16666667])

In the next section, we’ll use these arrays to do a Bayesian update.

The Update

In this example, computing the likelihoods was the hard part. The Bayesian update is easy. Since T was chosen by rolling a fair die, the prior distribution for T is uniform. I’ll use a Pandas Series to represent it.

import pandas as pd

pmf = pd.Series(1/6, hypos)
pmf

1    0.166667
2    0.166667
3    0.166667
4    0.166667
5    0.166667
6    0.166667

Now here’s the sequence of data, encoded using the likelihoods we computed in the previous section.

data = [gt, gt, lt, eq, lt, lt, lt, eq, gt, lt]

The following loop updates the prior distribution by multiplying by each of the likelihoods.

for datum in data:
    pmf *= datum

Finally, we normalize the posterior.

pmf /= pmf.sum()

1    0.000000
2    0.016427
3    0.221766
4    0.498973
5    0.262834
6    0.000000

Here’s what it looks like.

As an aside, you might have noticed that the values in eq are all the same. So when the value we roll is equal to T, we don’t get any new information about T. We could leave the instances of eq out of the data, and we would get the same answer.

The Left-Handed Sister Problem

July 23, 2021 AllenDowney

Suppose you meet someone who looks like the brother of your friend Mary. You ask if he has a sister named Mary, and he says “Yes I do, but I don’t think I know you.”

You remember that Mary has a sister who is left-handed, but you don’t remember her name. So you ask your new friend if he has another sister who is left-handed.

If he does, how much evidence does that provide that he is the brother of your friend, rather than a random person who coincidentally has a sister named Mary and another sister who is left-handed? In other words, what is the Bayes factor of the left-handed sister?

Let’s assume:

Out of 100 families with children, 20 have one child, 30 have two children, 40 have three children, and 10 have four children.
All children are either boys or girls with equal probability, one girl in 10 is left-handed, and one girl in 100 is named Mary.
Name, sex, and handedness are independent, so every child has the same probability of being a girl, left-handed, or named Mary.
If the person you met had more than one sister named Mary, he would have said so, but he could have more than one sister who is left handed.

I’ll post a solution only when someone replies to this tweet with a correct answer!

UPDATE: Here it is.

If you like this sort of thing, you might like the new second edition of Think Bayes.

Probably Overthinking It

Data science, Bayesian Statistics, and other ideas

Browsed by
Tag: Bayes's Theorem