Lions and tigers and bears

Lions and tigers and bears

Here’s another Bayes puzzle:

Suppose we visit a wild animal preserve where we know that the only animals are lions and tigers and bears, but we don’t know how many of each there are.

During the tour, we see 3 lions, 2 tigers, and one bear. Assuming that every animal had an equal chance to appear in our sample, estimate the prevalence of each species.

What is the probability that the next animal we see is a bear?

Solutions

Will Koehrsen posted an excellent solution here.  His solution is more general than mine, allowing for uncertainty about the parameters of the Dirichlet prior.

And vlad posted another good solution using WebPPL.

 

Cats and rats and elephants

Now that we solved the appetizer, we are ready for the main course…

Suppose there are six species that might be in a zoo: lions and tigers and bears, and cats and rats and elephants.  Every zoo has a subset of these species, and every subset is equally likely.

One day we visit a zoo and see 3 lions, 2 tigers, and one bear.  Assuming that every animal in the zoo has an equal chance to be seen, what is the probability that the next animal we see is an elephant?

The Game of Ur problem

The Game of Ur problem

Here’s a probability puzzle to ruin your week.

In the Royal Game of Ur, players advance tokens along a track with 14 spaces. To determine how many spaces to advance, a player rolls 4 dice with 4 sides. Two corners on each die are marked; the other two are not. The total number of marked corners — which is 0, 1, 2, 3, or 4 — is the number of spaces to advance.

For example, if the total on your first roll is 2, you could advance a token to space 2. If you roll a 3 on the next roll, you could advance the same token to space 5.

Suppose you have a token on space 13. How many rolls did it take to get there?

Hint: you might want to start by computing the distribution of k given n, where k is the number of the space and n is the number of rolls.  Then think about the prior distribution of n.

I’ll post a solution later this week, but I have to confess: I believe my solution is correct, but there is still part of it I am not satisfied with.

[UPDATE November 1, 2018]

Here’s the thread on Twitter where a few people discuss this problem.

And here’s my solution.  As you will see there are still some unresolved questions.

Here’s another solution from Austin Rochford, which estimates the posterior distribution by simulation.

And here’s a solution from vlad, also based on simulation, using WebPPL:

How tall is A?

How tall is A?

Here are a series of problems I posed in my Bayesian statistics class:

1) Suppose you meet an adult resident of the U.S. who is 170 cm tall. What is the probability that they are male?

2) Suppose I choose two U.S. residents at random and A is taller than B.  How tall is A?

3) In a room of 10 randomly chosen U.S. residents, A is the second tallest.  How tall is A?  And what is the probability that A is male?

As background: For adult male residents of the US, the mean and standard deviation of height are 178 cm and 7.7 cm. For adult female residents the corresponding stats are 163 cm and 7.3 cm.  And 51% of the adult population is female.

If you solve the problems in order, you can reuse code from the first two to solve the third.

Here’s my solution, using a grid algorithm and the libraries from Think Bayes:

When I tweeted about this problem, I heard from Colin Carroll, who wrote a solution using PyMC:

And vlad posted a this solution using WebPPL, a browser-based environment for probablistic programming:

You can run that solution at WebPPL.

The Dungeons and Dragons problem

The Dungeons and Dragons problem

Last week I posed this problem in my Bayesian Statistics class:

Suppose there are 10 people in my Dungeons and Dragons club; on any game day, each of them has a 70% chance of showing up.

Each player has one character and each character has 6 attributes, each of which is generated by rolling and adding up 3 6-sided dice.

At the beginning of the game, I ask whose character has the lowest attribute. The wizard says, “My constitution is 5; does anyone have a lower attribute?”, and no one does.

The warrior says “My strength is 16; does anyone have a higher attribute?”, and no one does.

How many characters are in the party?

My solution is in this Jupyter notebook:

The six R’s of debugging

The six R’s of debugging

In Modeling and Simulation yesterday I presented my six R’s of debugging:

Read: You have to read the code, and read what it really says, not what you think it says.  You have to read the documentation, read the error message, and read the Stack Overflow page that comes up when you Google the error message.

But sometimes the bug is in your head.  If the problem is your misunderstanding, you won’t find it by staring at the code…

Run: You also have to run the code, makes some changes, and run the code again.  Sometimes when you clean up the code, and you make a change that should have no effect, and it does, that gives you a hint.

But don’t just make random changes…

Ruminate: Take time to think!  What have you changed since the last time you had a working program?  What is the program doing wrong, and what kind of error could cause it?  What are you assuming that might be wrong?

Question everything, but don’t just sit in silence…

Rubber duck: You have to talk about it.  Find someone who’s willing to listen and explain the problem.  You might figure it out before they have a chance to say a word.  In that case you don’t even need a person.  A rubber duck will do.

Be persistent, but not too persistent…

Rest: If you’ve been at it a while, take a break.  Get away from the computer, do something else, and wait for your blood pressure to come down.  Some of the best places to find bugs are trains, showers, and bed, just before you fall asleep.

Finally, if you are pretty much stuck, you might have to be strategic…

Retreat: Get back to a previous working version and start building up the code.  Take smaller steps this time, or take different steps.  Spend some time building code that helps you debug, like functions that visualize your data structures.

I hope these suggestions are helpful.  There are six things to try, so if you are stuck on one, try another.  Debugging can be frustrating, but it is one of the most useful skills you can develop, and it applies to almost every domain, not just software development.

If you are good at debugging, you can do anything.

New home

New home

Welcome to the new home of Probably Overthinking It, where I write about data science, Bayesian statistics, and occasional other topics.  If you want to read any of my older articles, they are still available at the original location.

Since I am teaching my Bayesian statistics class this semester, I will post some of my examples here soon, and some student projects, too.

One of the reasons I am moving the blog is to get better support for Jupyter notebooks.  Here’s an example from a recent article.  Let’s see how it looks.

Not bad.