SAT math scores: gender difference or selection bias?
The video from my PyData Boston talk is up now:
Resources
- The slides are here
- Run the first notebook (Poincaré problem) on Colab
- Run the second notebook (analysis of SAT data) on Colab
If you want to learn to do this kind of analysis, you can sign up for the January 2026 offering of the Applied Bayesian Modeling Workshop, which I teach along with my colleagues at PyMC Labs.
And as always, you can read Think Bayes in hard copy or free online.
Abstract
Why do male test takers consistently score about 30 points higher than female test takers on the mathematics section of the SAT? Does this reflect an actual difference in math ability, or is it an artifact of selection bias—if young men with low math ability are less likely to take the test than young women with the same ability?
This talk presents a Bayesian model that estimates how much of the observed difference can be explained by selection effects. We’ll walk through a complete Bayesian workflow, including prior elicitation with PreliZ, model building in PyMC, and validation with ArviZ, showing how Bayesian methods disentangle latent traits from observed outcomes and separate the signal from the noise.