My very busy week
I’m not sure who scheduled ODSC and PyConUS during the same week, but I am unhappy with their decisions. Last Tuesday I presented a talk and co-presented a workshop at ODSC, and on Thursday I presented a tutorial at PyCon.
If you would like to follow along with my very busy week, here are the resources:
Practical Bayesian Modeling with PyMC
Co-presented with Alex Fengler for ODSC East 2025
In this tutorial, we explore Bayesian regression using PyMC – the primary library for Bayesian sampling in Python – focusing on survey data and other datasets with categorical outcomes. Starting with logistic regression, we’ll build up to categorical and ordered logistic regression, showcasing how Bayesian approaches provide versatile tools for developing and evaluating complex models. Participants will leave with practical skills for implementing Bayesian regression models in PyMC, along with a deeper appreciation for the power of Bayesian inference in real-world data analysis. Participants should be familiar with Python, the SciPy ecosystem, and basic statistics, but no experience with Bayesian methods is required.
The repository for this tutorial is here; it includes notebooks where you can run the examples, and there’s a link to the slides.
And then later that day I presented…
Mastering Time Series Analysis with StatsModels: From Decomposition to ARIMA
Time series analysis provides essential tools for modeling and predicting time-dependent data, especially data exhibiting seasonal patterns or serial correlation. This tutorial covers tools in the StatsModels library including seasonal decomposition and ARIMA. As examples, we’ll look at weather data and electricity generation from renewable sources in the United States since 2004 — but the methods we’ll cover apply to many kinds of real-world time series data. Outline Introduction to time series Overview of the data Seasonal decomposition, additive model Seasonal decomposition, multiplicative model Serial correlation and autoregression ARIMA Seasonal ARIMA
This talk is based on Chapter 12 of the new edition of Think Stats. Here are the slides.
Unfortunately there’s no video from the talk, but I presented related material in this workshop for PyData Global 2024:
After the talk, Seamus McGovern presented me with an award for being, apparently, the most frequent ODSC speaker!
On Wednesday I flew to Pittsburgh, and on Thursday I presented…
Analyzing Survey Data with Pandas and StatsModels
PyConUS 2025 tutorial
Whether you are working with customer data or tracking election polls, Pandas and StatsModels provide powerful tools for getting insights from survey data. In this tutorial, we’ll start with the basics and work up to age-period-cohort analysis and logistic regression. As examples, we’ll use data from the General Social Survey to see how political beliefs have changed over the last 50 years in the United States. We’ll follow the essential steps of a data science project, from loading and validating data, exploring and visualizing, modeling and predicting, and communicating results.
Here’s the repository with the notebooks and a link to the slides.
Sadly, the tutorial was not recorded.
Now that I have a moment of calm, I’m getting back to Think Linear Algebra. More about that soon!