Elements of Data Science

Elements of Data Science

I’m excited to announce the launch of my newest book, Elements of Data Science. As the subtitle suggests, it is about “Getting started with Data Science and Python”.

Order now from Lulu.com and get 20% off!

I am publishing this book myself, which has one big advantage: I can print it with a full color interior without increasing the cover price. In my opinion, the code is more readable with syntax highlighting, and the data visualizations look great!

In addition to the printed edition, all chapters are available to read online, and they are in Jupyter notebooks, where you can read the text, run the code, and work on the exercises.

Description

Elements of Data Science is an introduction to data science for people with no programming experience. My goal is to present a small, powerful subset of Python that allows you to do real work with data as quickly as possible.

Part 1 includes six chapters that introduce basic Python with a focus on working with data.

Part 2 presents exploratory data analysis using Pandas and empiricaldist — it includes a revised and updated version of the material from my popular DataCamp course, “Exploratory Data Analysis in Python.”

Part 3 takes a computational approach to statistical inference, introducing resampling method, bootstrapping, and randomization tests.

Part 4 is the first of two case studies. It uses data from the General Social Survey to explore changes in political beliefs and attitudes in the U.S. in the last 50 years. The data points on the cover are from one of the graphs in this section.

Part 5 is the second case study, which introduces classification algorithms and the metrics used to evaluate them — and discusses the challenges of algorithmic decision-making in the context of criminal justice.

This project started in 2019, when I collaborated with a group at Harvard to create a data science class for people with no programming experience. We discussed some of the design decisions that went into the course and the book in this article.

Density and Likelihood: What’s the Difference?

Density and Likelihood: What’s the Difference?

It’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

If you get this post by email, the formatting might be broken — if so, you might want to read it on the site.

likelihood
PMFs and PDFs

PMFs and PDFs

It’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

If you get this post by email, the formatting is not good — you might want to read it on the site.

pmf_and_pdf
Regrets and Regression

Regrets and Regression

It’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

standardize