The Elvis problem revisited

The Elvis problem revisited

Here’s a problem from Bayesian Data Analysis:

Elvis Presley had a twin brother (who died at birth). What is the probability that Elvis was an identical twin?

I will answer this question in three steps:

  • First, we need some background information about the relative frequencies of identical and fraternal twins.
  • Then we will use Bayes’s Theorem to take into account one piece of data, which is that Elvis’s twin was male.
  • Finally, living up to the name of this blog, I will overthink the problem by taking into account a second piece of data, which is that Elvis’s twin died at birth.

For background information, I’ll use data from 1935, the year Elvis was born, from the U.S. Census Bureau, Birth, Stillbirth, and Infant Mortality Statistics for the Continental United States, the Territory of Hawaii, the Virgin Islands 1935.

It includes this table:

With a few reasonable assumptions, we can use this data to compute the probability that Elvis was an identical twin, given that his twin brother died at birth.

You can see my solution in this Jupyter notebook.

Among U.S. college students, religious attendance is at an all-time low

Among U.S. college students, religious attendance is at an all-time low

In the last 30 years, college students have become much less religious. The fraction who say they have no religious affiliation tripled, from about 10% to 30%. And the fraction who say they have attended a religious service in the last year fell from more than 85% to less than 70%.

I’ve been following this trend for a while, using data from the CIRP Freshman Survey, which has surveyed a large sample of entering college students since 1966.

The most recently published data is from “97,753 first-time, full-time students who entered 147 U.S. colleges and universities of varying selectivity and type in the fall of 2018.”

Of course, college students are not a representative sample of the U.S. population. And as rates of college attendance have increased, they represent a different slice of the population over time. Nevertheless, surveying young adults over a long interval provides an early view of trends in the general population.

Religious preference

Among other questions, the Freshman Survey asks students to select their “current religious preference” from a list of seventeen common religions, “Other religion,” “Atheist”, “Agnostic”, or “None.”  

The options “Atheist” and “Agnostic” were added in 2015.  For consistency over time, I compare the “Nones” from previous years with the sum of “None”, “Atheist” and “Agnostic” since 2015.

The following figure shows the fraction of Nones from 1969, when the question was added, to 2018, the most recent data available.

Percentage of students with no religious preference from 1969 to 2018.

The blue line shows data until 2015; the orange line shows data from 2015 through 2018. The gray line shows a quadratic fit.  The light gray region shows a 90% predictive interval.

Since 2015, the total fraction of atheists, agnistics, and Nones has been essentially unchanged. The most recent data point is below the trend line, which suggests that the “rise of the Nones” may be slowing down.

Attendance

The survey also asks students how often they “attended a religious service” in the last year. The choices are “Frequently,” “Occasionally,” and “Not at all.” Respondents are instructed to select “Occasionally” if they attended one or more times, so a wedding or a funeral would do it.

The following figure shows the fraction of students who reported any religious attendance in the last year, starting in 1968. I discarded a data point from 1966 that seems unlikely to be correct (66%).

Percentage of students who reported attending a religious service in the previous year.

About 68% of incoming college students said they attended a religious service in the last year, an all-time low in the history of the survey, and down more 20 percentage points from the peak.

In contrast with the fraction of Nones, this curve is on trend, with no sign of slowing down.

In previous years I have also reported on the gender gap in religious affiliation and attendance, but the data are not available yet. I will update when they are.

Data Source

The American Freshman: National Norms Fall 2018
Stolzenberg, Eagan, Romo, Tamargo, Aragon, Luedke, and Kang,
Higher Education Research Institute, UCLA, December 2019

This and all previous reports are available from the HERI publications page.

Young Christians are more sex-positive than the previous generation

Young Christians are more sex-positive than the previous generation

This is the fifth and probably final in a series of articles where I use data from the General Social Survey (GSS) to explore

  • Differences in beliefs and attitudes between Christians and people with no religious affiliation (“Nones”),
  • Generational differences between younger and older Christians, and
  • Generational differences between younger and older Nones.

In the first article, I looked at changes in religious beliefs and found that younger Christians are more secular in many ways than the previous generation.

In the second article, I looked at views related to law and public policy and found that young Christians are more progressive on most issues than the previous generation.

In the third article, I found that generational differences on most questions related to abortion are small and probably not practically or statistically significant.

In the fourth article, I looked at responses to questions related to priorities and public spending. On many dimensions, younger Christians are moving toward the beliefs of their secular peers, but there are notable exceptions.

In this article, I use the same dataset to explore changes in attitudes related to sex. For details of the methodology, see the first article.

When is sex wrong?

GSS respondents were asked several questions related to their attitudes about sex:

There’s been a lot of discussion about the way morals and attitudes about sex are changing in this country.

  • If a man and woman have sex relations before marriage, do you think it is always wrong, almost always wrong, wrong only sometimes, or not wrong at all?
  • What if they are in their early teens, say 14 to 16 years old? In that case, do you think sex relations before marriage are always wrong, almost always wrong, wrong only sometimes, or not wrong at all?
  • What about sexual relations between two adults of the same sex–do you think it is always wrong, almost always wrong, wrong only sometimes, or not wrong at all?
  • What is your opinion about a married person having sexual relations with someone other than the marriage partner–is it always wrong, almost always wrong, wrong only sometimes, or not wrong at all?

For each of these questions, I count the fraction of respondents who reply “always wrong”.

And I looked at responses to one other sex-related question:

Would you be for or against sex education in the public schools?

Here are the results:

Generational changes in attitudes related to sex.

The blue markers are for people whose religious preference is Catholic, Protestant, or Christian; the orange markers are for people with no religious affiliation.

For each group, the circles show estimated percentages for people born in 1968; the arrowheads show percentages for people born in 1993.

For both groups, the estimates are for 2018, when the younger group was 25 and the older group was 50. The brackets show 90% confidence intervals.

In almost every scenario, young Christians are less likely than the previous generation to say that sex is “always wrong”, and in the cases of homosexual and teen sex, the changes are substantial.

Opposition to premarital sex was already low and did not change as much. Support for sex education was already high and is now an overwhelming majority.

The exception is extramarital sex, where there is practically no generational change: more than 80% of both generations think it is always wrong.

Compared to their Christian peers, the non-religious are more sex-positive by 15-30 percentage points. And their generational changes go in the same direction, with young Nones less likely to think sex in these scenarios is wrong.

But again, extramarital sex is the exception; among the Nones, the small generational change is within the margin of error.

This exception suggests that both groups distinguish between actions that harm people and transgressions of divine law.

Summary

In 2007, when I started writing about religious trends, I thought the increasing number of people with no religious affiliation was hugely underreported. Now, the “rise of the Nones” is well known.

Then, for a while, the story was that people were leaving organized religion, but they were still religious or at least spiritual; that is, they were “believing without belonging”.

More recently, it has become clear that beliefs and attitudes among the Nones are getting more secular.

In this series of articles, I have looked at changes among the ones who are left behind; that is, the decreasing fraction who identify as Christian. On many dimensions, the pattern is the same: young Christians are more secular than the previous generation.

Responses that follow this pattern include:

  • Almost all religious beliefs and activities, except belief in the afterlife.
  • Opposition to sex and sex education, except extramarital sex.
  • Matters of public policy including the legalization of marijuana, pornography, and euthanasia; support for affirmative action; and opposition to the death penalty and school prayer.

Many questions related to public spending follow the same pattern, with younger Christians generally moving toward positions held by their secular peers; the only substantial exception is mass transportation, which has less support among young people in both groups [although this result is so surprising to me that I need more evidence to be confident it is correct].

The most notable exceptions are opposition to gun control and abortion, which show almost no generational changes. Maybe not coincidentally, these exceptions are probably the most politicized topics among the questions I explored.

In summary, we can describe secularization in the U.S. as the sum of two trends, changes in affiliation and changes in belief. Both trends are moving fast, and they are moving in the same direction, away from religion.

A large majority of Americans support legal abortion, at least in some circumstances

A large majority of Americans support legal abortion, at least in some circumstances

This is the third in a series of articles where I use data from the General Social Survey (GSS) to explore

  • Differences in beliefs and attitudes between Christians and people with no religious affiliation (“Nones”),
  • Generational differences between younger and older Christians, and
  • Generational differences between younger and older Nones.

In the first article, I looked at changes in religious beliefs and found that younger Christians are more secular in many ways than the previous generation.

In the second article, I looked at views related to law and public policy and found that young Christians are more progressive on most issues than the previous generation.

In this article, I use the same dataset to explore changes in opinions about abortion. For details of the methodology, see the previous article.

GSS respondents were asked, “Please tell me whether or not you think it should be possible for a pregnant woman to obtain a legal abortion” under different circumstances.

The following figure shows the results.

Generational changes in beliefs about legal abortion

The blue markers are for people whose religious preference is Catholic, Protestant, or Christian; the orange markers are for people with no religious affiliation.

For each group, the circles show estimated percentages for people born in 1968; the arrowheads show percentages for people born in 1993.

For both groups, the estimates are for 2018, when the younger group was 25 and the older group was 50. The brackets show 90% confidence intervals.

Before we look for generational changes, we should notice the starting point: a large majority of Americans support legal abortion, at least in some circumstances.

  • In cases of severe birth defects and pregnancy due to rape, the majority is about 70% of Christians and 90% of the nonreligious.
  • In cases of serious danger to the woman’s health, it’s almost 90% of Christians and nearly all of the nonreligious.

Under other circumstances, opinions are more divided, with support near 40% among Christians and 70% among the Nones.

Looking now at the generational changes, I see only one that is likely to be practically and statistically significant: younger people in both groups are less likely than the previous generation to support legal abortion if there is a chance of serious birth defect.

Even so, there is majority support in both groups, more than 60% among Christians and 80% among Nones at age 25.

In summary:

  • Beliefs about abortion depend substantially on the circumstances;
  • In many circumstances, a large majority of Christians and the non-religious support legal abortion;
  • Even where there is disagreement between the groups, there is substantial diversity of opinion within both groups;
  • Generational changes in these opinions are generally small and within the statistical margin of error.
Young Christians are less religious than the previous generation

Young Christians are less religious than the previous generation

This is the first in a series of articles where I use data from the General Social Survey (GSS) to explore

  • Differences in beliefs and attitudes between Christians and people with no religious affiliation (“Nones”),
  • Generational differences between younger and older Christians, and
  • Generational differences between younger and older Nones.

On several dimensions of religious belief, young Christians are less religious than their parents’ generation. I’ll explain the methodology below, but here are the results:

Generational changes in religious belief, comparing people born in 1968 and 1993

The blue markers are for Christians (people whose religious preference is Catholic, Protestant, or Christian); the orange markers are for people with no religious affiliation.

For each group, the circles show estimated percentages for people born in 1968; the arrowheads show percentages for people born in 1993.

For both groups, the estimates are for 2018, when the younger group was 25 and the older group was 50. The brackets show 90% confidence intervals for the estimates, computed by random resampling.

The top row shows the fraction of respondents who interpret the Christian bible literally; more specifically, when asked “Which of these statements comes closest to describing your feelings about the Bible?”, they chose the first of these options:

  • “The Bible is the actual word of God and is to be taken literally, word for word”
  • “The Bible is the inspired word of God but not everything in it should be taken literally, word for word.
  • “The Bible is an ancient book of fables, legends, history, and moral precepts recorded by men.”

Not surprisingly, people who consider themselves Christian are more likely to interpret the Bible literally, compared to people with no religious affiliation.

But younger Christians are less likely to be literalists than the previous generation. Most of the other variables show the same pattern; younger Christians are less likely to answer yes to these questions:

  • “Would you say you have been ‘born again’ or have had a ‘born again’ experience — that is, a turning point in your life when you committed yourself to Christ?”
  • “Have you ever tried to encourage someone to believe in Jesus Christ or to accept Jesus Christ as his or her savior?”

And they are less likely to report that they know God really exists; specifically, they were asked “Which statement comes closest to expressing what you believe about God?” and given these options:

  • I don’t believe in God
  • I don’t know whether there is a God and I don’t believe there is any way to find out.
  • I don’t believe in a personal God, but I do believe in a Higher Power of some kind.
  • I find myself believing in God some of the time, but not at others.
  • While I have doubts, I feel that I do believe in God.
  • I know God really exists and I have no doubts about it.

Younger Christians are less likely to say they know God exists and have no doubts.

Despite all that, younger Christians are more likely to believe in an afterlife. When asked “Do you believe there is a life after death?”, more than 90% say yes.

Among the unaffiliated, the trends are the same. Younger Nones are less likely to believe that the Bible is the literal word of God, less likely to have proselytized or been born again, and less likely to be sure God exists. But they are a little more likely to believe in an afterlife.

More questions, less religion

UPDATE: Since the first version of this article, I’ve had a chance to look at six other questions related to religious belief and activity. Here are the results:

Generational changes in religious belief, comparing people born in 1968 and 1993

Qualitatively, these results are similar to what we saw before: controlling for period effects, younger Christians are more secular than the previous generation, in both beliefs and actions.

They are substantially less likely to consider themselves “religious” or “spiritual”, and less likely to attend religious services or pray weekly. And they are slightly less likely to participate in church activities other than services.

They might also be less likely to say they have had a life-changing religious experience, but that change falls within the margin of error.

In later articles, I’ll look at trends in other beliefs and attitudes, especially related to public policy. But first I should explain how I generated these estimates.

Methodology

My goal is to estimate generational changes, that is, cohort effects as distinguished from age and period effects. In general, it is not possible to distinguish between age, period, and cohort effects without making some assumptions. So this analysis is based on the assumption that age effects in this dataset are negligible compared to period and cohort effects.

Data from the General Social Survey goes back to 1972; it includes data from almost 65,000 respondents.

To measure current differences between people born in 1968 and 1993, I could select only respondents born in those years and interviewed in 2018. But there are not very many of them.

Alternatively, I could use data from all respondents, going back to 1972, fit a model, and use the model to estimate generational differences. That might work, but it would probably give too much weight to older, less relevant data.

As a compromise, I use data from 1998 to 2018, from respondents born in 1940 or later. This subset includes about 25,000 respondents. But not every respondent was asked every question, so the number of valid responses for most questions is smaller.

For most questions, I discard a small number of respondents who gave no response or said they did not know.

To model the responses, I use logistic regression with year of birth (cohort) and year of interview as independent variables. For questions with more than two responses, I choose one of the responses to study, usually the most popular; in a few cases, I grouped a subset of responses (for example “agree” and “strongly agree”).

I use a quadratic model for the period effect and a cubic model of the cohort effect, using visual tests to check whether the models do an acceptable job of describing the trends in the data.

I fit separate models for Christians and Nones, to allow for the possibility that trends might look different in the two groups (as it turns out they often do).

Then I use the models to generate predictions for four groups: Christians born in 1968 and 1993, and Nones born in the same years. These are “predictions” in the statistical sense of the word, but they are deliberately not extrapolations into cohorts or periods that are not in the dataset; it might be more correct to call them “interpolations”.

To show how this method works, let’s consider the fraction of Christians who answer that they know God exists, with no doubts. The following figure shows this fraction as a function of birth year (cohort):

Fraction of Christians who says they know God exists, plotted over year of birth

The red dots show the fraction of respondents in each birth cohort. The red line shows a smooth curve through the data, computed by local regression (LOWESS). The gray line shows the predictions of the model for year 2008.

This figure shows that the logistic regression model of birth year does an acceptable job of describing the trends in the data, while also controlling for year of interview.

To see whether the model also describes trends over time, we can plot the fraction of respondents in each year of interview:

Fraction of Christians who says they know God exists, plotted over year of inteview

The green dots show the fraction of respondents during each year of interview and the green line shows a local regression through the data. The purple line shows the model’s predictions for someone born in 1968; the pink line shows predictions for someone born in 1993.

The gap between the purple and pink curves is the estimated generational change; in this example, it’s about 3 percentage points.

In summary, the model uses data from a range of birth years and interview years to fit a model, then uses the model to estimate the difference in response between people born in different years, both interviewed in 2018.

The results are based on the assumption that the model adequately describes the period and cohort effects, and that any age effects are negligible by comparison.

You can see all of the details in this Jupyter notebook, and you can click here to run it on Colab.

Please stop teaching people to write about science in the passive voice

Please stop teaching people to write about science in the passive voice

You might think you have to, but you don’t and you shouldn’t.

Why you might think you have to

  1. Science is objective and it doesn’t matter who does the experiment, so we should write in the passive voice, which emphasizes the methods and materials, not the scientists.
  2. You are teaching at <a level of education> and you have to prepare students for the <next level of education>, where they will be required to write in the passive voice.

Why you don’t have to

Regardless of how objective we think science is, writing about it in the passive voice doesn’t make it any more objective. Science is done by humans; there is no reason to pretend otherwise.

If you are teaching students to write in the passive voice because you think they need it at the next stage in the pipeline, you don’t have to.

If they learn to write in the active voice now, they can learn to write in the passive voice later, when and if they have to. And they might not have to.

A few years ago I surveyed the style guides of the top scientific journals in the world, and here’s what I found:

  1. None of them require the passive voice.
  2. Several of them have been begging scientists for decades to stop writing in the passive voice.

Here is the style guide from Science, from 1968, and it says:

“Choose the active voice more often than you choose the passive, for the passive voice usually requires more words and often obscures the agent of action.”

Here’s the style guide from Nature:

Nature journals like authors to write in the active voice (“we performed the experiment…” ) as experience has shown that readers find concepts and results to be conveyed more clearly if written directly.”

From personal correspondence with the production department at the Proceedings of the National Academy of Sciences USA (PNAS), I learned:

“[We] feel that accepted best practice in writing and editing favors active voice over passive.”

Top journals agree: you don’t have to teach students to write in the passive voice.

Why you shouldn’t

As a stylistic matter, excessive use of the passive voice is boring. As a practical matter, it is unclear.

For example, the following is the abstract of a paper I read recently. It describes prior work that was done by other scientists and summarizes new work done by the author. See if you can tell which is which.

The Lotka–Volterra model of predator–prey dynamics was used for approximation of the well-known empirical time series on the lynx–hare system in Canada that was collected by the Hudson Bay Company in 1845–1935. The model was assumed to demonstrate satisfactory data approximation if the sets of deviations of the model and empirical data for both time series satisfied a number of statistical criteria (for the selected significance level). The frequency distributions of deviations between the theoretical (model) trajectories and empirical datasets were tested for symmetry (with respect to the Y-axis; the Kolmogorov–Smirnov and Lehmann–Rosenblatt tests) and the presence or absence of serial correlation (the Swed–Eisenhart and “jumps up–jumps down” tests). The numerical calculations show that the set of points of the space of model parameters, when the deviations satisfy the statistical criteria, is not empty and, consequently, the model is suitable for describing empirical data.

L. V. Nedorezov “The dynamics of the lynx–hare system: an application of the Lotka–Volterra model“.

Who used the model? Who assumed it was satisfactory? And who tested for symmetry?

I don’t know.

Please don’t teach students to write like this. It’s bad for them and anyone who has to read what they write, and it’s bad for science.

Handicapping pub trivia

Handicapping pub trivia

Introduction

The following question was posted recently on Reddit’s statistics forum:

If there is a quiz of x questions with varying results between teams of different sizes, how could you logically handicap the larger teams to bring some sort of equivalence in performance measure?

[Suppose there are] 25 questions and a team of two scores 11/25. A team of 4 scores 17/25. Who did better […]?

One respondent suggested a binomial model, in which every player has the same probability of answering any question correctly.

I suggested a model based on item response theory, in which each question has a level of difficulty, d, each player has a level of efficacy e, and the probability that a player answers a question is

expit(e-d+c)

where c is a constant offset for all players and questions and expit is the inverse of the logit function.

Another respondent pointed out that group dynamics will come into play. On a given team, it is not enough if one player knows the answer; they also have to persuade their teammates.

Me (left) at pub trivia with friends in Richmond, VA. Despite our numbers, we did not win.

I wrote some simulations to explore this question. You can see a static version of my notebook here, or you can run the code on Colab.

I implement a binomial model and a model based on item response theory. Interestingly, for the scenario in the question they yield opposite results: under the binomial model, we would judge that the team of two performed better; under the other model, the team of four was better.

In both cases I use a simple model of group dynamics: if anyone on the team gets a question, that means the whole team gets the question. So one way to think of this model is that “getting” a question means something like “knowing the answer and successfully convincing your team”.

Anyway, I’m not sure I really answered the question, other than to show that the answer depends on the model.