CDC Misinformation on Girls and Violence

This post is related to the Youth Suicide Rise project

CDC announced that 'dramatic increases' in exposure to violence led to teenage girls being 'engulfed in a growing wave' of violence. The CDC failed to provide compelling evidence for this extraordinary assertion and omitted data that contradicts it.

Every two years the CDC administers the Youth Risk Behavior Survey (YRBS) asking a sample of U.S. high school students many questions about their lives. On February 13, the CDC Division of Adolescent and School Health (DASH) held a press conference about findings from the 2021 YRBS.

The primary theme of the conference was that teenage girls and sexual minorities are in a state of crisis due to an unprecedented surge of mental health problems and violence.

Note: My critique of the CDC press release is not about mental health, since there is robust evidence of substantial and sustained increases in most YRBS measures of trauma and suicide over the past decade and since there is no obvious reason to suspect these YRBS increases are the artifact of a major flaw in YRBS administration -- and since adolescent suicide as well as clinical depression and anxiety nearly doubled during the past decade.

Girls: Overwhelming Wave of Violence

The emphasis was especially strong on girls and violence: the press was told by deputy director of CDC that "America’s teen girls are engulfed in a growing wave of sadness, violence, and trauma" and that "Over the past decade, teens, especially girls, have experienced dramatic increases and experiences of violence" -- after which the director of DASH reinforced the message by announcing that "Teen girls are experiencing record high levels of violence" and implied CDC knows why mental health problems are rising: "Our teenage girls are suffering through an overwhelming wave of violence and trauma, and it’s affecting their mental health."

The CDC conference led to a widespread news media coverage that warned the public about a supposedly massive rise of violence in the lives of girls, with violence being mentioned directly in numerous headlines such as Teen girls 'engulfed' in violence and trauma [Washington Post], CDC Says Teen Girls Are Caught in an Extreme Wave of Sadness and Violence [NBC NY], CDC sees alarming rise in violence, sadness in teen girls [CBS News] and Teen girls and LGBTQ+ youth plagued by violence and trauma [NPR].

Evidence Presented by CDC

The entire evidence provided by CDC in support of supposedly 'dramatic' increases of violence against girls is represented by two assertions in its press release U.S. Teen Girls Experiencing Increased Sadness and Violence:

1 in 5 (18%) experienced sexual violence in the past year—up 20% since 2017, when CDC started monitoring this measure.

More than 1 in 10 (14%) had ever been forced to have sex—up 27% since 2019 and the first increase since CDC began monitoring this measure.

Before we address the crucial distinction between survey results and reality, we need to note the following:

The relative change figures given to journalists by CDC were evidently calculated from prevalence figures rounded to whole numbers.

Calculating with rounded figures is not the same as rounding the calculation -- it can lead to huge inaccuracies.

For example, a change from "11" to "14" could actually be a change from 11.49 to 13.50, giving a relative increase of 17% instead of 27%.

The relative change figures given to journalists by CDC and widely disseminated to the public could therefore be very different from properly calculated results.

Note: The assertion that this is "the first increase since CDC began monitoring this measure" is false -- the measure increased (and decreased) repeatedly in the past, even by as much as 15%. Perhaps CDC meant to say this was the first time an increase was statistically significant, but statistical significance was a topic the CDC officials avoided during the conference and within the press release.

Sexual Violence

The YRBS item labeled Sexual Violence by CDC was introduced in 2017 and asks:

During the past 12 months, how many times did anyone force you to do sexual things that you did not want to do? (Count such things as kissing, touching, or being physically forced to have sexual intercourse.)

Note: I will not address here the problematic formulation and the consequent uncertainty about how students interpret the question; I will also not discuss any distinction between harassment and violence.

In 2017, 15.2% of girls reported experiencing Sexual Violence, in 2019 16.6%, and in 2021 this further increased to rounded 18% (anything from 17.5 to 18.4).

In other words "record-high violence" means that the measure was two or three percentage points higher in 2021 than it was in 2017 and 2019, the only other times it was measured.

It is unclear if the 2021 result is significantly different from the 2017 result. The 2021 DATA SUMMARY & TRENDS REPORT does define the term "significantly" (alpha < 0.5) and uses it to describe increases in mental health indicators, but it does not state that the sexual violence increase was significant.

Note: Given that standard errors were around 0.75 percentage points for this measure in 2017 and 2019, much of the 2021 increase could have been simply due to random sampling. This is a separate issue from the problem of compromised sampling that I discuss later.

Strangely, the online version of YRBS administered in spring 2021 (named ABES) found the prevalence to be only 15.3%, essentially the same as on YRBS 2017.

During the conference, CDC officials never mentioned the much lower ABES result, even though it preceded the fall YRBS only by roughly 6 months.

Lifetime Rape Measure

The YRBS rape question does not measure current risk of rape -- it is a lifetime measure.

When we compare YRBS answers from girls in 9th grade with those in 12th grade, it is clear that a great majority of high school girls reporting rape to YRBS were raped already before high school. In fact for three decades nearly 1 out of 10 high school freshmen girls consistently told YRBS that they were raped at some previous point in their lives.

This means that the YRBS measure of lifetime rape risk can increase even if current risk of rape decreases for high school girls.

It is therefore a fallacy for the CDC to misinterpret the lifetime rape item as if it measured current levels of sexual violence.

Furthermore, aggregate measures should increase much more slowly than current measures -- after all a 10% increase in your salary does not boost your lifetime earnings by 25% within two years. See Did Rape of Teenage Girls Double in 2 Years? in the Appendix for more details.

And yet the increase in sexual violence, which is a past year measure, was only by 5 to 10 percent between 2019 and 2021 (from 16.6% to '18%' -- we do not know exactly because of the rounding to whole numbers by CDC).

Indeed the absolute increase was about same for lifetime rape as it was for past year sexual violence: roughly 2.5 percentage points -- 25 extra girls per 1000. And yet the vast majority of the reported sexual violence is likely harassment such as unwanted kissing and touching -- far from forced sexual intercourse.

So how could the sexual volence trend explain the lifetime rape trend?

And once again, as with the sexual violence item, the online version of YRBS administered in spring 2021 found the prevalence of lifetime rape to be much lower: 10.4% versus '14%' on the fall YRBS.

Dating Violence Reversal

A slight detour (it will make sense soon):

In 2019 there was a curious divergence on the two YRBS measures of dating violence: sexual dating violence increased substantially, reversing the previous substantial decline; physical dating violence showed no such reversal.

Note: the sexual dating violence item is distinct from the sexual violence item; the latter measures abuse by anyone while the former measures only abuse by dating partners.

Why would one increase substantially but not the other?

A closer examination of the data reveals an even greater mystery:

While there was no change from 2017 to 2019 in the frequency of missing answers to the physical dating violence question, there was a huge increase for the sexual dating violence question: from 7% missing answers in 2017 to 24% in 2019.

How could missing answers increase so massively?

Questionnaire Censorship

It turns out that the 2019 YRBS was afflicted by massive censorship: items about sexual violence were removed from many YRBS questionnaires.

This can be seen in the huge increase of missing answers in 2019 for all the sexual violence items. For example, in 2017, 4% of the students did not answer the sexual violence by anyone question, but 2019 this portion rose to 25%. For rape, the frequency of missing answers rose from 2% to 18%.

One would expect the CDC to inform the public clearly and prominently about such a fundamental issue affecting the integrity of YRBS, but in fact you have to search various CDC documents carefully to finally find admission of this censorship buried inconspicuously among tons of other information (see Appendix).

Compromised Samples

The censorship of YRBS questionnaires severely undermines the validity of the sample for the censored items, because the sampling ceases to be random.

If the question was omitted mainly in conservative states and districts, the girls excluded from the sample may be less willing to admit they were raped, or have a more restricted view of what counts as sexual intercourse, or simply be less likely to have been raped.

Sample bias due to questionnaire censorship could cause substantial increases on the sexual violence measures. It could also explain why the online uncensored version of YRBS (ABES) did not find any increases at all.

Note: CDC did not yet release response rates for individual items from the 2021 YRBS but it is highly unlikely that the massive censorship of questionnaires in 2019 did not continue in 2021.

Omissions of Declines in Violence

The YRBS 2021 DATA SUMMARY & TRENDS REPORT released by CDC on the day of the conference included information for only 6 of the 13 items the CDC classifies as measures of Behaviors that Contribute to Violence:

Note: A few of the CDC 'Violence' items, such as Electronic Bullying, have only indirect and tenuous link to violence. Items like Attempted Suicide and Drunk Driving were not included among Behaviors that Contribute to Violence by CDC, perhaps because public discourse on teenage violence seldom includes self-inflicted and unintentional violence.

What were some of the violence measures omitted by CDC in their summary document?

Since its inception, YRBS asks students if they were involved in a physical fight during the past year.

In 2011, 24% of girls said they were involved in a physical fight during the past year; by 2019 this declined to 15%:

The 2019 figure is a three-decade historic low, less than half the violence risks of early 1990s.

The likelihood of a fight on school grounds also decreased greatly for girls:

The prevalence of reported school fights also reached a three-decade low in 2019, far below the risks in 1990s.

We do not know the 2021 YRBS results for physical fighting because the CDC withheld them from all the documents released during the conference.

The frequency of injuries from school fights declined so much -- down to 2.9% in 2015 from 4.2% in 2007 -- that CDC discontinued this YRBS item.

Exposure of girls to physical dating violence also declined per YRBS measure implemented in 2013:

And so did exposure to sexual dating violence per YRBS:

Note: in 2019 this item was afflicted by a large amount of missing answers due to questionnaire censorship.

These decreases were not only due to declines in dating but also due to the lowering risks of violence for those girls who did date during the past year.

The two dating violence indicators were included in the 2019 YRBS DATA SUMMARY & TRENDS REPORT but they were removed from the 2021 YRBS DATA SUMMARY & TRENDS REPORT.

The omission by CDC of the four historically low measures helped to ensure that no journalists would question the CDC narrative of record high levels of violence.

Contrary evidence from ABES

During the YRBS press briefing, CDC officials omitted any mention of the Adolescent Behaviors and Experiences Survey (ABES), which was essentially YRBS administered online: there the lifetime prevalence of rape was 10.4% (instead of '14%' on YRBS 2021) and recent sexual violence victimization was 15.3% (instead of '18%').

Unlike YRBS, the ABES was not affected by massive censorship: missing answer rates were below 2%.

If CDC dismisses ABES results because of the switch to online administration, then why does CDC accepts unquestioningly the results of the 2021 YRBS despite its switch from spring to fall administration?

Neither the switch to online administration, nor the switch to fall administration, should have a major effect on measures such as lifetime rape prevalence.

The massive censorship of specifically sexual victimization questions, on the other hand, could easily have had a major effect.

Imagine you have survey A without censorship that produced no surprising victimization results, and survey B that produced highly surprising victimization results but only on the measures severely affected by censorship. Would you decide to completely ignore survey A and instead accept unquestioningly the extraordinary results from survey B as being accurate depictions of reality?

Summary

Assertions by CDC officials that girls are engulfed in a growing wave of violence that reached record-high levels are based entirely on two sexual violence items -- one that is not a measure of current risk and one that was added in 2017. Both indicators rely on samples severely compromised by widespread censorship of YRBS questionnaires. The sexual victimization increases on the censored 2021 YRBS are contradicted by results from the uncensored ABES survey (online version of YRBS).

Meanwhile, YRBS measures that showed large and sustained declines of violence were omitted by CDC from its press release and from its 2021 YRBS documents.

CDC should rigorously investigate the possibility of substantial increases in recent risks of sexual violence against girls. CDC officials should not, however, misrepresent improper statistics and flawed reasoning as scientific proof of their own extraordinary conclusions; nor should CDC officials impose invalid generalizations upon the public and omit crucial data and information when they present evidence to news media.

Appendix

2021 YRBS Switch to Fall Administration

The CDC failed to warn journalists that there was a switch from spring to fall in the administration of YRBS and that this places in doubt the degree of comparability with previous results for some items since student conduct and victimization can be subject to considerable seasonal variation.

This is particularly problematic for YRBS questions that ask about the past week or past month. Even questions about past year experiences could be affected, since kids tend to remember recent events better than older events.

2019 YRBS Questionnaire Censorship

CDC admitted censorship of the national 2019 YRBS questionnaires in its publication 2019 YRBS Overview and Methods:

---

Finally, three survey measures had relatively large amounts of missing data in 2019: forced sex (approximately 2,400 observations), sexual dating violence (approximately 3,400 observations), and attempted suicide with injury (approximately

4,900 observations). Most of these missing data can be attributed to some selected schools administering YRBS questionnaire

versions that did not include these questions. Consequently, not all students in the national sample were given the opportunity to answer these questions and were counted as missing.

--- [https://www.cdc.gov/healthyyouth/data/yrbs/pdf/2019/su6901-H.pdf p.26]

Note that this incorrectly omits mention of the "Experienced sexual violence by anyone" item, where "Data were missing for 3,439 students for this variable, mostly attributed to the use of different versions of the YRBS questionnaire that did not include the sexual violence questions in certain selected schools".

Furthermore, the note conflates the low response rate on the suicide question with the new censorship problem. In reality, the suicide question always had a problem with low response rates, likely due to poor formulation (kids who didn't attempt suicide may simply skip it, because it starts with the words "If you attempted suicide during the past 12 months").

Did Rape of Teenage Girls Double in 2 Years?

To better understand the extreme implications of the CDC narrative that lifetime rape truly increased by a quarter due to a current wave of violence, one needs to realize that this would mean the rape of teenage girls actually doubled within two years.

This is because we then have, looking at the cohort surveyed in both 2019 and 2021,

1.25 N = 0.75 N + 0.25 N + x N

where N is the expected lifetime rape prevalence (same as in 2019), 0.75 N a lower bound on the prevalence that occurred already before HS, and x N the additional prevalence in 2020-2021 needed to raise lifetime rape by a quarter -- so rapes in high school would have to roughly double.

Even if we take into account some cohort differences, the 'past year' rape risk would have to increase substantially more than the 25% for lifetime risk.

Another way to look at it is to consider absolute differences: between 2019 and 2021, the increase in the number of girls reporting rape is the same as the increase in the number of girls reporting sexual violence (roughly extra 25 girls per 1000 for both). So the great majority of the offenses comprising the increase in sexual violence would have to be forced sexual intercourse!

If we are to explain the increase by cohort differences, we would need a huge jump between cohorts just two years apart (especially since a quarter of the girls were still in high school 2 and a half years later).

Any way one looks at it, a 27% increase in lifetime rape over mere 2 years is extraordinary and fails to be explained by the at most 10% increase in current rate of sexual violence.

2021 YRBS Substance Abuse

The 2021 YRBS Executive Summary (p.2) declares prominently that "Across almost all measures of substance use [...] female students are faring more poorly than male students."

This seems doubtful and is currently unverifiable without the full release of YRBS data.

A large part of the substance use questions on YRBS consists of questions regarding hard drugs and tobacco, and in 2019 it was boys who had the higher risks on most of these. It would be very strange if nearly all these items reversed on gender risk in just 2 years.

Either this is yet another example of how the CDC agenda of prioritizing girls results in careless misinformation or there may be yet another flaw in the administration of the 2021 YRBS.

2021 YRBS Bullying Decline

The 2021 YRBS indicated a large decline in school bullying prevalence (from 20% to 15%).

It could be that students are kinder to each other due to their experience of social isolation during the pandemic.

A simpler explanation, however, is that they spent fewer days in school and so were less likely to be victimized by bullying.

If so, we can expect the bullying measure to return to 2019 rates on the 2023 YRBS.

It will not surprise me then if this leads to widespread news headlines decrying a massive increase in school bullying.

2013 YRBS Bullying Anomaly

To better understand the need for caution regarding YRBS questionnaire changes (not to mention censorship), consider the following example.

There was a double anomaly on the 2013 YRBS: bullying of girls rose considerably while bullying of boys declined considerably; and the NCVS showed substantial declines in bullying of girls.

It turns out in 2013 YRBS the bullying question was, for the first time, directly preceded by two new questions on dating violence.

Bullying by dating partners is often not counted as 'bullying' in popular usage -- in fact CDC itself has in the past excluded abuse by dating partners (as well as siblings) from its definition of bullying.

Since YRBS uses the word "bullying" in its question, this common usage is bound to affect how students read the question. Girls in particular may have omitted, prior to 2013, harassment by dating partners, not thinking of it in regards to bullying. When, however, girls are reminded of abuse by boyfriends in the two questions just prior to the bullying item, they will be more likely to classify bullying behavior by boyfriends for what it is -- bullying.

Needless to say, girls experience considerably higher dating violence rates than boys, so this would explain the gender trends split in the 2013 YRBS. And since NCVS did not change its questionnaire this way in 2013, only YRBS would show this split. Finally, since the 2015 YRBS questionnaire remained the same in this regards, there was no 'correction' in 2015 of the rise in 2013.

While it is impossible to tell for certain without administrating a split YRBS survey (with and without the two new questions preceding bullying), it is at least an explanation that would fit all the aspects of the 2013 mystery.

Correction (Mar 12): The Sexual Dating Violence and the Physical Dating Violence graphs were mistakenly for boys and girls together -- I corrected both graphs.

Note (March 17): Professor John Santelli (Columbia University) pointed out that CDC determines YRBS statistical weights based on grade (plus sex and race/ethnicity), not age. So unless CDC changed its methodology specifically for the 2021 fall administration, any results that depend heavily on age, even those from questions asking about "the past 12 months", could be affected considerably by the spring to fall switch as the students will be on average about 6 months younger than they were in the 2019 YRBS survey.

2 comments:

Anonymous3/16/2023 02:38:00 AM
Thanks for this important exploration. The sampling issue may be even work than you suggest in your report. High school students are systematically YOUNGER in the fall than they are in the spring. So shifting YRBS sampling from spring to fall is likely to create systematic bias in the prevalence of behaviors and experiences. Ouch! John Santelli js2637@columbia.edu
Anonymous3/17/2023 02:42:00 AM
Yes and one question is did CDC adjust the weights to ensure similar average age to previous YRBS? We will find out once CDC releases the full data (I'm too tired of asking and getting no response from CDC),

The Shores of Academia

Thursday, February 23, 2023