Friday, February 7, 2020

Youth Suicide Rise: Rate Fluctuation


Youth Suicide Rise: Rate Fluctuation


Note: this is part of the Youth Suicide Rise project.


Could the doubling of suicide rates within a decade be the result of natural fluctuation within the data?

Given the limited amount of relevant data, separating systematic patterns from mere 'random noise' is essentially impossible.  It is possible, however, to estimate reasonable upper bounds on the role of chance by examining year-to-year fluctuation.

Let us start by looking at child suicide rates since 1999:



We see there does not appear much volatility in the graph -- indeed both the mean and the median of the year-to-year changes is a little over 1 death where values range from 11 to 24 deaths (per million).

Since the difference between 2007 and 2017 amounts to 13 deaths, it is roughly 10 times as large as the average change between years.

To see this better, let us look at the magnitudes of year-to-year changes:


The light blue is the magnitude of the positive or negative change, the dark blue the yearly rates.

Note that the largest change (3.3 deaths) occurred after rates rose greatly, so it is only 15.9% of the preceding year rate.

We can see that year-to-year changes are a small fraction of of the preceding year: 


Here I deliberately left the scale maximum at 100% as that represents the rate from the previous year while the value displayed describes the fraction of this (expressed in percents) that equals the (absolute) difference between the two years.

The average change is about 8% from the previous year, and never went above 20%.

The proportionally largest change occurred in 2008 with a 17.7% jump above 2007.  Therefore the doubling between 2007 and 2017 amounts to about 6 times the year-to-year maximum proportional change on record.

Another method to differentiate short-term fluctuations from long-term trends is to look at aggregate data over a longer time period.

Let us begin with the average of rates over 5-year periods:




We can see that the suicide rate jumped more than 50% from 2003-07 to 2013-17.

Finally, we can also look at 5-year averages:



We now see a very stable upward trend since the 2006-10 interval -- except that it seems to be slightly accelerating at the very end.

We can see that 3-year intervals also give a very smooth trend (since 2007):



To summarize, the U.S. is so large a country that there is not enough fluctuation in yearly child suicides to justify the dismissal of the 2007-2017 doubling as being largely the result of mere chance.


Notes:


Admittedly, the conclusion is fairly obvious from just looking at the first graph of child suicide rates each year.  We will soon, however, encounter data where fluctuations can be far more important, such as when we examine suicide rates of young girls or suicide methods resulting in deaths.  The above is thus useful in getting familiar with the methods involved, and in a manner easily comprehensible by even those readers who know nothing about statistics.

On the other hand, we also avoided more advanced statistical tools -- with only 15 data points the use of more sophisticated methods may mislead rather than enlighten. 

The role of anomalies within the 2007-2017 period is a separate issue, with 2008 being an obvious example.  We will look at anomalies a bit closer in the next post.


Technical Notes:

I start with 1999 because that is the first year available under the ICD-10 classification that CDC displays by default and going further back would offer little additional advantage.

When examining natural fluctuations, it needs to be kept in mind that these can also be affected by time -- think of the rapidly increasing child population in the 1960s -- and therefore the consideration of periods before 1999 may lead to needless complications rather than insight.

We will of course look at data from earlier time periods soon, once we move on to examine historical context.

The average of rates over N years is not the same as the rate for the N years period.  In fact 'population' any given year is not in itself a well-defined concept, but this does not matter much if the counting method is consistent over the years and there are no extreme changes in population.

Similarly, the rate over a 5-year period may be computed differently depending on its definition; usually though we multiply each yearly rate by its population, then divide by all populations total.

In our case the population has varied only slightly (namely by less than 1% from year to year) so we need not worry about these details.

The rate values are rounded to single decimal digits when displayed, but not when used in calculations; therefore the percentage changes may differ slightly from those that would be computed with the rounded values shown in the first graph.

The graphs were produced by the free and open-source LibreOffice software.  I admit part of the reason behind the graphing bonanza above is that I am getting familiar with the Chart feature of LibreOffice Calc -- so far I like the crisp look of the graphs.

Data Sources:




No comments:

Post a Comment

CDC and YRBS: Time for Transparency

   CDC and YRBS: Time for Transparency This post is related to the  Youth Suicide Rise  project CDC response to Washington Post questions re...