Guest Post by Willis Eschenbach
Well, I’ve been thinking for a while about how to explain what I think is wrong with how climate trend uncertainties are often calculated. Let me give it a shot.
Here, from a post at the CarbonBrief website, is an example of some trends and their claimed associated uncertainties. The uncertainties (95% confidence intervals in this instance) are indicated by the black “whisker bars” that extend below and above each data point.
Figure 1. Some observational and model temperature trends with their associated uncertainties.
To verify that I understand the graph, here is my own calculation of the Berkeley Earth trend and uncertainty.
Figure 2. My own calculation of the Berkeley Earth trend and uncertainty (95% confidence interval), from the Berkeley Earth data. Model data is taken directly from the ClimateBrief graphic.
So far, so good, I’ve replicated their Berkeley Earth results.
And how are that trend and the uncertainty calculated? It’s done mathematically using a method called “linear regression”. Below are the results of a linear regression, using the computer program R.
Figure 3. Berkeley Earth surface air temperature, with seasonal anomalies removed. The black/yellow line is the linear regression trend.
The trend is shown as the “Estimate” of the change in time listed as “time(tser)” in years, and the uncertainty per year is the “Std. Error” of the change in time.
This gives us an annual temperature trend of 0.18°C per decade (shown in the “Coefficients” as 1.809E-2 °C per year), with an associated decadal uncertainty of ±0.004°C per decade (shown as 3.895E-4°C per year)
So … what’s not to like?
Well, the black line in Figure 3 is not the record of the temperature. It’s the record of the temperature with the seasonal variations removed. Here’s an example of how we remove the seasonal variations, this time using the University of Alabama at Huntsville Microwave Sounding Unit (UAH MSU) lower troposphere temperature record.
Figure 4. UAH MSU lower troposphere temperature data (top panel), the average seasonal component (middle panel), and the residual with the seasonal component removed.
The seasonal component is calculated as the average temperature for each month. It repeats year after year for the length of the original dataset. The residual component, shown in the bottom panel, is the original data (top panel) minus the average seasonal variations (middle panel)
Now, this residual record(actual data minus seasonal variations) is very useful. It allows us to see minor variations from the average conditions for each month. For example, in the residual data in the bottom panel, we can see the temperature peaks showing the 1998, 2011, and 2016 El Ninos.
To summarize: the residual is the data minus the seasonal variations.
Not only that, but the residual trend of 0.18°C per decade shown in Figure 3 above is the trend of the data itself minus the trend of the seasonal variations. (The seasonal variations trend is close to but not exactly zero, because of the end effects based on exactly when the data starts and stops.)
So … what is the uncertainty of the residual trend?
Well, it’s not what is shown in Figure 3 above. Following the rules of uncertainty, the uncertainty of the difference of two values, each with an associated uncertainty, is the square root of the sum of the squares of the two uncertainties. But the uncertainty of the seasonal trend is quite small, typically on the order of 1e-6 or so. (This tiny uncertainty is due to the standard errors of the averages of each monthly value.)
So the uncertainty of the residual is basically equal to the uncertainty of the data itself.
And this is a much larger number than what is usually calculated via linear regression.
How much larger? Well, for the Berkeley Earth data, on the order of eight times as large.
To see this graphically, here’s Figure 2 again, but this time showing both the correct (red) and the incorrect (black) Berkeley Earth uncertainties.
Figure 5. As in Figure 2, but showing the actual uncertainty (95% confidence interval) for the Berkeley Earth data.
Here’s another example. Much is made of the difference in trends between the UAH MSU satellite-measured lower troposphere temperature trend and ground-based trends like the Berkeley Earth trend. Here are those two datasets, with their associated trends and the uncertainties (one standard deviation, also known as one-sigma (1σ) uncertainties) incorrectly calculated via linear regression of the data with the seasonal uncertainties removed.
Figure 6. UAH MSU lower troposphere temperatures and Berkeley Earth surface air temperatures, along with the trends showing the linear regression uncertainties.
Since the uncertainties (transparent red and blue triangles) don’t overlap, this would look like the two datasets have statistically different trends.
However, when we calculate the uncertainties correctly, we get a very different picture.
Figure 6. UAH MSU lower troposphere temperatures and Berkeley Earth surface air temperatures, along with the trends showing the correctly calculated uncertainties.
Since the one-sigma (1σ) uncertainties basically touch each other, we cannot say that the two trends are statistically different.
CODA: I’ve never taken a statistics class in my life. I am totally self-taught. So it’s possible my analysis is wrong. If you think it is, please quote the exact words that you think are wrong, and show (demonstrate, don’t simply claim) that they are wrong. I’m always happy to learn more.
As always, my best wishes to everyone.
w.