Friday, April 16, 2010

This is from the 1998 article FIGURE 5
ME Mann, RS Bradley, MK Hughes. "Global-scale temperature patterns and climate forcing over the past six centuries".  Nature, 1998

 ###Hockey stick replication
###data source    "nhmean.txt"

##get rid of 0s
hockey$raw[hockey$raw==0] <- NA 
hockey$recon[hockey$recon==0] <- NA
### Get 1961- mean to center chart

x1<-subset(hockey$raw, hockey$date >1961)

#### extend plot margins for labels

#plot data and create new ylab numbers
plot(hockey$date, hockey$raw, type="l",col = "red", ylim = c(-1.1, 0.8), las=1, ylab="Departure in temperature (C)\nfrom he 1961 to 1990 average", xlab="Year\n Chris Miner Geo 299B ", yaxt= "n")
axis(2, at=meanx1, las=1, labels="0.0")
axis(2, at=meanx1-0.5, las=1, labels="-0.5")
axis(2, at=meanx1+0.5, las=1, labels="0.5")
axis(2, at=meanx1-1, las=1, labels="-1.0")

##create "standard error" polygon
polygon(x= c(hockey$date, rev(hockey$date)), y=c(hockey$lower, rev(hockey$upper)), col="grey", border=FALSE)

###### add text
rect(1550, -0.9, 1997, -1.1, bor = TRUE, col = "white")
   text(1550, -1.02, "Data from thermometers (red) and from tree rings,\ncorals, ice cores and historical records (blue).", pos = 4, adj = 0)
   text(1700,.7, "NORTHERN HEMISPHERE")

#### add trend line

lines(hockey$date, hockey$recon, col="dodgerblue3") ### go dodgers
abline(h=meanx1) #### mean value line

###function to deal with lowess' problem with NAs blahhhh <- function(x, y, f = 2/3,...) { 
  x1 <- subset(x,(! &(!
  y1 <- subset(y, (! &(! <- lowess(x1,y1,f, ...)
 lines($date, hockey$recon, f=0.04),lwd=2)

In many ways the lines of the climate debate can be drawn within the frame of one graph. In 1998, Mann, Bradley and Hughes published the now famous 'hockey stick' recreation of 600 years of temperature patterns across the Northern Hemisphere. The graph itself is innocently buried within a dense scholarly discourse full of Eigen values and principle component analysis. As a figure it is one of many in a short article. As a representation of the data, among the other figures presented, it could be said that it is the least visually appealing and contains the least amount of information. However, what it does do is summarize the thrust of the entire article and visually provide near conclusive evidence that there is an historically unique change in climate and that it is caused by human activity.

    The questions the article sets out to answer are fundamental to the climate debate: Is the earth getting warmer; if it is, is that change within the normal variability of long term trends; and finally is human activity involved in that change. The 'hockey stick' graph, whether it was intended to or not by the authors, answers all of these questions and it does so forcefully and emphatically. It depicts a clear monotonic growth in temperature over the 20th century. This growth has gone beyond the visible trends seen in prior centuries, and most importantly the beginning of the current trend seems to exactly correlate with the growth of industry in the northern hemisphere. Yet, and this is the reason we may question the authors intentions; the graph is stunningly clear and conclusive, while the text of the article speaks of the uncertainty and provisional nature of the findings.
    The graph itself, has given rise to its own controversy: it is either a global fraud, one part of a scientific discourse, or the philosopher’s stone of climate change. Critics point out the highly aggregate nature of the data; layers of uncertainty built one upon another. They argue it smoothes and attenuates global trends which exaggerates the data from modern thermometer readings--its data is filled with measurement error; it is a highly non-random sample of both the proxies and of the raw temperature data which introduces more severe auto-correlation than the authors admit and problems of endogeneity in the temperature readings. The defenders claim that methodology and data was open to inspection, that levels of uncertainty were well explicated in that study and following ones, that further studies have built on this evidence, and finally that whatever reasonable level of uncertainty you put on the data something worrying is going on.

    However, leaving the climate debate aside and focusing on the 'hockey stick' itself, the most telling critique of the authors of the original study might not be that their study is flawed but that they underestimated or ignored the power that the visual representation of data can have. In this debate, the graphics overpowered the words. It gave a strong impression of certainty not echoed in the text and thus as was warned in IPCC recommendations, "More consistent estimates of the endpoints of a range for any variable would minimize misunderstandings and reduce the likelihood that interest group could misunderstand or misrepresent the findings". These misunderstandings run across interest groups both for and against the articles findings. For uncertainty does not favor either side of the debate in this case. Just as much as it might be overstated, the problem could be much worse, as scholars have pointed out. This fact has been lost in the debate sparked by the visual representation of Mann et al’s findings.

     Though I only have access to the already aggregated mean centered data, below is a brief discussion of some shortcomings in the ‘hockey stick’ graph and an attempt at improvement.  First, are the wholly arbitrary elements of the graph: the 0 point line drawn through the authors’ chosen point; the coloring, the scale of the axes, the combination of a time-series line plot, a mean centered trend line, and the backdrop of the ‘confidence intervals’.  The scale of the axes, gives the impression of a much greater magnitude of change. The 0 point of the y axis seems chosen for visual effect rather than representing an aspect of the data (why 1961? Why not 1902, the whole timeline).  The uncertainty in the graph is grayed out and simply a background feature.

Yet in my view, one of the most visually misleading aspects of the graph is the trend line drawn through the mean of the of the data points. First of all, these are not observations these are point estimates. In a frequentist approach (which I’m assuming they’re taking) if portraying the level of uncertainty is high on our agenda then a mean trend line is likely in this case to give an overly confident visual impression. For instance, the standard deviation intervals represented in gray in the graph tell us little about how confident we are of where the true value lies--what it tells us is if our assumptions are reasonably accurate then 95% or 97.5% etc percent of the time that confidence interval will cover the true value. We do not know the probability of where that point lies or how likely it is to be at the center or the extreme of that confidence interval. Further, there is no uncertainty given to the thermometer temperature readings. They are treated as if they were an accurate census of the population. However, the thermometer readings are as much a sample from a population as the proxy data. Thus the thermometer readings are at risk for all of the problems suffered in the proxy data--spatial and serial auto-correlation, measurement error, missing data, etc.
Below is a WEAKLY ATTEMPTED improvement on the original graph:

I'm trying to capture a little more of the uncertainty. Since the proxy data covers near the whole time span. The thermometer readings are left out as overlaying them on the proxy data obscures the trend being told by the proxy data. Second, the 0 line is centered at the mean of the thermometer data, as the graph is claiming to tell us the deviation from the post-industrial temperatures. Finally, the upper and lower bounds are highlighted with lowess lines (still need to play with the smoothing, as some of the points are outside the bounds). 

No comments:

Post a Comment