Replication Problems: Mannian Verification Stats

If anyone feels like sticking needles in their eyes, I’d appreciate assistance in trying to figure out Mannian verification statistics. Even when Mann posts up his code, replication is never easy since they never bothered to ensure that the frigging code works. Or maybe they checked to see that it didn’t work. UC’s first post on the matter wondered where the file c:\scozztemann3\newtemp\nhhinfxxxhad was. We still have no idea. This file is referred to in the horrendously written verification stats program and it may be relevant.

With UC’s help, I’ve been able to replicate quite a bit of the CPS program (the EIV module remains a mystery.)

I’ve been testing verification stats with the SH iHAD reconstruction. I mentioned previously that Mannian splicing does not always use larger proxy networks if they get “better” RE stats with fewer proxies. This Mannian piece of cherry picking is justified in the name of avoiding “overfitting” although it is actually just the opposite. It reminds me of the wonderful quote from Esper 2003 (discussed here):

this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal. The ability to pick and choose which samples to use is an advantage unique to dendroclimatology.

Mining promoters would like a similar advantage, but, for some reason, securities commissions require mining promoters to disclose all their results.

Mann’s reconstruction archive in 2008, as with MBH98, only shows spliced versions – some habits never change, I guess. But in the SH iHAD run, the AD1000 network remains in use right through to the 20th century, with all proxies starting later than AD1000 being ignored – all in the name of not “overfitting”. But the long run of values from a consistent network is very handy for benchmarking and, with much help from UC’s Matlab runs, I’ve managed to very closely replicate the SH iHAD reconstruction from first principles, as shown below – this graphic compares a version archived at Mann’s FTP site with my emulation.

For comparison, here is an excerpt from Mann SI Figure S5d (page 11), which has an identical appearance.

You can upload an original digital version of this reconstruction (1000-1995) as follows:

url=”http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/reconstructions/cps/SH_had.csv”
had=read.csv(url)
temp=!is.na(had[,2])
estimate=ts(had[temp,2],start=min(had[temp,1]))

A digital version of the “target” instrumental is also at Mann’s website and can be downloaded as follows:

url=”http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/instrument/iHAD_SH_reform”
target=read.table(url)
target=ts(target[,2],start=1850,end=1995)

The reported verification statistics for the SH iHAD reconstruction are also archived and can be downloaded as follows (load the package indicated). BTW this is a nice package for reading Excel sheets into R.

library(xlsReadWrite)
url=”http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/cps-validation.xls”
download.file(url,”temp.xls”,mode=”wb”)
test=read.xls( “temp.xls”,colNames = TRUE,sheet = 14,type = “data.frame”,colClasses=”numeric”)
count=apply(!is.na(test),2,sum);count
temp=!is.na(test[,1])
stat=test[temp,!(count==0)]
name1=c(“century”, c(t( outer(c(“early”,”late”,”average”,”adjusted”),c(“RE”,”CE”,”r2″), function(x,y) paste(x,y,sep=”_”) ) )) )
names(stat)=name1[1:ncol(stat)]
stat[stat$century==1000,]
# century early_RE early_CE early_r2 late_RE late_CE late_r2 average_RE average_CE average_r2 adjusted_RE adjusted_CE adjusted_r2
# 1000 0.0746 -1.663 0.3552 0.7194 0.1475 0.303 0.397 -0.758 0.3291 0.397 -0.758 0.3291

Given digital versions of the reconstruction and the “target”, it should be simplicity itself to obtain standard dendro verification statistics. But, hey, this is hardcore Team. First, Mann does some Mannian smoothing of the instrumental target. Well, we’ve managed to replicate Mannian smoothing and can follow him through this briar patch.

library(signal) # used for smoothing and must be installed
source(“http://data.climateaudit.org/scripts/mann.2008/utilities.txt”)
cutfreq=.1;ipts=10 #ipts set as 10 in Mann lowpass
bf=butter(ipts,2*cutfreq,”low”);npad=1/(2*cutfreq);npad
smooth=ts( mannsmooth(target,M=npad,bwf=bf ) ,start=1850)

Now the “early miss” verification stats using a simple (and well-tested) program to do the calculations:

rbind( unlist(stat[ stat$century==1000, grep(“early”,names(stat)) ]), unlist(verification.stats(estimator=estimate,observed=smooth,calibration=c(1896,1995),verification=c(1850,1895))[c(2,5,4)]
))
# early_RE early_CE early_r2
#[1,] 0.0745930 -1.6633600 0.3551960
#[2,] 0.2883958 -0.8940804 0.1432888

And for the “late-miss” stats:
rbind( unlist(stat[ stat$century==1000, grep(“late”,names(stat)) ]),
unlist(verification.stats(estimator=estimate,observed=smooth,calibration=c(1850,1949),verification=c(1950,1995))[c(2,5,4)]
))
# late_RE late_CE late_r2
#[1,] 0.719441 0.1474790 0.3030280
#[2,] 0.804556 0.4111129 0.4549566

These should match the early_ and late_ values, but don’t. The inability to replicate the r2 values is particularly troubling, since these are not affected by the various scaling transformations. I simply haven’t been able to get the reported verification r2 values using many permutations.

Since the reconstruction ties together both to digital and graphic versions, perhaps the archived instrumental version http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/instrument/iHAD_SH_reform is not the same as the c:\scozztemann3\newtemp\shhinfxxxhad .

The code for the verification stats is at
http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeveri/veri1950_1995sm.m and http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeveri/veri1850_1895sm.m. They seem to have learned their programming style from Hansen, as the code is replete with steps that don’t seem to have any function, unhelpful comments made less helpful in places by inaccuracy and, most of all, by an almost total lack of mathematical understanding and organization in implementing the code.

43 Comments

  1. Jeff Alberts
    Posted Nov 25, 2008 at 10:00 PM | Permalink

    That Esper quote is a classic. Still amazes me how anyone can say such a thing with a straight face.

  2. Louis Hissink
    Posted Nov 26, 2008 at 1:08 AM | Permalink

    Esper’s quote makes my eyes glaze over, as I’m a mining type. I think it’s called “high grading” in mining lingo.

  3. Jean S
    Posted Nov 26, 2008 at 3:27 AM | Permalink

    on the matter wondered where the file c:\scozztemann3\newtemp\nhhinfxxxhad was.

    These nh/sh/gl l/h infxxx -files seem to be “high” and “low” -splits of the instrumental data. They are (at least some of them) prepared in the code
    http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeprepdata/doindexxx.m
    Before “Mannian filetering”, there seems to be some type of “standardization” (zero mean, unit variance in 1850-1995???). Maybe that’s causing a problem?

    • Posted Nov 26, 2008 at 4:49 AM | Permalink

      Re: Jean S (#3),

      line 203 of doindefinf is

      %save(‘/holocene/s1/zuz10/work1/temann/newtemp/nhhinfxxxihad’,’highf’,’-ascii’)

      but I can’t find

      c:\scozztemann3\newtemp\nhhinfxxxhad

      from any m-file. Line 177 of veri1850_1895sm.m

      RE=1-ccw/aaw

      indicates that act1 indeed is instrumental reference, Mannian
      smoothed at line 58. Of course, comment

      %%%% Apply the zero-phase butterworth filter but only to the proxies

      at line 48 confuses me a bit 😉

      Reconstructions are then in

      t1=load(strcat(‘c:\scozztemann3\z1308\veri\nhrech\recon’,num2str(istep)));
      t2=load(strcat(‘c:\scozztemann3\z1308\veri\nhrecl\recon’,num2str(istep)));

      (lines 74-75 ) , smoothed (again?) at line 123

    • Posted Nov 26, 2008 at 5:24 AM | Permalink

      Re: Jean S (#3),

      Before “Mannian filetering”, there seems to be some type of “standardization” (zero mean, unit variance in 1850-1995???). Maybe that’s causing a problem?

      Yep,

      doindexxx.m standardizes temperatures at 23-28, then saves the result to

      /holocene/s1/zuz10/work1/temann/newtemp/stdtempn95infxxx

      the same file is then loaded at line 55, Mannian smoothed and saved to /holocene/s1/zuz10/work1/temann/newtemp/nhhinfxxx and /holocene/s1/zuz10/work1/temann/newtemp/nhlinfxxx , no more unit variance because of smoothing..

  4. pete m
    Posted Nov 26, 2008 at 4:29 AM | Permalink

    Even when Mann posts up his code, replication is never easy since they never bothered to ensure that the frigging code works.

    lol you just made my day. I think that little frustration has been brewing for just a couple of years.

    I guess Mann just doesn’t like people to replicate his work – strange behaviour for a scientist.

  5. John A
    Posted Nov 26, 2008 at 6:47 AM | Permalink

    Is it me or is Mann obsessed with smoothing?

    • Jean S
      Posted Nov 26, 2008 at 7:24 AM | Permalink

      Re: John A (#7),
      Well, actually the operation mentioned in #6 is not Mannian “smoothing”, but Mannian “splitting” 😉 The instrumential series is split into “high” and “low” parts using Mannian filtering. This operation is then undone (!!!) in the beginning of veri1850_1895sm.m when these two “splits” are added together to form the series “act1”, which, of course, needs to be Mannian smoothed (in other words, to get a new “low” split called “lowf” and assigned to “act1” after filtering) 🙂 If you followed me this far, you may ask what is the difference between “nhlinfxxxhad” and “lowf”. Well, “nhlinfxxxhad” is apparently obtained by Mannian smoothing with frequency = 0.05 whereas “lowf” has the frequency = 0.10. I wonder if these guys took part in Obfuscated Matlab Code Contest 😉

      • Posted Nov 26, 2008 at 8:04 AM | Permalink

        Re: Jean S (#9),

        Well, actually the operation mentioned in #6 is not Mannian “smoothing”, but Mannian “splitting”

        Oh, that’s true, sorry. I thought they are double-smoothing, but I didn’t read the code carefully enough 😉

        Comments such as

        stdkeepers=stdtemps95infxxx; % because standardization have done in doannsst.m

        won’t help in understanding what is going on 😉

  6. Posted Nov 26, 2008 at 7:17 AM | Permalink

    Smoothing annihilates* information.

    *ORIGIN late Middle English (originally as an adjective meaning [destroyed, annulled] ): from late Latin annihilatus ‘reduced to nothing,’ from the verb annihilare, from ad- ‘to’+ nihil ‘nothing.’ The verb sense [destroy utterly] dates from the mid 16th cent.

  7. Steve McIntyre
    Posted Nov 26, 2008 at 7:56 AM | Permalink

    UC or Jean S, I presume that, after all the pointless Mannianisms, one should end up with a smooth of 0.1 (which is what I used – after scratching my head.) But that doesn’t yield the reported values.

    In addition to the above pointless Mannianisms, the program veri1950_1995sm (which really should only be a few lines), has any number of other feints to wrongfoot the unwary. It calculates a Butterworth filter in the line butter(5,lowlim,’low’) but doesn’t use it anywhere that I can see. It includes the same comment “smooth instrumental series” for both reconstructions and instrumental.

  8. Steve McIntyre
    Posted Nov 26, 2008 at 8:32 AM | Permalink

    OK, the following function should replicate doindexxxx.m. REquires package signal and function mannsmooth (which replicates lowpassmin)

    mannsplit=function(x,frequency=0.05, reference= c(1850,1995),ipts=10) {
    index=(reference[1]:reference[2])-tsp(x)[1]+1
    rescaled= (x- mean(x[index]))/sd(x[index])
    bf=butter(ipts,2*frequency,”low”)
    npad=1/(2*cutfreq)
    lowf=mannsmooth(x,M=npad,bwf=bf)
    highf=x-lowf
    mannsplit=list(x=x,lowf=lowf,highf=highf)
    mannsplit}

    More later.

  9. Steve McIntyre
    Posted Nov 26, 2008 at 8:35 AM | Permalink

    #7. All this smoothing runs into one of the Santer/Nychka issues – the degrees of freedom falls with smoothing. Mann purports to allow for autocorrelation but simply “guessed” (as far as I can tell) an autocorrelation factor, which is far less than observed in the data.

  10. Steve McIntyre
    Posted Nov 26, 2008 at 8:59 AM | Permalink

    Jean S and UC, I’ve got a theory as to what’s going on. Data is written into holocene/s1/zuz10/work1/temann/newtemp/nhhinfxxx in BOTH progams (doindexx.m and doindexinf.m)

    save(‘/holocene/s1/zuz10/work1/temann/newtemp/nhhinfxxx’,’highf’,’-ascii’) #doindexx.m

    save(‘/holocene/s1/zuz10/work1/temann/newtemp/nhhinfxxx’,’highf’,’-ascii’) #doindexinf.m

    The program prepinputfor recon comments out doindexinf.

    % This is master code to prepare input data

    %doindexinf; % preparing high-f/low-f hemispherical land(iCRU) or land+ocean(iHAD) mean surface temperature
    % series based on our infilled global instrumental gridbox.

    doindexxx; % preparing high-f/low-f hemispherical land(CRU) or land+ocean(HAD) mean surface temperature
    % series from Brohab et al’s CRUTem3v/HadCRUT3v.

    MAKEPROXY; % preparing proxy data

    INPUTall; % preparing input data of reconstruction based on all proxy network

    INPUTscreen; % preparing input data of reconstruction based on screened proxy network

    Since it looks like a somewhat different instrumental target series has been used for the verification stats, I wonder if they’ve actually used the doindexinf.m version – the one where Mann foregos using CRU hemisphere composites and calculates his own Mannian hemisphere averages. That’s sort of what he did in MBH98 and the leopard doesn;t seem to change his spots.

  11. Posted Nov 26, 2008 at 12:02 PM | Permalink

    Steve,

    the one where Mann foregos using CRU hemisphere composites and calculates his own Mannian hemisphere averages.

    This one lost me a bit, I am aware of the 08 paper problems with r but real temps weren’t used for correlation in 98??

    Great Esper quote, pretty well sum’s it all up.

    • Jean S
      Posted Nov 26, 2008 at 3:03 PM | Permalink

      Re: Jeff Id (#15),
      Mann calculates his own target “NH mean” temperature series in MBH9X. It is based on retaining only a few principal components and on fewer number of gridcells than actually available for most years (and also the grid areal weighting is wrong if I recall correctly). So the target series is not the CRU NH mean temperature estimate available back then.

  12. Posted Nov 26, 2008 at 2:09 PM | Permalink

    I guess the easiest way to get all this running is to make C:\holocene\s1\zuz10\allproxy1209 etc folders to own computer. I almost got everything working with prepinputforrecon.m . But after few minutes of computer crunching I get error message

    ??? Undefined function or variable ‘clihybrid’.

    so I googled clihybrid.m, one hit, brings me back to CA, http://www.climateaudit.org/?p=3844#comment-300690 . LOL 🙂

  13. Robinedwards
    Posted Nov 26, 2008 at 2:37 PM | Permalink

    I’ve just downloaded http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/instrument/iHAD_SH_reform, as mentioned in Steve’s posting, and had a look at it graphically. The period covered is from 1849 to 2006 (from memory). Have other readers looked at it? For those who have not, it is clearly /not/ reasonable to believe that a simple linear model is adequate to describe it.

    Please remember that I am not addressing some fundamental aspects of the data – such as its reliability, its internal consistency, its archiving quality and its transparency. Questions may arise from any or all of these considerations. I am merely accepting the data as being someone’s attempt to publish genuine scientific information, and I am assuming diligent and honest endeavour on the part of the publisher of the data.

    The question then arises, “What is a plausible underlying model for this data set?”. I propose that to hypothesise a simple linear model is to ignore some blatantly obvious properties of the data.

    Several simple models could be put forward. My initial choice is a dummy variable model having three dummies, (or groups). The first ends at 1920, the next at 1970 and the last at 2006. These subgroups (and there are many more probably equally reasonable group divisions that could be proposed) when applied to the data show a decreasing trend for the first group, then a step change to an effectively stable regime with the final group showing a steep rising trend to which it would be difficult to ascribe a second order effect.

    It’s not difficult to produce confidence intervals for the fitted plot, which naturally has a far lower residual mean square that the simple regression. After all, I’ve been seriously judgemental with my model!

    What is striking is that the data “appear” to exhibit points of very rapid change (either a step or a change from stability to a regime of very marked and sustained temperature increase.

    One feels that /something/ fundamental must underlie these rather simple observations. Could it be a real-world effect, or is it an artifice of the data generation and reporting technology?

    We are in need of more erudite comment and perhaps investigation by Steve, I think! It looks hockey-stick like, but starting in 1970 or ’71. MM would like that, but then, it’s his data.

    Robin

  14. Jean S
    Posted Nov 26, 2008 at 2:53 PM | Permalink

    Steve, with your emulation, could you produce a figure similar to S11(A) but for SH and iHAD?

    BTW, the verification stats for SH iHAD (“SH full IHAD”-sheet in cps-validation.xls) do change after AD1000. So am I missing something or is it so that although AD1000 network remains fixed in the actual reconstruction, the proxy networks are changing in the “late” and “early” “validation” reconstructions?? This would be strange even in Mannian standards.

  15. Steve McIntyre
    Posted Nov 26, 2008 at 3:22 PM | Permalink

    #19. Not quite. In MBH98, Mann calculates averages for his “dense” (1084 gridcells) and “sparse” networks (219 cells) and uses these as targets. Even I got a bit worn out in this bramble patch and didn’t quite figure out the purpose of using these rather than usual CRU hemispherical results, but I presume that it “improved” his results a little.

    #18. Yes. (At least I’m very close to being able to do so.)

    #18. My guess is that the changes derive from a point that UC mentioned before. In each recon step, Mann re-applies Mannian smoothing with the truncated network. Because the Butterworth filter has a frequency component to it, when you chop off some early data e.g. in going from an AD1000 start to an AD1100 start, that doesn’t just change the smoothed series at the early end, it also changes it a little at the closing end. So the verification stats change a little, even though the network itself is unchanged. Yes, it’s an excellent Mannianism. 🙂

  16. Jean S
    Posted Nov 26, 2008 at 3:41 PM | Permalink

    #20: Hmmm… hard to believe that the end-point condition would cause such a big changes. Especially, when that does not seem to cause the same thing in the early steps. Also there are separate columns under title “adjusted to overfitting” which remains unchanged after AD1100 (i.e. is identical to AD1100 “average”).

  17. Steve McIntyre
    Posted Nov 26, 2008 at 4:53 PM | Permalink

    #21. There’s something else that I’m checking on. The networks in “late miss” and “early miss” ‘experiments’ do not appear to be the same. The look-up table in gridproxy.m is different in the two cases.

  18. Steve McIntyre
    Posted Nov 26, 2008 at 5:04 PM | Permalink

    #22. In the AD1000 network, there are 25 proxies in each of the networks but none of the networks are the same.

    In total, 31 proxies are selected into one or other of the “early miss”, “late miss” or “whole” networks, with a different 6 excluded from each of the networks as used. This methodology looks like another bit of opportunistic selection that is not allowed for in the “significance” tests.

    There’s a possibility that I can tie down the verification stats by running these weird network variations – I’ll see.

  19. Steve McIntyre
    Posted Nov 26, 2008 at 7:07 PM | Permalink

    I’ve tried another couple of ways of getting Mannian verification stats without success.

    If one applies the masks “enough” and “landmask” to the infilled temperature data, one gets the Mannian iCRU and iHAD versions – these are the infilled Mannian versions of CRU and HAD. The archived versions of iCRU and iHAD at http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/instrument exactly match the results calculated from the http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/data/instrument/infilled2poles_1850.mat – so this doesn’t help the replication of SH_iHAD verification statistics.

    The early-miss, late-miss and whole versions of SH_iHAD AD1000 are a lot different (due to presence/absence of individual proxies). The difference is illustrated below and this is interesting in other contexts. However, none of the versions yield the reported verification stats. It looks like another Mannian snare for the unwary.

    • Jean S
      Posted Nov 27, 2008 at 5:31 AM | Permalink

      Re: Steve McIntyre (#24),
      Interesting. “Early-Miss” is way out of their “self consistent uncertainties” for the period of interest (1000-1099) being almost 0.5C above. And this is obtained by simply using a shorter instrumental period for calibration. This absurd situation is also observed in Figure S11. In S11a, there is about 1C (!!!) difference between “Early” and “Late” reconstructions in the early steps …. how can anyone take these seriously?

  20. Posted Nov 27, 2008 at 2:29 AM | Permalink

    Dr. Mann, could you please update http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeprepdata/ with clihybrid.m ? ( Are we the first ones who tried to run this code? )

    • Posted Dec 1, 2008 at 1:10 AM | Permalink

      Re: UC (#25),

      http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeprepdata/Readme.txt :

      ***Correction (29 Nov 2008): clihybrid.m added to directory.

      Thanks, Dr. Mann. I can now run the prepinputforrecon.m, but I had to add manually folders such as

      C:\holocene\s1\zuz10\work1\temann\zzrecon1209\nhglfulihad\highf ,

      do you want me to make a turn-key version ? Something I did with the hockeystick ? Or would you prefer GUI ?

      • Jean S
        Posted Dec 1, 2008 at 4:37 AM | Permalink

        Re: UC (#39),
        Ah, frequency 0.05 was used this time. Can anyone keep track which smoothing (frequencies 0.05/0.1) is used in whatever operation? Especially, how were the final series used to calculate the verification stats in veri1850_1895sm.m actually smoothed?

  21. Imran
    Posted Nov 27, 2008 at 5:21 AM | Permalink

    Steve
    Nothing to do with this post but was wondering if you had ever turned your attention ot the statistic associated with surveys. I was reading about a recently released survey done in partnership with HSBC and how Nicholas Stern was stating this was proof of a global mandate to get politicians to act on climate change.

    http://news.bbc.co.uk/2/hi/science/nature/7748247.stm

    I had a look at some of the details which can be reached though HSBC main web page (www.hsbc.com)

    When looking at the data, apart from the obvious question as to whether a survey of 11,000 respondents makes a ‘mandate’, an obvious conclusion is that from those surveyed, there has been a 30% drop in those who would change their lifestyle – compared to 18 months ago – something which doesn’t get reported at all. I was just interested if you had ever analysed this kind of statistical data – from a conclusions point of view – or from a process (eg. how did they pick the respondents etc).

  22. Steve McIntyre
    Posted Nov 27, 2008 at 8:26 AM | Permalink

    #27. Speaking of “self-consistent uncertainties”:

    Mann et al 2008:

    Uncertainties were estimated from the residual decadal variance during the validation period based (32 – Mann et al JGR 2007, 42 – Wahl and Ammann, 2007) on the average validation period r^2 (which in this context has the useful property that, unlike RE and CE, it is bounded by 0 and 1 and can therefore be used to define a ‘‘fraction’ of unresolved variance).

    Mann et al JGR 2007:

    Uncertainties were diagnosed from the variance of the verification residuals as in M05 – [Mann et al J Clim 2005].

    Mann et al 2005:

    We therefore also used a highly conservative estimate of unresolved variance provided by 1- r^2 (along with the more conventional RE) to estimate statistical uncertainties as conservatively as possible.

    Although MBH99 uncertainties remain a complete mystery, MBH98 residuals can be derived from calibration r^2 as previously discussed here noting an analysis by Jean S:

    Jean S. re-opened the matter by sending me the following graph (slightly redrawn here by me) showing a link between MBH98 confidence intervals in each step and the calibration r^2 statistic (described by Mann as the calibration beta statistic). Jean S estimated the calibration sigma using the archived calibration r^s statistics using the formula:

    sigma.hat = sqrt ((1- r^2 [calibration]) * var (instrumental) )

    However, if one applies this formulation to the present case with verification r^2, it doesn’t give anything remotely resembling the reported uncertainties. I sent an email to Mann et al 2008 reviewer, Gerald North, about this and he told me to “move on”, sort of like a traffic cop, I guess.

    • Posted Nov 27, 2008 at 11:05 AM | Permalink

      Re: Steve McIntyre (#28),

      to estimate statistical uncertainties as conservatively as possible.

      This hand-waving is funny. Like Hegerl in reply to Schneider,

      We account for uncertainty in temperature reconstructions as fully as possible.

      And then, they are like they’ve never heard of Brown / Sundberg work. La la la, I can’t hear you.

    • Spence_UK
      Posted Nov 28, 2008 at 8:23 AM | Permalink

      Re: Steve McIntyre (#28),

      We therefore also used a highly conservative estimate of unresolved variance provided by 1- r^2 (along with the more conventional RE) to estimate statistical uncertainties as conservatively as possible.

      I wonder if he could shoehorn the term “conservative” in there a few more times. If not, does this count as unprecedented conservativeness?

      Since the hockey team like to test their verification statistics on pathological cases rather than more conventional methods, I wonder if Mann would be kind enough to demonstrate the consequences of his CI calculation on the anecdotal reconstruction examples in Rutherford 2005? He must approve of those examples as he has relied on them before.

  23. GeneII
    Posted Nov 27, 2008 at 12:58 PM | Permalink

    If anyone feels like sticking needles in their eyes, I’d appreciate assistance in trying to figure out Mannian verification statistics.

    “Do not worry about your difficulties in mathematics. I can assure you mine are still greater.”
    ~Einstein

  24. UK John
    Posted Nov 27, 2008 at 2:49 PM | Permalink

    If Steve and UC et.al. cannot make it work, then it doesn’t!

    Perhaps this is the paper you should do

    “Replication Problems: Mannian Verification Stats”

    Abstract:- “its like poking your eyes out with needles”, observational behaviour of frustrated statiticians!
    The paper gives comment and goes to prove how daft the world can be.

  25. Steve McIntyre
    Posted Nov 27, 2008 at 3:30 PM | Permalink

    I think that one part of this conundrum can be resolved. Although the program http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeprepdata/doindexxx.m says that a frequency of 0.05 is used,

    frequency = 0.05;
    [smoothed0,icb,ice,mse0] = lowpassmin(data,frequency);

    collateral information indicates that a frequency of 0.1 was used for this step.

    A smoothed instrumental series in the matrix NH.land.mat matches freq 0.1 but not freq 0.05. Plus freq 0.1 corresponds to a Figure as shown below.

  26. GeneII
    Posted Nov 27, 2008 at 9:03 PM | Permalink

    I know this is completely off topic (my apologies for that) and it will probably be deleted. But this is the first positive economic news I’ve seen and I wanted you fellas to see it too.
    Renowned economic pessimist Nouriel Roubini approves of Obama’s picks,…

    Newsweek: Your view of the economic future is often a bit less than optimistic. What does Obama’s team signal about what could be coming?
    Roubini: Look, he wants to get things done, so he’s choosing a really terrific team.

  27. Poha
    Posted Nov 28, 2008 at 8:59 AM | Permalink

    Roger Bacon (1214-1294):
    “Quatuor vero sunt maxima comprehendendæ veritatis offendicula, quæ omnem quemcumque sapientem impediunt, et vix aliquem permittunt ad verum titulum sapientiæ pervenire: videlicet fragilis et indignæ auctoritatis exemplum, consuetudinis diuturnitas, vulgi sensus imperiti, et propriæ ignorantiæ occultatio cum ostentatione sapientiæ apparentis.”
    (There are four barriers blocking the road to truth: submission to unworthy authority, the influence of custom, popular prejudice, and concealment of one’s ignorance with a technical show of wisdom.)

  28. theduke
    Posted Nov 28, 2008 at 9:55 AM | Permalink

    “We are not sceptical enough about the data.”

    I guess she’s not made it to Climate Audit.

    http://www.smh.com.au/news/opinion/miranda-devine/beware-the-church-of-climate-alarm/2008/11/26/1227491635989.html

  29. RLGrin
    Posted Nov 28, 2008 at 10:11 AM | Permalink

    I wouldn’t be surprised if Mann et al intentionally published bad data to send everyone on a wild goose chase. For what it’s worth, DON’T trust and verify everything.

  30. Steve McIntyre
    Posted Dec 1, 2008 at 2:04 PM | Permalink

    #40. Yes, I can confirm that 0.05 smoothing was used in newtemp. As others have observed, it’s pretty hard to tell from the program, because it’s set to both 0.05 and 0.1 at different points in the doindexxx.m program. This split seems to get undone later in CPS operations, but perhaps not in the EIV operations. What a mess.

    For reference: in the newtemp/ directory,
    HAD_NH_reform is identical to the version at Mann’s website;
    tempn95infxxx is identical once again, just under a new name
    stdtempn95infxxx – is the same series, re-scaled
    nhlinfxxx is obtained by .05 smoothing of stdtempn95infxxx

    and, mutatis mutandi, for SH and GL.

    Here’s a plot of their “target” smoothed temperature series. How many degrees of freedom in Nychka terms are in this sucker? I’d be surprised if it was out of the single digits.

  31. Posted Jan 13, 2009 at 9:21 AM | Permalink

    # century early_RE early_CE early_r2 late_RE late_CE late_r2 average_RE average_CE average_r2 adjusted_RE adjusted_CE adjusted_r2
    # 1000 0.0746 -1.663 0.3552 0.7194 0.1475 0.303 0.397 -0.758 0.3291 0.397 -0.758 0.3291

    At least r2 problem can be solved now ( see http://www.climateaudit.org/?p=4833 )

    Here’s

    [e_recon(vals1:vale1) inst(vals1:vale1) l_recon(vals2:vale2) inst(vals2:vale2)]

    from calc_error.m, and

    load e_recon.txt
    >> corrcoef(e_recon(:,1),e_recon(:,2)).^2

    ans =

    1.0000 0.3030
    0.3030 1.0000

    >> corrcoef(e_recon(:,3),e_recon(:,4)).^2

    ans =

    1.0000 0.3552
    0.3552 1.0000

    matches with the ones archived.

  32. Steve McIntyre
    Posted Jan 13, 2009 at 3:58 PM | Permalink

    #42. Excellent.

    UC, can you post up the full e_recon and l_recon series for this example?

  33. Posted Jan 14, 2009 at 2:26 PM | Permalink

    u=calc_error(‘SH’,”,’iHAD’);

    http://signals.auditblogs.com/files/2009/01/early1.txt

    http://signals.auditblogs.com/files/2009/01/late1.txt

    http://signals.auditblogs.com/files/2009/01/inst1.txt

One Trackback

  1. […] Even now work continues, attempting to make sense of Mann’s hodge-podge of code and data (see this and this).  Hence the “peer review” failed. Their review might as well consisted […]