Why did Steig use a cut-off parameter of k=3?

A question that Jean S inquired about before we were so rudely interrupted. The expanation in Steig et al was:

Principal component analysis of the weather station data produces results similar to those of the satellite data analysis, yielding three separable principal components. We therefore used the RegEM algorithm with a cut-off parameter k=3…. A disadvantage of excluding higher-order terms (k>3) is that this fails to fully capture the variance in the Antarctic Peninsula region. We accept this tradeoff because the Peninsula is already the best-observed region of the Antarctic.

I’ve sent an inquiry to one of the coauthors on what they mean by “separable principal components” and how they determined that there were three, as opposed to some other number. I would have thought that “significant” would be a more relevant term than “separable”, but we’ll see.

In the figure below, I show AWS trends for regpar =1,2,3,4,5,6,16, 32. It turns out that 3 was very fortuitous choice as this proved to yield the maximum AWS trend, something that will, I’m sure, astonish most CA readers. For regpar=1, the trend was negative. (I can picture a line of argument as to why 1 is a better choice than 3 or 4, which I’ll visit on another day.) As k increased the trend returned towards 0. Thus k, selected to be 3 no doubt from the purest of motives, yielded the maximum trend. I guess that was a small consolation for the bitter disappointment of failing to “fully capture the variance in the Antarctic Peninsula region” and it was definitely gracious of Steig and Mann to acquiesce in this selection under the circumstances.

ASW Trends under different regpar parameters (“RegEM Truncated PC”)

The graphic shows results for a method slightly varied from RegEM TTLS – let’s call it RegEM Truncated PC. I’ll explain the differences tomorrow. RegEM TTLS is a pig as regpar increases. RegEM TTLS yields rank k reconstructions; “RegEM Truncated PC” also yields rank k reconstructions, that were only about 1% different in benchmarks. For 1-6, RegEM TTLS has a similar pattern, but so far we haven’t run RegEM TTLS with higher regpar values as it will be VERY slow. (Jeff Id is going to try.)

I’ve got a bit of a personal interest as to why they excluded the PC4. Seems like something we’ve visited before, doesn’t it.

Feb 24 Update: Jeff Id reports:

I ran reg EM up to regpar=14, after that the methods it uses to set up TTLS don’t work and the regpar is truncated back down. Speed wasn’t too bad, each iteration in the loop takes about 3 seconds but the trend didn’t converge easily. After 100 iterations the high order system didn’t converge but was creeping closer. At regpar=8 it did converge after about 65 iterations.

136 Comments

  1. Bugs
    Posted Feb 24, 2009 at 1:57 AM | Permalink

    No abuse or denigration of scientists here, no siree, sir.

    • Gerry Morrow
      Posted Feb 24, 2009 at 7:52 AM | Permalink

      Re: Bugs (#1), Bugs, a group of scientists who have put forward a theory that the world is warming and the reason it’s warming is because of MMGW put forward a paper that purports to show that the Antarctic is indeed warming using PCA with a cut off parameter of k=3 without an explanation as to why they chose this cut-off parameter. Subsequent analysis of the data shows that this particular cut-off parameter produces the result which best proves their case. Nothing wrong so far, except of course the lack of explanation for the choice of cut of parameter. Even Inspector Clouseau of the Surete would have his suspicions raised at this amazing coincedence .nd would no doubt “Suspect nobody and suspect everybody”>

  2. David Wright
    Posted Feb 24, 2009 at 2:15 AM | Permalink

    “Seperable” probably means that their eigenvalues are well-seperated from the others. (E.g. in the list 105, 99, 93, 66, 59, 52, 45, etc., the first three values might be said to be well-seperated from the others.) “Significant” is a loaded word with a precise meaning in statistics, whereas the choice of how many PCS components to keep is unavoidably arbitrary.

    Of course, for precisely that reason one generally uses PCS only as a sort of heuristic guide, and in a rigorous analysis doesn’t throw away any components unless one has to. If your calculations of trend as a function of the number of retained components are correct, then the fact that the trend disappears as more components are retained is quite remarkable, and the fact that they chose exactly the number that maximizes the trend is quite suspicious.

  3. Kohl Piersen
    Posted Feb 24, 2009 at 3:29 AM | Permalink

    Bugs comments –

    No abuse or denigration of scientists here, no siree, sir.

    A bit of sarcasm is “abuse” or “denigration” ??

    Honestly, you’d have to be something of a marshmallow to be too worried about this!

  4. Kohl Piersen
    Posted Feb 24, 2009 at 3:31 AM | Permalink

    And another thing – by comparison with what passes for comment at certain warming blogs, this is insignificant. Don’t you think?

  5. Chris JH
    Posted Feb 24, 2009 at 3:47 AM | Permalink

    No abuse or denigration of scientists here, no siree, sir.

    Yet another comment devoid of technical content. Do you have anything substantive to say on the choice of three PCs? I for one would love to see a detailed answer to Steve’s question, perhaps you could provide one?

  6. Posted Feb 24, 2009 at 4:07 AM | Permalink

    I’d love to know why r=4 was not significant and r=2 was significant…just for my peace of mind.

  7. rafa
    Posted Feb 24, 2009 at 4:12 AM | Permalink

    I did not follow the previous posts about the Steig paper so my apologies if my question has been already answered. Years ago I read about stopping rules (Bartlett, etc) to determine the number of principal components, is this discussed in the Steig paper?

  8. Geoff Sherrington
    Posted Feb 24, 2009 at 5:24 AM | Permalink

    Diverting for a moment to the satellite data analysis, I can remember what it was like in the early days. Landsat 1 (launched 1972) well after the Steig sequence starts, used 6-bit spectral channel recording. That is, the range of any colour including the IR channel was split into 64 bins. Reconstructing temperatures from this to give differences of 0.1 deg Celsius, today, has to be suspect. Maybe they did not use Landsat, but it is indicative of the technology. Earlier satellites included the Nimbus and Tirus series (first one 1959). When sending streams of digital data to Earth, available technical resources were sorely strained and the amount of data transmission was restricted to a bare minimum.

    My son even recalls Landsat imagery writing of “data at Curtin University about 1980 which had 30% of the pixels being white or black. It was noise from faulty detectors, transmission losses, and magnetic tape failures. One Landsat had a defective detector strip. I would then stitch the data to maintain the correct geographical position and note in the metadata file on the tape header and footer to NOT USE these areas. Who has read the metadata files?

    “I never saw a Landsat image that was 100% correct.

    “I note the number of people with impressive titles now, but considering the social climate at the time – with rumours of deliberate downgrading of capability for missile accuracy purposes – ask the people who sat laboriously cleaning these images if they represent “reasonable quality data?”

    The suggestion is that before investment of too much quality statistical effort into early satellite figures, it might be rewarding to revisit their accuracy.

  9. Posted Feb 24, 2009 at 5:59 AM | Permalink

    Well, the graph really explains why k=3 was chosen. Climate scientists are apparently more predictable than the climate.

    Congratulations to your website’s running again! 😉

  10. dearieme
    Posted Feb 24, 2009 at 6:11 AM | Permalink

    “..something that will, I’m sure, astonish most CA readers.” Not so much astonished, more flabbergasted, or even gobsmacked.

  11. curious
    Posted Feb 24, 2009 at 6:25 AM | Permalink

    Re: Geoff at 8 – On another thread (Deconstructing the Steig AWS Reconstruction by Phil comment 207 at
    February 11th, 2009 at 2:29 pm) there was reference to the Comiso paper:

    “Variability and Trends in Antarctic Surface Temperatures from In Situ and Satellite Infrared Measurements
    JOSEFINO C. COMISO”

    concluding accuracy of 3degC rms for more recent satellite IR data 1979 to 1998. In the paper trends are quoted to 3dp with tolerance ranges also to 3dp. Am I right in thinking this depends on the assumption the inaccuracy in the measures remains constant with time and the resolution available is good to 3dp? Sorry if this is a naive/basic question. I looked out for a response on that thread but didn’t catch it – any thoughts appreciated over there, especially if this is the underlying assumption how likely it is to be correct. Thanks

    Also from a quick skim of that thread some of the images have gone from comment 210 Jeff C.:
    February 11th, 2009 at 8:07 pm.

    Good to see CA back online again – will make a contribution to WU.

  12. Craig Loehle
    Posted Feb 24, 2009 at 7:52 AM | Permalink

    Here’s a guess as to why 3 PCs. If you are looking for the strongest “signal”, then 3 gives it. The fact that this might be spurious escapes those doing data dredging.

  13. Steve Geiger
    Posted Feb 24, 2009 at 8:00 AM | Permalink

    Principal component analysis of the weather station data produces results similar to those of the satellite data analysis, yielding three separable principal components.

    Does that portion indicate some sort of ‘testing’ of the AWS outcome vs. the satellite data to chose the best fit? If so, would not this be an acceptable approach?

  14. Patrick M.
    Posted Feb 24, 2009 at 8:05 AM | Permalink

    but so far we haven’t run RegEM TTLS with higher regpar values as it will be VERY slow

    This is their out. Not only will they claim they didn’t have enough computing power to calculate PC4 they will then suggest that if their research budget were bigger they would have calculated more PC’s.

    😉

    • Patrick M.
      Posted Feb 24, 2009 at 8:07 AM | Permalink

      Re: Patrick M. (#15),

      To clarify: “their” = Steig et al, (and yes I realize the quote was Steve not Steig).

  15. Spence_UK
    Posted Feb 24, 2009 at 8:15 AM | Permalink

    Hmm. I need to tread carefully in what I say here, but I just came across this link on another blog, and the cartoon within seemed strangely apt:

    http://www.gocomics.com/nonsequitur/2009/02/24/

    The only thing is… shhh… is that I saw the link on stroppy biologist PZ Myers’ blog, y’know, the one who threw all his toys out of the pram about Steve’s blog, then subsequently admitted that he didn’t know anything about CA and refused to visit it. It is just sooo fitting that he should post that up today.

  16. Edward
    Posted Feb 24, 2009 at 9:03 AM | Permalink

    snip – as a matter of editorial policy, I do not permit attempts to debate AGW from first principles on every thread or every thread would be identical.

  17. bender
    Posted Feb 24, 2009 at 9:20 AM | Permalink

    Like I said, they used Prunuspicker’s rule.

  18. Kenneth Fritsch
    Posted Feb 24, 2009 at 9:25 AM | Permalink

    It would appear that the differences between the AWS trends resulting from the selection of regpar parameters 2,3 and 4 are not great, but I am wondering, based on my own dabbling with 95% CIs, if choosing 3 gives a statistically significant trend different than zero whilst using 2 and 4 does not.

  19. bender
    Posted Feb 24, 2009 at 9:26 AM | Permalink

    “Separability” is a subjective term that connotes both (1) regionally contrasting patterns of spatial loading, and (2) independence of the components in terms of mechanistic interpretation. To some extent all orthogonal decompositions are going to lead to “separable” components, so it is a little bit misleading. But like I say, it’s a subjective term. It’s prose. Nothing really turns on it.

  20. Ross McKitrick
    Posted Feb 24, 2009 at 9:54 AM | Permalink

    Separability is a mathematical term that refers to independence of partial derivatives with respect to subgroups of variables in a multivariate function. It is used in economic modeling a lot. If, for instance, a firm’s cost function is defined as a function of thousands of input prices, the demand equation for input X can be simplified by assuming it is a function of the price of X itself, as well as price indexes of a few related groups, and is otherwise independent of all the other prices. The various kinds of independence lead to refinements like weak separability, strong separability, quasi-separability etc. PCs are orthogonal to each other, but I don’t know what ‘separability’ means in this context.

    I agree with Kenneth that while k=3 maximizes the trend, the result is pretty stable across k=2-6. But the reversal for k=1 merits some comment. Also, as k goes up to 16, 32 etc. you’re approaching the full rank, so it’s odd that the supposed signal vanishes. I seem to remember some other controversy where observing an emergent signal as k increased led to great huffing and puffing about the evils of truncating k too soon.

  21. Mark T
    Posted Feb 24, 2009 at 10:02 AM | Permalink

    From PC3 down to PC6 they’re all unphysical, so in that sense, there are really only 2 legit PCs. Half of each of the remaining PCs (half of each PC) is an artifact of their infilling for the early portion of the recon. Since the data is obviously non-stationary, the recon (decomp) should be done in blocks at least.

    Mark

  22. Steve McIntyre
    Posted Feb 24, 2009 at 10:38 AM | Permalink

    I’ve identified a possible interpretation of “separable” EOFs in local dialect.

    Smith et al 1996 cited by Tapio Schneider 2001 in a different context says:

    Computing EOFs from the OI anomalies for the tropical Pacific, the first mode accounts for 43.7% of the variance while the second accounts for 10.0%. Only these modes (see North et al 1983) are well separated from other modes. ..The remaining tropical Pacific modes are likely degeneratively mixed with variances that begin at 4.3% (mode 3) and gradually decrease. Although the changes after mode 2 are small, the flat areas of the eigenvalue curve help group eigenvalues that are similar.

    (Gerry) North et al 198s stated:

    An obvious difficulty arises in physically interpreting an EOF if it is not even well-defined intrinsically. This can happen for instance if two or more EOFs have the same eigenvalue. It is easily demonstrated that any linear combination of the members of the degenerate multiplet is also an EOF with the same eigenvalue. Hence in the case of a degenerate multiplet one can choose a range of linear combinations which .. are indistinguishable in terms of their contribution to the average variance…Such degeneracies often arise from a symmetry in the problem but they can be present for no apparent reason (accidental degeneracy.)

    The “accidental degeneracy” in question arises when eigenvalues are close but theoretically different, but the sampling error blurs them.

    Here is a graph of eigenvalues for something that I’ll explain in a day or two, showing cases where eigenvalue couplets have identical values.
    As you can see, the 2nd and 3rd eigenvalues are identical. They are not “separable”, but, according to some definitions of “significant”, they might well be “significant”.


    Note: this is simulated and my guess is that some of the very close couplets are probably multiplets.

  23. Steve McIntyre
    Posted Feb 24, 2009 at 10:46 AM | Permalink

    there are really only 2 legit PCs

    I think that it’s premature to even say this.

    • Mark T
      Posted Feb 24, 2009 at 10:54 AM | Permalink

      Re: Steve McIntyre (#25), LOL! Well, I was speaking in terms of possibility, not actuality, i.e., there are only two with a chance of being legit! Point taken, however. 🙂

      Mark

  24. Jean S
    Posted Feb 24, 2009 at 10:53 AM | Permalink

    Steve, do you have any collection of the AVHRR data? The supposed “observational” (1982-) part of Steig’s data is obviously obtained either by PCA or some other “cut-off” method so it is not much of use for determining the eigenvalue spectrum of the original satellite data (BTW is there a possibility that raw AVHRR data is already rank 3?). Also, have you tried to calculated the spectrum based on surface & AWS (gridded) data?

    • Steve McIntyre
      Posted Feb 24, 2009 at 12:59 PM | Permalink

      Re: Jean S (#26),

      Nope. I don’t have any AVHRR data. It’s hard to keep with Steig and Schmidt on this topic 🙂 :

      On Jan 23, I wrote to Steig (later he criticized me allegedly not asking for the data)

      Dear Dr Steig, In your article you refer to the development of “50-year-long, spatially complete estimate of monthly Antarctic temperature anomalies.” Could I please have a digital copy of this data. Also I presume that this output was derived from corresponding monthly gridded versions of T_IR and T_B and I would appreciate a copy of these (or equivalent) input data as well. Regards. Steve McIntyre

      On Jan 23, Steig replied:

      I have always intended to provide all the material on line; I wasn’t allowed to do this before the paper was published. I would have done it already but have been busy answering emails. I should have these up on line next week. Eric

      On Jan 23, I also wrote to Comiso, the AVHRR guy asking for AVHRR data. Comiso replied:

      Thanks for your request for AVHRR surface temperature IR data. I am actually planning to have the entire data set archived in the near future and as soon as I get the associated document that describes the data and discusses the errors and caveats completed. The data are indeed on a gridded monthly basis. I will let you know how to access them in the web as soon as they are archived and ready to be downloaded. Best Wishes,
      Joey Comiso

      The following week, Steig provided an incomplete archive with only the rank-3 AVHRR version and posted at CA in very inflammatory terms. On Feb 2 at RC he said:

      [Response: ALL of the data that were used in the paper, and EXACTLY the code used in our paper have been available for a long time, indeed, long before we published our paper. This is totally transparent, and attempts to make it appear otherwise are disingenuous.

      LAter that day he reiterated:

      [Response: Yes, I think we do a fine job. We could do better, but we do very very well. I’ve never had trouble getting data that I need from others. Indeed, our Nature study was based entirely on freely available data and code.–eric]

      and reiterated again:

      Response:… I released an electronic version of our data and links to all the original data and code almost as soon as our paper was published. Anyone paying attention would know that

      On Feb 8, Nicolas Nierenberg asked at RC:

      Dr. Steig has said that he is willing to provide the data to legitimate researchers. My response is to simply post what he would provide. I still haven’t heard from Dr. Schmidt what the objection is to that concept. Also as to the specifics in Dr. Steig’s paper. I believe that there is probably sufficient information on AWS trends. However I don’t think there is sufficient information to reproduce the gridded AVHRR temperature results. They are quite dependent on corrections for clouds, and manipulation to produce temperature values as I understand it.

      Gavin now had a different story. Without batting an eye, Gavin now said that the data was unavailable but Comiso was working on it.

      [Response: Joey Comiso is apparently working on making that available with appropriate documentation – patience. – gavin]

      On Feb 16, Comiso was still working on it.

      [Response: Joey Comiso is apparently working on the data preparation along with sufficient explanation to make it usable. I have no particular insight into the timetable. If all the stations being averaged together are coherent, then it pre-averaging shouldn’t make that much difference. If instead there are some dud stations/bad data mixed in, they will corrupt the average and reduce the coherence with the far-field stations. Bad stations in the standard approach should be incoherent to any variability in the far field stations and so shouldn’t affect the result much. You’d be better off removing them rather than trying to average them away I would think. – gavin]

      On Feb 24, we haven’t heard anything more, so I guess that he’s still working on it. The only explanation that I can put on the delay is that someone inadvertently overwrote the version used in the Nature study – not expecting anyone to be actually interested in the data – and now that it’s in the public eye are working hard to re-create the version that gave the reported results. It’s something that can easily happen if archiving data as used is not the sort of thing that you pay attention to. And it can be hard to figure out exactly. But maybe it’s some other reason. But if the data was on someone’s hard drive, why wouldn’t they just put it online with a read program. It would only take a few hours max.

      But bottom line – right now the monthly AVHRR data is not available anywhere.

      • bender
        Posted Feb 24, 2009 at 1:21 PM | Permalink

        Re: Steve McIntyre (#37),

        the monthly AVHRR data is not available

        Apparently, “availability” is a subjective concept. Isn’t it convenient to have such flexibility? I guess it’s another “situation that is unique to climate science”. Am I piling on yet?

      • Dave Andrews
        Posted Feb 24, 2009 at 2:39 PM | Permalink

        Re: Steve McIntyre (#37),

        This is probably a naive question, but if the data is not available in any archivable form how did they arrive at their initial conclusions based on the data?

        • bender
          Posted Feb 24, 2009 at 3:04 PM | Permalink

          Re: Dave Andrews (#41),

          if the data is not available in any archivable form

          It’s “available” in the sense that if you could guess what the authors did you could obtain the same data that they did through similar means. It’s the kind of “availability” that is designed to prevent competing scientists from tracking too closely what you are doing. Unfortunately, this it’s the same model of transprency that is being used to drive global carbon policy. IPCC feels no obligation to audit results. Just summarize them.

      • Jean S
        Posted Feb 24, 2009 at 4:13 PM | Permalink

        Re: Steve McIntyre (#37),

        The only explanation that I can put on the delay is that someone inadvertently overwrote the version used in the Nature study

        Well, I don’t think so … I believe we do have the AVHRR version that was fed into RegEM. Why I believe so, well, just substract the 1982- part of AVHRR reconstruction from the detrended “reconstruction” 😉 On the other hand, I have a nagging feeling that the version used was not the one that was supposed to be fed into RegEM…

        • Steve McIntyre
          Posted Feb 24, 2009 at 10:46 PM | Permalink

          Re: Jean S (#43),

          cool. Here are differences for 6 gridcells. The same thing definitely went into both meatgrinders. I think that Jean S is on to something here – RegEM output contains the input data – so for the Steig dataset to be RegEM output, it had to be smoothed already before it went in. IT sure doesn’t make a lot of sense .

        • Ryan O
          Posted Feb 25, 2009 at 11:16 AM | Permalink

          Re: Steve McIntyre (#44),

          The same thing definitely went into both meatgrinders. I think that Jean S is on to something here – RegEM output contains the input data – so for the Steig dataset to be RegEM output, it had to be smoothed already before it went in. IT sure doesn’t make a lot of sense .

          .

          As you indicate, the really strange thing is that AVHRR 1982-2006 part (which should correspond to actual input data) is also a 3PC construction. So it seems that the published “reconstructions” are actual outputs of RegEM, and therefore contain the inputs+infilled values of the corresponging reconstruction AWS/AVHRR/AVHRR (detrended). This would imply that AVHRR data was somehow “cut off” to rank 3 before feeding into RegEM.

          .
          Wouldn’t this be due to the processing for cloud masking? They remove data that differs from the climatological mean by 10C or greater. This will leave gaps. They would then infill the gaps and use the actual+infilled as input (which it seems they did not do) or simply find 3 PCs, reconstruct the whole thing, and use that as input (which it seems they did do). Either way, they would have to do something to account for the gaps.

        • Hugo M
          Posted Feb 25, 2009 at 11:42 AM | Permalink

          Re: Ryan O (#48),

          Wouldn’t this be due to the processing for cloud masking? They remove data that differs from the climatological mean by 10C or greater. This will leave gaps.

          While CA was down, I read somewhere that clouds over Antarctica are distributed rather unevenly, with clouds prefering to cover West Antarctica.

        • Ian
          Posted Feb 25, 2009 at 12:26 PM | Permalink

          Re: Ryan O (#48), I understood this satellite covers the antartic 15 times a day, without knowing more about how they process the data I guess we won’t know, but given some of those wind speeds recorded, and the IR images I’ve seen I’d be surprised if the cloud cover masking caused a significant dropout

          Ian

        • Ryan O
          Posted Feb 25, 2009 at 12:54 PM | Permalink

          Re: Ian (#50), True, but the number of passes/day doesn’t affect the persistence of clouds.
          .
          Also, their masking procedure is not necessarily always related to cloud cover. They removed points that differed from the climatological mean by 10C or more. Sometimes this is due to clouds; other times it may be due to errors/missing values in the raw data or other things.
          .
          Regardless, the percentage of time clouds are present is significant (abstracts):
          .
          http://www.agu.org/pubs/crossref/2005/2005GL023782.shtml
          .
          http://ieeexplore.ieee.org/Xplore/login.jsp?url=/ielx5/7695/21048/00977070.pdf?arnumber=977070
          .
          http://mclean.ch/climate/Cloud_Antarctic_Pen.htm
          .
          Re: Hugo M (#49), Yeah. The clouds are most significant along the coastlines, sea ice, and west antarctica . . . the areas that show warming. The plateau has the least average cloud cover, and, perhaps coincidentally, shows the least warming.

      • Taphonomic
        Posted Mar 1, 2009 at 11:08 AM | Permalink

        Re: Steve McIntyre (#37),

        This raises the question of why Nature published this paper at all (not to mention gave it a cover) when the paper violates Nature’s own editorial policies for “Availability of data and materials”
        (available at: http://www.nature.com/authors/editorial_policies/availability.html ) which clearly state:

        “An inherent principle of publication is that others should be able to replicate and build upon the authors’ published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols promptly available to readers without preconditions. Any restrictions on the availability of materials or information must be disclosed to the editors at the time of submission. Any restrictions must also be disclosed in the submitted manuscript, including details of how readers can obtain materials and information. If materials are to be distributed by a for-profit company, this should be stated in the paper.”

        “Supporting data must be made available to editors and peer-reviewers at the time of submission for the purposes of evaluating the manuscript. Peer-reviewers may be asked to comment on the terms of access to materials, methods and/or data sets; Nature journals reserve the right to refuse publication in cases where authors do not provide adequate assurances that they can comply with the journal’s requirements for sharing materials.”

        “After publication, readers who encounter refusal by the authors to comply with these policies should contact the chief editor of the journal (or the chief biology/chief physical sciences editors in the case of Nature). In cases where editors are unable to resolve a complaint, the journal may refer the matter to the authors’ funding institution and/or publish a formal statement of correction, attached online to the publication, stating that readers have been unable to obtain necessary materials to replicate the findings.”

        It was bad enough that Steig refused to provide the actual code for replication. As the AVHRR data are not available and apparently were not available to the reviewers, one has to ask how and why did this paper get published when it violates Nature’s policies?

        • bender
          Posted Mar 1, 2009 at 1:40 PM | Permalink

          Re: Taphonomic (#62),

          how and why did this paper get published when it violates Nature’s policies?

          A relaxation of scientific standards when it comes to alarmist pseudoscience, possibly resulting from an alarmist ideology amongst Nature‘s editors? Dunno. Good question.

    • RomanM
      Posted Feb 25, 2009 at 7:01 AM | Permalink

      Re: Jean S (#26), Re: Steve McIntyre (#44),

      The supposed “observational” (1982-) part of Steig’s data is obviously obtained either by PCA or some other “cut-off” method so it is not much of use for determining the eigenvalue spectrum of the original satellite data (BTW is there a possibility that raw AVHRR data is already rank 3?)

      I think that you may be on to something here. This would answer my question about why the AWS reconstructions had actual embedded data and the satellite reconstruction did not.

      As Steve’s graph in comment #44 indicates, the difference between the original recon and the detrended recon is a straight line in the 1982 to 2006 time period and wiggly lines prior to 1982. I don’t have time right now to delve further, but my preliminary R results indicate that the wiggly part is also a 3PC result.

      I will hazard a guess that the raw data was NEVER used directly into RegEM. It appears to me that the satellite data was independently crunched into 3 PCs and the reconstructed 1982-2006 result used as input into the RegEM procedure. The “detrended” reconstruction seems to have used the residuals from a regression line fitted to each of the 5509 faux data sequences as the input.

      If the procedure was actually done this way, this could seriously understate the uncertainty in the satellite data producing results that would not only limit the the possibility of producing trends that differed substantially in the various geographic regions of the Antarctic, but also overstate the “significance” of the trend results that actually were obtained.

      Got a seminar to go to this morning. I will try to look at it more later.

      • Jean S
        Posted Feb 25, 2009 at 7:55 AM | Permalink

        Re: RomanM (#46),
        Yes, that’s about my understanding of the situation right now. The “wiggly part” (infilling, prior 1982) is always 3PC (rank 3), and as discussed earlier, this due to regpar=3 option in RegEM (which creates rank 3 outputs). As you indicate, the really strange thing is that AVHRR 1982-2006 part (which should correspond to actual input data) is also a 3PC construction.

        So it seems that the published “reconstructions” are actual outputs of RegEM, and therefore contain the inputs+infilled values of the corresponging reconstruction AWS/AVHRR/AVHRR (detrended). This would imply that AVHRR data was somehow “cut off” to rank 3 before feeding into RegEM. There is also another minor mystery in the data sets. AVHRR (and AVHRR det) is clearly zero mean on period 1982-2006. But PCA reconstruction is not.

      • Posted Feb 26, 2009 at 7:21 AM | Permalink

        Re: RomanM (#46),

        please see this link below.

        Satellite Temperature Trend Also Halved by Simple Regridding

        The new guy get’s lost in the threads here sometimes, but I believe graph 1 proves that you don’t need the whole dataset. I was already able to redo the entire satellite reconstruction in a re gridded fashion from 3 pc’s.

  25. peter vd berg
    Posted Feb 24, 2009 at 11:11 AM | Permalink

    this link http://medicine.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pmed.0020124

    is a study in the medical field but can easily be applied to the Steig and any other publications

  26. Jesper
    Posted Feb 24, 2009 at 11:18 AM | Permalink

    Steve, What are the actual eigenvalues?

  27. Keith Herbert
    Posted Feb 24, 2009 at 11:21 AM | Permalink

    Sorry, but AWS is not in the glossary so I looked it up, I’m assuming it’s not the Alien Workshop.
    Also I’m thinking this text below the graph is just a typo

    ASW Trends under different regpar parameters (“RegEM Truncated PC”)

    and it’s really AWS again.

  28. Shallow Climate
    Posted Feb 24, 2009 at 11:36 AM | Permalink

    Re Ross McKitrik (#22):
    Would someone please comment on the negative k=1, as stated that it merits some comment? Also, please, what is the “other controversy” to do with “truncating k too soon”?

  29. compguy77
    Posted Feb 24, 2009 at 11:41 AM | Permalink

    Steve,

    My guess would be that the response you get on the topic of separable (if you indeed get one) would be based on the explanation identified by Lubos in a prior thread (http://www.climateaudit.org/?p=5287#comment-328818)

  30. Keith Herbert
    Posted Feb 24, 2009 at 11:47 AM | Permalink

    Thanks Cliff. I also found it here AWS

  31. Posted Feb 24, 2009 at 12:09 PM | Permalink

    Re David Wright #2, John A #6, Rafa 37, Bender #19, Kenf #20,

    “Preisendorfer’s Rule N”, invoked by MBH and discussed on CA at length already (use CA search engine) is in fact not unreasonable as an objective criterion for separating significant PCs from noise PCs. It basically objectifies the eyeball “scree” rule (see Wikipedia on PCA).

    However, it assumes a full panel of data. The large gaps in this data set could make it difficult to apply.

    Preisendorfer was a climatologist/statistician who wrote a book in the 1970s about using principal component methods in climate. “Rule N” is just a reference to the 14th of several alphabetically identified cutoff rules he considered, and has nothing to do with the sample size.

    (I’m not saying MBH used the rule correctly, just that they commendably invoked it.)

    Steve: Hu, let me clarify this a little. There are two quite separate PC operations in MBH – temperature grid and tree ring. They specifically report use of this method only for deciding how many temperature PCs to retain and relevant code can be observed. They do not specifically report use of this method for tree ring networks and the observed pattern of retained tree ring PCs cannot be even approximately replicated using Preisendorfer’s Rule N. The first specific mention of PReisendorfer in connection with tree rings was in response to MM papers, where use of this method identified bristlecones as a pattern. (It’s a jump to going from being a pattern to a temperature signal, but niceties like that were of zero interest to the Team.) It is impossible to replicate retained PCs in the Stahle/SWM and/or Vaganov networks with Rule N – a point that the Team blithely ignores.

  32. Layman Lurker
    Posted Feb 24, 2009 at 12:25 PM | Permalink

    #34

    “…it assumes a full panel of data. The large gaps in this data set could make it difficult to apply.”

    It was applied after the data was infilled, was it not?

    • Peter
      Posted Feb 24, 2009 at 1:05 PM | Permalink

      Re: Layman Lurker (#36), LL, I’m assuming Hu means a full panel of observational data as opposed to created data.

  33. BillA
    Posted Feb 24, 2009 at 1:43 PM | Permalink

    Being a bear of very little brain, I was uncertain about the terms “eigenvalue” and “PC”. I looked for a fairly basic explanation and found this (the example is about language analysis):
    What is an eigenvalue?
    http://jalt.org/test/bro_10.htm

    It has a section on choosing the number of eigenvalues. This method may have been used in the analysis being discussed here.
    Using a theory-based approach. The fourth approach discussed here is to use the number of factors that your theory would predict.”

    I suppose that “PC” means “principal component”.

  34. Posted Feb 24, 2009 at 11:35 PM | Permalink

    Another comment about availablility of the data, posted at RealClimate on 11 February 2009 (note no name on the reply):

    “Reply: the raw data are public; the processed data (i.e. cloud masking) are not yet, but will be in due course. So relax”

  35. Eric
    Posted Feb 26, 2009 at 4:05 PM | Permalink

    First of all I want to say my last name is not Steig.

    I am not a statistics expert and have no experience with principal components analysis. I see a lot of speculation on this page about why the Steig and Schneider did what they did.

    May knowledge of PCA is what I got off of this web page.

    http://www.statsoft.com/textbook/stfacan.html

    On this page there is an illustration of how to pick the optimum number of principal components. It looks at the percentage of variation explained by each of the principal components. There are 2 stopping criteria suggested, the Kaiser criterion and the Scree criterion.

    So far, I haven’t seen these criteria used to determine whether the first 3 eigenvalues represents a good choice on this blog. I may have missed it.

    Could someone who has been working with this data, supply the necessary data to use the above criteria determine whether the first 3 eigenvalues represents a proper choice of the number of principal components to use?

    • Posted Feb 26, 2009 at 4:52 PM | Permalink

      Re: Eric (#54),

      If you use the R scripts SteveM has provided on various posts they will download all the data which is currently available which you can then copy to other formats.

    • Jean S
      Posted Feb 26, 2009 at 4:59 PM | Permalink

      Re: Eric (#54),

      Could someone who has been working with this data, supply the necessary data to use the above criteria determine whether the first 3 eigenvalues represents a proper choice of the number of principal components to use?

      See here (#26) and here (#37)

  36. Mike Lorrey
    Posted Feb 28, 2009 at 9:31 AM | Permalink

    My cousin Andrew Lorrey (http://www.gns.cri.nz/nzpaleoclimate/pdfs/2008_06_AUS_INTIMATE_meeting_report.pdf) has identified six separate climate regions for the two islands of New Zealand. I find it amazing that Steig can only find three climate regions in all of Antarctica.

    • Steve McIntyre
      Posted Feb 28, 2009 at 10:20 AM | Permalink

      Re: Mike Lorrey (#57), Mike, remind your cousin to archive his speleothem data.

  37. David Cauthen
    Posted Mar 1, 2009 at 8:59 AM | Permalink

    Steve,

    The ONLY weakness at CA from my perspective is the lack of periodic summary expositions of issues such as done by Jeff^2 at WUWT on Steig, et.al. Just a note in the suggestion box.

  38. Posted Mar 1, 2009 at 1:44 PM | Permalink

    Steve,

    If you don’t mind can you tell me what the convergence limit was that you used in generating your trends and how many iterations are we looking at. From memory is fine I don’t need a perfect number.

    I realize it’s a different algorithm but I’m trying to understand why TTLS trends in higher order aren’t a perfect match. So far my thought is that the planar fit estimates are giving a substantially different minimum due to stiffness. I don’t think it’s well suited for higher order problems with large sections of missing values.

    • Steve McIntyre
      Posted Mar 1, 2009 at 10:55 PM | Permalink

      Re: Jeff Id (#64), Jeff, I did one run with regpar=3 and tol=.001, but otherwise used tol=0.01. I’ll have to look up run lengths.

  39. Posted Mar 1, 2009 at 2:43 PM | Permalink

    RE Taphonomic #62, quoting Nature policy statement:

    “After publication, readers who encounter refusal by the authors to comply with these policies should contact the chief editor of the journal (or the chief biology/chief physical sciences editors in the case of Nature). In cases where editors are unable to resolve a complaint, the journal may refer the matter to the authors’ funding institution and/or publish a formal statement of correction, attached online to the publication, stating that readers have been unable to obtain necessary materials to replicate the findings.”

    Great find! I trust Steve will be pursuing this avenue before long.

    RE Steve, #37, what is rank 3 AVHRR data? Is this the TIR-based recon that Steig has indeed posted? If not, you say he did provide it — did he send it just to you? If this is not the raw AVHRR data, how does it differ?

    At least Comiso sounds cooperative. Any news from him? The paper has been out for over a month now.

    • Steve McIntyre
      Posted Mar 1, 2009 at 10:50 PM | Permalink

      Re: Hu McCulloch (#65),

      the rank 3 AVHRR is what Steig posted, the 600×5509 data set that Roman observed to have rank 3. This is all we have. Obviously raw monthly AVHRR data couldn’t have rank 3. How else does it differ? No idea.

      As to what they’re doing. Dunno. You’d think that they’d have been able to locate the data used for the Nature article by now, but I guess they’re still working on it. One would think that it would be better practice to prepare your data before you publish the article, than afterwards.

      As others have observed, it looks like they fed the rank 3 data into the meatgrinder – and, if so, this would be a bit of an embarrassment. Maybe they’re going to do nothing and hope that it all goes away.

    • Steve McIntyre
      Posted Mar 1, 2009 at 10:54 PM | Permalink

      Re: Hu McCulloch (#65), I’ve used this policy before. But they still wouldn’t require Mann to archive the stepwise results from MBH98 which to this day remain unreported and unreplicable in detail (both ourselves and Wahl and Ammann can get things that look like it.)

  40. Posted Mar 1, 2009 at 11:29 PM | Permalink

    RE Steve, #66,

    Aha, you mean matrix rank! I somehow thought that this was a level of processing: Rank1 = raw data, rank2 = daily averages over 1KM grids with TOBS, rank3 = monthly averages over 25KM grids with FILNET, etc.

    So in fact, all we have is the 600X5509 TIR recon file on Steig’s website (which we now know has only 3 non-zero eigenvalues, and therefore matrix rank 3), plus a link to a website that has the specific AVHRR data they used on it somewhere. Better than nothing, but still not enough to replicate the TIR recon, as required by the Nature guidelines cited by Taphonomic.

    But now that the article has been out and lionized by the press for over a month, perhaps it is time for the Team to “Move On!”

  41. Jean S
    Posted Mar 2, 2009 at 4:26 AM | Permalink

    So in fact, all we have is the 600X5509 TIR recon file on Steig’s website (which we now know has only 3 non-zero eigenvalues, and therefore matrix rank 3)

    Yes, this the case. The file seems to be the RegEM output. Now, the way they seem to have used RegEM (ttls, regpar=3) automaticly produces a rank three reconstruction for the imputed values. However, the RegEM output also contains the input values (of the part of the reconstruction that was known), that is, post 1982 values (in AVHRR output) should be input to RegEM. Jeffs and others have reported that they can reproduce the RegEM output (Steig’s file) from these values (plus the surface station values of course). Now also these input AVHRR values have rank three, which seems unlikely, to say the least, to be the case if the input was truly “raw”, only slightly modified data.

    • Posted Mar 2, 2009 at 8:24 AM | Permalink

      Re: Jean S (#70),

      I think that means the data went through the bone sorter before the meat grinder.

      SteveM made the point it isn’t proven and he’s right but it’s too close. Besides putting the huge 5509+42 x600 matrix in RegEM would be difficult. I should try it but I like this computer.

      • Jeff C.
        Posted Mar 2, 2009 at 11:12 AM | Permalink

        Re: Jeff Id (#71), So we have the bone sorter than reduced the AVHRR 5509 data series to three PCs and their coefficients. I wonder how they handled the missing data of cloud masked intervals? We know they didn’t use RegEM alone or we would still have the measured data.

        Then we have the RegEM meat grinder with 3 AVHRR PCs and the 42 surface series. It leaves the the 82 to 06 portions of the AVHRR PCs alone since there are no missing values, but adds a prefix of the 57-81 period to the three AVHRR PCs from the surface data.

        The three PCs from the RegEM meatgrinder are then expanded back out to 5509 series using the coefficients from the bone sorter first step.

        They aren’t infilling using three PCs, they’re replacing using three PCs. I can see the thinking with not having 5509 +42 series run through RegEM, but why limit the input to RegEM 3 AVHRR PCs? They could have had the bone sorter output as many PCs as they wanted to include into the input of RegEM to retain the spatial detail of the AVHRR data. Use 8 AVHRR PCs as an example. They still could have had RegEM use regpar=3 to construct the 57-81 portion if they wanted, but they could have used the 8 PCs to expand the AVHRR series back out to 5509 points.

        The paper talks of k=3 and infilling but it is very misleading. The bone sorter acted as a spatial filter, not an infiller.

  42. Posted Mar 2, 2009 at 8:36 AM | Permalink

    RE SM #67,

    I’ve used this policy before. But they still wouldn’t require Mann to archive the stepwise results from MBH98 which to this day remain unreported and unreplicable in detail

    But maybe there’s a new Editor in Chief now who thinks the rule is important. It’s worth a try, anyway.

    • bender
      Posted Mar 2, 2009 at 8:47 AM | Permalink

      Re: Hu McCulloch (#72),
      Steig’s response is easy to predict: everything is already available, code, data, everything. You think the Editor of Nature’s gonna check to see if that’s true? They have rules, but with no intent to enforce them. They presume a culture of compliance – which is an incorrect presumption in the case of AGW alarmist pseudoscience.

  43. Posted Mar 2, 2009 at 9:13 AM | Permalink

    Re Jean S #70, Jeff Id #71,

    I can see that a 5509+42 X 600 data matrix, not to mention the implied (5509+42) X (5509+42) moment matrix, would be a bit cumbersome.

    I still don’t understand RegEM, but wouldn’t it have made sense for them to have first reduced the 5509X600 TIR matrix to some small number k ≥ 3 of PCs, thus reducing the RegEM matrix to (k+42) X 600? If they are, as they say, just using the TIR covariances to “guide” the recon, while only using the manned station data as predictors, nothing would be lost by this except (indirectly) geographic detail.

    Just finding the eigenvalues of a 5509 X 5509 moment matrix could be daunting, but didn’t Steve mention a trick whereby the SVD of a matrix can be found from that of its transpose, thereby reducing the eigenvalue problem to 600X600? In any event, there are only 300 dates of adequate TIR readings (1982-2006 according to the paper), so in fact this would just be a 300X300 problem.

    This is starting to sound easy!

    • Posted Mar 2, 2009 at 10:17 AM | Permalink

      Re: Hu McCulloch (#74),

      I have to say, if you don’t get it there’s not much hope for me. I do have the advantage of being able to single step through the R code though.

      I still don’t understand RegEM, but wouldn’t it have made sense for them to have first reduced the 5509X600 TIR matrix to some small number k ≥ 3 of PCs, thus reducing the RegEM matrix to (k+42) X 600?

      Once I realized that RegEM used covariance to weight the infilling, it made good sense. I think it was a pretty smart way to break down a complicated problem. Aside from the fact that it’s area weighting is otherwise uncontrolled and basically unverified. I haven’t proven that it is an equivalent method to all the data because I’ve had some trouble writing NaN values into a CSV file that Matlab understands. Easiest thing in the world in C but I don’t know R well enough.

      RomanM, SteveM or Jean can probably do it in a couple of minutes but the extra quotes or spaces or whatever Matlab doesn’t like are stopping me. I’m pretty stubborn though.

      Did you see the correlation vs distance plots Jeff C and I did at WUWT this weekend?

      • Hugo M
        Posted Mar 2, 2009 at 10:42 AM | Permalink

        Re: Jeff Id (#75), regarding how to export NaN values from R:

        x=c(1,2,3,NA,5,6,7,NA,9)
        write.csv(x,na=”nan”)

        • Posted Mar 2, 2009 at 10:48 AM | Permalink

          Re: Hugo M (#76),

          I did something very similar to that, I’ll try your way but from memory it was like R put quotes around it or something.

  44. Ryan O
    Posted Mar 2, 2009 at 5:50 PM | Permalink

    Made the following at RC. Wonder if it will appear or be moderated to nonexistence (my guess is the latter):

    Gavin says: “If people want to make specific points, they should make them.”
    .
    With all due respect, he did make a specific point: The method used by the authors smeared the peninsula warming over the interior. Or, to put it slightly differently, the low number of PCs used was insufficient to properly capture the geographical distribution of the temperature trends.
    .
    Stating that the authors “realized that the disadvantage of not including higher order terms” led to an inaccurate depiction of peninsula warming in no way addresses the statement that the failure to include higher-order terms transferred peninsula warming to the interior. The followup that “there are ad hoc rules” to determine this is similarly irrelevant. It doesn’t matter what the ad hoc rules are . . . if the application of one or more of those rules resulted in an inaccurate geographic distribution of temperature trends, then the rule was either wrong, inappropriately used, or both.
    .
    And if as you imply the higher-order terms are contaminated by artifacts (and thus cannot be used) – while simultaneously the lower-order terms are shown to be insufficient to accurately depict the geographical distribution of trends – then the obvious conclusion is that the available information is insufficient to support the main conclusion of the paper: heretofore unreported significant warming in West Antarctica.

    • bender
      Posted Mar 2, 2009 at 9:01 PM | Permalink

      Re: Ryan O (#80),
      The western Antarctic lack of warming is a short-term anomaly; it’s weather noise. The eastern peninsualr warming is evidence of the “robust” deterministic GHG-forced trend. To follow his logic it helps to start from those assumptions. Read “the Blackboard”. Lucia’s discussing that very issue at the moment.

  45. Posted Mar 2, 2009 at 6:06 PM | Permalink

    Thanks Ryan, If they delete your comment, it will be the fourth one today.

  46. Posted Mar 2, 2009 at 7:13 PM | Permalink

    Ryan, amazingly enough it got through. The reply was this.

    [Response: We have a situation where we don’t have complete information going back in time. The information we do have has issues (data gaps, sampling inhomogeneities, possible un-climatic trends). The goal is to extract enough information from the periods when there is more information about the spatial structure of temperature covariance to make an estimate of the spatial structure of changes in the past. Since we are interested in the robust features of the spatial correlation, you don’t want to include too many PCs or eigenmodes (each with ever more localised structures) since you will be including features that are very dependent on individual (and possibly suspect) records. Schneider et al (2004) looked much more closely at how many eigenmodes can be usefully extracted from the data and how much of the variance they explain. Their answer was 3 or possibly 4. That’s just how it works out. The fact is that multiple methods (as shown in the Steig et al paper) show that the West Antarctic long term warming is robust and I have seen no analysis that puts that into question. You could clearly add in enough modes to better resolve the peninsular trends, but at the cost of adding spurious noise elsewhere. The aim is to see what can be safely deduced with the data that exists. – gavin]

    This makes no sense to me.

    • Michael Jankowski
      Posted Mar 2, 2009 at 7:16 PM | Permalink

      Re: Jeff Id (#82), Love how Gavin took the opportunity to throw in that word “robust.”

    • Greg F
      Posted Mar 2, 2009 at 7:39 PM | Permalink

      Re: Jeff Id (#82),

      This makes no sense to me.

      Three step program to understanding.

      1. Stand in front of a mirror.
      2. Put your arms over your head.
      3. Move your arms vigorously.

      Hope this helps.

    • Ryan O
      Posted Mar 2, 2009 at 8:30 PM | Permalink

      Re: Jeff Id (#82), Haha!
      .
      I posted a followup based on Gavin’s non-answer. Let’s see if that gets through. 🙂

  47. John M
    Posted Mar 2, 2009 at 8:15 PM | Permalink

    Gotta love it!

    Steve Mc at JeffId’s place:

    However, it’s one that I find of considerable interest as the number of retained PCs was a battleground issue in the debate over MBH, where Mann developed various pretexts for increasing the number of retained PCs to get the bristlecones (the covariance PC4) into the mix. I for one will be intrigued to see how they formulate a rule that limits to 3 PCs in this case, while mandating 4 or more PCs in the MBH NOAMER case

    Gavin:

    Their answer was 3 or possibly 4. That’s just how it works out.

    And sometimes cooling is warming…

    • bender
      Posted Mar 2, 2009 at 8:53 PM | Permalink

      Re: John M (#85),

      I for one will be intrigued to see how they formulate a rule that limits to 3 PCs in this case, while mandating 4 or more PCs in the MBH NOAMER case

      “That’s just how it works out.” Three is what is required to obtain that which is (presumed to be) robust: the spatially correlated warming trend in Antarctica. Four is what is required for NOAMER. Different Presendorfer numbers for different continents.
      .
      Gavin’s not stupid. He knows how to evade your attempts to pin him down.

      • Posted Mar 2, 2009 at 9:16 PM | Permalink

        Re: bender (#88),

        In cases where you’re looking for a trend, less PC’s make sense. Had they pre-weighted the stations for area, 1 PC would work fine (mountain ranges would still get blurred). In a case where we’re relying on covariance to paste the correct trends on, there must be more PC’s. They are simply not equivalent cases and gavin knows this which just makes me madder.

        I’m going to snip myself now because …

        • bender
          Posted Mar 2, 2009 at 9:21 PM | Permalink

          Re: Jeff Id (#90),

          and gavin knows this

          Mann knows it. I’ll bet you Gavin doesn’t. Remember, this is the same guy who interpreted his negative correlation coefficients (in S09) as though they were positive. He has a short attention span for methodological detail when it comes to fly-swatting.

        • Martin Sidey
          Posted Mar 3, 2009 at 2:54 AM | Permalink

          Re: Jeff Id (#90),

          Couldn’t the point be made the issue here is not mathematics or statistics but teh physics of the climate in Antarctica. The mathematics of the Steig paper have been analyzed and it produces physically unrealistic and unreasonable results. References to ad hoc rules created by Schneider (or even Gauss or Newton) do not answer the issue that the results that Steig has produced is unphysical.

        • Posted Mar 3, 2009 at 11:49 PM | Permalink

          Re: Martin Sidey (#92),

          I’m no climatologist, the weather is not something I’ve studied beyond the internet. All I know is that the math presented seems to be flawed in a simplistic fashion…..again. My only other real experiences are M08, Santer (a bit more subtle) and this one. I don’t know what to do with it, when I try to discuss it with RC my comments are cut. The claims of superior climate knowledge that I’ve read don’t make sense to me. Gavin told me once the data is the data, I say math is math. The point you make about references seems right to me, this is a different situation entirely from Schneider or previous team papers but I’m still the rookie so others may have different opinions.

  48. bender
    Posted Mar 2, 2009 at 8:49 PM | Permalink

    Actually, Gavin’s reply makes sense to me, if by “robust features of the spatial correlation” you take that to mean the presumed GHG forcing trend.
    .
    As he often does, Gavin uses circular logic in a reply that is long enough that the circularity eludes the casual reader.
    .
    He artfully dodges the substantive issue: what makes him think this spatially correlated trend is “robust”?

  49. Jean S
    Posted Mar 14, 2009 at 6:58 AM | Permalink

    Any news about AVHRR data? Comiso still working on it?

    • Ryan O
      Posted Mar 14, 2009 at 8:18 AM | Permalink

      Re: Jean S (#94), Apparently.
      .
      Even having the processed data still leaves a lot of wiggle room for Steig & Co. Small changes in cloud masking technique result in large changes in final results (compare early Comiso, Monaghan, and Steig), and monthly data cannot be cloud masked. All of that processing has to be done before converting to monthly and before going from the 5km grid to Steig’s 50km grid.
      .
      Still, it would be very very nice to see if the reduction of the data to 3 PCs introduced significant error.

      • Jeff C.
        Posted Mar 14, 2009 at 9:21 PM | Permalink

        Re: Ryan O (#95),

        Ryan, did you get your daily data from NSIDC? I’m playing with monthly mean AVHRR data from the UWisc website. It already has the default cloud mask applied using CASPR. Got it converted to 50 x 50 km cells and have a continent layout almost identical to Steigs.

        Unsurprisingly, it didn’t look anything like Comiso, Mohaghan, Chapman, or Steig. I think a made a very interesting find today regarding how Comiso “adjusts” his data. I’m starting to get something that resembles his recon.

        Jeff Id and I have been working together on this and will write something up once complete. Let me know if you are interested, perhaps we can pool our knowledge.

        • Ryan O
          Posted Mar 19, 2009 at 6:12 PM | Permalink

          Re: Jeff C. (#96), I’m up to 1998 so far. 🙂 I’ll have it all in about 2 weeks at this rate. I haven’t started playing with it, yet. I have no idea how long it will take me to make anything useful out of it.
          .
          I can say I am awfully curious about the Comiso adjustment! 😉

        • Jeff C.
          Posted Mar 20, 2009 at 11:59 AM | Permalink

          Re: Ryan O (#97), I should state first that I’m not sure this is what he did. What I have noticed is that the trends I originally saw didn’t look like the trends shown in the Steig SI (not the reconstruction, but the pre-RegEM trends shown in figure S1 c) or like those attributed to Comiso in Monaghan.

          Once I applied a rudimentary surface calibration to the data (forced the South pole cells equal to the Amundsen-Scott data, then offset all other points for that month by the same amount), I got something that looks quite similar to those shown in the Steig SI and Monaghan.

          It might be entirely coincidental. I’ll put some plots up later today so you can see what I mean.

        • Ryan O
          Posted Mar 20, 2009 at 1:42 PM | Permalink

          Re: Jeff C. (#98), Interesting. Did you do that correction before or after converting to anomalies?

        • Jeff C.
          Posted Mar 20, 2009 at 5:01 PM | Permalink

          Re: Ryan O (#99), The surface calibration was done to the temps, I converted to anomalies after the calibration. Here is my trend comparison to some of Comiso’s published results. They look fairly close, but that doesn’t mean I’m doing it right, it could be a coincidence. I need to dig into this more.

        • Ryan O
          Posted Mar 21, 2009 at 4:03 PM | Permalink

          Re: Jeff C. (#100),
          .
          Close enough to show that he did something similar. For Steig, there’s an additional step of removing any AVHRR data that “differs from the climatological mean” by more than 10F. I could make some guesses as to what that means, but none of them would be a sure thing. Regardless, once the cloud masking is done, all the images seem to look very similar. So I would venture a guess that Comiso’s procedure is not much different than CASPR.
          .
          At this point, it seems clear to me that something is wrong with the cloud masking procedures.
          .
          Suspicious Item #1: The areas showing the strong positive trends are also the areas with the greatest cloud cover – and, hence, the areas most modified by cloud masking.
          .
          Suspicious Item #2: The coasts are universally warming, and warming strongly. Both UAH and RSS show cooling or no trend in the ocean surrounding most of Antarctica with the exception of the Peninsula area. By what physical mechanism could the oceans cool and the immediately adjacent land warm strongly? I think you can see this in your plot, because it looks like you have a few pixels dominated by ocean response scattered around the coast.
          .
          This doesn’t just go for Steig, it goes for Comiso and Monaghan, too.
          .
          So now the problem is, how would cloud masking result in a positive trend? I have a guess, but nothing to confirm it yet.
          .

          .
          The AVHRR data set isn’t just from one instrument. It’s from several. For the times we’re talking about above, there were 4 separate AVHRR instruments that were used to gather the data:
          .
          NOAA-7 23 July 1981 – 31 December 1984
          NOAA-9 1 January 1985 – 7 November 1988
          NOAA-11 8 November 1988 – 31 December 1994
          NOAA-14 1 January 1995 – 31 December 2000
          .
          If any one of them has a problem with one of the channels (most likely Channel 4, since cloud masking seems to rely on it the most), that could result in an offset that would show up as a trend when the cloud masking is done.
          .
          Current theory. Don’t know if there’s any validity to it (I never did download the monthly data).
          .
          BTW, I’m up to 1999 on the raw data. 760 GB so far . . . 🙂

        • Jeff C.
          Posted Mar 21, 2009 at 5:47 PM | Permalink

          Re: Ryan O (#103), I read somewhere (might have been Monaghan) that Antarctica is subject to temperature inversions where cold air is trapped below cloud cover. This means that the clouds are systemically warmer than the surface, thus leading to artificially warm temps in the AVHRR data if not masked properly.

          Regarding Steig’s “differs from the climatological mean” by more than 10 deg C” (should be C), Jeff Id asked Gavin about the meaning of “climatolgical mean”. His exact response was “The average temperatures for that month over the whole satellite record”. I take that to mean that they throw out daily data where the anomaly exceeds +/- 10 deg C for a cell on any given day. Sounds like you throw out a lot of real data using that as a screen. Exceeding the norm by 10 deg C is not that uncommon even in mild climates.

          Good point about the different spacecraft, problems cropping up, and drift. That was what led me to use a surface calibration. The idea was to use a good known reference to normalize the AVHRR data and bring continuity over time. If you had to pick a reference, Amundsen-Scott is ideal. It has a complete record (no missing months), it is in an area without much physical variation for quite a distance, and you don’t really need to worry about time of day since the sun is at the same elevation angle all day long.

        • Ryan O
          Posted Mar 22, 2009 at 11:54 AM | Permalink

          Re: Jeff C. (#104), Along with those issues, there may also be an issue with the equatorial crossing time drift. The measured temperature is dependent on time-of-observation (hmm . . . sounding like Hansen now . . . 🙂 ). Regardless, this acquisition time shifts. It would be interesting to see if there are discontinuities in the U Wisc AVHRR data at satellite changeovers, since there would be a sudden shift in the acquisition time.
          .
          On your forced surface calibration, did you use a single offset (1 number) for the entire series, or did you allow the number to change based on the monthly difference between AVHRR and Amundsen? If it was the latter, it is more of a curve-fitting than a calibration. If the latter was what Comiso actually did, it is an entirely physically invalid way of performing a calibration. It’s not bad as an order-of-magnitude approximation, but that is definitely not how any of these guys present their information.
          .
          The proper way to do this is to develop an expression – either from principles or empirically – that allows you to predict ground station from AVHRR and vice-versa. For it to be a valid calibration, this expression cannot vary from point to point. “Cases” – like “ice”, “snow”, “water”, “dirt”, etc., can be built into it – but the same expression must be applied at each point and the “case” is evaluated in the same manner at each point. It may not work perfectly (like Shuman’s calibration did not), and that is okay so long as the problems with the calibration are explicitly quantified as Shuman did.
          .
          Failure to do it in this manner makes the output dependent on arbitrary correction factors without any physical reason to select one scheme over another.

  50. Hu McCulloch
    Posted Mar 21, 2009 at 7:06 AM | Permalink

    RE Jeff C, #100,
    In the third graph, what do you mean by “using AVHRR monthly means from UW website”? Does this mean Steig’s AVHRR-based recon from his U. Washington website? If so, this is not supposed to be AVHRR data per se, but rather surface station data, which has merely been interpolated with the aid of covariances computed from AVHRR data.

    I gather your first 2 graphs are truly AVHRR data, with no reference to surface temperatures other than the S Pole zero reference.

    So far I’ve received no reply from either Steig or Comiso to my request a few days ago for their data. I’ll give them a full week before sending them a reminder.

    • Jeff C.
      Posted Mar 21, 2009 at 10:08 AM | Permalink

      Re: Hu McCulloch (#101), I should have made that more clear. UW is University of Wisconsin http://stratus.ssec.wisc.edu/products/appx/appx.html , these are monthy mean data files from the AVHRR from 1982 to 2004. Jeff Id and I have been using this data to try and re-create something close to the input AVHRR data set Comiso supplied to Steig. The third plot is my trend map using that data from the U Wisconsin.

      Since we don’t have Comiso’s dataset, all we have are trend plots that have shown up in a few papers for comparison. The first plot is from Monaghan et al, and is a trend plot using Comiso’s dataset from 1982 to 2001. The second plot is from Steig’s SI and claims to be the input AVHRR data from 1982-1999 (although it looks very similar to the Monaghan plot despite ending in 1999).

      The third plot is my trend map using the Wisconsin AVHRR dataset from 1982 to 2001 for comparison to Comiso’s shown above. I’m trying to see if I can get my trend plot to look like Comiso’s. At first, they looked quite different. Once I applied a calibration using the surface data, I got the plot above. My calibration forces the AVHRR cells over the South Pole to agree with the Amundsen-Scott surface record from the South Pole, I then offset the other cells in the same month by the same amount.

      I’m not sure how Comiso came up with plots 1 and 2, he may use a surface calibration or he may not. I’m experimenting with various approaches using the Wisc dataset to how close I can get to Comiso.

  51. Layman Lurker
    Posted Mar 21, 2009 at 7:48 PM | Permalink

    Ryan, the +/- 10C rule is something that should have been testable in areas with corresponding surface station or AWS data. Surely this must have been done somewhere? Do you have enough data that it could be checked out now?

  52. Hu McCulloch
    Posted Mar 22, 2009 at 7:17 AM | Permalink

    RE Jeff C, #104,

    I wonder if there are any locations (“New Seattle Land”?) that are more often cloudy than clear? Then this recipe might be throwing out the clear days and keeping the cloudy ones.

    Even if not, it would still make a difference if the rule were applied iteratively — is the mean for each cell recomputed after the “cloudy days” have been removed, and then each day rechecked relative to the new mean until convergence?

    Also, is the 10 degrees taken in both directions, or just one? Ie if clouds are believed to make the TIR reading too hot (or cold), wouldn’t the reading be discarded only if the reading is unusually high (or low)?

    Its nice that Gavin is responsive about these details, but it’s the responsibility of the authors themselves to explain what they did, either in direct answer from inquiries or in a general statement made publicly available. Some friend of theirs who runs a blog and thinks he knows the answer is just second hand information.

    • Jeff C.
      Posted Mar 22, 2009 at 6:52 PM | Permalink

      Re: Hu McCulloch (#106), This is from Monaghan:

      AVHRR temperature records must be used with caution as they are only valid for clear-sky conditions, an issue that can be problematic in the coastal Antarctic regions where conditions are more often cloudy than not [e.g., Guo et al., 2003].

      As Ryan noted, the Comiso plots shown in #100 have a warming belt around the coast. This could be from inadequate cloud masking, but since these are trends it would mean the cloud cover has increased over time or the masking methodology has degraded over time. It is interesting that Monaghan’s recon has no coastal warming band.

      Regarding the +/- 10 deg C window (both directions), that also peaked my interest. It assumes that clouds are randomly hotter or colder than the surface temperature. Most of the papers I have found speak of the clouds being warmer than the surface. By windowing both a hot side and a cold side it might add a warm bias. Although I can’t explain any time dependancy that would be required to induce a warming trend.

      Re: Ryan O (#107), In my “calibration” I used a single offset for all cell locations for a specific month. For example, if the AVHRR at the South pole was 10 deg warmer than the surface temperature, I offset every cell for that month by -10 deg C. The thinking was that if there was a systemic drift over time or offset from one spacecraft to another, forcing the South pole cells to equal the surface temp at the south pole would remove the error. This assumes the error is equal at all measurement locations for a given month (by no means a given).

      • Posted Mar 22, 2009 at 8:01 PM | Permalink

        Re: Jeff C. (#108),

        Most of the papers I have found speak of the clouds being warmer than the surface. By windowing both a hot side and a cold side it might add a warm bias. Although I can’t explain any time dependancy that would be required to induce a warming trend.

        What Jeff is saying is that the offset could be warmer because the description of the algorithm applies equally to early data as well as later, not the trend. I also have been thinking about this.

        Here’s the question and answer I had with Gavin last month.

        Gavin, Please describe the meaning of climatology in this statement in methodologies. Is it meaning RegEm reconstructions of the temperature stations?

        “We make use of the cloud masking in ref. 8 but impose an additional restriction that requires that daily anomalies be within a threshold of +/-10 C of climatology.”

        [Response: The average temperatures for that month over the whole satellite record. Nothing to with the reconstruction. – gavin]

        To me, if the daily anomaly is clipped according to +/- 10C from the monthly average it must be an iterative process since each clip changes the average. Perhaps that’s why we can’t have the data yet, they’re still iterating… 😉

        Anyway, good stuff Jeff. Redoing the trend according to the pole station makes sense. It also makes me think after some time with this data that there was an undisclosed recalibration step to the TIR data. Another set of code we need to see.

      • Ryan O
        Posted Mar 23, 2009 at 7:17 PM | Permalink

        Re: Jeff C. (#108),
        .

        In my “calibration” I used a single offset for all cell locations for a specific month. For example, if the AVHRR at the South pole was 10 deg warmer than the surface temperature, I offset every cell for that month by -10 deg C. The thinking was that if there was a systemic drift over time or offset from one spacecraft to another, forcing the South pole cells to equal the surface temp at the south pole would remove the error. This assumes the error is equal at all measurement locations for a given month (by no means a given).

        .
        I thought that’s what you meant. If that is what Comiso did, the problem with that is why would the offsets change from month to month? If the ground station and the satellite are measuring the same quantity, he should be able to apply a single offset for the entire series. If they are not measuring the same quantity, then he has to find the expression to transform one to the other . . . and then apply a single offset (if necessary).

  53. Posted Mar 23, 2009 at 8:34 PM | Permalink

    I’ve been running some plots of the AVHRR relative to the surface station data at the same coordinates. Visually the comparison is interesting.

    Know Your Data

  54. Hu McCulloch
    Posted Mar 24, 2009 at 7:43 AM | Permalink

    RE Jeff Id, #111,
    Interesting plots, Jeff.

    You give plots for both 0200, “night”, and 1400, “day”. Are these local times, or Greenwich times? If the latter, they would not necessarily correspond to day and night, even in temperate or tropical latitudes. But below the Antarctic circle, even local “day” and “night” do not necessarily indicate whether or not the sun is up, but merely which direction the sun (or moon) is shining from.

    • Posted Mar 24, 2009 at 8:47 AM | Permalink

      Re: Hu McCulloch (#112),

      Hu, This is from the NSIDC

      This data set consists of AVHRR retrievals of surface and cloud properties as well as radiative fluxes for the period 1982 – 2004 over the Arctic and Antarctic at a 25 km resolution. The images times are 1400 and 0400 (Arctic) or 0200 (Antarctic) local solar times. Resulsts are calculated on a twice-daily basis, but only monthly mean images and area-averaged values are currently online.

      You’re right though. Perhaps I should just say the times rather than the day/night description.

      For those who want to read more, the data set is here.

      http://stratus.ssec.wisc.edu/products/appx/appx.html

  55. RomanM
    Posted Mar 24, 2009 at 11:23 AM | Permalink

    I have also been looking at the AVHRR data. Using an adaptation of a nice simple script belonging to Jeff for viewing the data for each month sequentially on a map, I found a some strange patterns in the data.

    In particular, I took the 1400 data set and created anomaly series in the usual manner by subtracting monthly means for each grid value. No other manipulation was done. When plotted, the values for March from 1982 to 2000 display spiral shaped “arms” emanating from the south pole. For example:

    (If the plot doesn’t post properly, it can be found here).

    I didn’t notice any such patterns in any other month.

  56. Steve McIntyre
    Posted Mar 24, 2009 at 11:59 AM | Permalink

    Hi, guys. We really need to set up a wiki to host the various retrieval scripts for an individual article. Pending that, I’d be happy to keep them in CA/scripts/steig if people wish to send me relevant scripts for uploading.

  57. Hu McCulloch
    Posted Mar 24, 2009 at 12:15 PM | Permalink

    Re #114-116,
    Steve — Welcome back!
    Roman — It’s really odd that there are 14 arms. 12 or 16 might easily arise from some data processing quirk, but why 14??

    • Posted Mar 24, 2009 at 12:34 PM | Permalink

      Re: Hu McCulloch (#117),

      The arms are from orbital passes, I counted also expecting 12 or 24 but nope. It makes sense though. I don’t know why the NSIDC kept the data for this month but they discuss the effects at the link above.

      Re: Steve McIntyre (#116), Glad you’re back Steve, it sounds like it was a great trip.

      • RomanM
        Posted Mar 24, 2009 at 1:15 PM | Permalink

        Re: Jeff Id (#118),

        Most of the March pictures display a similar pattern, not just a single month. If this is from orbital passes then orbit positions must be pretty static from day to day to produce such well-defined patterns from the daily averages.

      • Kenneth Fritsch
        Posted Mar 24, 2009 at 1:17 PM | Permalink

        Re: Jeff Id (#118),

        Those whirls make for a pretty picture. Can we call it in a work of artifact?

    • Steve McIntyre
      Posted Mar 24, 2009 at 1:11 PM | Permalink

      Re: Hu McCulloch (#117),

      Hu, you must be teasing us with this question as the answer is really quite obvious.

      Fourteen-arm, Four legs, & ten faces Heruka Chakrasamvara. url

      In the center of this grand mandala, fourteen-armed, ten-headed, blue-complexion Heruka is standing in the warrior posture on a throne. His expression is wrathful and he embraces his consort with his two principal hands. url

      Santa Claus may live at the North Pole, but obviously Heruka Chakrasamvara (blue complexion, 14 arms) lives at the South Pole.

  58. Jeff C.
    Posted Mar 24, 2009 at 3:26 PM | Permalink

    Welcome back Steve!

    Re: RomanM (#120), I did not realize that the whirl pattern shows up only in March. It also seems to only appear in the 1400 data set, I have not seen it at all in the 0200 data set.

    There is a discussion of a similar phenomena on this page here with an example copied below. It is caused from contamination of one data channel with that of another channnel. NOAA-16 from 2001 to 2005 is listed as the only time period affected so this problem may be similar but presumably doesn’t have the same cause. I think Jeff Id is correct in that the arms correspond to orbital passes.

    Figure 1. Northern Hemisphere Composite at 1400 Hours Showing Patches of Bad Data

    The page linked above has an excellent summary of known data problems and the instances of missing data.

  59. Hu McCulloch
    Posted Mar 24, 2009 at 4:18 PM | Permalink

    RE Steve, #119 —
    A prime example of what we’ve been missing for the last 2 weeks!
    This accounts for the blue complexion at the center of the picture as well…

  60. Steve McIntyre
    Posted Mar 26, 2009 at 3:00 PM | Permalink

    Here’s a 14-armed image showing satellite paths.

    • RomanM
      Posted Mar 26, 2009 at 3:14 PM | Permalink

      Re: Steve McIntyre (#124),

      Does this produce a time of observation problem? How would they handle that?

  61. John Ritson
    Posted Apr 27, 2009 at 7:17 AM | Permalink

    Can anyone guess where I found this simple rule of thumb?

    “5) How can you tell whether you have included enough PCs?

    This is rather easy to tell. If your answer depends on the number of PCs included, then you haven’t included enough.”

  62. Hu McCulloch
    Posted Apr 28, 2009 at 1:55 PM | Permalink

    RE Bender, #128,
    John Ritson’s quote strikes me as very pertinent. The original question was, “Why did Steig use a cut-off parameter of k = 3?
    The answer, as found by Jeff, Jeff, Ryan and Roman, is shaping up that it was because k > 3 gives a very different, undesired answer. (I haven’t checked their progress lately, but they were going strong when I last looked.)

    • bender
      Posted Apr 28, 2009 at 7:40 PM | Permalink

      Re: Hu McCulloch (#129),
      That’s your interpretation of what he said. I interpreted his comment slightly differently. This IS Dr. Ritalin, correct? I think he’s snidely implying that its’s – duh – obvious why 3 was chosen: because with k greater than 3 the answer doesn’t change “significantly”. If that is what he’s saying, I’m asking to confirm and clarify with a follow-up.

  63. Posted Apr 28, 2009 at 10:48 PM | Permalink

    Re #130,
    I don’t recall any past encounters with Ritson, but it seems to me that he and Ryan are referring to the irony that Gavin and Caspar, in their attempted RC put-down of skeptics, have put their finger on the essential problem with Steig, Mann & co.

  64. John Ritson
    Posted Apr 28, 2009 at 11:15 PM | Permalink

    Re #131
    Hu,
    You have read the post exactly as I intended.

  65. Steve McIntyre
    Posted Apr 29, 2009 at 9:40 PM | Permalink

    I agree entirely with the above observation – I realize that it’s hard to locate things in the sprawl of Climateaudit, but this issue was discussed in some length here http://www.climateaudit.org/?p=5401 making a number of references to Ammann and Wahl on this point. This post raised this issue; Jeff Id used the graphic from this post in his first analyses. The collective analysis has advanced from late Feb, but this post in late Feb grabbed the issue pretty well in retrospect.

    I’m glad that others appreciate the irony. The point goes well beyond a realclimate post. The realclimate post previewed Wahl and Ammann 2007, which was relied upon by IPCC, to assert that Mann’s various errors didn’t “matter”, a point contested by Wegman. There’s an interesting backstory on IPCC which I’ll get into some time – the wording on this point in the final report was never presented in a Draft version to reviewers. The Drafts said that the impact of these issues was “unclear”. The only reviewer who objected to this phrase on the record was Mann himself. Ammann lobbied behind the scenes and, against IPCC regulations, IPCC has refused to release Ammann’s secret review comments. The claim that MAnn’s errors didn’t “matter” was inserted in the published version without ever having been put to third party reviewers. The Review Editor said that he destroyed all his comments on this point.

    • Posted Apr 30, 2009 at 9:02 AM | Permalink

      Re: Steve McIntyre (#134),

      The Review Editor said that he destroyed all his comments on this point.

      Is that anonymous or anomalous review?

    • Pat Frank
      Posted Apr 30, 2009 at 12:09 PM | Permalink

      Re: Steve McIntyre (#134), Yet one more instance of the pattern of dishonesty typifying the IPCC.

  66. Posted May 25, 2009 at 12:32 PM | Permalink

    RE # 126-133, 135, it appears that the John Ritson of #126 is being confused by Bender and perhaps some others here with the D.M. Ritson who works with Mann and Ammann etc.

    In fact, John R. has turned up an excellent statement by Caspar and Gavin, explaining, in a different context, why Steig 08 is so wrong.

7 Trackbacks

  1. […] Above: a graph from Steve McIntyre of ClimateAudit where he demonstrates how “K=3 was in fact a fortuitous choice, as this proved to yield the maximum AWS trend, something that w…“ […]

  2. By Don’t Eat the Fish « The Air Vent on Mar 1, 2009 at 4:58 AM

    […] Above: a graph from Steve McIntyre of ClimateAudit where he demonstrates how “K=3 was in fact a fortuitous choice, as this proved to yield the maximum AWS trend, something that w…“ […]

  3. […] (2)  From Josefino Comiso (NASA Goddard Space Flight Center), in reply to request for data by McIntyre on January 23 (source): […]

  4. By Steig’s “Tutorial” « Climate Audit on Jan 4, 2011 at 12:52 AM

    […] I observed earlier this year here, if you calculate eigenvalues from spatially autocorrelated random data on a geometric shape, you […]

  5. […] the basis that flawed use of statistical analysis also lands authors in this circle, we might find a certain Dr Eric Steig here, still arguing his […]

  6. […] the basis that flawed use of statistical analysis also lands authors in this circle, we might find a certain Dr Eric Steig here, still arguing his […]

  7. […] the basis that flawed use of statistical analysis also lands authors in this circle, we might find a certain Dr Eric Steig here, still arguing his […]