FOI: The “Final” Answer on Jones et al 1990

I wrote again on Apr 17, 2007 on my FOI request observing that part (B) of my FOI request had not been answered: the identification of the stations used as comparanda in the calculations of Jones et al 1990.

Thank you for your courtesy and attention in this matter, which has successfully resolved part (A) of my request. However part (B) remains outstanding and I re-iterate my previous request for this information:

A) the identification of the stations … for the following three Jones et al 1990 networks:
1. the west Russian network
2. the Chinese network
3. the Australian network

B) identification … of the stations used in the gridded network which was used as a comparandum in this study

Thank you for your attention.
Regards, Steve McIntyre

I received the following reply:

In your email of 17 April 2007, you re-iterated your request from your email of 12 March 2007, to see

“B) identification … of the stations used in the gridded network which was used as a comparandum in this study”

I have been in conversation with Dr. Jones and have been advised that, in fact, we are unable to answer (B) as we do not have a copy of the station data as we had it in 1990. The station database has evolved since that time and CRU was not able to keep versions of it as stations were added, amended and deleted. This was a consequence of a lack of data storage comparable to what we have at our disposal currently.

I have been advised that the best equivalent data available is within the current version of CRUTEM3(v) or CRUTEM2(v). The latter is still available on the CRU web site, though not updated beyond 2005.

These latest versions are likely different from what was used in 1990. Australia and China have both released more data since then – it is likely that much of this was not digitized in 1990. Dr. Jones acknowledges that the grid resolution is now different, but this is again due to greater disk storage available.

The details of our updating of the raw station data is discussed in the following article:
Jones, P.D. and Moberg, A., 2003: Hemispheric and large-scale surface air temperature variations: An extensive revision and an update to 2001. J. Climate 16, 206-223.

This is, in effect, our final attempt to resolve this matter informally. If this response is not to your satisfaction, I will initiate the second stage of our internal complaint process and will advise you of progress and outcome as appropriate. For your information, the complaint process is within our Code of Practice and can be found at:

Click to access 1.2750!uea_manual_draft_04b.pdf

Yours sincerely

David Palmer
Information Policy Officer
University of East Anglia

57 Comments

  1. cbone
    Posted Apr 19, 2007 at 11:07 AM | Permalink

    Looks like a fair and honest reply to me. It would seem to be the electronic version of ‘the dog ate my homework’.

    Or to put it another way, it is a ‘plausible’ excuse for why they won’t give you the data.

    Once again I am stunned that a key paper, relied upon by the IPCC for a critical adjustment to the temperature record is essentially unverifiable. This isn’t the way I was taught how science works.

  2. Reid
    Posted Apr 19, 2007 at 11:11 AM | Permalink

    This is a variant of “the dog ate my homework” defense.

    Does the IPCC have a stated policy on non-repeatable “peer reviewed” papers?

  3. Roger Dueck
    Posted Apr 19, 2007 at 11:16 AM | Permalink

    Reid, their policy appears to be “we accept their word for it and we’ve moved on”.

  4. Steve Sadlov
    Posted Apr 19, 2007 at 11:23 AM | Permalink

    So therefore, it would not be academically honest to use it.

  5. Willis Eschenbach
    Posted Apr 19, 2007 at 11:32 AM | Permalink

    That’s a crock. The list you are looking for is not more than a couple of pages, it wasn’t heaved to make computer storage room.

    The main problem is, they seem to think that you are looking for the data, rather than a list of the stations used. They say:

    I have been in conversation with Dr. Jones and have been advised that, in fact, we are unable to answer (B) as we do not have a copy of the station data as we had it in 1990. The station database has evolved since that time and CRU was not able to keep versions of it as stations were added, amended and deleted. This was a consequence of a lack of data storage comparable to what we have at our disposal currently.

    I have been advised that the best equivalent data available is within the current version of CRUTEM3(v) or CRUTEM2(v). The latter is still available on the CRU web site, though not updated beyond 2005.

    These latest versions are likely different from what was used in 1990. Australia and China have both released more data since then – it is likely that much of this was not digitized in 1990. Dr. Jones acknowledges that the grid resolution is now different, but this is again due to greater disk storage available.

    The details of our updating of the raw station data is discussed in the following article:
    Jones, P.D. and Moberg, A., 2003: Hemispheric and large-scale surface air temperature variations: An extensive revision and an update to 2001. J. Climate 16, 206-223.

    Note that they go on and on about releasing more data, that the data was not digitized in 1990, and grid resolution, and updating the raw data … you need to write them back and clarify that what you have requested is a list of the stations used, and not the data itself.

    w.

  6. Gary
    Posted Apr 19, 2007 at 11:55 AM | Permalink

    And ethically would require publication of currently held data with all of the metadata any replicator/auditor of methods would need.

  7. KevinUK
    Posted Apr 19, 2007 at 12:17 PM | Permalink

    Steve M

    Surely with this reply and the problems you’ve uncovered to date there must now be a bonefide case for completely ignoring this original work by Jones et al and for funding an up to date re-analysis of the UHI effect. I recommend that such a re-analysis be done by someone who does not have a vested interest the outcome of this re-analysis i.e. it should be done by someone who currently has no connections with paleoclimatology (e.g. Tyndall Centre). Do you think Edward Wegman would be interested? If the UK Research Councils can find ⡲ million to fund research into ‘managing uncertainties in complex models’ (see Unthreaded #8) then surely they can afford to fund this work?

    KevinUK

  8. Steve McIntyre
    Posted Apr 19, 2007 at 12:35 PM | Permalink

    #5. Willis, it is hard for them to keep their stories straight. A list of stations from that period is about 250 KB in size, which obviously fit comfortably onto computers in 1990. I’m sure that the station list was backed up onto a diskette. They’re just trying to be annoying.

  9. John Lang
    Posted Apr 19, 2007 at 12:39 PM | Permalink

    The foremost authority on climate at the University of East Anglia had to explain to the University’s information policy officer that he did not archive the data used in the most important and most cited paper he has ever published in his long and important academic career.

    No wonder they are trying to ease him out.

    I’m sure the information officer is now telling the story to other academics in the professor’s lounge and the town pub. Eyebrows will be raised in olde english fashion.

  10. KevinUK
    Posted Apr 19, 2007 at 1:05 PM | Permalink

    #8 Steve M

    Once again as is your choice you are being polite. IMO Phil Jones is at best incompetent and at worst is deliberately avoiding disclosure of this information because he knows that you are about to blow a big hole in his boat below the water-line. He knows that as a result his AGW boat will no longer float and will inevitably sink, preferably to the bottom of the Marianas trench where it belongs.

    KevinUK

  11. Stan Palmer
    Posted Apr 19, 2007 at 1:07 PM | Permalink

    This was a consequence of a lack of data storage comparable to what we have at our disposal currently.

    Tape storage ws widely available and used for archival purposes from the 1970s. Archival records were undoubtedly deing kept for student results back then. If a student from East Anglia requested a transcript of his/her marks, I wonder what the reply would be.

  12. DW
    Posted Apr 19, 2007 at 1:31 PM | Permalink

    The local MP for Norwich is Ian Gibson, who was Chair of the Commons Science and Technology Committee. As the UK Government is taking AGW fully on board, and Mr Gibson has been a thorn in the side of the Blair Government (he was sacked from the Chair of the Committee), I wonder if he might investigate…

    Oh, and he used to be Dean of Science at the University of East Anglia too…

  13. richardT
    Posted Apr 19, 2007 at 1:53 PM | Permalink

    I strongly suspect that a survey of scientists would reveal that many could not produce the raw data from a paper they published in 1990.

  14. Steve Sadlov
    Posted Apr 19, 2007 at 1:54 PM | Permalink

    Dr. Gibson’s web site:

    http://www.iangibsonmp.co.uk/

  15. Steve McIntyre
    Posted Apr 19, 2007 at 1:58 PM | Permalink

    Richard T, if the issue was merely the identification of the stations in the 1990 paper, it would be one thing. But Jones has also refused to identify the stations in the present HadCRU data set even pursuant to a FOI request. Would most scientists do that?

  16. bernie
    Posted Apr 19, 2007 at 2:19 PM | Permalink

    I think DW has the right idea. A well asked question at question time in the House of COmmons that links the British Government’s AGW policies to the missing Jones’ data would put the proverbial cat among the proverbial pigeons.
    The target would of course be future funding for UEA. The more I think about it, the more I think a direct conversation between
    Steve and Dr. Gibson may be in order.

  17. 2dogs
    Posted Apr 19, 2007 at 2:44 PM | Permalink

    The good thing about this reply is they don’t contest the legitimacy of your request; they might be estopped from claiming otherwise later.

    The reply should be to request a list of stations in the current database, any information they have as to when they were added, plus any records they may have of any changes or deletions to the database since 1990.

  18. per
    Posted Apr 19, 2007 at 2:52 PM | Permalink

    for the record, I am with RichardT. I don’t think it unreasonable for 17 year old data to be lost, incomprehensible, etc., if it wasn’t archived (published) at the time. I don’t think it unreasonable that scientists just chuck out their original data/ lab note books after a decade, unless they have good reason.

    The UK research councils have a code of practice for data that says it must be archived for seven years. I think that is reasonable. It is pretty unreasonable of people to deny you access within that timespan.

    17 year old data ?
    nope !
    per

  19. bernie
    Posted Apr 19, 2007 at 2:59 PM | Permalink

    per:
    The problem with such a position is that policy makers are relying on something critical, yet it cannot be checked. The answer I guess is to redo the work. But in the mean time …we will trust Prof. Jones….

  20. Richard deSousa
    Posted Apr 19, 2007 at 3:06 PM | Permalink

    With all those weasle words David Palmer used he must be a lawyer… LOL

  21. Steve McIntyre
    Posted Apr 19, 2007 at 3:11 PM | Permalink

    #18, per, I’m dubious. We’re talking 240K of data presumably on a diskette. I don;t believe that Jones would go and through out his old diskettes without knowing what’s on them. I have old diskettes too. I wouldn’t just throw them out without knowing what’s on them. It’s easier to copy them onto a permanent file labelled “old” or whatever without trying to figure out what’s on them.

    In any event, I obtained a listing of 1994 stations from Jones in 2003. So if we’re talking about approximating the 1990 roster, this reply would justify the use of that roster (as opposed to the 2006 roster as they proposed.) If even this 1994 roster is presently unavailable to Jones, then it’s fortunate that I;ve retained the information as Jones would then have disposed of the information within the last few years in the face of at least request.

  22. bruce
    Posted Apr 19, 2007 at 3:22 PM | Permalink

    Maybe I am paranoid, but I worry that what is going on here is that original data is being ‘lost’ so that the only data on the record is the ‘adjusted’ data that presents a revisionist view of world temperature history.

    Remember: “We must get rid of the Medieval Warm Period”

    And as observed in numerous posts, they seem to be getting rid of the inconveniently warm 1930s.

    “Trust British Climate Scientists? Sure Can!”

    For non-Australians, that comments reflects an ad for British Paints that went:

    “Trust British Paints? Sure Can!”

  23. richardT
    Posted Apr 19, 2007 at 3:37 PM | Permalink

    #22 Bruce

    You are being paranoid. There is no risk of the Medieval Warm Period being forgotten. Just this week there were several talks and posters discussing the MWP, its cause, effect, timing and spatial extent at the EGU.

  24. Don Keiller
    Posted Apr 19, 2007 at 3:42 PM | Permalink

    re #12 and #16. Steve, I live in the UK and I am an academic. As a non-UK citizen you might have difficulty in getting an MP to ask a Parliamentary question on your behalf. I on the other hand would have a better chance. If you want to compose a suggestion to me, I would pass it on as best I could.

  25. jaye
    Posted Apr 19, 2007 at 3:56 PM | Permalink

    Losing data for a 17 year old paper is not acceptable considering the weight given to the paper’s conclusions. The only true recourse is to consider the results null and void, then redo the study. Another example that supports this theory: Academics = amateurs.

  26. Reid
    Posted Apr 19, 2007 at 4:21 PM | Permalink

    For Jones claim that the data is missing to be credible is the implicit assumption that he is a computer incompetent. As Steve M. states it is a small core dataset(est. 240K). It should be available either in hard copy format somewhere on the planet or on the CPM based Kaypro portable PC that Jones was using in 1990 and is still using today as his remote terminal device for FTP internet access. This a sarcastic ad hom attack but the loss of data of data that Steve is requesting is not plausible.

  27. Neil Fisher
    Posted Apr 19, 2007 at 4:24 PM | Permalink

    Is it reasonable to archive a complete dataset that may be quite large (at least in terms of storage available at the time) for 17 years? I don’t think it is; that would almost be like asking them to supply any papers they referenced.

    Is it reasonable to archive references to what data was used and where/how it was obtained for 17 years? I don’t beleive that would be unreasonable – in fact, I would have thought that such information would be part of the original paper (although perhaps not published in full).

    Steve, I think you should reiterate that it is not the actual data that you require, but rather a complete description of what data was used.
    If they are unable to supply the data after 17 years, then fair enough. If they are unable to tell you what they did to what data after 17 years, then the paper should be retracted – it seems to me that if they are unable to specify what they did and where the data came from, even after all this time, then they would not have been able to do so at the time of publication, either. And *that* means their paper is, and always has been, unable to be replicated. In short, if that is the case, it ain’t science!

  28. Nicholas
    Posted Apr 19, 2007 at 4:39 PM | Permalink

    The bottom line is, I think he should have published the list of stations at the time in the supplemental material provided with the paper, or perhaps without the paper itself. If I remember correctly, approx. 50 pairs of stations were analyzed? If he had done so, then it would not be possible to lose the list, as it would be “archived” along with the paper. The actual temperature data itself is another issue; that should have been permanently archived too, however its loss is slightly less critical, since one can at least try to replicate his conclusions with modern data, as long as one has that list.

    If I were Steve, I would attempt to replicate his conclusions using my own list of stations. If I couldn’t I would publish that result and use it as proof that his original result is invalid, and claim that unless he can come up with a list of stations and/or data to prove otherwise, that conclusion would have to stand. Of course, that would take significant time and effort, but UHI is an important issue to get right, so it could be worth it.

  29. jae
    Posted Apr 19, 2007 at 4:43 PM | Permalink

    This a sarcastic ad hom attack but the loss of data of data that Steve is requesting is not plausible.

    Does this mean you are 90 percent confident?

  30. Reid
    Posted Apr 19, 2007 at 4:59 PM | Permalink

    Re #29: “Does this mean you are 90 percent confident?”

    I’m 90 percent confident Jones has upgraded his original 300 baud acoustic coupled modem.

  31. Stan Palmer
    Posted Apr 19, 2007 at 5:18 PM | Permalink

    re 21

    Long before 1990, institutions were backing up user account data to tape on a daily basis. In the place whre I worked, daily data would be retained for a month or so and then be clled so that only a weekly archive would be retained. This data could be used to restore important data that had been lost or corrupted. We had one inveterate hacker who set a trap that erased an entire directory. On another occasion, a self-important manager took it upon himslf to delete an entire user account containing very important programs. This data was easily recovered by the archive system retained. This was in the mid to late 80s

  32. bernie
    Posted Apr 19, 2007 at 5:21 PM | Permalink

    At issue is not when the data was collected but when the paper that was based on the data became “infamous”. Given that was in the 90s, the misplacing of the data is even more suspect, IMO.

  33. Reid
    Posted Apr 19, 2007 at 5:28 PM | Permalink

    Hey Phil Jones,

    I know your reading this. Hand over the data. We know you have it.

    Let’s do this this easy way or do we take this to the NAS like the Hockey Stick?

    Phil, it’s over. You had a good run. IPCC chapter writer and all but the jig is up. You’ve been Climate Audited!

    NEXT!!!

  34. Don Casada
    Posted Apr 19, 2007 at 8:15 PM | Permalink

    Re: 18.

    Baloney.

    I still have raw and processed data used in studies at Oak Ridge National Laboratory on stuff that few are interested in – pump and valve mechanical failure data from in the late 80’s and early 90’s. If others were relying heavily on that work, instead of just having two separate copies here at my house, I’d have it archived at multiple, physically separate sites to ensure that it wasn’t lost.

  35. Steve McIntyre
    Posted Apr 19, 2007 at 8:26 PM | Permalink

    The other possibility is that Jones may have sent the information to the U.S. – after all, he says that he’s sent all his present data to GHCN. So even if Jones has lost or destroyed the data, maybe it still exists in the U.S. It will be amusing to try to find out. Maybe we can also find out what site lists Phil hasn’t destroyed. It’s not as though the problem is going to go away just because they’re playing silly.

  36. Al
    Posted Apr 19, 2007 at 9:33 PM | Permalink

    I have data by the mountain from the 1987 (My first year as a researcher) _and_ the duplicate copy of every page of my lab notebooks. The original lab notebooks being ‘on file’ with my thesis advisor, and a third set archived at his house. There were notebooks on that same shelf from 1957 – and I’d assume they were kept. My advisor would regard the loss of any as both incompetence and professional misconduct. I tend to agree.

    There’s also full “appendices” of all of our publications archived. As in, how exactly the data analysis was performed, exact versions of Excell, DOS, and QBASIC used were noted down for crying out loud.

    As far as I’m concerned, “losing data” is the equivalent of leaving the flipping scalpel inside the incision.

    This is for data in physical chemistry in an area where no one is likely to contest _anything_. Use more advanced techniques and refine the data – sure. But the ‘theory’ was all pretty set in stone.

  37. Brad Culver
    Posted Apr 19, 2007 at 10:06 PM | Permalink

    1990: Motorola introduces the 68040 microprocessor.

    1990: IBM announces its RISC Station 6000 family of high performance workstations.

    1990: Digital Equipment introduces a fault-tolerant VAX computer.

    1990: Cray Research unveils an entry-level supercomputer, the Y-MP2E, with a starting price of $2.2M.

    1990: Microsoft introduces Windows 3.0.

    1990: Lotus wins its look and feel suit against Paperback Software’s spreadsheet program.

    1990: IBM ships the PS/1, a computer for consumers and home offices.

    1990: IBM announces the System 390 (code name Summit), its mainframe computer for the 1990s.

    1990: Microsoft’s fiscal year revenue ending 6/30/90 exceeds $1B.

    1990: NCR abandons its proprietary mainframes in favor of systems based on single or multiple Intel 486 and successor microprocessors.

    1990: Apple introduces its low-end Macintoshes: The Classic, LC and IISI.

    1990: Intel launches a parallel supercomputer using over 500 860 RISC microprocessors.

    1990: Sun Microsystems brings out the SPARCstation 2.

    1990: Microsoft along with IBM, Tandy, AT&T and others announced hardware and software specifications for multimedia platforms.

    1990: The first SPARC compatible workstations are introduced.

    I remember back in 1990 most Universities were transitioning from Vax systems to unix “Supercomputer’s” for serious computing. PC banks were available – PS/1’s were all over campus. Disk space in 1990 was not a major issue. Maybe the list of sties and their data is on an old 9 track somewhere?

  38. Steve McIntyre
    Posted Apr 19, 2007 at 10:17 PM | Permalink

    #36. This is not the first instance of “lost” data. Crowley lost the data as used in Crowley and Lowery 2000 other than a smoothed version and wasn’t able to recall where he got the digital versions. He blamed the rigors of the move from Texas A&N to Duke and thought it was unreasonable for me to have expected the data to have survived such an arduous journey.

  39. Willis Eschenbach
    Posted Apr 20, 2007 at 12:15 AM | Permalink

    As a quick test of the size of the information that we are talking about, I made a 2,000 station dummy copy of the China station identification document from HadCRUT. That’s about the number of stations in the entire GISS network. The dummy list contains all of the information ‘€” latitude, longitude, WMO number, and all the rest.

    75 kilobytes, uncompressed … and zipped? (ZIP compression was invented in 1986 …)

    4 kilobytes.

    w.

  40. hadenough
    Posted Apr 20, 2007 at 12:46 AM | Permalink

    You know what?

    I am having great trouble attributing any credibility at all for the ‘work’ put out by Dr Phil Jones, Dr Michael Mann, the Hockey Team, RC, the IPCC, Algore, Tim Lambert, et al. There is a never-ending parade of:

    – ‘lost data’
    – flagrant disregard of REQUIRED archiving policies
    – acceptance of poor practice by supposedly respectable scientific journals
    – unwillingness to allow replication
    – refusal to access expert competency, eg in statistics, signal processing etc.
    – refusal to respond to reasonable questions
    – obfuscation
    – reliance on ‘peer review’ as a gold quality seal of approval when it is clear that the ‘peer reviewed’ documents were looked over by mates, and never subjected to expert independent scrutiny.
    – acquiesence and acceptance by climate scientist, even defense, of clearly dodgy practice
    – instead of balance, we see nothing but unquestioning support for AGW, and attacks for those asking reasonable questions.
    – refusal to challenge dodgy practice on the AGW side
    – use of patronising language in response to reasonable questions
    – totally unwarranted claims of ‘consensus’ of climate scientists when it obvious that the only possible consensus is that there is not a consensus.
    – use of term ‘denialists’ to denigrate sincere questioners
    – intemperate, exaggerated language as acknowledged by Algore, Stephen Schneider
    – revisionism of the past – remove the MWP, the hot decade of the 30s
    – ‘adjustments’ to the 20thC record accounting for almost all of the claimed warming

    How can it be that the MSM and the bulk of scientists just accept this incredible state of affairs. It is truly extraordinary to me.

    And to think that we are being asked to make massive economic changes and dislocations costing perhaps trillions of dollars, all based on an elusive tissue of what look increasingly to me to be lies.

  41. Francois Ouellette
    Posted Apr 20, 2007 at 7:02 AM | Permalink

    The key thing is here is not so much the data. It’s all about reproducibility. If you’d ask me for the raw data for an experiment that I, or especially one of my students, did back in 1990, I might have a bit of trouble locating it, althoug I have kept most of my lab books (not all, because they sometimes belong to your employer, and you can’t leave with them). But the details of the experiment are all in the paper, and if someone needed to know more, I would gladly help them, even if it was to refute my results. In Jones’ case, the “data” are not so much the “results” of the experiment, as its starting point. The list of stations would be the equivalent to the description of an experimental setup. So the station list is crucial in repeating the “experiment”. Not to have provided them at the time, and not to have kept them archived, shows a rather grave lack of professionalism.

    I know that if I had published such a paper, I would still have the station list somewhere because, as Willis has pointed out, this would not require any sort of massive storage. I mean, the fundamental principle behind a scientific publication is that the reader should be able to reproduce what you’ve done. If a reputed scientist such as Phil Jones, whose work was already relevant for IPCC reports back in 1990, has been so sloppy as to carelessly lose a list of stations used in an important paper, it speaks volumes about the credibility of such reports, and Phil Jones’ whole corpus of work.

  42. DaleC
    Posted Apr 20, 2007 at 8:22 AM | Permalink

    Re this article, which claims that Tony Blair and Angela Merkel are harassing George Bush to toe the consensus line, perhaps some US citizen should tell George to tell Tony that when his UEA scientists cough up with the raw data so that the conclusions can be properly tested and checked, then there might be something to discuss.

  43. Steve Sadlov
    Posted Apr 20, 2007 at 9:50 AM | Permalink

    RE: #37 – I was messing with workstations, DECs, SPARC 1s, etc, in 1990. I was using email, ftp’ing, and participating in newslists. Had a Mac at home. Got my first SPARC 2 in ’92 and at that point started to see a couple web sites. 1993 I got a laptop (a TI! those were the days…. LOL). By ’94, I was using an early primative browser, etc ….

  44. Jim Edwards
    Posted Apr 20, 2007 at 10:24 AM | Permalink

    It’s ironic that alarmists keep tagging skeptics as being akin to Big Tobacco scientists denying the link between smoking and cancer.

    These series of exchanges over data [and how to interpret the NAS report…] remind me of tobacco spokesman Nick Naylor’s debate with his son in the excellent movie Thank You For Smoking. Naylor tells his son he has ‘won’ their unwinnable debate over which ice cream is best, chocolate or vanilla. Naylor’s son [the skeptic] replies, “But you haven’t convinced me.”

    Naylor’s response, “I’m not after you; I’m after them…[the public]”

    Naylor [Jones, et al] appears to be winning at the moment.

  45. Reid
    Posted Apr 20, 2007 at 10:36 AM | Permalink

    I have a name for the phenomenon, McIntyre’s Certainty Principle.

    The greater the importance of the AGW study, the greater the certainty that the data isn’t available for auditing.

  46. Ken Fritsch
    Posted Apr 20, 2007 at 10:54 AM | Permalink

    Re: #41

    If a reputed scientist such as Phil Jones, whose work was already relevant for IPCC reports back in 1990, has been so sloppy as to carelessly lose a list of stations used in an important paper, it speaks volumes about the credibility of such reports, and Phil Jones’ whole corpus of work.

    “It speaks volumes about the credibility of such reports” and says that it is ok to be skeptical about such works. I think that Steve M’s digging, even without culminating in revealing the data sought, in these cases serves a bigger need and that is to be skeptical ‘€” and that need goes beyond any questions of motivation or scientific ethics and expectations.

    This process and data involving Jones must have been judged at the time to be something more important and far ranging in its implications than a single paper on climate, but the handling of it would indicate that its owners might have thought otherwise.

  47. John Hekman
    Posted Apr 20, 2007 at 11:23 AM | Permalink

    I still have the deck of punchcards with the regression model I used in my thesis in 1975.

    Want to replicate my results?

  48. Michael Jankowski
    Posted Apr 20, 2007 at 11:38 AM | Permalink

    Every piece of my relatively unimportant grad research in the early 90s could be found in multiple locations and formats. Some of it was absurdly redundant. I would have software-generated data on the computer, which would remain archived in that software file format. I’d hand-write the numbers into my lab book. Then I’d enter them into spreadsheets so that I could work with them – spreadsheets which would be on a network drive, at least one floppy disk, end up on my home computer, etc. When I got a new computer, that data was copied over. When I backup the hard drive every 6 months, it gets backed up.

    I’m freaking disorganized, but I’ve got dozens of copies of my grad research (both data from publishable and non-publishable experiments alike). How come I’m capable of handling what is essentially meaningless data for over a decade, and these top-level researchers with their important contributions to science are so incompetent?

  49. Willis Eschenbach
    Posted Apr 20, 2007 at 12:25 PM | Permalink

    As y’all may recall, I wrote back regarding the FOI request I had made for Phil Jones’ list of stations used for HadCRUT3. Unlike Steve M., I was not looking for 17 year old data, but current data. Here’s what I received today …

    Information Services Directorate
    University of East Anglia
    Norwich NR4 7TJ England
    Telephone 01603 456161
    Direct Dial
    01603 593523
    Fax 01603 591010
    Email foi@uea.ac.uk

    Mr. Willis Eschenbach
    HI 96743 USA
    20 April 2007

    Dear Mr. Eschenbach

    FREEDOM OF INFORMATION ACT 2000 – INFORMATION REQUEST
    (FOI_07-04)

    Further to your email of 14 April 2007 in which you re-stated your request to see

    “a list of stations used by Jones et al. to prepare the HadCRUT3 dataset… I am asking for:
    1) A list of the actual sites used by Dr. Jones in the preparation of the HadCRUT3 dataset, and
    2) A clear indication of where the data for each site is available. This is quite important, as there are significant differences between the versions of each site’s data at e.g. GHCN and NCAR.”

    In your note you also requested “the name and WMO number of each site and the location of the source data (NCAR, GHCN, or National Met Service)”,

    I have contacted Dr. Jones and can update you on our efforts to resolve this matter.

    We cannot produce a simple list with this format and with the information you described in your note of 14 April. Firstly, we do not have a list consisting solely of the sites we currently use. Our list is larger, as it includes data not used due to incomplete reference periods, for example. Additionally, even if we were able to create such a list we would not be able to link the sites with sources of data. The station database has evolved over time and the Climate Research Unit was not able to keep multiple versions of it as stations were added, amended and deleted. This was a consequence of a lack of data storage in the 1980s and early 1990s compared to what we have at our disposal currently. It is also likely that quite a few stations consist of a mixture of sources.

    I have also been informed that, as the GHCN and NCAR are merely databases, the ultimate source of all data is the respective NMS in the
    country where the station is located. Even GHCN and NCAR can’t say with precision where they got their data from as the data comes not only from each NMS, but also comes from scientists in each reporting country.

    In short, we simply don’t have what you are requesting. The only true source would be the NMS for each reporting country. We can, however, send a list of all stations used, but without sources. This would include locations, names and lengths of record, although the latter are no guide as to the completeness of the series.

    This is, in effect, our final attempt to resolve this matter informally. If this response is not to your satisfaction, I will initiate the second stage of our internal complaint process and will advise you of progress and outcome as appropriate. For your information, the complaint process is within our Code of Practice and can be found at:

    Click to access 1.2750!uea_manual_draft_04b.pdf

    Yours sincerely

    David Palmer
    Information Policy Officer
    University of East Anglia

    Man, these guys are hysterical. They are currently producing and updating the HadCRUT3 temperature database every month, but they say they “do not have a list consisting solely of the sites we currently use” … so … do they just pick sites at random every month to update the database?

    I mean, when I was a kid on the cattle ranch, the cowboys used to say, “Partner, you can piss on my boots … but you can’t convince me it’s raining” …

    But then further along in the letter, they offer to send a list consisting solely of the sites they currently use … go figure.

    Of course, they won’t say where the data is coming from, claiming that ummm, well, the GHCN and NCAR folks don’t know either, and it’s all just too very hard, or something of that nature.

    So, first I’ll take them up on their offer of the list of stations, then I’ll deal with their pathetic excuses …

    This is better than the circus … the clowns are so much funnier.

    w.

  50. KevinUK
    Posted Apr 20, 2007 at 2:02 PM | Permalink

    Willis

    “This is better than the circus … the clowns are so much funnier.”

    This is what John A needs to appreciate about Numberwatch. As you’ve observed just as at UEA CRU the clowns are very significant at Numberwatch and they (Numberwatch and UEA CRU) just wouldn’t be the same without them :-).

    KevinUK

  51. MrPete
    Posted Apr 20, 2007 at 5:18 PM | Permalink

    #39 Willis’ test is **entirely** reasonable. Yes, there is no way that Dr Jones data was too large to be archived.

    In 1990, I worked for a scientific/demographic research organization. We were jazzed to have a bookshelf with 30 10mb portable cartridge disks — 300+MB of data on a shelf!!! 😀

    How much could we store in that? An unbelievable amount of data (for the time). Textual measurement data is incredibly compressible. That’s how, in 1990, we were able to fit a detailed street address map of the entire USA on a single CD.

    This situation is quite simple actually:

    * Data were not properly archived

    * Therefore, a less-than-adequate amount of backup copies were made (Dr Jones, if keeping the data only on his personal machine, was putting his entire career at risk, whether he knew that or not!)

    * Now, backups can’t be found. Time to delete all research based on that data, since IT WAS NEVER AUDITED OR PEER REVIEWED. That’s right, even if the *papers* were reviewed, we are assured by the University that the data itself is not and never has been available to anyone other than Dr Jones.

    My only positive hope and suggestions, seriously given:

    — Run a Google Desktop search of all computers in Dr Jones’ posession. Perhaps the data is hiding in an obscure place

    — Use Restorer2000 to search the “deleted” areas of all disk drives that are now or ever have been used by Dr Jones

    If these attempts to recover the data fail, then the research results should be withdrawn. Science abhors non-repeatable “research”.

  52. tc
    Posted Apr 21, 2007 at 11:42 AM | Permalink

    MrPete #51 hits the nail on the head. Thank you, MrPete for stating the fatal flaw in a way that is crystal clear to layman and scientist alike.

    Now, backups can’t be found. Time to delete all research based on that data, since IT WAS NEVER AUDITED OR PEER REVIEWED. That’s right, even if the *papers* were reviewed, we are assured by the University that the data itself is not and never has been available to anyone other than Dr Jones.

    You state the perfect answer to the “peer reviewed” mantra of climate researchers. The answer is: THE DATA WAS NEVER AUDITED OR PEER REVIEWED. You point out the distinction between “peer reviewed papers” and “peer reviewed data”.

    Folks at ClimateAudit are leaders in exposing the lack of auditing and peer review of data used in benchmark research papers in climate studies. Folks here have made similar statements, and I think your statement is one of the best.

    In trying to explain to the public, to the media, and to other scientists the fatal flaw in the foundation of climate research, this type of short clear statement is ideal for cutting through the fog of antropogenic hot air (AHA).

  53. Sudha Shenoy
    Posted Apr 22, 2007 at 5:45 AM | Permalink

    As MrPete says:

    That’s right, even if the *papers* were reviewed, we are assured by the University that the data itself is not and never has been available to anyone other than Dr Jones.

    (emphasis added.)

    I am gobsmacked. Completely gobsmacked. One person’s say-so & officials require millions of people to spend heaven only knows how much — ?! This well beyond funny. How much do people outside ClimateAudit know? Shouldn’t they be alerted?

  54. Posted Apr 22, 2007 at 10:55 AM | Permalink

    At one point in my career I was responsible for retaining government records. That meant I planned carefully how and when to destroy the data. Keeping data longer than required by law exposed us to audit risk. I destroyed all data just as soon as I legally could. It was my duty to my organization.

    There is an underlying underlying assumption here that data decays or evaporates spontaneously unless it is carefully sheparded. In my experience the problem with data storage is just the opposite. Data records are hardy and resistant – rather like an antrax spore. If data is in fact gathered and stored, office workers will instinctively preserve it. The originals will get filed and duplicates will be created. If you need to get rid of it you must expend some effort.

  55. John F. Pittman
    Posted Apr 22, 2007 at 11:37 AM | Permalink

    I agree with #54. I have the same problem. One can find an unbelievable number of copies, and revisions, and find them in places you would never expect. The more important, the more interesting, the more it gets reproduced…like anthrax spores.

    In an empirical sense, doesn’t this mean the data you are looking for is unimportant and uninteresting? Or perhaps I should say the author, empirically, found his own work unimportant and uninteresting?

  56. Terry
    Posted Apr 22, 2007 at 8:05 PM | Permalink

    This is actually an opportunity for an easy publication.

    Get the new data and analze it using the techniques originally used by Jones. This will provide an out-of-sample test of Jones’ results.

    It the resuls are different, then Jones 1990 is obsolete and the paleoclimate community will, of course, hail the new results as cutting edge, and demand that to move on from Jones’ antiquated work.

  57. Posted Apr 26, 2007 at 12:51 AM | Permalink

    Re. #54, 55 (and all the previous posts)

    As a researcher myself, I try to keep a good record of everything I do. This usually works for a period of time (generally until submission!), though notes are invariably less than 100% complete (even if it’s 99.9%). Some very honest and thorough researchers are not as good as they should be at record-keeping… it’s far too much like admin.

    And yes, files tend to replicate themselves in a dozen different places, but in my experience this happens for similar versions of code, with no way of telling where the file you actually used is. Also, labs get periodically cleared out. Old formats (7″ floppies, DAT, microfiche) get archived for a while, then disposed of. Sometimes they’re backed up by a student /technician onto a general “Old” disk somewhere, but existing doesn’t make people aware of it, or where it is.

    As #41 says, the information required to reproduce the work (though the data would need to be unearthed in this case) should be in the paper. If it’s not, then that’s the fault of both referees and the journal editor. If there’s insufficient information to reproduce the work, then there should be no problems in getting a fully open-source attempt to reproduce it published (as #56 rightly states).

    Maybe that’s the way to go?

3 Trackbacks

  1. […] global temperatures. Moreover, data is often not properly archived, whether early studies (eg. Jones et al 1990), or later ones (e.g. Kaufman et al […]

  2. […] receipt of the data, I did a number of posts at CA on the Chinese network e.g. here here here here here, analysis that we now know that Jones was monitoring. One of the few mentions of Climate Audit in […]

  3. […] on April 3, 2007, my follow-up on the part where they remained unresponsiveness and their final refusal.. Immediately on receipt of this information, I wrote some interesting posts on Chinese stations […]