Estimating Station Biases and Comparing to GISS Homogeneity Adjustments

If you had the task of choosing where to put a climate monitoring thermometer here at the USHCN Climate station of record #469683 in Winfield, WV where would you choose to put it?

Certainly the parking lot would not be a good choice. Maybe up in the grassy area behind the security fence? That would be my choice. Winfield is classified as a “rural” station so the grassy area would be a bit closer to the representivity for the area. It would also remove the sensor from the heat sinks of the parking lot and the building.

But then there’s that cabling issue with the MMTS sensor which this station has, it is a bit tough to trench through the parking lot up to the grass. So that leaves only one “logical” choice for placement.

Click image for a supersized closeup view

Surfacestations.org volunteer surveyor Michael Caplinger captured this location in his recent survey of West Virginia stations. As NOAA has already established with their training manual for the Baltimore USHCN station, rooftops are a far less than ideal place, and tend to create new temperature records where none actually exist.

According to the survey form submitted by Mr. Caplinger, he says:

“The new lock and dam opened in 1997. Prior to construction the weather station was possibly located about 100 yards West-Southwest, on land removed/altered for new lock. Reported coordinates appear incorrect for current location.”

According to NCDC’s MMS database, it appears that the MMTS came into being in August,1986, as prior to that they list the equipment type as “unknown”. That’s a good bet for the conversion date from Stevenson Screen, as MMTS did not start being implemented until the mid 1980’s

Also from MMS, and indication of the likely date of roof placement when the lat/lon and elevation changed significantly:

[1999-09-22] 2007-06-10 38.527220
(38°31’37″N) -81.916110 (81°54’57″W) GROUND: 611 FEET N/400/FEET PUTNAM 03 – SOUTHWESTERN EASTERN (+5)
Location Description: LOCK AND DAM, OUTSIDE & 1.2 MI SW OF PO AT REDHOUSE, WV

[1986-08-30] 1999-09-22 38.533330 (38°31’59″N) -81.916670 (81°55’00″W)
GROUND: 571 FEET — PUTNAM 03 – SOUTHWESTERN EASTERN (+5)
Location Description: LOCK AND DAM, OUTSIDE & 1.2 MI SW OF PO AT REDHOUSE, WV

In looking at the temperature record from NASA GISS, one sees what appears to be a step function around 1986, when the station changed to MMTS, seen in the data plot:

Click image for original GISTEMP plot

I downloaded the data, and there is an entire year of missing data in 1986, and the data resumes in 1987. This coincides with the equipment change noted in the NCDC MMS record on 8-30-1986. When I plotted the data and ran some curve fits and baseline value analysis on the two data segments, the differences became more apparent:

Click for a full sized plot

The baseline values between the two curve segments pre and post 1986 differ by 0.51°C, The slopes also differ significantly.

Looking at the GISTEMP plot for Homogenized data, you can note that the data has been shifted upwards a bit in the past, but the step function at 1986 remains:

Click image for original GISTEMP plot

When I plot the homogenized data, it can be clearly seen that there has been no change to the 1987 to 2007 segment of data, but that the 1905-1985 segment has been adjusted such that the early 20th century is a bit warmer, dramatically changing the slope for that segment.

Click for a full sized plot

The baseline difference between the two segments is less, now at 0.31°C

Here is the complete data set, with before and after Homogenization adjustment applied by GISS:

Click for a full sized plot

Note that unlike some other adjustments of rural stations we’ve seen where the past has been adjusted cooler (such as Cedarville, CA) in this case the past has been adjusted to be warmer, resulting in a slight cooling trend for the last century.

It makes no sense to me why GISS would adjust the past warmer. What could account for it? Certainly population growth wouldn’t be a factor, especially for a rural station. UHI doesn’t make any sense either.

Just for fun, I thought I’d try an experiment in data adjustment based on what I know about this station’s history. That isn’t much, but we do know these two dates:

1986 – MMTS installed, and likely moved closer to building due to cable issues
1999 – MMTS moved to rooftop of new locks building, based on lat/lon and elevation change

So based on that history, and having a handle on some other biases I’ve seen at the 500+ USHCN stations I’ve examined thus far, I decided to provide some offsets, based on what I believe a reasonable estimate of the bias might be:

1986-1998 = 0.5°C for MMTS to building proximity
1999-2007 = 1.0°C for MMTS on rooftop

Applying those adjustments and comparing to the GISS Homogeneity adjustment we get this:

Click for a full sized plot

Applying my station history based estimated placement biases as offsets post 1987, I come quite close in slope to that of the GISS homogeneity adjustment. My slope (dark blue) is actually just a tiny bit cooler than GISS. Some might say that my method uses too much “guesstimating”. But how is it any worse really than applying a broad brush algorithm blindly to the data, adjusting the far past, and without dealing with the step function that was introduced when the MMTS was installed? While my method is spur of the moment, it does have something the GISS adjustment doesn’t; adjustments based on known history and known measurement environment. GISS certainly does not know the history or measurement environment in the period that their automated algorithm applied adjustments. NCDC doesn’t have the station history for that period online either.

Looking for another nearby rural station to compare to, the closest I found was Spencer, WV, at 58 kilometers away. It also has a cooling trend, a bit sharper, and most likely has not been placed on top of a concrete building, though it’s current location is also not the best, at a Water Purification Plant:

While Spencer’s placement at a water plant presently (since 2005) probably would take the “rural” portion out of the record, the previous portion of the station history appears to be truly rural. Up until 1995, it spent most of it’s life at USDA SOIL CONSERV, WITHIN & 0.5 MI SE OF PO AT SPENCER, WV. From experience, I tend to view places such as Ag farms like this as being fairly good sites that don’t get much if any encroachment. This I would tend to believe the Spencer, WV record as showing a true cooling.

So the question is, can we use station photographs and station history, combined with some bias estimates that should be quantifiable either by experiments or direct measurements on site to come up with a more realistic adjustment for USHCN stations? While this is only one example that appears to work, I think the idea bears exploring.

This entry was written by Anthony Watts, posted on Jun 17, 2008 at 10:25 AM, filed under Surface Record. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

28 Comments

M.Villeger

Posted Jun 17, 2008 at 10:46 AM | Permalink

Title spelling mistake “homOgeneity”.

Another perfect illustration of the problem, bravo!

REPLY: Thanks, fixed. -Anthony
TheDude

Posted Jun 17, 2008 at 12:40 PM | Permalink

lol, who said scientists have to English write good.
Wolfgang Flamme

Posted Jun 17, 2008 at 1:34 PM | Permalink

Quite obviously there are compromises and compromises.
steven mosher

Posted Jun 17, 2008 at 2:17 PM | Permalink

I double checked this site AW.

Its a good example of divergence between nightlights ratings and
population rating for urbanity.
Scott Lurndal

Posted Jun 17, 2008 at 2:31 PM | Permalink

The URL to surface stations in the 4th paragraph is incorrect.
Larry Sheldon

Posted Jun 17, 2008 at 5:39 PM | Permalink

Scott Lurndal is correct, works better.
Dennis Wingo

Posted Jun 17, 2008 at 6:07 PM | Permalink

Steve/Anthony

I really don’t understand this cabling issue. Data integrity is far more important that convenience of placement and we do live in the 21st century. I could design in a day, a climate monitoring system powered by solar panels/batteries that could transmit data wirelessly back to a cheap router placed in a window.

Why is it that data integrity is sacrificed to the poor design of these systems?
Jonathan Schafer

Posted Jun 17, 2008 at 7:35 PM | Permalink

#7

Well, probably in part because these stations are weather monitoring stations, not climate monitoring stations. As such, the quality of the data is probably a bit less important. After all, for a weather station, does it really matter if it’s off by .5C or not on a daily basis? The data only become really important when used for other purposes, such as trending for climate. So while I agree that this could easily be solved, I can certainly understand why they likely aren’t interested in spending the money.

In the end, what it really means is that the provenance of the data must always be questioned until it is proven correct.
Anthony Watts

Posted Jun 18, 2008 at 12:13 AM | Permalink

RE7 Dennis,

The cabling issue has to do with the upgrade methodology from the Stevenson Screen/ mercury thermometer to an electronic one. The display/readout is not weatherproof and thus MUST be indoors.

The cable to the sensor then is the key issue. Here is why:

1. The NWS COOP manager is usually given a day or less to complete the job

2. They don’t have the budget for much work beyond use of hand tools such as shovels, picks, etc.

3. When encountering a barrier such as a concrete walkway, driveway, or other surface/object they can’t easily trench around or tunnel under with hand tools, they have to make a choice.

4. The choice usually ends up being putting the MMTS sensor closer to the building where they have to bring the cable inside to the display/readout.

5. Once a government program exists. and methods, documents, and inventory are in stock, it is very difficult to change as technology moves on. Witness how long the Space Shuttle ran on PDP 11/03 computers while PC’s ran circles around them for size and computing power before an upgrade occurred.

For an example of how the barrier of a driveway can put a temp sensor much closer to a building, see this post I made today:

http://wattsupwiththat.wordpress.com/2008/06/17/how-not-to-measure-temperature-part-65/

6. As the network has been slowly upgraded to MMTS electronic sensors since 1985, a slow warm bias has been creeping in due to building proximity. Yet nobody except Roger Pielke Sr. and I are talkign much about it. Look for it in the literature and little if anything is found.

MMTS is the ugly bastard child of temperature data cum climate change nobody wants to acknowledge.
Anthony Watts

Posted Jun 18, 2008 at 12:15 AM | Permalink

Mosh #4

There is a paper in here somewhere waiting to be published.
notanexpert

Posted Jun 18, 2008 at 5:57 AM | Permalink

Would you say that this result tends to support the accuracy of the GISS homogeneity adjustment?

Since your method takes advantage of knowing details about the station history, it would seem that it should do better than a completely generic algorithm. And yet the results are very similar. Not too shabby a result for the GISS adjustment, it would seem.

(BTW, the second to last plot doesn’t enlarge when when clicked.)
steven mosher

Posted Jun 18, 2008 at 6:02 AM | Permalink

re 10. no paper, I’m just noting that there are sites in the us that are
rural LT. 5K pop, but have dim or bright ratings by nightlights. this site
is one of them. i could generate an entire list
Anthony Watts

Posted Jun 18, 2008 at 10:38 AM | Permalink

RE 11 that graph enlargement is fixed, do a refresh and try again. Thanks for pointing it out.

As for your question, I’d say it says that a guy with an understanding of the measurement environment, some photos, some site knowledge and history can do at least as well or possibly better than GISS once the technique is refined. In this case GISS gets the overall trend close to being right, but it was done with an adjustment that is wrong headed.

It is like math class, the answer is right, but the methodology is all wrong to get the answer. That is why you have to show your work, if the answer is right but the method is all wrong you still fail.

Look at Cedarville, CA, (linked in article) for an example where GISS algorithms really fail. Point is, each site if different, broad brush adjustments don’t work for every site.

Mosh can make a list.
Steve McIntyre

Posted Jun 18, 2008 at 11:07 AM | Permalink

#11. As I’ve said on many occasions, the ROW adjustments are done differently and with very inferior metadata. The US situation looks much “better” than the ROW.
bill white

Posted Jun 18, 2008 at 1:09 PM | Permalink

Having been through Winfield WV many times in my travels, I can confirm Anthony’s suspicions that the roof of the dam structure may not be the best place for a station. There are lots of farm fields, grassy areas, and even a few town parks that would seem to be better.

Winfield is also a lot less rural than it used to be. U.S. Route 35 runs right past the locks & dam and through the town – it handles 15-16,000 vehicles per day, 35% of which are tractor-trailers. These vehicles often back up for more than a mile along U.S. 35, while waiting to turn right onto State Route 34.

In addition, Toyota opened up a 1.2 million sqft engine and transmission factory in Buffalo WV in 1998, a facility which now employs more than 900 people. It is about seven miles from the dam on the other side of the Kanawa River (downstream).
Sam Urbinto

Posted Jun 18, 2008 at 1:41 PM | Permalink

Jonathan Schafer #8 “I can certainly understand why they likely aren’t interested in spending the money.”

What, 50 E per remote thermometer that nobody has to touch and is accurate and self calibrating and reports temperatures every second to a central computer? I can certainly understand they aren’t interested in that either.

We wouldn’t want to put the folks doing the adjustments out of a job after all.
Thor

Posted Jun 18, 2008 at 1:59 PM | Permalink

This seems to happen in many places; the original temperature measurements station is removed and then replaced with a different one, in a slightly different location – with different surroundings.

Are there any known stations, where the new station was installed before the old one was removed, and where the temperatures were recorded with an overlap of, say, a year or two? Or even a month or two?
Dennis Wingo

Posted Jun 18, 2008 at 3:38 PM | Permalink

#9 Anthony

Yea I understand but with the future of our civilization at stake, we can certainly do better today. I have been designing embedded computers and sensor systems for over 20 years and it would be an almost trivial exercise to build a set of climate monitoring sensors that could be literally anywhere and could upload their readings to store and forward satellites in a manner that would completely free the climate monitoring system from dependence on terrestrial infrastructure.

I am working on some interesting sensorweb technology that could be used to rapidly set up a global climate monitoring network at a cost of about $1-2k per node (most of that cost is the solar power system)

Dennis
Robinedwards

Posted Jun 18, 2008 at 4:24 PM | Permalink

Interesting stuff! And worrying :-((

Is it possible to obtain the data that you used to prepare the graphs, Anthony? I would very much like to look at the actual numbers regarding the step functions, because I am very interested in detection of abrupt changes in temperature time series. Are you able to release the numbers, please?

Robin

REPLY: The data is on the bottom of the pages at GISS that I link to (see the original GISTEMP plot)
Geoff Sherrington

Posted Jun 19, 2008 at 3:30 AM | Permalink

Re 10 Anthony Watts at Winfield

There is a paper in here somewhere waiting to be published.

A cigarette paper waiting to be smoked?
notanexpert

Posted Jun 19, 2008 at 6:13 AM | Permalink

#13 I looked at the Cedarville link. Did you calculate the straight line trend of the two series in the final graph and compare them? Also, with regard to comparing such trend lines, what would you use as a threshold for good agreement?

I take your basic point about the importance of method over result. As Wegman put it, WRONG METHOD + RIGHT ANSWER = BAD SCIENCE. (Or something close to that.) And yet in science results ultimately do prove the method. Suppose you were to examine thousands of sites in this manner and in every case the algorithm made “wrong-headed” adjustments but always got the trend right? Would you still consider the method “all wrong?” Perhaps the main take away of your work should be not that the algorithm is wrong-headed, but that getting the absolute temperatures right is not a necessary intermediate step when calculating corrected trend lines. Strictly from a trend point of view, whether you adjust older temperatures warmer or newer ones cooler by a commensurate amount is of no consequence, is it?

Stepping back a bit, considering that this algorithm looks only at temperature series and geographic location, how could it have any conception of right or wrong (headed) adjustments? It seems obvious on its face that expanding such an algorithm to incorporate detailed site-specific data could only improve it. But if you had that data, why would you need such an algorithm at all? You would just apply the corrections directly and be done with it.

It seems to me that the only valid way to assess an algorithm like this is to accept the inherent limitations of its design. If the initial formulation was something like, “What can we do about making site adjustments without actually having site-specific data,” then it seems a bit daft to criticize it for not making use of site-specific data.

I don’t mean to take any position on whether the GISS algorithm is accurate or not. Given its limitations maybe it can’t possibly be accurate in a systematic way. But your criticisms seem to be off. If you repeated this work at many sites and demonstrated many instances of error than you could make the basic scientific point that, well, whatever the method, it doesn’t work and you have the data to prove it. But here you’ve given one result and the algorithm’s trend line matched it. (And in the Cedarville case you don’t give trend lines.) The fact that the intermediate data (the absolute temperatures) seem wrong is not in itself proof that the method is “all wrong.”

#14 Apologies for making you repeat yourself. I’ve only been coming to this site regularly for a few weeks and have only just scratched its surface. (BTW, do you have an acronym list anywhere? ROW = rest of world?)

REPLY: The station history files have always been available, and could be used instead of the lights and brightness ratings Hansen uses now
Sam Urbinto

Posted Jun 19, 2008 at 1:31 PM | Permalink

RestOftheWorld yep
ChrisJ

Posted Jun 20, 2008 at 12:00 AM | Permalink

Hmmm. That homeland security camera appears to be on the grass side of the pavement… Sigh. Thanks. best regards, -chris
Geoff Sherrington

Posted Jun 20, 2008 at 3:37 AM | Permalink

Re # 21 notanexpert

Many scientific nerves in my body scream “NO” at the approach of using bad data and hoping it averages out. That’s only one step away from astrology.

Suppose you were to examine thousands of sites in this manner and in every case the algorithm made “wrong-headed” adjustments but always got the trend right?

The 4-colour map problem might have been “solved” by throwing around thousads of numbers, but many of us regard that as a second class proof.

One tries not to work with trends, because they ultimately need pinning to a reference value. It is much better to work in absolutes. Trends in sea level height, for example, can be detected (perhaps) in the noise at some sites, but other sites give different trends. Isostasy, etc, many other effects exist to affect the trend – when the absolute value is the final need.

Keep logging, Anthony.
MarkW

Posted Jun 20, 2008 at 5:01 AM | Permalink

Anthony,

You can cut NASA some slack on the shuttle computers. I’m involved in the aeronautics industry, and I can assure you that getting a computer designed and then certified for flight is a difficult and expensive process. I can imagine that building one for the shuttle would be even more difficult. So as long as the computers were doing the job, I don’t see why they would even think of upgrading them.

About the only reason to upgrade would be lack of spare parts, as the equipment gets obsolete, or an increase in what the computer is expected to do.
Anthony Watts

Posted Jun 21, 2008 at 11:31 AM | Permalink

Geoff, you sent me an email a couple of weeks ago and I had intended to respond, but my email application crashed and I’ve lost email correspondence and contacts. Can you resend?
Geoff Sherrington

Posted Jun 21, 2008 at 9:36 PM | Permalink

Re #26 Anthony

I’m battling combind EvID ad Vundo on Vista just know, so I know what crashes mean. (Anyone help? Vista and the anti-virus people have not cracked it yet).

Never mind the last email becaue I’ve made some progress in mapping UHI in Melbourne. It was done by BoM som years ago, I know the name of the then project leader and I am in the process of asking for the results.

Motivation is to add more ROW to USA data. Australia has lots of high quality station data. Do you have the CD of 1,200 stations with daily recording starting as early as 1860s? If not I’ll try to get you one. It’s the least I can do to reward yor time, persistence, skill and diligence. I’ve been waiting for the present term of staellite temps to lengthen and be de-contraversialised because then you can do better comparisons.

I also have a paper on surveying the Earth including data on satellite errors (unpub). About 12 pages of .pdf. Shall I send it or are you still having bug problems?

sherro1@optusnet.com.au
Jeremy

Posted Jun 24, 2008 at 11:21 AM | Permalink

Here’s my question: Why was the decision made to *replace* old temperature data collection methods with MMTS? Wouldn’t it have been smarter to have a period of 10-20 years where both old and new systems were run in parallel for verification purposes? It seems that data would be very very useful, even if taken using older methods.