Thursday, October 15, 2015

Global Land Surface Databank: Version 1.1.0 Release

In June 2014, the first version of the global databank was released (Rennie et al., 2014), which included data from nearly 50 different sources and an algorithm to resolve duplicate stations and piece together complete temperature time series. Since then, there have been monthly updates, appending new data to existing stations. Thanks to user feedback, along with additional analysis described below, minor changes were introduced and implemented to the merge program to ensure the most accurate data were incorporated in the final product. This, along with updates to current sources required a small change to the versioning system. The remainder of this post will highlight the changes implemented in the global land surface databank, version 1.1.0. More information about the structure of the databank, including sources, formats, and merge algorithm, can be found on the databank website (

Updates to Stage 1 and Stage 2 Data 
The databank design includes six data Stages, starting from the original observation to the final quality controlled and bias corrected products. For the purposes of this update, only three stages were modified: digitized data (Stage One), data converted to a common format (Stage Two), and the merged dataset (Stage Three).

The highest priority source comes from the Global Historical Climatology Network – Daily (GHCN-D) dataset (Menne et al. 2012). In June 2015, GHCN-D underwent a large update, which included a new average temperature element (TAVG), along with the addition of 1,400 stations that are a part of the World Meteorological Organization’s (WMO) Regional Basic Climatology Network (RBCN). Because these stations are important for real time updates, it was necessary to include this new version in the latest merge.

Further assessment was also done on one of our sources known as “russsource.” This source contained over 36,000 stations reporting maximum and minimum temperature. While the original format was consistent across all stations, it was discovered that this source included 27 individual sources. It was decided to split these sources up and place them individually in the merge following the source hierarchy defined by the databank working group. Because of some duplication with sources used in GHCN-D, only 20 of the 27 sources were included. In addition, station ID’s were brought into the Stage Two data, so that the merge’s ID test could be implemented. The same was done for the source known as “ghcnsource.” 

Other than the above, no additional sources were added to the source hierarchy. One source however was removed (crutem4), because it was determined that the use of these stations as a last resort was causing stations to be unique because of the data changes through bias corrections. Candidate stations from crutem4 were matched with their respective target stations through metadata tests, but were chosen as unique from the data tests, because of these corrections. In order to avoid excessive station duplication, this source was removed.

Changes to Merge Algorithm
The merge algorithm, as described by Rennie et al. 2014, underwent no code changes. However, a couple of thresholds were modified in order to maximize the amount of data the final recommended product would have. The thresholds are defined in a configuration file that is required for the program to run successfully.

The first step of the merge algorithm takes into account the metadata between a target and candidate station, including the stations latitude, longitude, elevation and name. A quasi-probabilistic comparison is made and the result is a metadata metric between 0 and 1. In version 1.0.0, this metric needed to pass a threshold of 0.50 in order to be considered for merging. Analysis showed that too many stations were being pulled through and forcing merges between stations that shouldn’t have. As a result, a stricter threshold of 0.75 was applied, in order to avoid this issue.

In addition, once a candidate station is chosen to merge with a candidate station, it needs to fill in a gap of at least 60 months (5 years) in order to be added to the target station. It was determined that this gap was too large, and target stations with short gaps in its data were not being filled in by qualifying candidate stations. This gap threshold has been reduced to 12 months as a result.

Similar to version 1.0.0, all decisions made were tested against an independent dataset generated from hourly data for US stations available in the Integrated Surface Dataset (Smith et al. 2011). Results only show a small change between the two versions

Version 1.1.0 of the recommended merge contains 35,932 stations (Figure 1), nearly 4,000 stations more than v1.0.0 (32,142). Figure 2 depicts that the addition of stations reflect the most recent period, as there is relatively a 10% increase in the number of stations since 1950. It should be noted that there is a drop in coverage prior to 1950 with the new version. However it is the author’s opinion that this was reflected by removing crutem4 as one of the sources. Including this source had made candidate stations unique, due to differences in its data as a result of the data providers bias corrections. While the number of stations is lower during this time period for v1.1.0, it should be noted that the number of gridboxes used in analysis (Figure 3) was either equal, or slightly higher than v1.0.0.

Stage Three normally includes a merge recommended and endorsed by ISTI, along with variants showing the structural uncertainty of the algorithm. Due to time constraints, these variants are not available, however will be provided at a later date.

Figure 1: Location of all stations in the recommended Stage Three component of the databank. The color corresponds to the number of years of data available for each station. Stations with longer periods of record mask stations with shorter periods of record when they are in approximate identical locations.

Figure 2: Station count of recommended merge v1.1.0 by year from 1850 to 2014, compared to version 1.0.0, along with GHCN-M version 3.

Figure 3: Percentage of global coverage with respect to 5 degree gridboxes for the recommended merger v1.1.0 by year from 1850-2014, comparted to version 1.0.0, along with GHCN-M version 3.

Saturday, June 6, 2015

Promotional flyers now available in German and Spanish

These have sat on my to-do-list for far too long but I have now finally found time to place the German and Spanish translations of the flyers that were taken to COP in Lima on the website. These can be found at My thanks to Enric Aguilar,  Stefan Bronnimann, Renate Auchmann and Victor Venema for the significant efforts to undertake these translations. Also a big thanks is due to NCEI graphics team for their efforts to re-render the original flyers in multiple languages.

The promotional materials are freely available and encouraged for re-use in any forum that may help raise interest in and knowledge of ISTI and its aims. Please feel free to take copies anywhere and everywhere that is relevant.

Thursday, June 4, 2015

The Karl et al. Science paper and ISTI

Note: this post is partly personal opinion.

I suspect when this is being posted at unembargo time there will be a whole slew of stories running on the news media and blogs about the Karl et al. paper in Science (I shall add a link to the actual paper if I remember later). But given the use of the ISTI databank in the analysis - its first high profile use in anger and a testament to all those years of hard work by very many colleagues (principally Jared Rennie) - some may come towards this little quiet corner of the internet. So here are some quick thoughts.

Karl et al. find greater recent period warming using a new set of land and sea surface temperature records than their operational versions used in NCEI's monitoring products to date. They conclude that there is no statistical evidence for a slowdown in the rate of warming in the new estimate calling into apparent question the much discussed 'hiatus'.

Firstly, to be clear, most of the change in trend documented in Karl et al. arises not from the land (the focus of ISTI) but rather from the sea surface temperature dataset changes. These changes relate to their now calculating ship bias adjustments throughout the record, and accounting for the transition from predominantly ships to predominantly buoys since the 1980s. There is no doubt that buoys read colder than ships (attested to in multiple published analyses) - so in not previously accounting for this the prior NCDC analysis had a marked propensity to underestimate sea surface temperature changes in the most recent period. There are other changes in the sea surface temperature dataset documented in Huang et al and Liu et al. These are secondary in terms of recent trends but still important for certain applications. For example, ERSSTv4 likely captures far better ENSO variations prior to 1920 or so. This, however, is a land surface air temperatures blog so I shall wax lyrical no further on the matter of SSTs. I can try to answer questions on ERSSTv4 in the comments (I was a co-author on the ERSST analyses) if you have any burning questions.

So, onto land temperatures. Karl et al. apply the pre-existing pairwise homogenization algorithm used in GHCNv3 to the databank version 1.0.1 release. Effectively this is going from considering these:

to considering these:

The effect of going from the 7,280 stations in GHCNMv3 to applying the same algorithm to the databank (although not all 32,128 stations as many were too short or isolated or incomplete - Karl et al. mentions 'double' so somewhere around 15,000 were likely used) is very much smaller than the effect of the sea surface temperature changes despite the step change in station count and coverage. The most recent period trends in Karl et al. over land exhibit a little more warming (c.10%) than GHCNv3 does, but its not remotely statistically siginificant. It'll be interesting to look, down the line, at what proportion of that change arises from improved coverage and what proportion to changes in areas of common sampling and to consider the effects on common stations and a slew of other analyses. Presumably this will be part of a broader analysis under GHCNv4 which will be built off the databank release, again using PHA. There may be additional innovations, in part arising from the SAMSI/IMAGe/ISTI workshop held in Boulder last summer. 

There are two additional questions that arise:

1. Does this analysis obviate the need for ISTI?

Absolutely not.

Without ISTI the land side of Karl et al would not have been possible for starters. But more generally this is but one estimate and we most definitely need multiple estimates. We are also yet to run the PHA algorithm and others through the benchmarks through which additional insights and improvements are expected to accrue. We also know there remain lots and lots of data out there to rescue and incorporate into the holdings and use to get still better estimates of the global, regional and local changes. So, much work to be done and we have only just started to scratch the surface of what is possible.

2. Does it call into question the slew of papers of the recent hiatus / pause / slowdown?

Not really.

The NCDC estimate (and GISS which uses the same marine and land basis estimates) was already at the low-end of the family of available estimates of global mean behaviour and this simply puts them back within or just above these estimates for their trends over 1998 to 2012 / 2014. The slowdown is also less marked in all of the datasets now in part because of the additional two and a bit years since the AR5 reported periods in which we appear to be flipping to a positive IPO (this will become clearer with time) which will cause enhanced short-term surface warming.

But, in part this is a question of which hypothesis to test. Karl et al are testing whether there has been a detectable change in the observed trend behaviour. The answer is no, and pretty much was anyway according to a number of prior analyses. The modern period adjustments and innovations in Karl et al. simply strengthen that conclusion.

Arguably the more interesting hypothesis to test is whether the observations are consistent with the family of climate model projections. Here the Karl et al adjustments take the NCDC dataset from inconsistent (3 sigma) to suspicious (2 sigma) (here I am adopting metrology Guide to Uncertainties in Measurements language for clarity - in that context Karl et al. analysis takes us from k=3 to k=2, k=1 (within 1 sigma) would be deemed consistent).

Furthermore, the questions of mechanistic understanding of decadal variability that all these studies have focussed upon are societally relevant and will improve our understanding of the climate system. Not only that but the insights will be used to improve climate models and therefore future predictions and projections. So, the existing literature on the topic is undoubtedly highly valuable. Doubtless there will be those saying they aren't / weren't.

Concluding remarks

To conclude, worryingly not for the first time (think tropospheric temperaures in late 1990s / early 2000s) we find that potentially some substantial portion of a model-observation discrepancy that has caused a degree of controversy is down to unresolved observational issues. There is still an undue propensity for scientists and public alike to take the observations as a 'given'. As Karl et al. attests, even in the modern era we have imperfect measurements.

Which leads me to a final proposition for a more scientifically sane future ...

This whole train of events does rather speak to the fact that we can and should observe in a more sane, sensible and rational way in the future. There is no need to bequeath onto researchers in 50 years time a similar mess. If we instigate and maintain refernce quality networks that are stable SI traceable measures with comprehensive uncertainty chains such as USCRN, GRUAN etc. but for all domains for decades to come we can have the next generation of scientists focus on analyzing what happened and not, depressingly, trying instead to inevitably somewhat ambiguously ascertain what happened.

Saturday, February 28, 2015

Promotional flyers now available in French

International volunteers are helping to translate the promotional materials recently distributed at the COP meeting in Lima into additional languages. These will be made available through as they become available. Please distribute and use to promote the Initiative's aims and objectives at relevant venues and meetings.

With thanks to Lucie Vincent of Environment Canada and the graphics team at NOAA's National Climatic Data Center versions in French are now available.

Tuesday, January 20, 2015

Because the POSTman always delivers ...

We recently had a full teleconference meeting of participants. If you are prone to insomnia the full minutes are available at this link.

The major news is that, after some discussions on the appropriate name for the group ISTI does, indeed, have a new group ... the Parallel Observations Science Team (or POST) led by Victor Venema and Renate Auchmann.

You may recall a number of posts on this subject over at Victor's place. We shall work with colleagues to help further this effort. By being part of the formal ISTI family we will ensure that benefits regarding data holdings, benchmarking, and lessons learnt from this effort are more broadly shared. We always look for win-wins!

We are still looking at populating the parallel measurements database so if you know of any coincident measurements using distinct techniques or looking at spatial variability at the local scale (or both) then please do get in contact. Victor and Renate are also still populating this group (terms of reference here) so if parallel measurements are of interest and you feel you could contribute drop them a line.

More details on this effort can be found at

Thursday, January 1, 2015

Survey on national homogenised temperature data sets

I've recently run a survey on national homogenised temperature data sets. Whilst this was not an exhaustive survey (as indicated by the number of responses), it is an indication of what's out there and what resources various countries are putting into this work.

Survey reports were received from 18 countries (CHN, CAN, ISR, IRL, SUI, SLO, NOR, HUN, NED, ROM, GBR, AUT, SRB, ESP, CZE, SWE, UKR, AUS) and 1 region (Catalonia). Summary results were as follows:

1. Number of staff involved in homogenisation (full-time equivalent)

Less than 1                                          2 countries
1-2                                                       9
2-4                                                       5
4 or more                                             3

(global and continental data sets are excluded from this - for example, the UK have several people working on the HadCRUT data sets, and the Netherlands on ECA&D and associated projects)

2. Existence of a national homogenised data set

Yes                                                                                         16
Yes but not yet released                                                         1
No national set but a station/regional set                              1
No                                                                                          1

3. Time resolution of data set

Daily                                                                                      8
Monthly                                                                                 7
Mix depending on element                                                    1
Monthly for early data, daily for later                                   1

4. Time resolution of adjustment

Results from this are a little unclear – several responses indicated use of the Vincent methodology, which interpolates adjustments based on monthly values to daily timescales.

Daily                                                                                      4
Monthly                                                                                 11
Monthly for detection, daily for adjustment                          2

5. Elements included

Maximum, minimum and mean temperature                        8
Maximum and minimum temperature                                   5
Mean temperature only                                                          4

(note that ‘maximum and minimum temperature’ implies mean temperature is not homogenised independently – in most cases it can still be calculated based on max/min)

6. Frequency of updating/reassessing homogeneity

Not updated                                                                      6 (in 2 cases, the first data set has only just
                                                                                              been completed)
Appended with unadjusted data only                               2
Irregularly                                                                         1
Annually or near-annually                                                4
Intervals longer than 2 years                                            4 (ranging from every 3 to every 10 years)