[Update 9 December 2014]
Summaries online
The summaries of the Climate Dialogue on Climate Sensitivity and Transient Climate response are now online (see links below). We have made two versions: an extended and a shorter version.

Both versions can be downloaded as pdf documents:
Summary of the climate dialogue on climate sensitivity
Extended summary of the climate dialogue on climate sensitivity
[End update]

Climate sensitivity is at the heart of the scientific debate on anthropogenic climate change. In the fifth assessment report of the IPCC (AR5) the different lines of evidence were combined to conclude that the Equilibrium Climate Sensitivity (ECS) is likely in the range from 1.5°C to 4.5°C. Unfortunately this range has not narrowed since the first assessment report in 1990.

An important discussion is what the pros and cons are of the various methods and studies and how these should be weighed to arrive at a particular range and a ‘best estimate’.  The latter was not given in AR5 because of “a lack of agreement on values across assessed lines of evidence”. Studies based on observations from the instrumental period (1850-2014) generally arrive at moderate values for ECS (and that led to a decrease of the lower bound for the likely range of climate sensitivity from 2°C in AR4 to 1.5°C in AR5). Climate models, climate change in the distant past (palaeo records) and climatological constraints generally result in (much) higher estimates for ECS.

A similar discussion applies to the Transient Climate Response (TCR) which is thought to be more policy relevant than ECS.

We are very pleased that the following three well-known contributors to the general debate on climate sensitivity have agreed to participate in this Climate Dialogue: James Annan, John Fasullo and Nic Lewis.

The introduction and guest posts can be read online below. For convenience we also provide pdf’s:

Introduction climate sensitivity and transient climate response
Guest blog James Annan
Guest blog John Fasullo
Guest blog Nic Lewis

To view the dialogue of James Annan, John Fasullo, and Nic Lewis following these blogs click here.

Climate Dialogue editorial staff
Bart Strengers, PBL
Marcel Crok, science writer

Introduction climate sensitivity and transient climate response

This dialogue focuses on the Equilibrium Climate Sensitivity (ECS) and the Transient Climate Response (TCR). Both summarize the global climate system’s temperature response to an externally imposed radiative forcing (RF), expressed in W/m2. ECS is defined as the equilibrium change in annual mean global surface temperature following a doubling of the atmospheric CO2 concentration, while TCR is defined as the annual mean global surface temperature change at the time of CO2 doubling following a linear increase in CO2 forcing over a period of 70 years. Both metrics have a broader application than these definitions imply: ECS determines the eventual warming in response to stabilization of atmospheric composition on multi-century time scales, while TCR determines the warming expected at a given time following any steady (and linear) increase in forcing over a 50- to 100-year time scale. TCR is a useful metric next to ECS because it can be estimated more easily than ECS, and is more relevant to projections of warming over the rest of this century.

Note that although ECS and TCR are defined in terms of a doubling of the CO2 content, it can be applied to whatever forcing agents, such as changes in solar radiation and volcanic dust injections (bearing in mind that different types of forcings can have a slightly different temperature response per W/m2). As such, ECS is a measure for the global average temperature response to a change in the Earth’s radiative balance, as characterized by the so-called radiative forcing expressed in W/m2 (e.g. the radiative forcing due to a doubling of CO2 is 3.7 W/m2).

Lines of evidence for ECS
Figure 1 below shows the ranges and best estimates of ECS in AR5 (IPCC, 2013) based on studies that support different lines of evidence, which are: 1) the observed or instrumental surface, ocean and/or atmospheric temperature trends since pre-industrial time, 2) observed and modelled short-term perturbations of the energy balance such as those caused by volcanic eruptions, included under instrumental in figure 1, 3) climatological constraints by comparing patterns of mean climate and variability in models to observations, 4) climate models, and 5) temperature fluctuations as reconstructed from palaeoclimate archives and 6) studies that combine two or more lines of evidence into one 5-95% (very likely) uncertainty range for ECS.

Likely range of ECS in AR5
In AR5 the different and partly independent lines of evidence are combined to conclude that ECS is likely in the range 1.5°C to 4.5°C (grey area in figure 1) with high confidence.

Figure 1 Ranges and best estimates of ECS based on different lines of evidence, replicated from figure 1 of Box 12.2 in AR5. Unlabeled ranges refer to studies cited in AR4. Bars show 5-95% uncertainty ranges with the best estimates marked by dots. Dashed lines give alternative estimates within one study. The grey shaded range marks the likely 1.5°C to 4.5°C range as reported in AR5, and the grey solid line the extremely unlikely less than 1°C, the grey dashed line the very unlikely greater than 6°C.

In AR4 the range was adjusted slightly upwards to 2–4.5°C, but AR5 reduced the lower bound down to 1.5°C, returning to the earlier range of 1.5–4.5°C for ECS. In Box 12.2 in AR5 it was written that: ‘…this change reflects the evidence from new studies of observed temperature change, using the extended records in atmosphere and ocean. These studies suggest a best fit to the observed sur­face and ocean warming for ECS values in the lower part of the likely range. Note that these studies are not purely observation­al, because they require an estimate of the response to radiative forcing from models. In addition, the uncertainty in ocean heat uptake remains substantial. Accounting for short term variability in simple models remains challenging, and it is important not to give undue weight to any short time period that might be strongly affected by internal variability.’ So it is stated that estimates based on (constraints from extended records in) the instrumental period point to lower ECS values but at the same time one should be careful with overvaluing the instrumental evidence.

Weighing the evidence
In AR5 it is indicated that the peer-reviewed literature provides no consensus on a formal statistical method to combine different lines of evidence. Therefore, in AR5 the range of ECS and TCR is expert-assessed, supported by, as indicated above, several different and partly independent lines of evidence, each based on multiple studies, models and data sets. Obviously, this expert judgement in AR5 has been performed deliberately, but it is not a straightforward procedure. The discussion on how to weigh the different lines of evidence is very old, not only in the scientific literature but also in the blogosphere and in reports and is still going on. For example, Nic Lewis, who takes part in this dialogue and was author/co-author of two studies mentioned in the instrumental category in figure 1, argues that instrumental or empirical approach studies with relatively low ECS values should be weighted much higher than IPCC did in AR5 (Lewis and Crok, 2014).

Others argue that the main limit on ECS is that it has to be consistent with palaeoclimatic data which point at ranges being consistent with the IPCC-range (Palaeosens, 2012, also mentioned in figure 1) and also in line with climate models likely range of about 2 to 4.5 0C (CMIP5). Some argue that palaeoclimatic data points to values in the upper part of the IPCC range (Hansen, 2013). In this dialogue we therefore want to focus first on the following two questions:

1) What are the pros and cons of the different lines of evidence?

2) What weight should be assigned to the different lines of evidence and their underlying studies?

Best estimate
With respect to the best estimate it was reported in AR5 that: “No best estimate for equilibrium climate sensitivity can now be given because of a lack of agreement on values across assessed lines of evidence.” Also, it was concluded that ECS is extremely unlikely less than 1°C (grey solid line in figure 1), and very unlikely greater than 6°C (grey dashed line). So IPCC did not choose between the different lines of evidence with respect to the best estimate, but it was not discussed in much detail why. Therefore, the third question we will address is:

3) Why would a lack of agreement between the lines of evidence not allow for a best estimate for ECS?

4) What do you consider as a range and best estimate of ECS, if any?

TCR range in AR5
AR5 concludes with high confidence that the TCR is likely in the range 1°C to 2.5°C, and extremely unlikely greater than 3°C (see figure 2).

Figure 2 Probability density functions, distributions and ranges (5 to 95%) for the TCR from different studies, replicated from figure 2 of Box 12.2 in AR5. The grey shaded range marks the likely 1°C to 2.5°C range, and the grey solid line marks the extremely unlikely greater than 3°C.

TCR is estimated from the observed global changes in surface temperature, ocean heat uptake and RF, the response to the solar cycle, detection/attribution studies identifying the response patterns to increasing GHG concentrations, by matching the AR4 probability distribution for ECS and the results of the CMIP5 model inter-comparison study. Estimating TCR suffers from fewer difficulties in terms of state- or time-dependent feedbacks, and is less affected by uncertainty as to how much energy is taken up by the ocean. But still, there is a debate on the likely range. Again, Nic Lewis argues that studies showing a lower value in figure 2 (Gillett et al. (2013), Otto et al. (2013) and Schwartz (2012)) should be weighted much higher than the others resulting in substantially lower values for TCR (1.3-1.4°C) than, for example, the average of the CMIP5 models (1.8-1.9°C). Therefore, the question that will be discussed with respect to TCR is:

5) What weight should be assigned to the different studies mentioned in figure 2?

6) What is your personal range for TCR, if any?

Gillett, N.P., V.K. Arora, D. Matthews, P.A. Stott, and M.R. Allen, 2013. Constraining the ratio of global warming to cumulative CO2 emissions using CMIP5 simulations. J. Clim., doi:10.1175/JCLI-D-12–00476.1.

Hansen, J., M. Sato, G. Russell, and P. Kharecha, 2013: Climate sensitivity, sea level, and atmospheric carbon dioxide. Phil. Trans. R. Soc. A, 371, 20120294, doi:10.1098/rsta.2012.0294.

IPCC Climate Change 2013: The Physical Science Basis (eds Stocker, T. F. et. al) (Cambridge Univ. Press, 2013).

Lewis, N. and M. Crok, 2014: A Sensitive Matter, a report published by the Global Warming Policy Foundation, 67 pp.

Otto, A., F. E. L. Otto, O. Boucher, J. Church, G. Hegerl, P. M. Forster, N. P. Gillett, J. Gregory, G. C. Johnson, R. Knutti, N. Lewis, U. Lohmann, J. Marotzke, G. Myhre, D. Shindell, B. Stevens and M. R. Allen, 2013. Energy budget constraints on climate response. Nature Geosci., 6: 415–416.

Paleosens Members, 2012. Making sense of palaeoclimate sensitivity. Nature, 491: 683–691.

Schwartz, S.E., 2012. Determination of Earth’s transient and equilibrium climate sensitivities from observations over the twentieth century: Strong dependence on assumed forcing. Surv. Geophys., 33: 745–777.

Guest blog James Annan

The sensitivity of the climate to atmospheric CO2 concentrations is obviously an important consideration in any discussion of energy policy and emissions targets. The transient climate response (TCR) is arguably more directly informative regarding the future warming which we will experience due to an (anticipated) increase in CO2 forcing over the 21st century, but the equilibrium sensitivity is more relevant to stabilisation scenarios and long-term change over perhaps 100-200 years (and beyond). For this reason, it has been a major topic of research in climate science for many decades.

One rather fundamental point need to be clearly understood at the outset of the discussion: there is no “correct” pdf for the equilibrium sensitivity. Such a pdf is not a property of the climate system at all. Rather, the climate sensitivity is a value (ignoring quibbles over the details and precision of the definition) and a pdf is merely a device for summarising our uncertainty over this value. An important consequence of this is that there is no contradiction or tension if different lines of evidence result in different pdfs, as long as their high probability ranges overlap substantially. All that this would mean is that the true value probably lies in the intersection of the various high-probability ranges. Thus the question of weighting different methods higher or lower should not really apply, so long as the methods are valid and correctly applied. If one result generates a range of 1-3.5°C and another study gives 2-6°C then there is no conflict between them.

Adjusting for a bit of over-optimism in each study (i.e. underestimation of their relevant uncertainties) we might conclude in this case that an overall estimate of 1.5-4°C is probably sound - the result in this hypothetical case having been formed by taking the intersection of the two ranges, and extending it by half a degree at each end. Additionally, if one result argues for 2-4°C and another 1-10°C, then the latter does not in any way undermine the former, and in particular it does not represent any evidence that the former approach is overconfident or has underestimated its uncertainties. It may just be that the former method used observations that were informative regarding the sensitivity, and that the latter did not.

A formally superior approach to calculating the overlap of ranges would be to combine all the evidence using Bayes' Theorem (e.g. Annan and Hargreaves 2006). In this paradigm, “down weighting” one line of evidence would really amount to flattening the likelihood, that is, acknowledging that that the evidence does not distinguish so strongly between different sensitivities. In principle it is not correct to systematically down weight particular methods or approaches, so long as their uncertainties have been realistically represented. It is more a case of examining each result on its merits. Just as some papers have underestimated their uncertainties, other papers have surely overestimated theirs.

Recent (~20th century) temperature change
Around ten years ago, Bayesian methods using the observed transient warming over the 20th century (also variously using ocean heat uptake and/or spatial patterns of climate change) became popular, although most researchers concluded that, at that time, these data didn't provide a very tight constraint. Actually, even as far back as 2002, there was enough data to provide useful estimates such as the 1.3-4.2°C of Forest et al (2002), but these results were unfortunately ignored in favour of methods which have since been shown to generate an inappropriate focus on higher values (Annan and Hargreaves 2011).

More recently, as data improves in both quantity and quality, and helped by better understanding of aerosol effects, it is widely agreed that the gradual warming of the climate system points to a sensitivity somewhere at the low end of the traditional IPCC range (e.g. Aldrin et al 2012, Ring et al 2012, Otto et al 2013). One important limitation of these methods is that they typically assume a rather idealised low-dimensional and linear system in which the surface temperature can be adequately represented by global or perhaps hemispheric averages. In reality the transient pattern of warming is likely to be a little different from the equilibrium result, which complicates the relationship between observed and future (equilibrium) warming (e.g. Armour et al 2014).

GCM ensemble-based constraints
Some (including me) have tried to generate constraints based on creating an ensemble of GCM simulations in which parameters of the GCM are varied, and then the models are generally evaluated against observations in some way to see which seem more likely. Unfortunately, the results of these experiments seem to be highly dependent on the underlying GCM, as was first shown by Yokohata et al 2010 and has also been confirmed by others (Klocke et al 2011). Therefore, I no longer consider such methods to be of much use. The underlying problem here appears to be that changing parameters within a given GCM structure does not adequately represent our uncertainty regarding the climate system. An alternative which might have the potential to overcome this problem is to use the full CMIP3/CMIP5 ensemble of climate models from around the world. These models generate a much richer range of behaviour, though debate still rages as to whether this range is really adequate or not (and for what purposes).

Some recent papers which explore the CMIP ensembles have presented arguments that the climate models with the higher sensitivities tend to be more realistic when we examine them in various ways (e.g. Fasullo and Trenberth 2012, Shindell 2014). If these results are correct, then the current moderate warming rate is a bit of an aberration, and so a substantial acceleration in the warming rate can be expected to occur in the near future, sufficient not only to match the modelled warming rate, but even to catch up the recent lost ground. It must be noted that these analyses are primarily qualitative in nature, in that they do not provide quantitative probabilistic estimates of the sensitivity (instead merely arguing that higher values are preferred). Thus it is difficult to judge whether they really do contradict analyses based on the recent warming.

Paleoclimate evidence
When averaged over a sufficiently long period of time, the earth must be in radiative balance or else it would warm or cool massively. This enables us to use paleoclimatic evidence to estimate the sensitivity of the climate. The changes to the climate system over the multi-million year time scales that may be considered here are generally far more complicated than just a change in GHG concentrations, including changes to ice sheets, continental drift and associated mountain range uplift, opening and closing of ocean passages, and vegetation changes.

It may be naively assumed or expected that we can just add up the forcings and use the temperature response to determine the equilibrium sensitivity, but model simulations suggest that there is significant nonlinearity in how the climate system responds to the multiple changes that have occurred. For example, Yoshimori et al (2011) found that the combined response to ice sheet changes and the reduction in GHG concentration at the Last Glacial Maximum is not the same as the sum of the responses to each of these forcings in isolation. Therefore, it would be difficult to derive a precise estimate of the sensitivity to CO2 forcing from an analysis of paleoclimatic evidence.

Nevertheless, paleo studies have a number of important consequences for understanding climate change. Firstly, the evidence does help to rule out both very high and very low sensitivities. The global mean temperature has clearly varied by several degrees over long time scales (in tandem with substantial changes to radiative forcings), which can only really be reconciled with an overall sensitivity around the 2-4.5°C level or thereabouts (Rohling et al 2012). Secondly, models do a reasonable job at reproducing this, though they are far from perfect (data limitations make it hard to say quite how bad they are). Thirdly, at more regional scales, models disagree quite substantially both with each other and often with the data, which suggests that future projections might be also some way off. And finally, paleoclimate data also carries a message for how substantial an issue climate change really is. Our recent estimate was that the LGM was 4°C colder than the pre-industrial state (others might argue for a value closer to 6°C) and for this global average change, much of the North American continent and northern Scandinavia were covered in ice sheets several thousand metres thick. Obviously the changes in a warmer future will be rather different, but we can't expect them to be small. Overall, the paleoclimate evidence does not tightly constrain the equilibrium sensitivity but it does provide reasonable grounds for expecting a figure around to the IPCC canonical range (which could be used as a prior, for Bayesian analyses).

The recent transient warming (combined with ocean heat uptake and our knowledge of climate forcings) points towards a "moderate" value for the equilibrium sensitivity, and this is consistent with what we know from other analyses. Overall, I would find it hard to put a best estimate outside the range of 2-3°C.

Biographical sketch
Originally a mathematician, Dr James Annan has worked in research areas including agriculture, and ocean forecasting. For the past 13 years, he worked in the Japanese climate change research institute FRSGC/FRCGC/RIGC (perhaps better known as the home of the Earth Simulator). His (frequent co-author) wife and he were the two most highly cited scientists based in Japan in the recent IPCC AR5. They left Japan last year, returned to the UK, and will continue to present their research at http://www.blueskiesresearch.org.uk

Aldrin, M., Holden, M., Guttorp, P., Skeie, R. B., Myhre, G., & Berntsen, T. K. (2012). Bayesian estimation of climate sensitivity based on a simple climate model fitted to observations of hemispheric temperatures and global ocean heat content. Environmetrics, n/a–n/a. doi:10.1002/env.2140

Annan, J. D., & Hargreaves, J. C. (2006). Using multiple observationally-based constraints to estimate climate sensitivity. Geophysical Research Letters, 33(6), 1–4. doi:10.1029/2005GL025259

Annan, J. D., & Hargreaves, J. C. (2011). On the generation and interpretation of probabilistic estimates of climate sensitivity. Climatic Change, 104(3-4), 423–436. doi:10.1007/s10584-009-9715-y

Armour, K. C., Bitz, C. M., & Roe, G. H. (2013). Time-Varying Climate Sensitivity from Regional Feedbacks. Journal of Climate, 26, 4518–4534. doi:10.1175/JCLI-D-12-00544.1

Fasullo, J. T., & Trenberth, K. E. (2012). A Less Cloudy Future: The Role of Subtropical Subsidence in Climate Sensitivity. Science, 338(6108), 792–794. doi:10.1126/science.1227465

Forest, C., Stone, P., Sokolov, A., Allen, M. R., & Webster, M. (2002). Quantifying uncertainties in climate system properties with the use of recent climate observations. Science, 295(5552), 113.

Klocke, D., Pincus, R., & Quaas, J. (2011). On constraining estimates of climate sensitivity with present-day observations through model weighting. Journal of Climate, 110525130752016. doi:10.1175/2011JCLI4193.1

Otto, A., Otto, F. E. L., Boucher, O., Church, J., Hegerl, G., Forster, P. M., et al. (2013). Energy budget constraints on climate response. Nature Geoscience, 6(6), 415–416. doi:10.1038/ngeo1836

Ring, M. J., Lindner, D., Cross, E. F., & Schlesinger, M. E. (2012). Causes of the global warming observed since the 19th century. Atmospheric and Climate Sciences, 2, 401. doi:10.4236/acs.2012.24035

Rohling, E. J., Rohling, E. J., Sluijs, A., Dijkstra, H. A., Köhler, P., van de Wal, R. S. W., et al. (2012). Making sense of palaeoclimate sensitivity. Nature, 491(7426), 683–691. doi:10.1038/nature11574

Rose, B. E., Armour, K. C., Battisti, D. S., Feldl, N., & Koll, D. D. (2014). The dependence of transient climate sensitivity and radiative feedbacks on the spatial pattern of ocean heat uptake. Geophysical Research Letters. doi:10.1002/(ISSN)1944-8007

Shindell, D. T. (2014). Inhomogeneous forcing and transient climate sensitivity. Nature Climate Change, 4(4), 274–277. doi:10.1038/nclimate2136

Yokohata, T., Webb, M. J., Collins, M., Williams, K. D., Yoshimori, M., Hargreaves, J. C., & Annan, J. D. (2010). Structural Similarities and Differences in Climate Responses to CO2 Increase between Two Perturbed Physics Ensembles. Journal of Climate, 23(6), 1392. doi:10.1175/2009JCLI2917.1

Yoshimori, M., Hargreaves, J. C., Annan, J. D., Yokohata, T., & Abe-Ouchi, A. (2011). Dependency of Feedbacks on Forcing and Climate State in Physics Parameter Ensembles. Journal of Climate, 24(24), 6440–6455. doi:10.1175/2011JCLI3954.1

Guest blog John Fasullo

Challenges in Constraining Climate Sensitivity: Should IPCC AR5’s Lower Bound Be Revised Upward?

I would like to start by thanking the conveners of Climate Dialogue for their invitation to participate in this forum for discussing Earth’s climate sensitivity and the challenges involved in its estimation. The invitation provides a valuable opportunity to exchange perspectives on what I view as a critically important issue.

As outlined in the editors’ introduction, considerable challenges remain. Exchanges that highlight these, while also promoting potential solutions for dealing with them, are likely to be central to achieving a better understanding of climate. To provide context for my commentary I have taken the liberty of addressing the basis for the decision made in IPCC AR5 to reduce the lower bound of the estimated range of equilibrium climate sensitivity, asking whether the decision was justified at the time of the report and whether the reduction remains warranted given our improved understanding of climate variability since then. In short, I argue that although IPCC’s conservative and inclusive nature may have justified such a reduction at the time of their report, the evidence accumulated in recent years argues increasingly against such a change.

The Challenge
As outlined in the introduction there are multiple approaches for estimating Earth’s equilibrium climate sensitivity (ECS) and transient climate response (TCR). All attempts to quantify climate feedbacks, changes in the climate system that either enhance (positive feedbacks) or diminish (negative feedbacks) the change in the amount of energy entering the climate system (the planetary imbalance) as a result of some imposed forcing (e.g. increased atmospheric concentrations of carbon dioxide).

To varying extents, the approaches all face common challenges, including uncertainty in observations and the need to disentangle the response of the system to carbon dioxide from the convoluting influences of internal variability and responses to other forcing, such as due to changes in aerosols, solar intensity, and the concentration of other trace gases. It is known that sensitivity estimates derived from both the instrumental and paleo records entail considerable uncertainty arising from such effects (1, 2).

For some approaches, uncertainty in observations is also a primary impediment. Efforts to estimate climate sensitivity from paleoclimate records are a good example. While benefiting from the large climate signals that can occur over millennia, these approaches face the additional challenge of a proxy record that contains major uncertainty (2). Nevertheless, the paleo record provides a vital perspective for evaluating the slowest climate feedbacks.

General circulation models (GCMs) offer a uniquely physical approach for estimating ECS and TCR and readily allow for controlled experimentation, yet their representations of key processes is often lacking (also for example the interaction of aerosols with clouds) and some processes, particularly those acting on low frequency timescales or for which observations are generally unavailable, contain additional uncertainty.

Climatological constraint approaches attempt to relate the spread in these uncertainties across GCMs for some simulated field to a key physical feedback or basic model sensitivity. This approach has led to the developing subject of ‘emergent constraints’ (3,4). Challenges for the approach include the difficulty of establishing statistical confidence in identified relationships, due to a lack of independence across GCMs, and the need to firmly establish a physical basis for why a climatological constraint should act as an indicator of future change. The degree of these challenges may relate to how strongly a given field is tied to surface temperature, as useful insight has been gained for some fields (e.g. snow cover and water vapor, 3,5) but not others (clouds, 6).

The relevance of perturbation studies to climate change are limited by the degree to which they can serve as analogues to climate change, the certainty with which their forcing can be known, and the potentially complex and poorly understood interactions between that forcing and nature (e.g. clouds). So-called combined approaches incorporate two or more of the above methods in an attempt to leverage the strengths of each, but in doing so are also susceptible to their weaknesses. Broadening the discussion to address TCR increases the range of relevant processes to include those governing the rate of heat uptake by various reservoirs, particularly the ocean.

To some extent, the distinctions between ECS estimation methods are artificial. All GCMs have used the instrumental record to select model parameter values that produce plausible climates, and similarly all observational constraints require some implicit ‘model’ of the climate, even if this is simply an energy balance approach. It is my perspective that ultimately further progress in estimating both ECS and TCR can best be made by a combined consideration of the individual approaches and the adoption of a physically-based perspective rooted in narrowing uncertainty in the individual feedbacks that govern sensitivity across a broad range of timescales.

The need for physical understanding
Irrespective of their complexity, all approaches are faced with the challenges of attribution and uncertainty estimation, for which the validity of observations, underlying model, and base assumptions are key issues. It therefore is inappropriate to place high confidence in any single approach. Given this, and the fact that they do not each lead to the same estimated range of sensitivity, undermines efforts to provide a single best-estimate.

A complicating factor is that definitions of ECS can vary somewhat within the context of each approach, with estimates of ECS being based on a rather limited set of feedbacks as traditionally defined in slab-ocean GCM experiments (so-called fast-physics feedbacks including those in clouds, water vapor, and temperature), an additional level of complexity in the context of fully coupled GCMs and the instrumental record (including changes in the upper ocean, cryosphere, and vegetation), and the influence of very low frequency processes on paleoclimate timescales (involving ice sheets, deep ocean). A focus on specific feedbacks, rather than on ranges for sensitivity, promotes an apples-to-apples comparison across these perspectives.

A challenge to the feedbacks-centric approach however is that existing multi-model GCM archives contain output that only allows for limited exploration of feedbacks on a process level. Computation of key diagnostics (e.g. atmospheric moisture and energy budgets) is not possible given the limited availability of the high frequency data required, and many aspects of model physics remain undocumented. There is also a need to include experiments that isolate individual feedbacks. It is anticipated that with additional improvements in these archives and strategic experimental designs, many of these issues will be addressed in coming years (7).

Simple models: when are they simplistic?
Simple models rooted in statistics can be powerful tools for interpreting complex systems, a potential that relates to understanding both GCMs and the instrumental record. Ideally, if the appropriate statistical “priors” can be found for the free parameters in the models and if the underlying model is adequate, there is the potential for significant insight. In practice however, the approaches can be severely limited by the assumptions on which they’re based, the absence of a unique “correct” prior, and the sensitivity of their methods to uncertainties in observations and forcing (8, 9).

Simple models are also problematic in that they are of limited use for hypothesis developing and testing. They do not resolve individual feedbacks and thus how to incorporate them in the approach for future progress mentioned above remains unclear. This is not to say however that they offer no potential for hypothesis building. In fact, one hypothesis that has been suggested based on simple models is that the climate record of the past 15 years or so argues for a reduction in the lower bound of our estimated range of ECS, due to the reduced rate at which the surface has warmed and the negative feedbacks it might be viewed as suggesting. Indeed, this hypothesis was found to be sufficiently compelling that IPCC AR5 lowered its lower bound estimate on the likely range for ECS (10). But in retrospect, was this change warranted?

The “Hiatus”: Evidence For Lower Sensitivity?
In the past decade or so there has been a slowdown in the rate of global surface warming. This so-called “hiatus” has been manifested with both seasonal and spatial structure, with greatest surface cooling occurring in the tropical eastern Pacific Ocean in boreal winter and little cooling apparent over land or at high latitudes (9). The apparent slowdown of global surface warming has led some to conclude that evidence for lower climate sensitivity is “piling up” (11). Some have even argued that global warming has stopped.

It is true that, under the assumption of all things being equal, simple models have provided a consistent message regarding the need to lower the likely estimated ranges of sensitivity in order to achieve a best fit to the observational record (12,13). However, per the discussion above, a more physical approach is also essential in order to test this hypothesis and evaluate whether or not the circumstances surrounding the hiatus are indeed suggestive of “all things being equal”. In essence, the physical assumptions underlying this interpretation merit further scrutiny.

If the argument is to be made that recent variability warrants lowering ECS estimates, then clearly a central tenet of that argument is that the planetary imbalance has been mitigated by feedbacks. To reasonably assert that global warming has stopped, the planetary imbalance should be shown to be zero. Such assertions are readily testable across a broad range of independent climate observations and, in fact, a growing body of work has aimed to do just this.

Figure 1: Global ocean heat content from the surface through a) 700 m and b) 2000m with error estimates (bars) based on data from the World Ocean Database (14).

The picture emerging from this work is that surface temperature during the hiatus has not been driven primarily by a reduction in the planetary imbalance due to negative feedbacks but rather by the vertical redistribution of where in the ocean the imbalance is stored. Specifically, the increase in storage in deeper ocean layers has led to a relative reduction in the rate of warming of the upper ocean.

When this vertical structure is averaged out, for example by considering the total ocean heat content (OHC) from the surface to 2000 meter (Figure 1) the data show remarkable constancy in the rate of warming from the 1990s through 2000s. They also show a dramatic shift in how that warming has occurred as a function of ocean depth between decades, with the uppermost layers warming little in recent years in conjunction with rapid warming at depth.

The general lack of strong decadal shifts in total OHC have recently been corroborated by estimates of global thermometric sea level rise from satellite altimetry, which show remarkable persistence in the rate of thermometric expansion since 1993 (15). Further, efforts to deduce variability in the planetary imbalance from the satellite record of top of atmosphere radiative fluxes also find little change between the 1990s and 2000s (Richard Allan, personal communication).

The consistent picture that emerges from these various lines of evidence is that any assumption of “all things being equal” with respect to internal variability during the hiatus is invalid and little evidence exists for a role played by reductions in the planetary imbalance due to climate feedbacks. In the context of this exceptionally persistent planetary imbalance, studies suggesting a role for reductions in net forcing as driving the hiatus (16) only heighten the challenge for hypotheses that the hiatus is evidence for a strong negative feedback.

Is such behavior surprising? Not really. As early as 2011, colleagues and I demonstrated that the NCAR CCSM4 reproduced periods analogous to the current hiatus, with hiatus periods accompanying changes in the vertical redistribution of heat driven by winds at low latitudes (17). Subsequent work has shown that similar behavior is evident across a wide range of GCMs. Recent observations have only reinforced the likelihood that the current hiatus is consistent with such simulated periods. The main question that persists relates mainly to the broader context for the hiatus, given the uncertainties surrounding internal variability, and just how unusual such an event may be.

Nature as an ensemble member, not an ensemble mean

Figure 2: The range of decadal trends in global mean surface temperature from the CESM1-CAM5 Large Ensemble Community Project (LE, black and grey lines, 18) along with an observed estimate based on the NOAA-NCDC Merged Land and Ocean Surface Temperature dataset. Also shown are the mean (circle) and range (lines) of simulated planetary imbalance (right axis) from 2000 through 2010 for the 10 members of the LE with greatest cooling (blue) and warming (red)

The NCAR CESM1-CAM5 Large Ensemble (LE) Community Project provides a unique framework for understanding the role of internal variability in obscuring forced changes. It currently consists of 28 ensemble members in simulations of the historical record (1920-2005) and future projections (2006-2080) based on RCP8.5 forcing.

At 4.1°C, the ECS of the CESM1-CAM5 is higher than for most GCMs. Nonetheless, decadal trends from the model track quite closely with those derived from NOAA-NCDC observations (red line), with the model mean decadal trend (thick black line) skirting above and below observed trends about evenly since 1920. In several instances, decadal trends in observations have been at or beyond the LE range including intervals of exceptional observed warming (1945, 1960, 1980) and cooling (1948, 2009). The extent to which these frequent departures from the LE reflect errors in observations, insufficient ensemble size, or biases in model internal variability remains unknown. Nonetheless, there is no clear evidence of the model sensitivity being systematically biased high. Also noteworthy is the fact that the LE suggests that due to forcing, as indicated by the ensemble mean, certain decades including the 2000s are predisposed to a reduced rate of surface warming.

The LE also allows for the evaluation of subsets of ensemble members, such as in Fig 2, where the planetary imbalances for the ten ensemble members with the greatest global surface warming (red) and cooling (blue) trends from 2000-2010 are compared. It is found that no significant difference exists between the two distributions and the mean imbalance for the cooling members is actually greater than for the warming members. Thus the finding of a relatively unchanged planetary imbalance during the recent hiatus period is entirely consistent with analogous periods in LE simulations. While the LE does suggest that recent trends have been exceptional, this is also suggested by the instrumental record itself, which includes exceptional El Niño (1997-98) and La Niña events (2010-2012) at the bounds of the recent hiatus.

A Path Forward
In my view, a combined effort that makes use of various approaches for constraining sensitivity, with an emphasis on evaluating individual climate feedbacks with targeted observations, provides a viable path forward for reducing uncertainty. Process studies focusing on feedback related fields are also essential and recent efforts have shown consistently that low sensitivity models generally perform poorly and therefore should be viewed as less credible (4, 19, 20). Testing models with paleoclimate archives, where uncertainties in proxy data and forcings are adequately small, is also likely to be essential.

Often lost in the conversation of estimating climate sensitivity is the need for well-understood, well-calibrated, global-scale observations of the energy and water cycles and related analysis systems such as reanalyses to provide a global holistic perspective on climate variability and change. As the hiatus illustrates, such observations can be an invaluable tool for hypothesis testing. Lastly, there is a need to move beyond global mean surface temperature as the main metric for quantifying climate change (21). Improved estimates of ocean heat content have been made possible though data from ARGO drift buoys and improved ocean reanalysis methods. Similar advances are being made across a range of climate indices (e.g. sea level, terrestrial storage) and are likely to be fundamental in providing improved metrics of climate variability and change, evaluating models, and narrowing remaining uncertainties.

Dr. John Fasullo is a project scientist at the National Center for Atmospheric Research in Boulder, CO. He received his B.Sc. degree in Engineering and Applied Physics from Cornell University (1990) and his M.S. (1995) and Ph.D. (1997) degrees from the University of Colorado.

Dr. Fasullo studies processes involved in climate variability and change using both observations and models with a focus on the global energy and water cycles. He has published over 50 peer-reviewed papers dealing with aspects of this work, aimed primarily at understanding variability in clouds, the tropical monsoons, and the global water and energy cycles. His work has centered on identifying strengths and weakness across observations and models, and has emphasized the benefits of holistic evaluation of the climate system with multiple datasets, theoretical constraints, and novel techniques. Dr. Fasullo is a member of various committees and science teams, and participated in the IPCC AR4 report that contributed to the award of the Nobel Peace Prize to IPCC in 2007.


  1. Schwartz, S. E. (2012). Determination of Earth’s transient and equilibrium climate sensitivities from observations over the twentieth century: strong dependence on assumed forcing. Surveys in geophysics, 33(3-4), 745-777.
  2. PALAEOSENS Project Members. (2012). Making sense of palaeoclimate sensitivity. Nature, 491(7426), 683-691.
  3. Hall, A., & Qu, X. (2006). Using the current seasonal cycle to constrain snow albedo feedback in future climate change. Geophysical Research Letters, 33(3).
  4. Fasullo, J. T., & Trenberth, K. E. (2012). A less cloudy future: The role of subtropical subsidence in climate sensitivity. science, 338(6108), 792-794.
  5. Soden, B. J., Wetherald, R. T., Stenchikov, G. L., & Robock, A. (2002). Global cooling after the eruption of Mount Pinatubo: A test of climate feedback by water vapor. Science, 296(5568), 727-730.
  6. Dessler, A. E. (2010). A determination of the cloud feedback from climate variations over the past decade. Science, 330(6010), 1523-1527.
  7. Meehl, G. A. (2013, December). Update on the formulation of CMIP6. In AGU Fall Meeting Abstracts (Vol. 1, p. 05).
  8. Trenberth, K. E., & Fasullo, J. T. (2013). An apparent hiatus in global warming?. Earth's Future.
  9. Shindell, D. T. (2014). Inhomogeneous forcing and transient climate sensitivity. Nature Climate Change.
  10. Collins, M., R. Knutti, J. Arblaster, J.-L. Dufresne, T. Fichefet, P. Friedlingstein, X. Gao, W.J. Gutowski, T. Johns, G. Krinner, M. Shongwe, C. Tebaldi, A.J. Weaver and M. Wehner, 2013: Long-term Climate Change: Projections, Com- mitments and Irreversibility. In: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Stocker, T.F., D. Qin, G.-K. Plattner, M. Tignor, S.K. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex and P.M. Midgley (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA.
  11. Lewis, N. and M. Crok, 2014: A Sensitive Matter, A Report from the Global Warming Policy Foundation, 67 pp.
  12. Lewis, N. (2013). An Objective Bayesian Improved Approach for Applying Optimal Fingerprint Techniques to Estimate Climate Sensitivity*. Journal of Climate, 26(19).
  13. Otto, A., Otto, F. E., Boucher, O., Church, J., Hegerl, G., Forster, P. M., ... & Allen, M. R. (2013). Energy budget constraints on climate response. Nature Geoscience.
  14. Levitus, S., et al. (2012), World ocean heat content and thermosteric sea level change (0–2000 m), 1955–2010, Geophys. Res. Lett., 39, L10603, doi:10.1029/ 2012GL051106.
  15. Cazenave, A., Dieng, H. B., Meyssignac, B., von Schuckmann, K., Decharme, B., & Berthier, E. (2014). The rate of sea-level rise. Nature Climate Change.
  16. Schmidt, G. A., Shindell, D. T., & Tsigaridis, K. (2014). Reconciling warming trends. Nature Geoscience, 7(3), 158-160.
  17. Meehl, G. A., Arblaster, J. M., Fasullo, J. T., Hu, A., & Trenberth, K. E. (2011). Model-based evidence of deep-ocean heat uptake during surface-temperature hiatus periods. Nature Climate Change, 1(7), 360-364.
  18. Kay, J. E., Deser, C., Phillips, A., Mai, A., Hannay, C., Strand, G., Arblaster, J., Bates, S., Danabasoglu, G., Edwards, J., Holland, M. Kushner, P., Lamarque, J.-F., Lawrence, D., Lindsay, K., Middleton, A., Munoz, E., Neale, R., Oleson, K., Polvani, L., and M. Vertenstein (submitted), The Community Earth System Model (CESM) Large Ensemble Project: A Community Resource for Studying Climate Change in the Presence of Internal Climate Variability, Bulletin of the American Meteorological Society, submitted April 17, Available here: http://cires.colorado.edu/science/groups/kay/Publications/papers/BAMS-D-13-00255_submit.pdf
  19. Huber, M., Mahlstein, I., Wild, M., Fasullo, J., & Knutti, R. (2011). Constraints on Climate Sensitivity from Radiation Patterns in Climate Models. Journal of Climate, 24(4).
  20. Sherwood, S. C., Bony, S., & Dufresne, J. L. (2014). Spread in model climate sensitivity traced to atmospheric convective mixing. Nature, 505(7481), 37-42.
  21. Palmer, M. D. (2012). Climate and Earth’s Energy Flows. Surveys in Geophysics, 33(3-4), 351-357.
Guest blog Nic Lewis

Why is estimating climate sensitivity so problematical?

Climate sensitivity estimates exhibit little consistency. As shown in the Introduction, Figure 1 of Box 12.2 of AR5[i] (reproduced here as Figure 1) reveals that 5–95% uncertainty ranges estimated for equilibrium climate sensitivity (ECS) vary from 0.6–1.0°C at one extreme (Lindzen & Choi, 2011), to 2.2–9.2°C at the other (Knutti, 2002), with medians[ii] ranging from 0.7°C to 5.0°C.

Figure 1. Annotated reproduction of Box 12.2, Figure 1 from AR5 WG1: ECS estimates

Bars show 5–95% uncertainty ranges for ECS, with best estimates (medians) marked by dots. Actual ECS values are given for CMIP3 and CMIP5 GCMs. Unlabelled ranges relate to studies cited in AR4.

The ECS values of CMIP5 general circulation or global climate models (GCMs) – as indicated by the dark blue dots in figure 1 – cover a narrower, but still wide range, of 2.1–4.7°C. So how should one weight the different lines of evidence, and the studies within them?

Climatological constraint studies
All climatological constraints ECS estimates cited in AR5 come from studies based on simulations by multiple variants of the UK HadCM3/SM3 GCM, the parameters of which have been systematically varied to perturb the model physics and hence its ECS values. These are called Perturbed Physics Ensemble (PPE) studies. Unfortunately, the HadCM3/SM3 model, maybe in common with other models, has a structural link – probably via clouds – between ECS and aerosol radiative forcing. As a result, at parameter settings that produce even moderately low ECS values, aerosol cooling is so high that the model climate becomes inconsistent with observations. See Box 1 in this document for details. Therefore, the AR5 climatological constraint studies cannot provide scientifically valid observationally-based ECS estimates: they primarily reflect the characteristics of the HadCM3 GCM.

Categories of study that AR5 downplays
AR5 considers all observational ECS estimates. It concludes, in the final paragraph of section 12.5.3 that estimates based on
  • paleoclimate data reflecting past climate states very different from today
  • climate response to volcanic eruptions, solar changes and other non-greenhouse gas forcings
  • timescales different from those relevant for climate stabilization, such as the climate response to volcanic eruptions
are unreliable, that is, may differ from the climate sensitivity measuring the climate feedbacks of the Earth system today. Another example of estimates based on different timescales (in practice, short-term changes) is satellite measured variations in top-of-atmosphere (TOA) radiation. The discussion of that approach in section refers to uncertainties in estimates of the feedback parameter and the ECS from short-term variations in the satellite period precluding strong constraints on ECS. AR5 also concludes in the final sentence of section that paleoclimate estimates support only a wide 10–90% range for ECS of 1.0–6°C. I agree with these conclusions, certainly for current studies.

Instrumental studies based on multidecadal warming
In essence, the only observational estimates remaining are those based on instrumental observations of warming over multi-decadal periods. In the last two or three decades the anthropogenic signal has risen clear of the noise arising from internal variability and measurement/forcing estimation uncertainty. These studies are therefore able to provide narrower ranges than those from paleoclimate studies. A key change between the 2007 AR4 report and AR5 has been a significant reduction in the best estimate of aerosol forcing, which – other things being equal – points to a reduction in estimates of ECS. However, uncertainties remain large, with the aerosol forcing uncertainty being by some way the most important for ECS estimation.

Useful surface temperature records extend back approximately 150 years (the ‘instrumental period’). Global warming ‘in the pipeline’, representing the difference between transient climate response (TCR), a measure of sensitivity over 70 years, and ECS, is predominantly reflected in ocean heat uptake, calculated from changes in sub-surface temperatures, records of which extend back only some 50 years.

In effect, estimates based on instrumental period warming compare measured changes in temperatures with estimates of the radiative forcing from greenhouse gases, aerosols and other agents driving climate change. Some do so directly through mathematical relationships, but most use relatively simple climate models to simulate temperatures, which can then be compared with observations as the model's parameters (control knobs) are varied. The idea is that the most likely values for ECS (and any other key climate system properties being estimated) are those that correspond to the model parameter settings at which simulations best match observations.

Whichever method is employed, GCMs or similar models have to be used to help estimate most radiative forcings and their efficacy, the characteristics of internal climate variability and maybe other ancillary items. But these uses do not rely on the ECS values of the models involved: GCMs with very different ECS values can provide similar estimates of effective forcings, internal variability, etc.[iii] However, some ECS and TCR studies were based on GCM-derived estimates of anthropogenic warming or recent ocean heat uptake rather than observations. Although those estimates may have taken observational data into account, it is unlikely that they fully did so.

I will consider studies in the Combination category in Figure 1, Box 12.2 together with those in the Instrumental category, since the combination estimates all include an instrumental estimate. I include the unlabelled AR4 Instrumental studies Frame et al (2005), Gregory et al (2002) and Knutti et al (2002) and the unlabelled AR4 Combination study Hegerl et al (2006). I exclude the Lindzen & Choi (2011) and Murphy et al (2009) studies, and also the unlabelled AR4 Forster & Gregory (2006) study, as they are based on satellite measured short-term variations in TOA radiation, an approach deprecated by AR5. (Two of these three studies actually give low, well-constrained ECS estimates.) I exclude Bender et al (2010) and the unlabelled AR4 Combination study Annan & Hargreaves (2006) since they involve the response to volcanic eruptions, an approach deprecated in AR5.

That leaves all the AR4 and AR5 Instrumental and Combination studies that involving estimating ECS from multidecadal warming. They are a mixed bag: AR5 includes sensitivity estimates from flawed observational studies that used unsuitable data, were poorly designed and/or employed inappropriate statistical methodology. Before considering individual studies, I will highlight two particular issues that each affect a substantial number of the instrumental-period warming studies.

Aerosol forcing estimation
Many of the observational instrumental-period warming ECS estimates that were featured in Figure 1, Box 12.2 of AR5, or TCR estimates featured in Figure 10.20.a of AR5, used values for aerosol forcing that either:

a) were consistent with the AR4 estimate; this was substantially higher than the estimate, based on better scientific understanding and observational data, given in AR5;

b) reflected aerosol forcing levels in particular GCMs that were substantially higher than the best estimates given in AR5; or

c) were estimated along with ECS using global mean temperature data.

Any of these approaches will lead to an unacceptable, biased ECS (or TCR) estimation. This is obvious for a) and b). Regarding c), because the time-evolution of global aerosol forcing is almost identical to that from greenhouse gases, it is impossible to estimate both aerosol forcing – which largely affects the northern hemisphere – and ECS (or TCR) with any accuracy without separate temperature data for the northern and southern hemispheres.

On my analysis, ECS estimates from Olson et al (2012), Tomassini et al (2007) and the AR4 study Knutti et al (2002) are unsatisfactory due to problem c).

Inappropriate statistical methodology
Most of the observational instrumental-period warming based ECS estimates cited in AR5 use a 'Subjective Bayesian' statistical approach.[iv] The starting position of many of them – their prior – is that all climate sensitivities are, over a very wide range, equally likely. In Bayesian terminology, they start from a ‘uniform prior’ in ECS. All climate sensitivity estimates shown in the AR4 report were stated to be on a uniform-in-ECS prior basis. So are many cited in AR5.

Use of uniform-in-ECS priors biases estimates upwards, usually substantially. When, as is the case for ECS, the parameter involved has a substantially non-linear relationship with the observational data from which it is being estimated, a uniform prior generally prevents the estimate fairly reflecting the data. The largest effect of uniform priors is on the upper uncertainty bounds for ECS, which are greatly inflated.

Instead of uniform-in-ECS priors, some climate sensitivity estimates use ‘expert priors’. These are mainly representations of pre-AR5 ‘consensus’ views of climate sensitivity, which largely reflect estimates of ECS derived from GCMs. Studies using expert priors typically produce ECS estimates that primarily reflect the prior, with the observational data having limited influence.

ECS estimates from the majority of instrumental-period warming based studies – identified below – are seriously biased up by use of unsuitable priors, typically a uniform-in-ECS prior and/or an expert prior for ECS. Unusually, although Aldrin et al (2012) used a Subjective Bayesian method, because its ECS estimates are well constrained they are only modestly biased by the use of a uniform-in-ECS prior (although its estimate using a uniform-in-1/ECS prior appears to reflect the data better).

Which instrumental warming studies are unsatisfactory, and why?
I will give just very brief summaries of serious problems that affect named studies and render their ECS estimates unsatisfactory.

Frame (2005) – ocean heat uptake incorrectly calculated; uses GCM-estimated anthropogenic warming not directly observed temperatures; ECS estimate badly biased by use of a uniform prior for ocean effective diffusivity (a measure of heat uptake efficiency) as well as for ECS.

Gregory (2002) – external estimate of forcing increase used was under half the AR5 best estimate.

Hegerl (2006) – ECS estimate dominated by one derived from the Frame (2005) study.

Knutti (2002) – poor aerosol forcing estimation [see c) above]; used a very weak pass/fail test to compare simulations with observations; estimate biased up by erroneous ocean heat content data and use of uniform prior for ECS.

Libardoni & Forest (2013) – ECS estimates largely reflect the expert prior used; surface temperature data badly used; and the relationships of its estimates using different datasets are unphysical.

Lin (2010) –forcing increase is too small (ignores strong volcanic forcing at start of simulation period) and assumed TOA imbalance excessive. Non-standard treatment of deep ocean heat uptake.

Olson (2012) – poor aerosol forcing estimation [see c) above]. Instrumental estimate using uniform prior for ECS almost unconstrained; Combination estimate dominated by the expert prior used.

Schwartz (2012) – The upper, 3.0–6.1°C, part of its ECS range derives from a poor quality regression using one of six alternative forcings datasets; the study concluded that dataset was inconsistent with the underlying energy balance model.

Tomassini (2007) – poor aerosol forcing estimation [see c) above]; ECS estimates badly biased by use of a uniform prior for ocean effective diffusivity and alternative uniform and expert priors for ECS.

For anyone who wants more details, I have made available, here, a fuller analysis of all the AR5 instrumental-period-warming based studies shown in Box 12.2, Figure 1 thereof, including Combination studies.

Which instrumental warming studies are satisfactory?
After setting aside all those instrumental-period-warming based studies where I find substantive faults, only three remain: Aldrin et al (2012), Lewis (2013) [solid line Box 12.1 Figure 1 range using improved diagnostic only] and Otto et al (2013). These all constrain ECS well, with best estimates of 1.5–2.0°C. Ring et al (2012), cited in AR5 but not shown in Box 12.1 Figure 1 as it provided no uncertainty ranges, also appears satisfactory. Its best estimates for ECS varied from 1.45°C to 2.0°C depending on the surface temperature dataset used.

Transient climate response estimation
Turning to TCR estimates cited in AR5, the story is similar. The ranges from AR5 studies are shown in Figure 2. As for ECS, I will give very brief summaries of serious problems that affect named studies and render their ECS estimates unsatisfactory.

Figure 2. 5–95% TCR ranges from AR5 studies featured in Figure 10.20.a thereof

Libardoni & Forest (2011) – estimates largely reflect the ECS expert prior used; surface temperature data badly used; and the relationships of its estimates using different datasets are unphysical.

Padilla (2011) – poor aerosol forcing estimation [see c) above]; reducing uncertainty about aerosol forcing by using only post 1970 data lowers range from 1.3–2.6°C to 1.1–1.9°C. Its TCR estimate is sensitive to the forcing dataset and does vary logically with ocean mixed layer depth.

Gregory & Forster (2008) – regressed global temperature on anthropogenic forcing (excluding years with strong volcanism) over 1970–2006. That period coincided with the upswing half of the Atlantic Multidecadal Oscillation cycle, to which 0.1–0.2°C of the 0.5°C temperature rise was probably attributable. Regressing over 70 years using AR5 forcings gives a TCR best estimate of 1.3°C.

Tung (2008) – based on the response to the 11 year solar cycle. Section 10.8.1 of AR5 warns that its estimate may be affected by different mechanisms by which solar forcing affects climate.

Rogelj (2012) – neither a genuine observational estimate, nor published. The study imposes a PDF for ECS that reflects the AR4 likely range and best estimate, which together with the ocean heat uptake data used would have determined a PDF for TCR, with other data having very little influence.

Harris (2013) – an extension of the Sexton (2012) climatological constraint study to include recent climate change. Same problem: the study's TCR estimate mainly reflects the characteristics of the HadCM3/SM3 model, due to its structural link between ECS (& hence TCR) and aerosol forcing.

Meinshausen (2009) – TCR estimate is based on a PDF for ECS matching the AR4 best estimate and range. Finds a similar range using observations, but uses the high AR4 aerosol forcing estimate as a prior. The study appears to observationally constrain that prior weakly, probably because it attempts to constrain many more parameters than the 9 degrees of freedom it retains in the observations.

Knutti & Tomassini (2008) – uses same model setup, data and statistical method as the Tomassini (2007) ECS study, but estimates TCR instead. Same substantial problems as for that study.

I provide a more detailed analysis of AR5 TCR studies here.

On my analysis, only the Gillett et al (2013), Otto et al (2013) and Schwartz (2012) studies are satisfactory. Those studies give well-constrained best estimates for TCR of 1.3-1.45°C, averaging around 1.35°C.

Energy budget studies
It is instructive to consider the robust 'energy budget' method of estimating ECS (and by extension TCR), which involves fewer assumptions and less use of models than most others. In the energy budget method, external estimates – observationally based so far as practicable – of all components of forcing and heat uptake, as well as of global mean surface temperature, are used to compute the mean changes in total forcing, ∆F, in total heat uptake, ∆Q, and in surface temperature, ∆T, between a base period and a final period. Climate sensitivity may then be estimated as:

where F2xCO2 is the radiative forcing attributable to a doubling of atmospheric CO2 concentration.

Strictly, Equation (1) provides an estimate of effective climate sensitivity rather than equilibrium climate sensitivity, according to the definitions in AR5. However, in practice the two terms are used virtually synonymously in AR5.

Total heat uptake by the Earth's climate system – the rate of increase in its heat content, very largely in the ocean – necessarily equals the net increase in energy flux to space (the Earth's radiative imbalance). As AR5 states (p.920), Eq.(1) follows from conservation of energy. AR5 also points out that TCR represents a generic climate system property equalling the product of F2xCO2 (taken as 3.71 W/m2 in AR5) and the ratio of the response of global surface temperature to a change in forcing taking place gradually over a ~70 year timescale. If most of the increase in forcing during a longer period occurs approximately linearly over the final ~70 years – as is the case over the instrumental period – then it likewise follows that:

The base and final periods each need to be at least a decade long, to reduce the effects of internal variability and measurement error. To obtain reliable and well-constrained estimation, one should choose base and final periods that capture most of the increase in forcing over the instrumental period and are similarly influenced by volcanic activity and internal variability, particularly multidecadal fluctuations. On doing so, best estimates for ECS and TCR using the forcing and heat uptake estimates given in AR5 and surface temperature records from the principal datasets are in line with those given above from studies that I do not find fault with. In fact, they lie in the lower halves of the 1.5–2.0°C ECS and 1.3-1.45°C TCR bands I quoted.

Note that Otto et al (2013), of which I am a co-author, was an energy budget study. It used average forcing estimates derived from CMIP5 GCMs rather than the AR5 estimates (which were not available at the time).

Raw model range
Before turning to evaluating the estimates of ECS from the CMIP3 (AR4) and CMIP5 (AR5) GCMs, I will first briefly discuss the TCR values that CMIP5 models exhibit. The AR5 projections of warming over the rest of this century should depend primarily reflect those TCR values.

Transient response is directly related to ECS, but lower on account of heat uptake by the climate system. CMIP5 GCMs have TCRs varying from 1.1°C to 2.6°C, averaging around 1.8°C – much higher than the sound observationally-based best estimates of 1.3-1.45°C. Moreover, about half the GCMs exhibit increases in transient sensitivity as forcing increases continue[v], so average CMIP5 projections of warming over the 21st century are noticeably higher than would be expected from their TCR values.

Feedbacks in GCMs
ECS in GCMs follows from the climate feedbacks they exhibit, which on balance amplify the warming effect of greenhouse gases.[vi] The main feedbacks in these models are water vapour, lapse rate, albedo and cloud feedbacks. Together, the first three of these imply an ECS of around 2°C. The excess of model ECS over 2°C comes primarily from positive cloud feedbacks and adjustments, with nonlinearities and/or climate state dependency also having a significant impact in some cases.

Problems with clouds
Reliable observational evidence for cloud feedback being positive rather than negative is lacking. AR5 (Section discussed attempts to constrain cloud feedback from observable aspects of present-day clouds but concluded that "there is no evidence of a robust link between any of the noted observables and the global feedback".

Cloud characteristics are largely 'parameterised' in GCMs – calculated using semi-heuristic approximations rather than derived directly from basic physics. Key aspects of cloud feedback vary greatly between different models. GCMs have difficulty simulating clouds, let alone predicting how they will change in a warmer world, with different cloud types having diverse influences on the climate. Figure 3 shows how inaccurate CMIP5 models are in representing even average cloud extent; over much of the Earth's surface cloudiness is too low in most models.[vii]

Figure 3. Error in total cloud fraction (TCF) for 12 CMIP5 GCMs. (TCF)sat = averaged MODIS and ISCCP2. Source: Patrick Frank, poster presentation at American Geophysical Union, Fall Meeting 2013: Propagation of Error and The Reliability of Global Air Temperature Projections

Although the overall effects of cloud behaviour on cloud feedback and hence on climate sensitivity are impossible to work out from basic physics and not currently well constrained by observations, the realism of GCM climate sensitivities can be judged from how their simulated temperatures have responded to the increasing forcing over the instrumental period. However, there is a complication.

Problems with aerosols
On average, GCMs exhibit significantly stronger (negative) aerosol forcing than the AR5 best estimate of -0.9 W/m2 in 2011 relative to 1750. Averaged over CMIP5 models for which aerosol forcing has been diagnosed, its change over 1850 to 2000 appears to be around 0.4–0.5 W/m2 more negative than per AR5's best estimate.[viii] In GCMs, much more of the positive greenhouse gas forcing would have been offset by negative aerosol forcing than per the AR5 best estimates, leaving a relatively weak average increase in net forcing. That depresses the simulated temperature rise over the instrumental period. With a weak forcing increase, GCMs needed to have high sensitivities in order to match the warming experienced from the late 1970s until the early 2000s. If aerosol forcing is actually smaller and the models had correctly reflected that fact, they would – given their high sensitivities – have simulated excessive warming.

If aerosol forcing is close to AR5's best estimate, there is little doubt that most of the models are excessively sensitive. But what if AR5's best estimate of aerosol forcing is insufficiently negative? The uncertainty range of the AR5 aerosol forcing estimate is very wide, and probably encompasses all GCM aerosol forcing levels. At present, one cannot say for certain that average GCM aerosol forcing is excessive.

Too fast warming once aerosol forcing stabilised
Fortunately, there is general agreement that aerosol forcing has changed little – probably by no more than ±0.15 W/m2 – since the end of the 1970s. By comparison, over 1979–2012 other forcings increased by about 1.3 W/m2. So by comparing model-simulated global warming since 1979 with actual warming, we can test whether the CMIP5 GCMs sensitivity is realistic without worrying too much about aerosol forcing uncertainty. Figure 4 shows that warming comparison over the 35 years 1979–2013, a period that is long enough to be used to judge the models. Virtually all model climates warmed much faster than the real climate, by 50% too much on average. Moreover, this was a period in which the main source of multidecadal internal variability in global temperature, the Atlantic Multidecadal Oscillation (AMO), had a positive influence (see, e.g., Tung and Zhou, 2012). Without its positive influence on the real climate, which was not generally included in GCM simulations, the average excess of CMIP5 model warming over actual would have been far more than 50%.

Figure 4. Modelled versus observed decadal global surface temperature trend 1979–2013
Temperature trends in °C/decade. Source: http://climateaudit.org/2013/09/24/two-minutes-to-midnight/. Models with multiple runs have separate boxplots; models with single runs are grouped together in the boxplot marked ‘singleton’. The orange boxplot at the right combines all model runs together. The red dotted line shows the actual increase in global surface temperature over the same period per the HadCRUT4 observational dataset.

Over the slightly shorter 1988–2012 period, Figure 9.9 of AR5, reproduced here as Figure 5, shows an even more striking difference in trends in tropical lower tropospheric temperature over the oceans. The median model temperature trend (shown along the x-axis: the y-axis is not relevant here) is three times that of the average of the two observational datasets, UAH and RSS.

Figure 5. Reproduction of Figure 9.9 from AR5 WG1

Decadal trends for the 1988–2012 period in tropical (20°S to 20°N) lower tropospheric temperature (TLT) over the oceans are shown along the x-axis. Coloured symbols are from CMIP5 models. The black cross (UAH) and black star (RSS) show trends per satellite observations. Other black symbols are from model-based data reanalyses. All but two CMIP5 models exhibit higher TLT trends than UAH and RSS.

To summarise, the ECS and TCR values of CMIP5 models are not directly based on observational evidence and depend substantially on flawed simulations of clouds. Moreover, in the period since aerosol forcing stabilised ~35 years ago most models have warmed much too fast, indicating substantial oversensitivity. I therefore consider that little weight should be put on evidence from GCMs (and the related feedback analysis) as to the actual levels of ECS and TCR.

To conclude, I would summarise my answers to the questions posed in the Introduction as follows:

1. Observational evidence is preferable to that from models, as understanding of various important climate processes and the ability to model them properly is currently limited.

2. Little weight should be given to ECS evidence from the model range or climatological constraint studies. Of observational evidence, only that from warming over the instrumental period should be currently regarded as both reliable and able usefully to constrain ECS, in accordance with the conclusions of AR5. Studies that have serious defects should be discounted.

3. The major disagreement between ECS best estimates based on the energy budget, of no more than about 2°C, and the average ECS value of CMIP5 models of about 3°C, seems to me the main reason why the AR5 scientists felt unable to give a best estimate for ECS. All the projections of future climate change in AR5 are based on the CMIP5 models. Giving a best estimate materially below the CMIP5 model average could have destroyed the credibility of the Working Group 2 and 3 reports. As it is still difficult, given the uncertainties, to rule out ECS being as high as the CMIP5 average, I do not criticise the lack of a best estimate in AR5. However, I think a more forthright and detailed explanation of the reasons was called for. I would have liked a clear statement that most model sensitivities lay towards the top of the uncertainty range implied by the AR5 forcing and heat uptake estimates.

4. The soundest observational evidence seems to point to a best estimate for ECS of about 1.7°C, with a 'likely' (17-83%) range of circa 1.2–3.0°C.

5. Following a detailed analysis of all studies featured in AR5, the only TCR estimates that I consider significant weight should be given to are those from the Otto, Gillett and Schwartz studies.

6. The soundest observational evidence points to a 'likely' range for TCR of about 1.0–2.0°C, with a best estimate of circa 1.35°C.

Nic Lewis is an independent climate scientist. He studied mathematics and physics at Cambridge University, but until about five years ago worked in other fields. Since then he has been researching in climate science and in areas of statistics of relevance to climate science. Over the last few years he has concentrated mainly

on the problem of estimating climate sensitivity and related key climate system properties. He has worked with prominent IPCC lead authors on a key paper in the area. He is also sole author of a recent paper that reassessed a climate sensitivity study featured in the IPCC AR4 report, showing that the subjective statistical method it used greatly overstated the risk of climate sensitivity being very high. Both papers are cited and discussed in the IPCC’s recently released Fifth Assessment Report.

Aldrin, M., M. Holden, P. Guttorp, R.B. Skeie, G. Myhre, and T.K. Berntsen, 2012. Bayesian estimation of climate sensitivity based on a simple climate model fitted to observations of hemispheric temperatures and global ocean heat content. Environmetrics;23: 253–271.

Annan, J.D. and J.C. Hargreaves, 2006. Using multiple observationally-based constraints to estimate climate sensitivity. Geophys. Res. Lett., 33: L06704.

Forest, C.E., P.H. Stone and A.P. Sokolov, 2006.Estimated PDFs of climate system properties including natural and anthropogenic forcings. Geophys. Res. Lett., 33: L01705

Forster, P.M.D., and J.M. Gregory, 2006. The climate sensitivity and its components diagnosed from Earth radiation budget data. J.Clim.,19: 39–52.

Frame D.J., B.B..B. Booth, J.A. Kettleborough, D.A. Stainforth, J.M. Gregory, M. Collins andM.R. Allen, 2005.Constraining climate forecasts: The role of prior assumptions. Geophys. Res. Lett., 32, L09702

Gillett, N.P., V.K. Arora, D. Matthews, P.A. Stott, and M.R. Allen, 2013. Constraining the ratio of global warming to cumulative CO2 emissions using CMIP5 simulations. J. Clim., doi:10.1175/JCLI-D-12–00476.1.

Gregory, J.M., R.J. Stouffer, S.C.B. Raper, P.A. Stott, and N.A. Rayner, 2002. An observationally based estimate of the climate sensitivity. J. Clim.,15: 3117–3121.

Gregory J.M. and P.M.Forster, 2008. Transient climate response estimated from radiative forcing and observed temperature change. J.Geophys. Res., 113, D23105.

Harris, G.R., D.M.H. Sexton, B.B.B. Booth, M. Collins, and J.M. Murphy, 2013. Probabilistic projections of transient climate change. Clim. Dynam., doi:10.1007/s00382–012–1647-y.

Hegerl, G.C., T.J. Crowley, W.T. Hyde, and D.J. Frame, 2006. Climate sensitivity constrained by temperature reconstructions over the past seven centuries. Nature;440: 1029–1032.

Knutti, R., T.F. Stocker, F. Joos, and G.-K. Plattner, 2002. Constraints on radiative forcing and future climate change from observations and climate model ensembles. Nature, 416: 719–723.

Knutti, R. and G.C. Hegerl, 2008. The equilibrium sensitivity of the Earth's temperature to radiation changes. Nature Geoscience;1: 735–743.

Lewis, N., 2013. An objective Bayesian, improved approach for applying optimal fingerprint techniques to estimate climate sensitivity. J. Clim.,26: 7414–7429.

Libardoni, A.G. and C.E.Forest, 2011. Sensitivity of distributions of climate system properties to the surface temperature dataset. Geophys. Res. Lett.; 38, L22705.

Libardoni, A.G. and C.E.Forest, 2013. Correction to ‘Sensitivity of distributions of climate system properties to the surface temperature dataset’. Geophys. Res. Lett.; doi:10.1002/grl.50480.

Lin, B., et al., 2010: Estimations of climate sensitivity based on top-of-atmosphere radiation imbalance. Atmos. Chem. Phys., 10: 1923–1930.

Lindzen, R.S. and Y.S. Choi, 2011. On the observational determination of climate sensitivity and its implications. Asia-Pacific J. Atmos. Sci.;47: 377–390.

Meinshausen, Malte, Nicolai Meinshausen, William Hare, Sarah C. B. Raper, Katja Frieler, Reto Knutti, David J. Frame, Myles R. Allen, 2009: Greenhouse gas emission targets for limiting global warming to 2°C. Nature, doi: 10.1038/

Murphy, D.M., S. Solomon, R.W. Portmann, K.H. Rosenlof, P.M. Forster, and T. Wong, 2009. An observationally based energy balance for the Earth since 1950. J. Geophys. Res. Atmos.,114: D17107.

Olson, R., R. Sriver, M. Goes, N.M. Urban, H.D. Matthews, M. Haran, and K. Keller, 2012. A climate sensitivity estimate using Bayesian fusion of instrumental observations and an Earth System model. J. Geophys. Res. Atmos.,117: D04103.

Otto, A., F. E. L. Otto, O. Boucher, J. Church, G. Hegerl, P. M. Forster, N. P. Gillett, J. Gregory, G. C. Johnson, R. Knutti, N. Lewis, U. Lohmann, J. Marotzke, G. Myhre, D. Shindell, B Stevens and M. R. Allen, 2013: Energy budget constraints on climate response. Nature Geoscience, 6, 415–416.

Ring, M.J., D. Lindner, E.F. Cross, and M.E. Schlesinger, 2012. Causes of the global warming observed since the 19th century. Atmos. Clim. Sci., 2: 401–415.

Rogelj, J., M. Meinshausen and R. Knutti, 2012. Global warming under old and new scenarios using IPCC climate sensitivity range estimates. Nature Climate Change, 2, 248–253

Schwartz, S.E., 2012. Determination of Earth's transient and equilibrium climate sensitivities from observations over the twentieth century: Strong dependence on assumed forcing. Surv.Geophys., 33: 745–777.

Sexton, D.M. H., J.M. Murphy, M. Collins, and M.J. Webb, 2012. Multivariate probabilistic projections using imperfect climate models part I: outline of methodology. Clim. Dynam., 38: 2513–2542.

Shindell, D.T. et al, 2013. Radiative forcing in the ACCMIP historical and future climate simulations, Atmos. Chem. Phys., 13, 2939-2974

de Szoeke, S.P. et al, 2012. Observations of Stratocumulus Clouds and Their Effect on the Eastern Pacific Surface Heat Budget along 20°S. J. Clim, 25, 8542–8567.

Tomassini, L., P. Reichert, R. Knutti, T.F. Stocker, and M.E. Borsuk, 2007. Robust Bayesian uncertainty analysis of climate system properties using Markov chain Monte Carlo methods. J. Clim., 20: 1239–1254.

Tomassini, L.et al, 2013. The respective roles of surface temperature driven feedbacks and tropospheric adjustment to CO2 in CMIP5 transient climate simulations. Clim. Dyn, DOI 10.1007/s00382-013-1682-3.

Tung, K-K and J Zhou, 2013. Using data to attribute episodes of warming and cooling in instrumental records. PNAS, 110, 6, 2058–2063

[i] References to AR5 are to the Working Group 1 report of the IPCC Fifth Assessment, except where the context requires otherwise.

[ii] The 50% probability point, which the target of the estimate is considered equally likely to lie above or below. All the best estimates I quote are medians, unless otherwise stated.

[iii] For instance, one can compute instantaneous radiative forcing (RF) for GHG without a GCM, using line-by-line calculations. But in order to estimate effective radiative forcing (ERF) one needs a GCM to compute how the atmosphere reacts to the presence of the GHG and what effect that has on the TOA radiative balance. Whilst the ratio of the derived ERF to RF will not be totally independent of the GCM's ECS, as a first approximation it will be. And in fact the estimated ratio is close to unity for most forcing agents.

[iv] Aldrin et al (2012), Libardoni & Forest (2013), Olson et al (2012), Tomassini et al (2007) and, of the unlabelled AR4 studies, Annan & Hargreaves (2006), Frame et al (2005), Hegerl et al (2006), Knutti et al (2002) and (dashed bar only) Forster & Gregory (2006).

[v] Figure 1 of Tomassini et al (2013) shows that the global mean temperature increase in the second 70 years of the “1pctCO2″ experiment exceeds that in the first 70 years by significantly more than is accounted for by emerging "warming in the pipeline" for 8 of the 14 models analysed. Gregory and Forster (2008), Table 1 also showed a similar behaviour for between 5 and 10 (rounding of the stated ratios precludes precise enumeration) of the 12 models analysed.

[vi] Broadly, ECS = F2xCO2/ (3.2 - Sum of feedbacks), 3.2 representing the Planck response of increased radiation from a warmer Earth.

[vii] A peer reviewed study, Szoeke et al (2012) likewise found that simulations of the climate of the twentieth century by CMIP3 models had ~50% too few clouds in the area investigated (south-eastern tropical Pacific ocean), and thus far too little net cloud radiative cooling at the surface.

[viii] Shindell et al (2013) estimated the average change in total aerosol forcing from 1850 to 2000 for the CMIP5 models it analysed at -1.23 W/m²; the corresponding best estimate in AR5 is -0.74 W/m².