The rapid growth of climate research provides an unprecedented evidence base for observing the effects of climate change across the globe.
However, the large amount of published studies means that an attempt to evaluate it as a whole presents a daunting challenge.
In our new study, published in Nature Climate Change, we used machine learning methods to assess, classify, and map more than 100,000 peer-reviewed studies of climate impacts.
Our results show that the impact of man-made warming on average temperature and precipitation can already be felt for 85% of the world’s population and 80% of the world’s area.
The results also highlight an “attribution gap” between countries in the global north and south due to a relative lack of research on climate impacts in less developed countries.
Since the Intergovernmental Panel on Climate Change (IPCC) published its first assessment report in 1990, the number of studies relevant to observed climate impacts published per year has increased by more than two orders of magnitude.
In itself, the first part of the IPCC’s latest assessment report – published in August – refers to more than 14,000 scientific articles.
This exponential growth in peer-reviewed scientific publications on climate change is already pushing manual expert assessments to their limits.
In this study, we develop an approach using machine learning – develop an algorithm that can not only recognize whether a study is about climate impacts, but the locations mentioned, the climate impact driver – whether the impacts were caused by temperature or precipitation changes – and the type of impact described .
To do this, we use the state-of-the-art in-depth language representation model, called “BERT”. The model can capture context-dependent meanings of texts, which means that it can extract the information we are looking for from each study we analyzed.
We trained our algorithm using “supervised learning”, which involves our team in hand coding more than 2,000 documents. Our algorithm was then able to replicate the classification decisions made by humans well. Of course, the predictions it makes are not perfect, but our approach allows us to explicitly assess areas of uncertainty for our predictions.
From our experience with double-coded documents from different human coders, we can testify that human classification is not without error or disagreement. How the performance of machine learning and human coders is compared is an interesting area for further research.
In total, our algorithm identified 100,000 studies documenting ways in which human and natural systems have been affected by climate and weather. These papers are taken from the journal’s databases, Web of Science and Scopus. We do not filter the surveys by their quality or prestige in the journal in which they are published.
The diagram below shows how large and growing the amount of scientific articles documenting climate impacts is. The blue shadow reflects the uncertainty surrounding our machine-learning approach.
Interactive chart of the number of climate impact studies identified by the algorithm in the scientific literature for 1986-2019. The blue shadow indicates the uncertainty. Credit: Max Callaghan.
The weight of the evidence
Where possible, our approach draws the place where each study is focused. This allows us to map how these studies are distributed across the globe, as shown below.
Each cell has a “weighted study” score, with the darker areas on the map indicating where evidence is denser – that is, where there are more studies referring to each lattice cell.
For example, almost all network cells in Europe have several studies documenting climate impacts. However, there are some areas – especially in Africa – where the distribution of evidence is much more sparse.
Use the filters to switch between climate driver – temperature or precipitation – and type of impact – e.g. On “mountains, snow and ice” and “rivers, lakes and soil moisture”. These give a sense of where evidence of different types of influence is distributed differently, but also highlight some limitations of our method.
Interactive map showing the weight of evidence of climate impacts from grid cells across the globe. The darker shade shows a greater weight of evidence. The filters can be used to specify the climate driver and the category of impact. Credit: Max Callaghan.
Because our documents were categorized using a machine-learning model, there will be some that are misclassified. For example, filtering documents to “coastal and marine ecosystems” concentrates the darker spots on coastal and marine areas, but some dark spots remain inland.
Upon inspecting where our algorithm worked better and worse, we noticed that documents about fish, especially salmon – which migrate to the sea before returning to freshwater to spawn – were sometimes misclassified between terrestrial and freshwater ecosystems and coastal and marine ecosystems.
Many studies document the effects of rising temperatures in a particular sector – such as crop yields, human health or biodiversity – without necessarily in the same study shows whether this temperature rise can be attributed to human influence on the climate.
Our algorithm does not allow for an actual analysis of whether each study formally attributes the observed changes to man-made climate change. This would probably require a human expert to read the entire paper, which is increasingly difficult to do for an ever-growing literature base.
In our study, we pursue a very different, data-driven approach to the detection and attribution issue. Our algorithm extracts documented influences and the respective drivers – in our case, temperature and precipitation – at the grid cell level. We then use physical climate scientifically based methods to assess detectable and attributable trends.
Using a well-established method, we assess detectable trends and their attribution to man-made climate change in the period 1950-2018 based on observational and climate model evidence at the grid cell level.
We can show that temperatures have been rising and can be attributed to human influence, because almost everywhere we have data.
The picture for precipitation is less clear. We have fewer lattice cells with sufficient data to analyze in this way, there are fewer cells where trends are outside the natural variation, and we have some cells – for example in West Africa – where precipitation has fallen significantly, although climate models expected an increase tendency. We do not expect such cells to show trends attributable to human influence.
By combining our large literature review with physical climate information, we can then provide an assessment of where climate impacts associated with changes in temperature or precipitation may be due to man-made climate change.
You can see this in the figure below, where the cells are colored pink, where the selected driver shows an attributable trend, and darker pink cells show, where there are several studies that refer to the place and the climate driver.
Overall, when considering either average temperature or precipitation, we show that trends can be attributed to 80% of the world’s land area, covering 85% of the world’s population.
Interactive map (top) and bar graph (bottom) showing areas where a trend can be attributed to man-made climate change (pink), where the darker shade indicates a larger number of supporting studies. Gray shade indicates a non-climatic driving force for this trend. The bar graph shows how these results are divided between high-income, upper-middle-income, lower-middle-income and low-income countries. The map filters can be used to specify the climate driver and the impact category, while the bar graph filters indicate whether the results are summarized by population or area. Credit: Max Callaghan.
For the majority of lattice cells, trends in temperature or precipitation can be attributed to large amounts of evidence of how these trends affect human and natural systems. However, this is not the case everywhere. The bar graph shows how this is changing in countries from different income classifications.
For example, if we consider all the influences driven by temperature, 90% of people living in high-income countries live in an area where trends can be attributed to human influence on the climate. Nearly 90% of them live in areas where there are a large number of studies referring to the effects of these trends on human and natural systems.
In low-income countries, the number of people living in areas with attributable trends is 72%. Of these, however, only 22% live in areas with high evidence of how temperatures affect human and natural systems.
We refer to this phenomenon as the “attribution gap”. It should be noted that lower levels of evidence do not mean that climate change does not affect people in low-income countries. The fact that published evidence is sparse — even where we can observe man-made changes in temperature or precipitation — shows that there is an urgent need for more scientific study of the effects of climate change in the global south.
The approach we are examining in our paper illustrates the potential for implementing deep-learning techniques and the combination of different threads of “big data” to inform scientific assessments of the available evidence — e.g. Those performed by the IPCC.
We also hope that it will allow for more systematic combining of climate science information across scales by gathering physical climate science information that often works on a global scale with highly regionalized studies of observed sector climate effects.
Our database – which we intend to make publicly available – can in theory be continuously updated, and our algorithm can be improved by investing in further supervised learning. In addition, we can continue to integrate information on detectable and attributable changes in climate impact drivers beyond temperature and precipitation alone.
If science advances by standing on the shoulders of giants, in times of knowledgeable scientific literature, the shoulders of giants will be harder to reach. Our computer-assisted evidence mapping method can offer a good chunk up.
Callaghan, M. et al. (2021) Machine learning-based evidence and attribution mapping of 100,000 climate impact studies, Nature Climate Change, doi: 10.1038 / s41558-021-01168-6
Sharelines from this story
Guest post: What 100,000 studies tell us about climate impacts around the world
Guest post: ‘Attribution gap’ in climate research for the global south