Predicting the conservation status of IUCN Data Deficient species
In 2010 Conservation International launched its Search for lost frogs, in an attempt to find a hundred amphibian species not seen in over a decade. Only four of those one hundred species were re-discovered, highlighting both the increasing risk of extinction to amphibian species, and the limited knowledge of their survival status. Millions of species remain to be discovered, and we lack ecological and distribution information for most of the world’s described species. Given this paucity of knowledge, how can we effectively conserve biodiversity?
My PhD attempted to solve this question by studying one of the world’s foremost conservation tools: the IUCN Red List of Threatened Species. The Red List assigns a category of extinction risk to species based on quantitative criteria, and so far has assessed more than 74,000 species.
After an internship at the IUCN Headquarters in Switzerland in 2009, I became interested in the data (or lack thereof) underpinning the Red List. Indeed, one in six species on the Red List are assessed as Data Deficient due to limited knowledge of their ecology, distribution and population status (see the box on being Data Deficient and figure 1). At this stage it was not known how much Data Deficient species influenced conservation priorities derived from the Red List. In 2010 I embarked on a PhD at the Zoological Society of London and Imperial College London to resolve this question.
Uncertain global patterns of extinction risk
I first studied the potential effect of Data Deficient species on perceived patterns of extinction risk in freshwater invertebrates (Bland et al., 2012). Freshwater invertebrates (crayfish, dragonflies and freshwater crabs) have recently been assessed by the IUCN, but show very high levels of data deficiency: 30-49% of species are assessed as Data Deficient.
I simulated the effects of three scenarios concerning Data Deficient species on patterns of extinction risk. The first scenario assumed no Data Deficient species were actually threatened, the second that Data Deficient species were as threatened as other species, and the third had all Data Deficient species as being threatened.
Globally, patterns of extinction risk among geographical regions and invertebrate families didn’t change much. However within continents, Data Deficient species completely masked any pattern identifying over-threatened families or countries which could have benefited from conservation actions. Data Deficient species are therefore a considerable source of uncertainty for the Red List.
Predicting extinction risk
Although insufficient for formal red listing, some data do exist on Data Deficient species. For example, we know the distribution of many Data Deficient species, so we can infer exposure to large-scale threats by combining distribution maps with spatial data on human population density or deforestation. We can also collect data on species’ body size or ecology from species descriptions and museum specimens. Predicting extinction risk from such basic data could help us rapidly understand the conservation status of thousands of species worldwide.
“Globally, patterns of extinction risk among geographical regions and invertebrate families didn’t change much. Within continents, however, Data Deficient species completely masked any pattern identifying over-threatened families or countries which could have benefited from conservation actions.”
I first tested this idea on mammals. I collected data for species of known conservation status, and predicted their IUCN Red List status with seven different Machine Learning models (Bland et al., 2014). Machine learning models are flexible and powerful tools for finding patterns in data sets (see the box on machine learning). The models correctly predicted the status of 94% of threatened species, and correctly predicted the global distribution of threatened mammals.
When applied to Data Deficient species, the models predicted 64% of Data Deficient mammals to be at risk, increasing the proportion of threatened mammals from 22 to 27%. This means that we may have underestimated threat levels in mammals, because poorly-known species tend to be at very high risk of extinction.
Poorly-known species typically have small range sizes, and occur in remote areas that can be subject to large threats (eg, deforestation and mining). Geographic regions containing large numbers of potentially threatened Data Deficient mammals are already conservation priorities, suggesting that poorly-known species are reasonably well covered by conservation schemes.
A cost-effective IUCN Red List
In 2013 I visited two CEED hubs (The University of Melbourne and The University of Queensland) to apply my models in a decision-theoretic framework. Surveying and re-assessing all Data Deficient species would cost a minimum of US $300 million, but funding towards poorly-known and uncharismatic species is very scarce. Cost-effectively understanding the conservation status of Data Deficient is therefore a priority.
I focused on cost-effectively estimating risk levels in Data Deficient species with the right balance of IUCN and model-based conservation assessments. I extended the method to amphibians, reptiles and crayfish, and also looked at the effect of poor data quality on model outputs. I concluded that models and decision theory could provide large monetary savings for the IUCN Red List (up to 69%). This would enable the cost-effective monitoring of progress towards international biodiversity targets, such as the 2020 Aichi Targets.
I am currently collaborating with CEED hubs to prioritize individual Data Deficient mammal species for field surveys and IUCN Red List assessments. I will use Liana Joseph’s Project Prioritization Protocol to allocate money to Data Deficient most in need of conservation attention. I will take into account species’ predicted extinction risk, their evolutionary value, and the financial costs and anticipated success of field surveys.
Data Deficient species can also represent global patterns of conservation knowledge deficiency. Because the Data Deficient category is similarly applied among different animal and plant groups, we can determine how conservation knowledge is accumulating within each of these groups.
I found that global patterns of conservation knowledge deficiency were very different among groups, and that these were caused by different institutional and historical factors. I also found that Data Deficient species could be used as surrogates of undiscovered species for conservation planning.
Monitoring biodiversity change with limited data is an important challenge for international biodiversity targets. Global indicators such as the IUCN Red List require large amounts of data and funds so would benefit most from cost-effective approaches.
Being Data Deficient
Around one sixth of the species assessed by the IUCN are classified as Data Deficient. This assessment is due to a lack of information in a number of categories including taxonomy, geographic distribution, population status or threats. Around 15% of mammals, 25% of amphibians, 19% of reptiles and 49% of freshwater crabs are classified as Data Deficient. Where do you prioritise your conservation effort for freshwater crabs when every other species has insufficient data to enable its conservation status to be determined?
Uncertainty within many groups about the true level of extinction risk of Data Deficient species considerably influences our understanding of patterns of threat because the distribution of Data Deficient species is often taxonomically and spatially biased. Genuinely threatened Data Deficient species may be neglected by conservation programs due to their uncertain conservation status.
Species data sets frequently contain many variables with nonlinear relationships, complex interactions, and missing values. As such, traditional statistical methods may lack the predictive ability we need. Machine learning methods are derived from research into artificial intelligence, and are flexible and powerful tools for finding patterns in data. They rely on few assumptions and can accommodate large amounts of data. A wide range of machine learning algorithms are now available, and their relative performance depends on the study objectives and available data. Some machine learning algorithms output the likelihood of occurrence of a given outcome, which allow easy interpretation of uncertainty in predicting complex processes. As a result of these properties, machine learning algorithms represent a robust approach for deriving rules of thumb to predict extinction risk in Data Deficient species.
More info: Lucie Bland firstname.lastname@example.org
Bland LM, B Collen, CDL Orme & J Bielby (2012). Data uncertainty and the selectivity of extinction risk in freshwater invertebrates. Diversity and Distributions 18: 1211-1220.
Bland LM, B Collen, CDL Orme & J Bielby (2014). Predicting the Conservation Status of Data-Deficient Species. Conservation Biology. doi: 10.1111/cobi.12372