Does more data necessarily lead to better outcomes?
Collecting more data is often used as an excuse to delay a decision.
Sometimes new data is important to a decision, sometimes it won’t affect it.
Value of information (VoI) analysis can assist decision makers assess the value of collecting more information.
VoI analysis was originally developed in the decision sciences as a means of explicitly calculating the improvement in decision outcomes when new information is first collected prior to making a decision.
VoI Analysis provides an important toolbox for improving adaptive management processes.
“We need more data before we can make a decision!” Everyone working around environmental management would have heard someone say this at some stage. Indeed, most scientists faced with an environmental challenge will say something similar. And yet, if you think about it, the logic of this statement is flawed because doing nothing and collecting data is, in itself, a decision!
The alternative to this decision (to do nothing and collect data) is to do something (take a management decision) with the limited information currently available. Yet these alternatives are rarely evaluated and compared for the outcomes they produce: would collecting new data change and improve the management decision compared to not collecting any new data?
These are exactly the types of decision problems that a technique called Value of Information (VoI) analysis can help us with. In this article, I’ll look at this method and how it can be applied to environmental decisions. I’ll provide some of the theory behind the approach, work through a simple case study problem to illustrate how the analysis works, and also look at a couple of examples where VoI analysis has been employed. The aim is to make VoI analysis more accessible to scientists and managers involved in environmental decision-making.
Data, uncertainty and decision-making
Statisticians and ecologists have spent a lot of time thinking about and measuring how much additional data is needed to reduce uncertainty. Data on species’ distributions and trends, environmental threats, and the impact of climate change all provide information that reduces uncertainty about environmental systems. But, from a decision making point of view, this information is only valuable if it results in better decisions being made. More information will almost always reduce uncertainty, but it may not necessarily provide the right information to improve decisions, or may not be relevant information for improving decisions. In these cases it may be better to invest resources in management actions right now, rather than in collecting more data (which also uses up resources).
Although it may sound counter-intuitive, there are two main reasons why collecting more data may not improve a decision, even though it reduces uncertainty. First, the best decision may be so obvious that, even with a lot of uncertainty, we are almost certain which action is the best one to choose anyway even with existing uncertainties. For example, it may be so much more cost effective to recover a koala population through reducing mortality than to restore habitat, that reducing uncertainty about the effects of each these actions on population persistence may have little effect on the decision (Maxwell et al, 2015). If this is the case we may make the same decision regardless of whether we collect new data or not and so the management benefit of this data is low.
The other main reason is that the expected benefit of alternative actions may be so similar that it really doesn’t matter which action you choose. In this case, collecting more data is unlikely to improve the outcomes of our decisions because, even if it causes us to switch from one action to another, the outcomes will be almost identical (Pannell & Glenn, 2000).
Somewhere in between these extremes, new data and information will be most valuable. However, this intuition doesn’t help us identify whether new data is valuable for a specific problem, or which type of data we should collect. This is where VoI analysis comes to the rescue.
What is Value of Information analysis?
VoI analysis was originally developed in the decision sciences (Raiffa & Schlaifer, 1961) as a means of explicitly calculating the improvement in decision outcomes when new information is first collected prior to making a decision. It does this by calculating outcomes assuming that an optimal decision is made without collecting new data and comparing this to outcomes assuming an optimal decision is made after collecting new data.
VoI analysis has been applied extensively in the medical/ health and economics literature and practice. It has resulted in significant refinements in these fields. Examples in medicine include evaluating medical diagnostic technologies and optimising clinical trials. It has been demonstrated to improve economic values per patient.
In the environmental sciences there has been a history of using VoI to assess information to inform pollution and toxicology control in a health context (Yokota & Thompson, 2004), but use in the conservation sciences (and more broadly for environmental management) has been a more recent development (Colyvan, 2016).
The disciplines of environmental science and management have a long history of thinking about adaptive management (Walters, 1986). This framework aims to improve decision making over time as new information is collected (see Decision Point #102). In this context, VoI analysis provides an important toolbox for improving adaptive management processes by allowing us to actually value the new information we collect and to choose between different monitoring activities to inform management during the adaptive management process.
Formulating a Value of Information Analysis
As we stated above, VoI is the difference between optimal decisions made with new information and optimal decisions made without that information. Here we are going to write down the formulation for a general VoI problem. There is a bit of maths in it, but for many environmental or ecological problems the associated calculations can be done in a spreadsheet (Canessa et al, 2015).
First of all, think about a case where we are making decisions under uncertainty and we want to know how much better we would do if we eliminated uncertainty entirely (ie, if we were to make decisions under ‘perfect’ information). But, I hear you say, you will never be able to eliminate uncertainty entirely in a real problem. This is true, but thinking of things in this way does provide useful information on the upper bounds for the benefits of information gain and is often the starting point for a VoI analysis. The improvement in outcomes when eliminating uncertainty entirely is what is known as the Expected Value of Perfect Information (EVPI) and calculating that is what we are going to focus on first.
Expected Value of Perfect Information (EVPI)
Based on the concepts discussed above, the very first thing we need to do is calculate the value of making an optimal decision without new information. But before doing that, let’s try to make this a bit more concrete by thinking of an example where we want to choose between two management actions, fire management and habitat restoration to conserve an endangered bird species. We think that fire management will affect breeding success and habitat restoration will affect adult mortality, but we are unsure about how much and so the best management action to choose to maximise population growth rates is also uncertain.
In a simple model we could represent this by having two uncertain parameters representing: (1) the amount that fire management increases breeding success, and (2) the amount that habitat restoration reduces adult mortality. In a more general sense we will often have a set of uncertain parameters, s, that link the management actions to an outcome. In the presence of these uncertain parameters we can calculate the value of making an optimal decision as follows
where V(a,s) is the environmental outcome (eg, population growth rate) under action a and parameter values s (eg, the parameters for the effects of fire management and habitat restoration on breeding success and adult mortality), Es indicates taking the expectation of V(a,s) over all possible values of the uncertain parameter values s (essentially the mean value), and maxa indicates finding the action that maximises this expectation (mean). This formulation says that we first find the expected (mean) outcome across all uncertainties and then find the management action that maximises environmental outcome. Now let’s use the same reasoning to calculate the value of making an optimal decision when we have no uncertainty (ie, we have perfect information). In this case, the value can be calculated as follows
Here you can see that we first find the management action that maximises V(a,s) for each specific parameter values for s (ie, assuming we know it is the true value) before taking the expectation (mean). Note that we still need to take the expectation after finding the optimal management action because a priori we still don’t know what the true parameter values are. Since EVPI is the difference in value with and without perfect information then we can calculate EVPI as the difference between equation (1) and equation (2), such that
Expected Value of Partial Perfect Information (EVPPI)
We can actually extend this idea of the value of perfect information to the idea of eliminating uncertainty in only some parameters and not others. This leads to the idea of the expected value of partial perfect information (EVPPI), which is sometimes referred to as the expected value of perfect X information (EVPXI). This might be useful, for instance, if we wanted to know the value of learning about the effect of fire management on breeding success versus learning about the effects of habitat restoration on adult mortality. Again, more generally, we can think of this as learning about a subset of the parameters in the set of parameters s. Let’s assume that we are interested in the value if eliminating uncertainty is a subset of the parameters, Φ, and the rest of the parameters, Ψ, remain uncertain. The equation for EVPPI based on this is
where the full set of parameters s= (Φ,Ψ). Here, in the second term on the right hand side of equation (4) we take the expectation across uncertainty in both Φ and Ψ, and then find the optimal management action to calculate the optimal outcome with uncertainty in both Φ and Ψ. In the first term on the right hand side of equation (4) we take the expectation across uncertainty in Φ (the set of parameters we want to value the elimination of uncertainty from) after finding the optimal action to calculate the optimal outcome with uncertainty only in Ψ. Note that the value of EVPPI will always be less than EVPI since we are only partially reducing uncertainty when considering EVPPI.
Expected Value of Sample Information (EVSI)
Earlier we mentioned that the assumption of a complete elimination of uncertainty was pretty unrealistic. In fact, we usually we go out into the field and collect data that helps reduce uncertainty but not eliminate it. For example, we may collect field data on breeding success in areas where fire is being managed and where fire is not being managed to estimate the effects of fire management on breeding success. Although collecting more data of this kind will reduce uncertainty, it will never remove uncertainty entirely. Fortunately, EVPI and EVPPI can be extended to deal with this issue and more realistically estimate the value of reducing uncertainty through the collection of data. For this we use the idea of the expected value of sample information (EVSI) and although we do not go into detail of the formulation, for those interested, the equation for EVSI is
where x is the data (sample information) collected. Here, the second term of the right and side of equation (5) is the same as for EVPI. On the other hand, the first term on the right hand side calculates the expectation of the outcomes for each management action over all values of s assuming the data are known, and then maximises this expectation.
Finally, it takes the expectation of this value over all possible values for the data, as the data are also uncertain, to provide an estimate of the value of an optimal decision taken after collecting new data.
Applying VoI analysis
To illustrate the idea of EVPI and EVPPI, let us apply it to our hypothetical threatened bird management problem. Let’s assume that we want to choose the management action that maximises the population growth rate of the species and we have a very simple model of population growth that depends on adult female survival, juvenile female survival, and breeding success. We are going to assume that this model looks like this (Pulliam 1988)
where λ is the growth rate, SA is the annual probability of adult survival, SJ is the annual probability of juvenile survival, and β is the probability of successfully breeding (ie, breeding success). We assume that adults can give birth to a maximum of one juvenile (ie, clutch size = 1). Now, since adult survival and breeding success depend upon the management action, the growth rate is a function of the management action adopted so that
where xhabitat = 0 if habitat is not restored and xhabitat = 1 if habitat is restored, and xfire = 0 if fire is not managed and xfire = 1 if fire is managed. Here you will see that adult survival now depends on whether habitat is restored or not and breeding success depends on whether fire is managed or not. In our hypothetical example, we only have resources to restore habitat or manage fire, and not both, so we also need to work with this constraint:
It may be reasonable to assume that we have a pretty good understanding of the current survival and breeding success parameters if it were a well-studied population. But we may be uncertain about the likely outcomes under the different management actions; maybe this aspect has never been studied in this population?
But, there may be evidence from other populations, or from experts (Runge et al, 2011), that give us some information a priori about what might happen under the different management actions and to be able to assign probabilities to each possible outcome.
To illustrate this idea, let’s assume that under the restore-habitat action there is a 50% chance that we get an adult survival probability of 0.8 and a 50% chance that we get an adult survival probability of 0.9. Then, under the manage fire action there is an 80% chance that we get a breeding success of 0.7 and a 20% chance that we get a breeding success of 0.95. These possible outcomes are summarised in Table 1, together with the expected values (ie, the weighted mean of the possible outcomes, weighted by the probability that each outcome occurs) for the do-nothing, restore-habitat, and manage-fire actions. This reveals that the action that maximises the expected growth rate is to restore habitat, with an expected growth rate of 1.1 (this is, the value you obtain by applying equation (1) to this problem and is the value for the second term of the EVPI equation; the expected outcomes with uncertainty).
But note that, although managing fire doesn’t provide such good benefits on average as restoring habitat, there is a small probability that you do better than you can ever do by restoring habitat (ie, a 20% chance you get a growth rate of 1.175). Now we will apply EVPI and EVPPI to this problem.
In Table 1 we show that the value of the second term of the EVPI equation is 1.1. But what about the first term, which is the outcome when uncertainty is resolved (equation (2))? To calculate this we need to assess what the best strategy under each combination of possible outcomes under the habitat restoration and fire management actions would be. This is summarised in Table 2 showing the optimal strategy for each combination of possible outcomes and also the expectation of the optimal growth rates (after an optimal decision has been made) across the combinations is calculated. This expectation is the first term in the EVPI calculation and, with a value of 1.115, the EVPI is 1.115 – 1.1 = 0.015, or around 1.5%.
That is to say, if uncertainty were resolved entirely, there would be an improvement in management outcomes that would increase the growth rate by a further 1.5% compared to the case where uncertainty is not resolved.
We can extend this analysis to EVPPI and use it to determine whether it would be best to eliminate uncertainty in the effect of habitat restoration on growth rates or to eliminate uncertainty in the effect of managing fire on growth rates.
The second term in the EVPPI equation is the same as for EVPI, so we know that is still 1.1, but the first term changes. Let’s first look at the first term when resolving uncertainty in habitat restoration. In this case, we calculate the growth rate for each potential outcome for the effect of habitat restoration and compare it to the expected outcome for managing fire (acknowledging the effects of managing fire remains uncertain).
This allows us to identify the optimal strategy, under each possible outcome, for the effect of habitat restoration, assuming the effect of managing fire remains uncertain, and then expected value of these optimal outcomes is the first term in the EVPPI calculation (Table 3). Table 3 shows that the first term is 1.1125, so EVPPI for resolving uncertainty in the effect of habitat restoration is 1.1125 – 1.1 = 0.0125 (or 1.25%). A similar analysis can be done for resolving uncertainty in the effect of managing fire on growth rates (Table 4) and the EVPPI for resolving uncertainty in this case is 1.115 – 1.1 = 0.015 (or 1.5%).
The interesting thing you will notice with the analysis above is that resolving uncertainty in the effect of fire management on growth rates is the same a resolving all uncertainty (EVPI) and equal to 1.5%. The reason for this is that, once we have resolved uncertainty in the effect of fire management, there is no longer any uncertainty in which management action to take.
Table 2 shows that uncertainty in the decision is eliminated because: (1) if we know that β = 0.7 under fire management, then the best strategy is always to restore habitat, regardless of the still uncertain effect of habitat restoration on growth rates, and (2) if we know that β = 0.95 under fire management, then the best strategy is always to manage fire, regardless of the still uncertain effect of habitat restoration on growth rates.
On the other hand, if we resolve uncertainty in the effect of habitat restoration on growth rates the choice of management decision still depends on the still uncertain effects of fire management on growth rates (Table 2) and EVPPI is lower at 1.25%. This demonstrates that when considering the collection of new data for management, we need to consider the effect on the choice of management decision and not just the effect on uncertainty in our understanding of the system. In the boxes through this story we have briefly highlighted some recent applications of VoI analysis for decision-making for conservation. The models they use and problems they solve are more complex than the simple illustration above, but the concepts are identical.
The business of valuing information
Decisions relating to conservation are challenging. They often involve high stakes (get it wrong and you might lose a species or ecosystem), inadequate funding and high levels of uncertainty. As I discussed at the beginning, decision makers often request more information to reduce uncertainty before committing to a course of action – before making a decision.
But, as I hope I have made clear in this short article, putting off a decision is actually a decision itself. Delaying a management action to collect more information involves important trade offs: the cost of gathering that information (in time and money) and the potential improvement in the environmental value you are managing for. And if that value is a critically endangered species, then delay or re-routing funds for management to monitoring might see an irreversible loss if the species goes extinct.
VoI analysis enables decision makers to put a value on what might be gained (or lost) by gathering more information. Undertaking a VoI analysis requires a little technical knowhow but it’s a relatively tractable and straightforward process.
While VoI analysis is only beginning to be applied to conservation problems, the approach is already widely used in the corporate world. When it comes to decisions surrounding big investments, business leaders will quickly analyze the value of acquiring new data as opposed to simply going ahead with the investment decision. For example, deciding on whether to proceed with a new mine requires data from geological surveys and exploratory drillings and so forth.
Delaying the decision (to proceed or look elsewhere) comes at a very explicit cost, as does the acquisition of the extra data. The importance of VoI analysis in these situations is clear and they are commonly undertaken. Given this, there is little excuse for environmental managers not to value the alternatives of whether to proceed with existing information, or to delay a decision when, it comes to deciding on actions to conserve our irreplaceable biodiversity. So, maybe next time you hear someone say: “We need more data before we can make a decision,” you might ask on what justification such a delay is warranted.
Does better information save more koalas?
What’s the financial benefit of resolving management uncertainty
VoI analysis was used to evaluate the benefits of resolving uncertainty surrounding the declining koala population in the Koala Coast region of south-east Queensland (Maxwell et al, 2015). We modelled the effectiveness of koala management using current levels of information and compared this to a situation where all uncertainty about birth and death rates, and the effect of forest cover on these rates, was resolved.
We found the optimal management strategies with and without new information on birth and death rates, and the effect of forest cover on these rates, to be very similar. This similarity suggests that resolving uncertainty will have negligible effects on management performance.
Indeed, we found that a 0.034% improvement in the population growth rate is the best we could expect if uncertainty was resolved. When we converted values of information, in terms of population growth rate, into values of information in terms of dollars, we found that if resolving uncertainty costs more than 1.7% of the koala management budget, it would be more cost-effective to allocate that money to direct management action now.
The value of information was low because optimal management decisions were not sensitive to the uncertainties they considered. Decisions were instead driven by a substantial difference in the cost efficiency of management actions. The value of information was up to forty times higher when the cost efficiencies of different koala management actions were similar.
The researchers demonstrated that the value of reducing uncertainty is highest when it is not clear which management action is the most cost efficient.
Using VoI to design adaptive management
Natural resource management is plagued with uncertainty of many kinds, but not all uncertainties are equally important to resolve. The promise of adaptive management is that learning in the short-term will improve management in the long-term; that promise is best kept if the focus of learning is on those uncertainties that most impede achievement of management objectives. In this context, an existing tool of decision analysis, the expected value of perfect information (EVPI), is particularly valuable in identifying the most important uncertainties.
Expert elicitation can be used to develop preliminary predictions of management response under a series of hypotheses, as well as prior weights for those hypotheses, and the EVPI can be used to determine how much management could improve if uncertainty was resolved.
These methods were applied to management of whooping cranes (Grus americana), an endangered migratory bird that is being reintroduced in several places in North America (Runge et al, 2011). The Eastern Migratory Population of whooping cranes had exhibited almost no successful reproduction through 2009. Several dozen hypotheses can be advanced to explain this failure, and many of them lead to very different management responses. An expert panel articulated the hypotheses, provided prior weights for them, developed potential management strategies, and made predictions about the response of the population to each strategy under each hypothesis.
Multi-criteria decision analysis identified a preferred strategy in the face of uncertainty, and analysis of the expected value of information identified how informative each strategy could be. These results provide the foundation for design of an adaptive management program.
Quality trumps quantity
The value of information for conservation planning under sea level rise
Here’s a slight variation on the theme of value of information. This case study, carried out by Rebecca Runting and colleagues, compared different types of information on sea level rise to determine which type gave the best conservation outcomes. They compared how good a conservation plan was depending on whether it was based on expensive high quality (high resolution) data versus using lower quality (low resolution) data. The high quality data costs a lot more to acquire meaning less money is available for purchasing sites for the reserve network. If planners used the cheaper low quality data, they could purchase more sites for the reserve network.
Their analysis (Runting et al, 2013) came up with an amazing result. Their results suggest that for the upper sea-level rise scenario it was worth spending up to 99% of the available budget on acquiring the high quality data. The break-even cost for the mid-range and lower sea-level rise scenarios was also a large proportion of the total budget (with a mean of 82% and 64% respectively).
In other words, whilst adopting a more accurate approach may mean less land is acquired in terms of overall area (because you’ve spent some of your money on modelling and data), the land that is acquired would be of greater conservation value than the cheaper approaches. This is because the cheaper approaches tend to omit areas of important conservation value in the prioritisation process. Sometimes, using these less accurate outputs meant that you could never achieve the same conservation value (no matter how much money you spent), as the areas that are important for conservation didn’t exist on the map!
Valuing information for management
Improving the future of box-ironbark forests with targeted learning
Value-of-information analysis reveals the expected benefit of reducing uncertainty to a decision maker. Will Morris and colleagues (Morris et al, 2017) performed this analysis on management of box-ironbark forests in Victoria. With three management alternatives (limited harvest/firewood removal, ecological thinning, and no management), managing the system optimally (for 150 years) with the original information would, on average, increase the amount of forest in a desirable state from 19% to 35% (a 16‐percentage point increase). Their VoI analysis revealed that resolving all uncertainty would, on average, increase the final percentage to 42% (a 23‐percentage point increase). However, only resolving the uncertainty for a single parameter was worth almost two‐thirds the value of resolving all uncertainty.
More info: Jonathan Rhodes email@example.com
Canessa S, G Guillera‐Arroita, J Lahoz‐Monfort, D Southwell, D Armstrong, I Chadès, R Lacy, S Converse & O Gimenez (2015). When do we need more data? A primer on calculating the value of information for applied ecologists. Methods in Ecology and Evolution 6:1219-1228.
Colyvan M (2016). Value of information and monitoring in conservation biology. Environment Systems and Decisions 36:302-309.
Maxwell SL, JR Rhodes, MC Runge, HP Possingham, CF Ng & E McDonald-Madden (2015). How much is new information worth? Evaluating the financial benefit of resolving management uncertainty. Journal of Applied Ecology 52:12-20.
Read a discussion on this paper in Decision Point #87
Morris WK, MC Runge & PA Vesk (2017). The value of information for woodland management: updating a state–transition model. Ecosphere 8:e01998
Read a discussion on this paper in Decision Point #105
Pannell DJ & NA Glenn (2000). A framework for the economic evaluation and selection of sustainability indicators in agriculture. Ecological Economics 33:135-149.
Pulliam, HR (1988). Sources, sinks, and population regulation. American Naturalist 132:652-661.
Raiffa H & RO Schlaifer (1961). Applied Statistical Decision Theory. Graduate School of Business Administration, Harvard University, Cambridge, USA.
Runge MC, SJ Converse & JE Lyons (2011). Which uncertainty? Using expert elicitation and expected value of information to design an adaptive program. Biological Conservation 144:1214-1223.
Runting RK, KA Wilson & JR Rhodes (2013). Does more mean less? The value of information for conservation planning under sea level rise. Global Change Biology 19: 352–363.
Read a discussion on this paper in Decision Point #67, p10-12.
Walters CJ (1986). Adaptive management of renewable resources. MacMillan, New York, USA. Yokota F & KM Thompson (2004). Value of information literature analysis: A review of applications in health risk management. Medical Decision Making 24:287-298.