Unfortunately time is short. I have found it more difficult than I expected to make a handful of specific criticisms of the statistics in Bradshaw et al.
I originally aimed to concentrate attention on a few narrow methodological issues when the paper was brought to my attention. I now feel that statistical weaknesses are indicative of a much deeper problem with papers of this sort.
Environmental science should guide environmental policy. Environmental politics should not exercise direct influence on the way scientific evidence is evaluated. If this is allowed to occur there is a constant risk of the underlying science becoming devalued. This has to be resisted. Scientific publishing concerns itself with rigor and objectivity. Scientists do have an obligation to make decision makers more aware of issues such as the wider positive value of forests. There are now many forums available that can allow us to achieve these goals. As individuals we can find always find, develop and strengthen, outlets that permit us to express subjective opinions freely and explicitly. A weblog such as this may even be an example.
However, the traditional scientific peer review process should not be seconded to an environmental cause. It must have a different role. The strict objectivity of peer review gives us the credentials and the confirmed evidence that we can use in order to make a difference when we express our opinions as scientists.
In this particular case the authors’ openly state that they deliberately set out to support decisions that strengthen forest conservation. They write that “for conservation to receive wide political and popular attention and priority, especially in the developing world, there needs to be empirical evidence of nature’s role in supporting human well-being. … For centuries it has been vehemently claimed, and hotly disputed, that forests provide natural protection from floods … Yet, because the claim lacks broad-scale empirical support, the development and implementation of clear flood-mitigation policies regularly stall (FAO & CIFOR, 2005; Calder & Aylward, 2006).” The authors stated intention was thus to correct this deficit. I find this quite uncomfortable reading. Scientific evidence doesn’t fall out of a data set as a result of political necessity to provide it.
Why does the claim of a link between deforestation and flooding “lack empirical support”? I don’t believe that it does. There is ample evidence on which to base a consensus. Flooding, and more specifically, the loss of life and property associated with flooding, is the result of a broad set of factors that combine in complex, case specific ways. These combinations (that could be called the “cause” of flooding) are often quite well understood at the local level when any specific event is investigated. However the “causes” of floods in general can rarely be analyzed successfully by looking for global scale correlations. The number of interactions involved prevents purely statistical analyses from incorporating enough of our current understanding to cast new light on the issue. Contemporary, process based, hydrological models now capture the dynamics of flooding remarkably well. Water does move in predictable ways, even if higher powers, including politicians, do not. There are relatively few deep scientific mysteries to be resolved.
The intrinsic difficulty is still in effectively communicating what is already known to decision makers. Flooding is largely predictable (at least in a probabilistic sense). Nevertheless, poor political decisions are constantly made that ignore both science and common sense. They will continue to be made if decision makers can claim that a scientific consensus has not been reached and then use the resulting confusion as a justification to act on non-scientific agendas.
In some specific circumstances forest cover can indeed mitigate the effects of intense rainfall events. This is especially true in the case of small scale, local events. In other cases flood plains are inundated regardless of whether the upper parts of watersheds are forested. Politicians sometimes use deforestation in order to distract attention from their own poor planning in heavily populated flood plains. Sound environmental management involves both protecting forests and ensuring high quality urban planning with sophisticated risk management in the flood plains. Distracting attention from the complexities of the trade offs involved by setting up false dichotomies based on questionable analysis of evidence is unhelpful. Floods cost lives. We must try to get this right.
Here are a few quick maps based on the authors, data.
The authors investigated two statements. (i) flood frequency is correlated with the total forest cover (natural and plantation) and/or (ii) flood frequency is correlated with the total forest cover loss over the period of interest. Only floods caused by heavy or brief torrential rain were included; those caused by typhoons, cyclones, dam breakage and tsunamis were excluded because they represent events that originate independently of landscape characteristics.
It should be clear that total forest cover must be correlated generally with climate. Absolute forest loss is positively correlated with the size of the country with large countries clearly losing more. Proportional forest loss is negatively correlated with the size of the country, with small countries losing more forest. So the authors’ attempted to hold for these effects statistically by including them in an initial model.
As I read the paper, this analysis looks increasingly odd. Instead of concentrating on “offsetting” effects such as total area, population or rainfall they threw them directly into the statistical model. The use of “random effect” to refer to country level soil moisture is also extremely odd. On initial reading I thought that country id must have been used as a random factor in a GLMM that looked at within country effects, sensu hierarchical mixed effects models as used in social science. I could find no clue that the authors had investigated any random effects as I understand the term.
Using the data set that I had previously made available for download in a previous post on this weblog I fit one of the models cited in the paper using R.
glm(formula = nfloods ~ area + slope + rain + degrad + for2000 +
forloss, family = gaussian(link = “sqrt”), data = d)
Min 1Q Median 3Q Max
-12.570 -4.091 -1.526 2.090 38.183
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.377e-01 6.697e-01 1.102 0.27666
area 6.295e-07 3.268e-07 1.926 0.06054 .
slope 1.182e-01 2.022e-01 0.585 0.56171
rain 9.215e-04 3.056e-04 3.015 0.00425 **
degrad 1.925e-06 7.054e-07 2.729 0.00909 **
for2000 -1.732e-06 5.392e-07 -3.211 0.00247 **
forloss 8.676e-05 5.967e-05 1.454 0.15307
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 74.80764)
Null deviance: 15817.0 on 50 degrees of freedom
Residual deviance: 3291.4 on 44 degrees of freedom
(5 observations deleted due to missingness)
In other words, at first sight no significant effect of deforestation at this scale. I am still not sure what the authors’ analysis actually involved but although the (often justifiable) use of information criteria to mediate between models appears sophisticated, and the lack of mention of statistical significance testing is sensible, in this particular setting the contemporary feel to the analysis mainly serves to make it more difficult to penetrate for the uninitiated.
As my major issue remains that of aggregation errors (for example what does median annual rainfall of 616 mm mean for a countrty like Mexico?) I am reluctant to go into further detail.