Building more informative climate layers for species distribution modelling

The Worldclim data set is a fantastic resource for species distribution modelling. Not only does it include quality controlled monthly temperatures (minimum and maximum) and precipitation data but it also provides 16 derived bioclimatic layers that are designed to be universally applicable and interpretable. These layers are quite simple functions of the original data and can be easily recalculated from the raw monthly variables. I have previously used an additional derived variable for modelling aimed at representing the length of the growing season in the tropics. This was the number of months with more than 100mm rainfall. However I was never very happy with the arbitrary nature of this variable. The growing season really should be based on soil water balance in some way.
Mario Gonzalez, who was trained in agronomy has always insisted that actual evapo-transpiration is a key variable that determines both tropical productivity and biodiversity (Gonzalez et al 2004). I implemented a regional “bucket model” for Chiapas some time ago but I did not use it in publications as I was slightly uncertain as to whether the necessary simplifications could be justified. However experience using machine learning to derive SDMs has convinced me that more mechanistic approaches to deriving variables are absolutely necessary in order to avoid finding “pseudo-niches” with no biological justification from combinations of variables that just happen to coincide with points where species are found.
So I looked again at my implementation of a regional water balance model in order to derive layers for input for SDMs.
A visual comparison seems to produce a very convincing match to regional patterns of plant growth as shown by the NASA Blue marble images. I am now fully convinced that with a little additional work verifying output against Modis derived NDVI its use in modelling can be robustly justified.

Here is an animation of a 12 month bucket model.

The units are (very approximately) soil moisture in mm in a profile that assumes 500mm to be saturated and 200mm to be permanent wilting point (i.e no growth possible).

And a visual comparison can be made with the Modis derived Blue Marble layer (more formal analysis with NDVI to come in a later post).

Side by side (note that September and November Blue Marble are missing from the current Nasa WMS.

The visual match is quite compelling, with the exception that a full (soil moisture) bucket in the temperate winter (Dec to Feb) of the Eastern US does not coincide with high NDVI (blue marble greenness) due to cold temperatures at these latitudes. This could be built into a model as well. However the main purpose of the bucket model taken alone is to provide input for modelling species distribution in Mexico where winters are less pronounced in terms of temperature differences.

The R code for the model is available here.

And the low res resampled WorldClim layer for input to this code from here.

There is still a lot of work needed to justify this approach formally, but it certainly looks very promising. A simplified informal argument goes like this.

A key determinant for many plant species is length of the growing season. In the tropics the growing season is limited mainly by water availability rather than temperature.

A conventional rule is that precipitation in the tropics exceeds evapotranspiration during months with over 100mm of rainfall. We have used this variable as an addition to the bioclim variables in previous published modelling studies. However this variable does not take into account the true water balance. It introduces serious bias at higher latitudes and elevations. It will also produce unrealistically sharp boundaries, as does any variable based on an arbitrary categorisation of what is in fact a continuum. Several of the other Bioclim variables suffer from the same defect.

Evapotranspiration is a complex process that integrates all climatic variables. Empirical measurements of pan evaporation produce an estimate of potential evapotranspiration (PET). This can also be calculated from first principles using the Penman Montieth equation and its derivatives. However, actual evapotranspiration (AET) is often much lower than PET as it is affected by soil water availability. Eco-physiologists and micro-meteorologists can measure (and model), AET quite accurately for relatively restricted areas such as fields and some forest canopies using methods such as eddy flux covariance. However, scaling up to larger spatial extents is challenging.

Precise quantitative measures of AET are impossible to produce without detailed knowledge of soil depth, soil texture, cloudiness, albedo among other factors. However, for species distribution modelling we need to include a layer that accurately depicts relative (rather than absolute) differences in the process over appropriate spatial and temporal scales in a realistic manner. It is the pattern that is vital to model accurately, not than the absolute pixel values.

One possibility would be to use Modis derived NDVI. The NDVI is correlated with productivity, as is AET, and areas with high NDVI have higher transpiraation rates than areas with low NDVI. The problem with this is that NDVI is also affected by land use. So it integrates two processes at once. There is an additional issue in irrigated areas. So while NDVI can, and should be considered in models at some point it is not the ideal input to a climate only based model, although taking the highest values provides a useful baseline for validation.

Empirical measures of PET are available from some climate stations. Unfortunately measures of pan evapotranspiration are notoriously inconsistent. This explains the absence of an evapo-transpiration layer in WorldClim. It would be unreliable if based on empirical data alone.

The Priestley-Taylor equation is often used to estimate PET over large areas. Although the simple form takes temperature and latitude as input, the equation does work with an estimate of net radiation so takes into account the seasonal variation of incoming radiation. It is implemented in the EcoHydRology package in R and can be adapted to run over Wordlclim layers.

## Input data

The input to the model is a 36 band Geotiff containing the Worldclim variables minimum temperatures (months 1:12), maximum temperatures (1:12) and precipitation.


## Calculating latitude in radians

The latitude of each pixel can be obtained from the coordinates. The function takes latitude in radians as input.


## Calculating monthly PET

The following code takes the function in EcoHydRology and applies it to a day at the midpoint of each month (assuming 30 day months for simplicity as there is no need for false precision) .

for(i in 1:12){
  if(i==1){PET<1){PET<<-addLayer(PET, evap)}

## Pattern of PET


## Estimating AET using a simple bucket model

Actual evapotranspiration is always lower than potential evapotranspiration. It can be much lower when the soil profile is well below field capacity. As soil moisture falls to close to the permanent wilting point AET will tend to zero, as no water can be extracted from the soil. The biggest challenge involved in calculating AET for large areas is that soil moisture is unknown. It can never be known accurately as so many factors are involved. However, after a run of days in which rainfall exceeds evapotranspiration soil moisture should rise towards field capacity. As this happens AET approaches PET. When rainfall ceases AET should drop towards zero as the soil profile gradually dries out. This is the basis of watershed models such as SWAT. A simple but effective way to represent this uses a “bucket model”. Once the bucket is full further rain leads to run off, and once the bucket is empty no more water can be lost. Watershed models are concerned with run off, so it is important to simulate pulses of input due to heavy rain. However a regional model can ignore this for the purpose at hand and simply add mean daily rainfall and subtract evapotranspiration to the bucket. AET can then be adjusted by an Alpha factor that mutiplies PET by 1 when the bucket is full and zero when empty.
In order to initialise the model a “burnin” period is needed in order to allow the bucket to find its level before monitoring the contents. The model is run at daily time steps and the results recorded at the mid point of each month in order to produce 12 layers to be added to WOrldclim.


for (n in 1:2){
 for (i in 1:359){
   evap<-values(subset(PET,mn))*alpha*0.8 ##A fudge factor for stomatal control.
       if(n>1 && (i%%30)-15==0){
      if(mn==1){AET<1){AET<<-addLayer(AET, NewAET)}

And that very simple model seems to more or less reproduce a useful pattern that has integrated ALL the Worldclim variables in one simulation plus derived radiation patterns that result from latitude.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s