Landsat paths and rows for Southern Mexico
Priority areas for dry forest restoration in La Frailesca
This is a marker for work in progress. Do not cite.
This morning I was contacted in order to provide geographical information for one of our partners in the Reforlan project. The aim of the research is to find priority areas for forest restoration.
The question of how to go about the analysis is thought provoking. I have already mentioned the problem of aggregation error in this weblog. Pooling data in order to find mean values does not work at the scale in which most of the important processes act. The classic example is mean rainfall for Mexico. At the same time, some elements on the landscape cannot be mapped at a fine scale, and others, such as management units (farms or ejidos) form natural aggregation according to social factors.
So perhaps the first step would be determine priority pixels according to a limited number of physical processes that can be resonably accurately mapped at fine resolution (30 m x 30 m) then aggregate to find the number of pixels in each agronomic unit that could be restored. An analysis of social and economic factors at this scale could then be undertaken in order to discover where restoration might be feasible and bring the most social benefit.
The mappable factors at the 30m pixel scale would be
1. Forest cover and recent deforestation: Pixels that have forest cover presumably do not need restoring. Pixels that have been recently deforested may be prioritised, although natural regneration may be a suitable mechanism in some areas close to the original forest.
2. Slope: Areas with steep slopes must be prioritised for restoration in order to avoid soil erosion. Factors such as slope length may be taken into account.
3. Insolation (aspect): South facing slopes that receive intense insolation during the dry winter months may be in need of restoration, but successful restoration may be more challenging.
4. Distance from rivers: Riparian areas may be in greater need of restoration and trees may establish more successfully in moist soil. At the same time these may be the most valuable areas for irrigated agriculture.
5. Conservation status: Reserves should perhaps take priority when restoration is being considered.
The interesting element of the research involves finding a suitable way of combing the layers in order to allow ranks to be calculated. A typical approach is weighted averaging, but this may not allow interactive effects to be included.
The map below shows the estimated current extent of the forest in the region and recent deforestation hotspots.

Forest cover in La Frailesca: Red and yellow areas are recently deforested: Shaded areas are federal reserves.
Areas with the steepest slopes do tend to coincide with remaining forest cover, as would be expected, but some recent deforestation both intentionally and as a result of forest fires has taken place in the steeply sloping “Sierra” region in the West and South West and South of the study region.
North facing slopes receive less direct sunlight during clear winter days and are more likely to be covered in evergreen forest. Fuel dries out on south facing slopes leading to more intense fires. Regeneration may be difficult due to hydric stress.
Most deforestation has occured close to the major rivers.
It may also be useful to evaluate soil type (not shown) although the maps tend to be at a coarse scale.
Once pixels have been ranked according to aptitude and priority for restoration (the two factors may be inversely related in many cases) they may be aggregated to agricultural management units for further analysis of socio-economic factors.
Animación localidades de Bosque Mesofilo de Montaña en Chiapas
Do we see trees or forests?: Knowledge gaps
It would seem reasonable to assume that research aimed to fill a basic knowledge gap should receive more attention than projects aimed at evaluating complex untestable theory. However in the field of tropical forest ecology it not always clear that fundamental basic knowledge gaps are either being identified or filled quickly enough. An issue that has concerned me for many years is the persistent difficulty in accurately tracing and understanding the spatial distribution of tropical organisms, particularly trees and other plants.
Trees are immobile, long lived, highly visible elements of the tropical landscape. It would therefore seem reasonable to assume that we already know a great deal regarding their distribution. However trees form forests. Forests cause numerous logistical difficulties for field research. Tropical forests can be hostile, disorienting, pathless environments. They also contain a complex set of natural resources. Illegal exploitation of some of these resources such as valuable timber or plants with interesting pharmaceutical properties (of various types) may obstruct scientific research. Many tropical forests have also provided refuge at times for members of radical political movements.
Tropical trees can be extremely difficult to identify to species. Flowers, fruits and leaves may be out of reach. Their taxonomy is complex and dynamic. Tropical families are unfamiliar to researchers trained in temperate zones. Many species have converged in terms of leaf shape and characteristics.
This has led to chronic ignorance of the distribution of tropical tree species. Only a small fraction of the total area covered by tropical forest of any description has been studied in any detail. Regional species lists rely on data from a few well known areas, but within any given region distribution patterns are poorly known. Many assumptions are made concerning the nature of anthropogenic influence on areas of forest. Some are justifiable, some tend to exaggerate human impact on naturally resilient forest ecosystems.
Given the poor state of our knowledge it is vitally important that the few reliable data sources that are available are used to their full capacity. However even the best global data sources can only provide a small part of the picture alone. Missouri Botanical Gardens (MOBOT) has one of the largest collections of tropical plant specimens of any herbarium. MOBOT has provided online access to most of this data. Yet even when the whole set of tropical tree collections with known coordinates from this source are mapped on to space large gaps are revealed in the empirically based knowledge they provide regarding tree diversity.
A PostGIS query can be used together with the graticule trick mentioned in the last post to map the density of collections within 0.1 degree (approximately 100 km2) grid squares in the MOBOT Vast data base. The query is provided as a model for counting incidences within grid squares. The data itself is stored locally and forms part of a larger set that we have collated over a period of several years.
CREATE TABLE meso.count with oids as
SELECT g.ncollects as Ncollects, a.* from
grat01 a,
(SELECT d.gid, count(c.id) as ncollects
FROM meso.trees c, grat01 d
WHERE c.the_geom && d.the_geom
AND contains(d.the_geom, c.the_geom)
AND c.source like ‘MOBOT%’
group by d.gid) g
WHERE a.gid=g.gid;
If it is assumed that 100 collections per 100 km2 is a minimum needed to provide a reasonable picture of the distribution of tree species (there may be more than 400 species in each grid square in some regions) the map below shows that only a fraction of the area of mesoamerica can be considered to be well enough typified by this data set to allow distribution patterns to be mapped through the data alone. This contrasts sharply with the floristic atlases available for most of Western Europe based on known occurrences.
Even when the grid squares with less than 50 collections (far fewer than needed) are added to the map there are many completely unknown areas when this data source is taken alone. Where collections have been taken the density is generally below 5 per 100 km2. Direct empirical evidence regarding tree diversity hot spots is simply not available. Comparisons between Chiapas, Yucatan, Belize and Costa Rica cannot be made from the data alone.
In contrast to our poor knowledge regarding forest composition, our knowledge regarding the extent of both forests and deforestation is becoming ever more accurate and detailed as high resolution satelite imagery becomes available. It appears that we are looking at forests, but not seeing the trees.
“Raster” data in PostGIS?
PostGIS does not support raster data as yet. Hopefully there should be raster support soon. Raster layers can be visualised from a local or remote server using geoserver. However having raster layers integrated into a PostGIS data base would be extremely useful for tasks such as overlaying herbarium records with continuous extrapolated climate data or elevation.
I have previously experimented with importing the points of a raster layer. This works quite well. It can be used for simple spatial overlays using proximity operations. However the visual impression produced by the points is not very satisfactory and a query using the contains operator would be more efficient.
QGIS has a plugin for graticule generation. This forms a grid of individual polygons. So if a grid is produced using the graticule plugin of QGIS and then imported into PostGIS using SPIT it is possible to partially mimic a raster layer, although with some computational overhead.
The document mnp-workshop3.pdf showed how worldclim data (http://www.worldclim.org/) can be imported into GRASS and R.
If R is run from GRASS the following lines can be used to import the appropriate raster layers at a chosen resolution into R.
Most of the lines of the script are concerned with calculations on the basic worldclim data. The dataset consists of 36 grids of monthly mean minimum temperature, maximum temperature and precipitation. Some mean annual values may be derived for simplicity.
a<-c(paste(”tmin”,1:12,sep=”"),paste(”tmax”,1:12,sep=”"),paste(”prec”,1:12,sep=”") ,”alt”)
d<-readRAST6(a)
d@data[,1:24]<-lapply(d@data[,1:24],function(x)x/10)
fullgrid(d)<-FALSE
f<-function(x)mean(x[1:12])
d@data[["meanmim"]]<-apply(d@data,1,f)
f<-function(x)mean(x[13:24])
d@data[["meanmax"]]<-apply(d@data,1,f)
f<-function(x)mean(x[1:24])
d@data[["meantemp"]]<-apply(d@data,1,f)
f<-function(x)sum(x[25:36])
d@data[["totprec"]]<-apply(d@data,1,f)
The trick for importing the layers to PostGIS from R is shown below. If the data is in R as a spatialPixelDataFrame the coordinates can be added to the data to form a table.
dd<-data.frame(coordinates(d),d@data)
This table can then be saved to a PostGIS database using ODBC.
library(RODBC)
con<-odbcConnect(”mydb”)
odbcQuery(con,”SET search_path =chiapas, pg_catalog;”)
sqlSave(con,dd,”worldclim”)
###########################################
Now either in pgadmin or running the queries directly through R by placing them inside an odbcQuery function.
ALTER TABLE chiapas.worldclim add column gid serial NOT NULL;
ALTER TABLE chiapas.worldclim ADD CONSTRAINT worldclim_pkey PRIMARY KEY(gid);
ALTER TABLE chiapas.worldclim ADD COLUMN the_geom geometry;
UPDATE chiapas.worldclim
SET the_geom = PointFromText(’POINT(’ || s1 || ‘ ‘ || s2 || ‘)’,4326);
CREATE INDEX
grat30arc_idx_geo on grat30arc USING GIST(the_geom GIST_GEOMETRY_OPS);
CREATE INDEX
worldclim_idx_geo on chiapas.worldclim USING GIST(the_geom GIST_GEOMETRY_OPS);
If a polygon graticule called grat30arc covering the same region with a 30 arch second grid has been produced using the graticule plugin in Qgis and imported into the public schema then the average values of the points within each grid square can then be calculated for the graticule and a new table produced using this as a model query.
CREATE TABLE chiapas.worldclim2 with oids as
SELECT a.*, g.totprec, g.meanmin, g.meanmax, g.elev from
grat30arc a,
(SELECT AVG(c.totprec) as totprec,
AVG(c.meanmim) as meanmin,
AVG(c.meanmax) as meanmax,
AVG(c.alt) as elev,d.gid
FROM chiapas.worldclim c, grat30arc d
WHERE c.the_geom && d.the_geom
AND contains(d.the_geom, c.the_geom) group by d.gid) g
WHERE a.gid=g.gid;
The result is very similar to the visualisation of an arcgrid in arcview when it is viewed in QGis over a shaded relief map. In this case the pixels are approximately 1 km x 1 km (30 arc seconds or 0.008333333 degrees) and the elevation lines up well with the contours derived from a 1:250,000 map.
Spatial indices on the herbarium table and the worldclim2 layer with the following
CREATE INDEX
herb_idx_geo on herb.herbario USING GIST(the_geom GIST_GEOMETRY_OPS);
CREATE INDEX
worldclim2_idx_geo on chiapas.worldclim2 USING GIST(the_geom GIST_GEOMETRY_OPS);
Then the following query runs in less than four seconds to provide 6,000 herbarium records with extrapolated climate data to the nearest km.
SELECT h.*, w.totprec, w.elev,w.meanmin,w.meanmax
FROM herb.herbario h, chiapas.worldclim2 w
WHERE h.the_geom && w.the_geom
AND contains(w.the_geom, h.the_geom);
Or
select h.*, w.totprec, w.elev,w.meanmin,w.meanmax
from herb.herbario h, chiapas.worldclim2 w
where ST_contains(w.the_geom, h.the_geom);
Visualising deforestation:Marques de Comillas
Our ongoing work with Conservation International has led to the first regional forest cover change map that has been derived using a single consistent methodology. The details of the study have not yet been published. However as we work on validation and interpretation of the results a pattern has become very clear.
Previous deforestation studies in Chiapas have usually concentrated attention on well defined study areas. This could produce the impression that deforestation is a homogeneous process over the whole state. In fact study areas selected for deforestation analysis are usually those with the highest rates when compared to the rest of the region. This is quite natural. Many of the areas where deforestation is no longer occurring have already lost a large proportion of their forest cover. However the overall regional deforestation we have quantified is considerably lower than previous studies imply.
Where deforestation has followed the classic pattern of forest conversion to permanent agriculture or pasture the CI methodology, which is based on Landsat imagery, has provided a remarkably good match with high resolution imagery. However where chronic, low level forest disturbance takes place the overall impact of human activities are much less easily quantified at the resolution of Landsat imagery. The difficulty in accurately evaluating forest cover change increases in areas of dry forest. Nevertheless regional patterns are robust.
The clearest deforestation hotspot in the state of Chiapas remains the Marques de Comillas area in the Southern Lacandon. Deforestation and carbon sequestration in this area has previously been studied in some detail by De Jong et al 2000
The two images below are animated gif files which change when clicked on to enlarge them to full size. The show clearly how Landsat based deforestation analysis coincides in this area with the conclusions drawn from visual analysis of recent high resolution imagery. The visual analysis is produced by overlaying our change analysis in Google Earth using Geoserver, and through the use of QGis.
Deforestation:Overlaying vector and raster layers in PostGIS using GRASS
A very common operation in GIS involves overlaying vector polygons on a raster map and calculating the area of each class in the raster map. The results are then added to the attribute table of the vector map. How can this be achieved in PostGIS?
If the raster layer is quite small it could be imported as points and the result achieved through a point in polygon SQL overlay. However that is not the usual way to achieve the operation. When large raster layers derived from the analysis of satelite imagery are of interest the conventional GIS approach is to first rasterise the vector layer. The computations involved in the overlay are then simple and fast.
This example taken from our ongoing work. It overlays a map of agricultural units on a raster layer representing deforestation in the last decade derived from analysis of satelite data. Because deforestation in Chiapas has tended to take place on a rather small scale involving only a few hectares at a time, it is not at all easy to see the large scale regional pattern from an image. Calculating the proportion of the total area of each agricultural unit deforested helps to see the pattern. The map above shows the percentage of the total (forested + non-forested) area of each agricultural unit estimated to have been deforested between 1990 and 2000.
The simplest way to achieve this result with PostGIS is to import the vector layer into GRASS, where the raster is also held. There are a few issues that have to be addressed when doing this. The first is that the gid index is lost when the attributes are imported. As a unique identifier for each polygon is needed during the overlay an index should be added to the table first if the gid is the only unique identifier for the table. This is very simple.
ALTER TABLE chiapas.agros ADD COLUMN id int8;
UPDATE chiapas.agros SET id=gid;
Now enter GRASS in a location with the same projection as the PostGIS data base (typically EPSG:4326)
Now there is another technical issue. Clearly vector processing in any GIS involves a combination of the geometery and attribute tables. We have both already in PostGIS. However in GRASS we use functions that work on GRASS’s own representation of the geometry, so an import is needed. GRASS can use a wide variety of databases as a back end for storing attribute data. Among these is Postgresql. What is not obvious, and potentially quite confusing, is that we already have the data in a Postgresql data base! So, does it all need to be imported again? The answer seems to be yes and no. It is quite possible to link the GRASS topology with the original table. However it is much easier to import it all in the usual way using v.in.ogr. This produces a new table, that could either stored in an external data base or perhaps even within the original PostGIS data base (as a new copy). Of course this attribute only table lacks the geometry column.
For example to set up GRASS to use the original local PostGIS data base as the backend for vector layers
db.connect driver=pg database=”host=localhost,dbname=gisdb2″
db.login user=Duncan pass=******
Note that this step is not necessary. By default Grass will store the table as in dbf format.
A layer can be imported from PostGIS using
v.in.ogr dsn=”PG:host=localhost dbname=gisdb2 user=Duncan” layer=chiapas.agros output=agros type=boundary,centroid
If the raster layer is in another coordinate system, as it is in this case, it will be necessary to exit the location and reproject in the location of the raster layer.
v.proj input=agros location=Global mapset=mesoanalysis output=agros
Then form a raster layer.
v.to.rast input=agros output=agros use=attr column=id layer=1 value=1
Then in GRASS the overlay is achieved quite simply using r.stats and sending the output to a text file.
r.stats -a input=agros,defor >agros.txt
This forms a file in which the first column holds the id of the vector layer, the second the class in the raster layer and the third the area in square meters of each combination. For example…
1 111 1319094.000000
1 112 159201.000000
1 122 1041304.500000
1 211 4026323.250000
1 222 85285437.750000
1 244 89347.500000
1 422 52796.250000
1 444 1749586.500000
2 111 565326.000000
So the task is now done, apart from the job of adding the data back into the original PostGIS table. There are many ways to do this. Often only one, or a few focal classes are of sufficient interest to merit a new column. So an easy way to process the output and send it back to PostGIS involves R.
Connect to the PostGIS data base using RODBC and set the schema
library(RODBC)
con<-odbcConnect(”mydb”)
odbcQuery(con,”SET search_path =chiapas, pg_catalog;”)
Read in the data and set the first column as numeric, removing any missing data (by default it will probably have been interpreted as a factor as GRASS will use a * for missing data).
d<-read.table(”agros.txt”)
d$V1<-as.numeric(as.character(d$V1))
d<-na.omit(d)
Now a subset of the data can be taken for each of the columns of interest, named and added back to the data base. In this case code “111″ represents area that was classified as forest in 1990,2000 and 2005. “122″ pixels that were deforested in the period between 1990 and 2000.
odbcQuery(con,”ALTER TABLE agros ADD COLUMN forest int4;”)
d2<-subset(d,d$V2==111)
d2<-d2[,-2]
names(d2)<-c(”gid”,”forest”)
d2$forest<-round(d2$forest/10000,0)
sqlUpdate(con,d2,”agros”)
odbcQuery(con,”ALTER TABLE agros ADD COLUMN defor2000 int4;”)
d2<-subset(d,d$V2==122)
d2<-d2[,-2]
names(d2)<-c(”gid”,”defor2000″)
d2$defor90<-round(d2$defor2000/10000,0)
sqlUpdate(con,d2,”agros”)
We are rather concerned that this cover change analysis, that has been achieved with the standard methodology used by Conservation International, tends to quite seriously underestimate small scale disturbance and canopy opening. Nevertheless in comparative terms it does demonstrate that absolute deforestation in Chiapas is certainly not a homogeneous process. Extensive deforestation is not occurring, although if the map is zoomed out further to show the Marques de Comillas area a very clear hotspot of deforestation is shown. In this part of Chiapas (Central Valley, Sierra and Altos) comparatively few agricultural units have deforested more than 2% of their total area in the last decade.
The other way of looking at this would be as the proportion of remaining forest lost. This woud tend to emphasise deforestation rates in ejidos and agricultural units with a very small ammount of forest in 1990. However at this level the approach would be somewhat unfair as it would be biased by classification errors in small areas.
To put the map in context a map with the proportion of the total area classified as forest can be used. In R the column can be added using
odbcQuery(con,”ALTER TABLE agros ADD COLUMN propforest float;”)
odbcQuery(con,”update agros set propforest=100*forest/hectares;”)
A surprising number of the agricultural units have over 40% tree cover. This has to be placed in the context of a classification method that draws no distinction between a pixel within a pasture with sufficient cover of secondary vegetation to be classified as forest and untouched primary forest. The next step clearly involves analysis of fragmentation in order to clarify this.
Telecomunicaciones e informática en Ecosur
No creo que sea apto usar este sitio para hacer comentarios con respecto a características de la institución donde trabajo. Nada mas aprovecho una oportunidad para citar un icono cultural.
“Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart; the centre cannot hold;
Mere anarchy is loosed upon the world,
The blood-dimmed tide is loosed, and everywhere
The ceremony of innocence is drowned;
The best lack all conviction, while the worst
Are full of passionate intensity.
Surely some revelation is at hand;
Surely the Second Coming is at hand.”
- W. B. Yeats (‘The Second Coming’)
Nicole in the jungle
Configuration of Geoserver
Mapservers have never taken centre stage in my thinking on how to make Ecosur’s geographical data more widely available. When using a geographical data base the underlying information is almost by definition available for analysis. In contrast data served up by a mapserver is usually only suitable for visualisation. However mapservers have an important role in conjunction with PostGIS. The NASA JPL wms server is a an essential provider of satelite images that provide the backdrop that allows a visual intepretation of spatial information.
At a recent workshop Conabio provided me with a very nice shaded relief map for Mexico derived from a digital elevation model. I have been using it locally with Qgis. However I became concerned that it must also be potentially made available for use with POSTGIS at an institutional level. This clearly needs a mapserver.
I have previously wasted a fair ammount of time failing to configure mapservers correctly. So this morning I looked with some trepidation at geoserver. To my relief within an hour I had geoserver configured and the shaded image available to QGIS through a local host WMS.
The steps are extremely simple:
1. Download geoserver as a zip file
2. Extract the contents.
3. Set an environmental variable to point to your implementation of Java. I have sun java 6 installed using the medibuntu repository.
sudo gedit /etc/environment
Add this line
JAVA_HOME=”/usr/lib/jvm/java-6-sun”
4. Reboot and run mapserver using a command that goes something like
/home/duncan/geoserver/bin/startup.sh
5. Point the browser at
http://localhost:8080/geoserver/
Then if everything is working simply follow all the online instructions in order to add layers.
With geoserver started there is a new wms server that QGis can connect to.
It is also easy to include data from geoserver in Google Earth. Choose <add> <image overlay> from the menu. Then the “refresh tab”. Click on WMS parameters and add the geoserver details.
You then have the layer in Google Earth
More videos of Mickey and Nicole.
Boinging on the trampoline again. Nicole is getting better all the time and Mickey has been an expert for months.
And, yet another welly walk with Nicole.
Landscape of the Highlands of Chiapas
While backing up some old information on my hard drive I rediscovered a document I produced for Pronatura at the end of November 2003. It lies somewhere between the status of informal field notes and a formal technical report. I presented it to Pronatura a week after a field trip in a light plane. The idea of the document was to record the discussion during the flight.
This was very much at the end of the “pre-google earth” era. Spatial analysis was still rather more of a specialised activity than it is today and I had only begun to become interested in the subject myself. My previous research was generally non-spatial.
Now a “fly over” of the region can be achieved rather more comfortably and safely with a laptop than a four seater plane. The experience was stomach churning at times with a few moments of sheer terror as the turbulence caused by thermals rising from the central depression hit.
Despite the fact that some of the analytical methods used were hurriedly applied and could certainly be refined, I still generally agree with most of the conclusions in the document.So I have placed it here in case it could be useful as a general introduction to the region. It is not directly citable, but similar statements to those included in the document have been made in peer reviewed work I have published together with Luis Cayuela, although there have been very slight differences in emphasis between my interpretation of patterns and causes of deforestation in the region and those adopted by Luis Cayuela and some other colleagues.
Here is the document in PDF form ….
Postgresql 8.3 and Qgis 0.10.0 Io
Ecosur will soon be providing students and researchers with a networked data base for storing and visualising public domain spatial layers. This database should also store results from ongoing research. I am convinced that the advantages of providing a fully functional geographical data base at zero cost by using PostGIS/Postgresql are compelling. The release of a stable Postgresql 8.3 and Qgis 0.10 have made the task even simpler.
Postgresql 8.3 is now notably faster when running spatial queries than the version I had been using (8.1). It also has very advanced features. These move PostgreSQL up to the power of the premium proprietary enterprise data base, Oracle. They clearly go beyond those found in MySQL.
The only practical challenge I faced with PostgreSQL 8.1 involved a very simple issue of scalability when the database was viewed with Qgis 0.9. Despite some issues with the stability of Qgis (that did worry me, but appear to have been largely resolved) Qgis is a suitable visualisation tool for PostGIS. I have used it extensively since I started experimenting with PostGIS a few months ago and am generally satisfied. However a simple practical issue concerned me.
How could users easily be given access to the parts of a potentially very large database in such a way that they see only the part they need or are allowed to see?
The answer had clearly to be through the use of Postgresql schemas, but the Qgis interface didn’t show them in a user friendly way that matched the need.
This problem is now largely resolved by the new version of Qgis. Schemas now show up in an unfoldable tree structure. As groups of users and individual users can be given access to a subset of schemas it should now be feasible to design a database that hides all the complexity from users and allows them to quickly find the layers they need.
Videos y DVDs con Ubuntu
Algunos estudiantes llevaron discos de Ubuntu Heron despues de mi presentación. Si han probado instalarlo estoy seguro que van a querer usarlo para ver videos. Para conseguir Flash para You Tube la mejor forma es abrir synaptics (sistema-administración-synaptics) y buscar “flashplugin-nonfree”. Luego los codecs se instalan cuando se necesitan.
Para tener Medibuntu (el repositorio para algunos codecs comerciales) la forma mas rapida es con comandos. Abre un terminal y pega este para abrir el archivo de los repositorios.
sudo gedit /etc/apt/sources.list
Luego pega este en el documento al fondo y guardarlo.
deb http://packages.medibuntu.org/ hardy free non-free
Para añadir la clave para activarlo pega este en un terminal.
wget -q http://packages.medibuntu.org/medibuntu-key.gpg -O- | sudo apt-key add - && sudo apt-get update
sudo apt-get update
Y luego nada mas añade los libs.
sudo apt-get install libdvdcss2 libdvdread3
No te asustas con el uso del terminal. Se puede conseguir lo mismo con el GUI, pero es mas facil comunicar la información como comandos.
Upgrade to Ubuntu Heron
I am impressed by Ubuntu Heron. The most important difference is ease of installation. Both Feisty and Gutsy were simple enough to get running, given a reasonable degree of patience. However there were always a few problems to fix. These involved screen resolution sound and video. Getting everything right needed a search which produced multiple answers on the users forums.
A single issue that needs to be resolved by editing a config file is more than enough to get most Windows users, acustomed to preinstalled operating systems, to give up and go strait back to what they know. Thus despite the clear merit of the underlying operating system, uptake remained much slower than it could have been among academic colleagues here.
In contrast my experience installing Heron on a Toshiba Tecra was effortless. It all worked out of the box. What was more impressive was the short amount of time it took to get everything, including a large number of R packages, GRASS, QGis, Google Earth, MySQL, Apache, Postgresql etc all back together. I already knew what I wanted and where to get it, but even so the short time line is worth stressing.
3:15 Open Laptop and remove old hard disk (It was time to go from 80GB to 250GB and this is a great way to make the change without any fear of losing time)
3:20 New hard disk in place.
3: 25 Heron install CD is asking me for details of location, keyboard and partitioning.
3:41 Heron installed and running.
3:45 Reboot in order to use NVidia, this was recognised as necessary after activating advanced visual effects. No manual configuration needed at all.
5:03 R, GRASS, QGis, Google Earth, Flash plugin + codecs, Apache MySQL, Postgresql all installed.
So I will never worry about a hard disk crash again. Providing I have my personally generated data safe somewhere, all the rest is easy. Perhaps it is not so simple on all laptops, but it is just so encouraging to see that Ubuntu is getting within a hair of the grandma friendly Linux.
I occasionally do have need of Windows XP. So this morning I moved the .virtualbox directory over from my old home directory to the new one, after a quick install of virtual box and adding my usename to the virtualbox group. It worked exactly as it had before. The advantage of virtualising windows is so glaringly obvious I am constantly surprised by my colleagues reluctance to protect themselves against virus attack that way. It is understandable to want to continue using what you are used to (Windows XP). However it is very strange not to want to make sure that you can get back to work quickly after a disaster. Virtualising XP gives the confidence that you can be back up and running in less than an hour. As desktop PCs are more or less generic virtual machines can be backed up and moved freely between them even if an IT department hasn’t done what it should and provided a sensible networked option of virtual machines for Windows XP users.
It is worth being aware that Heron comes with Firefox 3 beta preinstalled, which is fine, as it will soon be the official version. However in the meantime you can install Firefox 2 in order to use plugins that don’t yet work with the new version. Be aware that you must close down Firefox 3 in order to run firefox-2 (type firefox-2 on a command line or set up a launcher).
I also noticed that the Heron repositories has up to date versions of other programs that I didn’t realise had progressed. The most important development for me at the moment was that Postgresql8.3 is now the stable version and that QGis is 0.10.0 instead of 0.9.1. Despite still not getting to version 1.0, QGis does look as if it has improved greatly in terms of both stability and appearance. This is encouraging after my recent presentations. QGis 0.9 did have a few uncomfortable issues that I didn’t mention. Also Open Office is better. The most notable improvement is when you try to change the language setting. Anyone who knows the frustrating illogicality of being presented with “thesaurus” when you want to check the spelling of a Spanish document will know what I mean. The options are now in much more sensible places.
Installing Heron
Despite recent posts, I admit to being a long term Windows user myself. I therefore share many of the characteristics of Windows users. The most important of these is that I am naturally extremely conservative and very reluctant to change a laptop when I have things working the way I am used to. I have been using Ubuntu Feisty since late July 2007 without ever bothering to upgrade to Gutsy (although I installed it for my son).
However I decided that the release of Heron really deserved my attention. I was becoming increasingly frustrated with the limited disk space on my duel boot Toshiba Tecra with its original 80 GB HD (40 GB still in an XP partition that I now never use). So today I treated myself to a 250GB external Simpletech ATA hardrive. On getting it home I invalidated the warranty by opening the case, taking out the drive and swopping it for my 80GB ATA drive. I mention the fact that it is ATA as older (my Toshiba was bought in October 2006) or cheaper laptops are likely to have IDE drives. Always check first and be aware that this simple trick will not work if your machine uses IDE. But if you need a new hard drive, and can afford one, simply switching the drives saves all worries about upgrading. If it goes wrong just put the old drive back in.
In went the Heron CD (I decided not to bother with any native windows install this time) and 15 minutes later all the basics were installed without any tweaking at all. I turned on the advanced visual effects. Ubuntu realised I needed an Nvidia driver and installed it. After rebooting I then had all the wobbly windows and spinning cubes if might ever want, if I ever feel like using them. A completely painless, risk free upgrade in no time at all. Now all I need to do is copy the parts of my home director I want to keep back over from the external drive and it is all back where it was, with extras. Two hours maximum for the whole process, including putting back R, GRASS, Qgis MYSQL etc. PostGis is just a little bit more difficult in my case due to the fact that I previously used Postgreql 8.1 and it is time for 8.2. But that detail is fairly tecnical. Contrast the basic experience with a complete re- install of Vista. It would be a nightmare that could last the best part of a week.
Last night I read this very useful and sympathetic account of the install experience from a novices’ perspective http://www.bspcn.com/2008/04/28/the-great-ubuntu-girlfriend-experiment/
As I had handed out several disks of the Heron after the Ecosur seminar I thought I should be aware of these sort of problems. I am quite likely to have a few students coming into my office in the next few days asking for help. The girlfriend gotchas really are all very minor. My experience with the new Heron install was simple to the point of being disappointingly trivial. I missed all the thrills of fixing screen resolution and sound. The only important detail worth mentioning is to make sure that the flash plugin is installed from symantics before going to watch a you tube video. There are then no problems. You simply get a polite message asking for permission to install codecs and there is no need to try to install a gzipped file from adobe. Done.
Preguntas sobre PostGIS y la presentación de ayer.
1. ¿Necesito Linux para usarlo?
No. El modelo es lo que se llama “cliente-servidor”. Es decir que PostGIS es el servidor. Proporciona los datos, pero la visualización de los datos o el procesamiento de las consultas es al nivel del cliente. Tu puedes escoger el cliente que mas te gusta. En este momento hay varios programas clientes de acceso abierto (gratis) que pueden leer, procesar y visualizar información de un servidor de PostGIS. Lo mas fácil de usar es probablemente QGIS, aunque Udig también tiene buenas características. Open Jump es otra alternativa. Todos estos programas se puede instalar en Windows. Los clientes están en un proceso de desarrollo todavía y van a incrementar su utilidad con el tiempo. PostGIS ya tiene toda la funcionalidad de servir datos a clientes con mas capacidades.
Al mismo tiempo programas como Excel y Access pueden vincularse con el servidor para procesar datos no espaciales.
2. ¿Es difícil instalar Post GIS? Necesitamos apoyo técnico externo para hacerlo.
No. Se instala PostGIS con un servidor de Linux en alrededor de una media hora. Se requiere un poco de planificación y esfuerzo adicional para la configuración óptima de las opciones de seguridad. Pero es importante recordar que los usuarios de la información no van a instalar PostGIS. Ellos pueden bajar un archivo qgis.exe del internet y instalarlo bajo Windows en minutos. Si usan Linux también van a instalar Qgis, o Udig, no PostGIS. El uso de Qgis es sencillo e intuitivo. Se puede dar capacitación en su uso practico en el curso de una mañana.
3. Hemos intentado bajar capas de un mapserver, pero parece complicado.
La relación entre PostGIS y un mapserver es potencialmente un elemento potente, pero no se necesita correr un mapserver para usar PostGIS. Un mapserver recibe capas de PostGIS, Asi que es otra capa entre el usuario y servidor. La configuración de un mapserver para correr con PostGIS no es muy difícil. Hay dos mapservers de acceso abierto (mapserver y geoserver) que pueden usarse. Sin embargo un mapserver es realmente una forma de dar información al publico. El primer paso hacia el uso de un mapserver sería asegurar la velocidad de la conectividad al Internet de Ecosur.
4.Estoy acostumbrado de usar ArcView, tengo licencia para usarlo y muchos shapefiles disponibles. ¿Por qué me interesa esta iniciativa?
La ventaja de una base de datos espacial es que toda la información esta guardada en el mismo lugar, bajo el mismo sistema de coordenadas y actualizado. Las capas pueden comunicarse entre ellas. Si ves por ejemplo que las coordenadas del censo nacional no coinciden con los del conteo de población sería posible tratar de resolver el problema. Una vez resuelto todos los usuarios de la base no van a repetir los errores. Usuarios de ArcView pueden bajar la información como shapefiles con un clic y trabajarla al nivel local de sus propios Pcs. Es muy importante enfatizar que la iniciativa no fuerza usuarios de cambiar su forma de trabajar. Cada persona desarrolla sus propias preferencias por software al nivel de sus propia “desktop” y hay que respetar sus decisiones. Pero al nivel del servidor hay que asegurar que todos los datos están mantenidos bajo estándares rigurosos, uniformes y convencionales.
5.¿Cuando podemos contar con un servicio así?
Un prototipo ya esta funcional y ya puede ser muy útil. Se puede proporcionar una herramienta de uso interno en el espacio de una semana. Pero la verdadera potencia de un sistema así viene con mas esfuerzo. Se puede programar portales Web atractivos para el publico. Se puede organizar sistemas amigables para que estudiantes entren sus datos en el sistema sin tener que saber detalles sobre el uso de SQL. El control de calidad y mejoramiento de la información se consigue a través de esfuerzos de revisión meticulosos y programación con datos. Para contar con un sistema del cual podemos estar orgullosos como institución debemos emplear por lo menos una persona con capacidad en programación de bases de datos tiempo completo.
Sería vital la coordinación entre investigadores que producen nuevos datos de campo, las colecciones, LAIGE, informática y la biblioteca.
6. ¿Cuales son la ventajas de Linux para mi como individuo?
Mis propias razones estan expresados aqui.
Presentación de PostGIS en Ecosur
Abajo he incluido las diapositivas de una presentación que voy a dar hoy en Ecosur sobre la potencia de PostGIS como un base de datos para la investigación institucional en Ecosur. Incluyo una introducción sobre la filosofía del movimiento de aceso libre y la potencia del nuevo modelo “profesionalizado” del software de aceso libre.
Los ejemplos del uso de POSTGIS se ven en los videos de You Tube abajo.
Desafortunadamente la calidad del imagen no es muy alto. De todos modos dado la velocidad de la conexión institucional no creo que se pueden ver dentro del institución.
Hurricanes in Chiapas
I have just added the historical hurricane paths to my PostGIS data base from NOAA http://maps.csc.noaa.gov/hurricanes/
The data base has storm tracks from 1851 to 2006. The above maps shows only those since 1990. This is a very valuable resource that is well worth including in the regional data base and looking at in more detail. The interesting element as far as the state of Chiapas is concerned is that very few hurricanes actually track over the state. This is clearer when the whole (150 year) historical data set is included as shown in the figure below. However the moist air masses associated with hurricanes can cause substantial rainfall in the state even when the centre of the hurricane is some distance to the north or east.
When the map is zoomed out further a very clear cut off point emerges, forming an almost horizontal line at around 8 degrees North. This is a feature associated with the Intertropical Convergence Zone (ITCZ),
This rather sharp transition also explains some of the strange patterns in the maps of global circulation model simulations which also show a pronounced band around the equatorial ITCZ. The cut off may alter in extent over the next century as the effects of global warming are felt. If it expands slightly northwards, then rainfall associated with hurricanes and tropical storms may also move, leading to a general decline in rainfall in southern Mexico. This could occur even if hurricane intensity and frequency in the Atlantic basin follows an overall increasing trend.
An example of using PostGIS from R
One of the interesting elements of PostGIS is that it can be run from R using RODBC. I have found this to be very useful, especially when tables are being altered, as it allows the properties of the results of a query to be quickly analysed and checked for sanity before being added to the data base. So in order to work with POSTGIS I usually open QGIS, PgAdmin and an R console and switch between them.
Here is an example of how you might calculate the population density within polygons and add the results to a PostGIS table.
In R first connect to a data base assuming an ODBC connector has been set up
library(RODBC)
con<-odbcConnect(”mydb”)
Now the example will find the sum of the population within polygons representing watersheds. This layer was originallybuilt using r.watershed in GRASS followed by r.to.vect. The automated watershed limitation doesn’t work very well along the coast, but provides a useful set if polygons for the rest of Chiapas, particularly the central valley. First add a new column to hold the result.
odbcQuery(con,”ALTER TABLE chiapas_basins ADD COLUMN poptot int4;”)
Then the query that calculates the total population can be run from R. The result can by looked at first in R to check that it is sensible. This is useful, as I found differences between using the 2000 census data and the 2005 population count that are not attributable to population change or migration, but rather are caused by errors in the geopositioning of the data.
sql<-”select sum(m.z1) as poptot, b.id from chiapas_conteo_2005 m, chiapas_basins b where m.the_geom && b.the_geom AND contains(b.the_geom, m.the_geom) group by b.id;”
d<-sqlQuery(con,sql)
This takes a while to run. I quickly discovered that it is extremely important to make sure that a GIST index is added before attempting point in polygon operations. This can be added to each table in PgAdmin, for example.
CREATE INDEX chiapas_basins_idx_geo on chiapas_basins USING GIST(the_geom GIST_GEOMETRY_OPS);
Now histograms and totals of the population in the watersheds can be produced in R to look at the results before adding them back into the data base.
For example
hist(log(d$poptot))
The population density is typically log normal.
sum(d$poptot)
4255957
This gives Chiapas a total population of 4.25 million which looks about right. As the query returned the results with a column name that has already been set up and has the primary key this line will update the database.
sqlUpdate(con,d,”chiapas_basins”)
Now we need to calculate the area of the watersheds in km2. To do this I first had to insert an srid for Albers conic projection for North America. This can be found here
http://www.spatialreference.org/ref/epsg/102008/
Then some more queries that can either be run in pgAdmin or wrapped up by odbcQuery(con, “some sql statements “) and run from R directly.
odbcQuery(con,”ALTER TABLE chiapas_basins ADD COLUMN areakm2 float4;”)
odbcQuery(con,”UPDATE chiapas_basins SET areakm2 = area(transform(the_geom,9102008))/1000000; “)
odbcQuery(con,”ALTER TABLE chiapas_basins ADD COLUMN density float4;”)
odbcQuery(con,”UPDATE chiapas_basins SET density = poptot/areakm2; “)
The results can then be visualised in QGIS. Whether this is easier than using ARCGIS certainly depends on what you are used to. I am still working out how to achieve fairly basic operations like this in PostGIS. However when I have achieved them once, it is then easy to repeat the operation for other examples, which is why I am posting fairly simple examples like this to the weblog for reference in case they can help others to overcome the intially steep learning curve with PostGIS. Although the geoprocessing wizard in ArcView is very simple to use and the point in polygon operation does seem to run much faster than the corresponding query in PostGIS, this particular operation would still require some additional work with the database tables if it were to be done using ArcView alone.
Importing maps from BERDS: Reprojection using ogr2ogr and shp2pgsql
I am constantly impressed by the BERDS database for Belize. It is powered by POSTGIS, and also offers downloads of some coverages as shapefiles in UTM. I have just downloaded and imported Jan Meerman’s fire risk map into my POSTGIS system. I did it in two steps that I have found to be generally more robust than trying to use ogr2ogr. directly. First reproject to a new shapefile then import the result using shp2pgsql. I am not sure why this process results in fewer errors, but it does seem to work with some difficult imports I have been attempting (although this particular example is very clean).
ogr2ogr -f “ESRI Shapefile” belizefire.shp -t_srs EPSG:4326 -s_srs EPSG:2028 firerisk.shp
shp2pgsql -s 4326 -c belizefire belizefire mydb | psql -d mydb
Deforestation and floods 4: PostGIS raster layers?
One element struck me when looking at the original data compiled by Dartmouth college on flood events. The data base contained a table headed “cause”. This was full of text such as “rain”, “heavy rain”, “torrential rain”. The orange points on the map above are the centres of recorded flood events in thals three years. Note VillaHermosa in Tabasco. The coloured layer is shaded by rainfall intensity (five bands ranging from below 600mm to 2500 mm +).
So, the headline would be that “scientists conclude that flooding is caused by ….. rain!”
This is not in fact as trivial as it seems. Devastating floods can occur in regions where heavy rain is uncommon, but there must be a clear correlation between flood frequency and the sheer amount of annual rainfall a region experiences. I am constantly surprised by how quickly the vegetation in Mexico changes from dry forest to exuberant tropical forest. The transition zone can be a matter of kilometers on the road north from Chiapas to Tabasco. The flooding in Villa Hermosa in 2007 was mainly the result of rainfall falling on the state of Tabasco itself (and a small part of the extreme north of Chiapas). During the dramatic events a very light drizzle was falling in San Cristóbal. The lowlands of Tabasco are almost completely deforested, but the mountains of northern Chiapas are covered in a mixture of secondary forest and cattle pastures. Rains in late October fall on already saturated soil profiles. Vegetation cover could only have played a marginal role in reducing the amount of water flowing out of the highland watersheds in this particular case.
This provided me with an excuse to experiment with ways to move raster layers into POSTGIS. The POSTGIS data base structure is clearly not designed for raster layers, but some quite good results can be obtained simply by moving a fairly coarse raster layer, such as an interpolated rainfall coverage, into POSTGIS as points. This is similar to a SpatialPixelsDataFrame in R. In fact the easy way to set up the coverage is to move the data from a sp object in R to POSTGIS by converting the coordinates into a geometry. When viewed at some scales an interference pattern results, but the coverage can be very useful because clicking the information tool on any point on the map results in information on the pattern of rainfall over twelve months, although the visualzation can use the total annual rainfall.
The first image used Udig. The lower image uses Qgis. Again although I have not put a clear legend in this very quick informal treatment, the rainfall ranges from 800mm per year (bright red)to 4000 mm (dark blue)
La niña and early rainfall
One of my first posts to this weblog mentioned the statistical correlations between low mid pacific sea surface temperatures and early rainfall in Chiapas.
http://duncanjg.wordpress.com/2008/02/06/checking-on-the-el-nino-effect-using-r/
Heavy rains often do not begin in the state until mid May. However this weekend we have experienced a major rain storm (12 April). Persistent rain fell throughout the night and early morning (14 April). Rerunning the code shows that la Niña effect is still strong.
Of course as any linkages are statistical in nature I do not expect my temporary success at long term weather forecasting to continue.
Las Chimalapas: A reserve waiting to be declared?
Watch the animation below carefully (Click on the map to open the animated GIF). First a blue marble image is presented. Then a map of settlements is overlain (red points) Then the current protected areas of southern Mexico are added (blue shading). Finally the registered plant collections are added.
So, what is that large hole in the centre of the map doing? Does no-one live there? Hasn’t it been declared a reserve? Have any botanical collections been made there?
This is the Chimalapas region, in the isthmus of eastern Oaxaca bordering on Chiapas. The vegetation ranges from lowland rainforest in the Atlantic north to tropical dry forests in the Pacific south Cloud forest and montane pine-oak forest lie in between. Rainfall ranges from 1100 mm to over 3500 mm. The whole area totals more than 600,000 ha. It is still in a good state of preservation. The avifauna is apparently the most diverse for any region of comparable size in the country, totalling at least 464 species in the region as a whole (with more than 300 species in the lowland rainforest) representing 44% of the bird species known from Mexico. Important species present in the region include Harpy Eagle Harpia harpyja and several other large eagles. Yet it has not received protected status and is seldom visited by plant collectors.
Giving protected status to the region would seem on the face of it to to be an automatic “no brainer”. There would be few losers. The resources of this uninhabited region are presently not being exploited. Pressures on the region are ever present, but not intense. The publicity associated with protected status might encourage incipient ecoturism in the area. Various conservation organizations such as the TNC and Conservation International are interested in the region and my group in Ecosur are currently looking at deforestation in the region in detail in order to strengthen the arguments.
However, is the protected area model really the best option for the area? I leave the question open in the hope of receiving some comments.
There are a few points that might be discussed.
1. Are governmental resources available in order to ensure that protected status would indeed prevent activities that negatively impact the biodiversity of las Chimalapas?
2. Could protected status have unexpected consequences, such as encouraging unsustainable activities in the the area?
3. Would protected status for las Chimalapas divert resources from initiatives that could be effectively combating more immediate threats?
4. Would protected status interfere with wider long term economic development of the region, specifically in terms of trans isthmus transportation?
It is quite astonishing to notice that the “hole” in question even stands out at a national level when desert regions are included. The only other large areas without inhabitants in Southern Mexico are around Calakmul.
Floods and forest loss 3
Unfortunately time is short. I have found it more difficult than I expected to make a handful of specific criticisms of the statistics in Bradshaw et al.
I originally aimed to concentrate attention on a few narrow methodological issues when the paper was brought to my attention. I now feel that statistical weaknesses are indicative of a much deeper problem with papers of this sort.
Environmental science should guide environmental policy. Environmental politics should not exercise direct influence on the way scientific evidence is evaluated. If this is allowed to occur there is a constant risk of the underlying science becoming devalued. This has to be resisted. Scientific publishing concerns itself with rigor and objectivity. Scientists do have an obligation to make decision makers more aware of issues such as the wider positive value of forests. There are now many forums available that can allow us to achieve these goals. As individuals we can find always find, develop and strengthen, outlets that permit us to express subjective opinions freely and explicitly. A weblog such as this may even be an example.
However, the traditional scientific peer review process should not be seconded to an environmental cause. It must have a different role. The strict objectivity of peer review gives us the credentials and the confirmed evidence that we can use in order to make a difference when we express our opinions as scientists.
In this particular case the authors’ openly state that they deliberately set out to support decisions that strengthen forest conservation. They write that “for conservation to receive wide political and popular attention and priority, especially in the developing world, there needs to be empirical evidence of nature’s role in supporting human well-being. … For centuries it has been vehemently claimed, and hotly disputed, that forests provide natural protection from floods … Yet, because the claim lacks broad-scale empirical support, the development and implementation of clear flood-mitigation policies regularly stall (FAO & CIFOR, 2005; Calder & Aylward, 2006).” The authors stated intention was thus to correct this deficit. I find this quite uncomfortable reading. Scientific evidence doesn’t fall out of a data set as a result of political necessity to provide it.
Why does the claim of a link between deforestation and flooding “lack empirical support”? I don’t believe that it does. There is ample evidence on which to base a consensus. Flooding, and more specifically, the loss of life and property associated with flooding, is the result of a broad set of factors that combine in complex, case specific ways. These combinations (that could be called the “cause” of flooding) are often quite well understood at the local level when any specific event is investigated. However the “causes” of floods in general can rarely be analyzed successfully by looking for global scale correlations. The number of interactions involved prevents purely statistical analyses from incorporating enough of our current understanding to cast new light on the issue. Contemporary, process based, hydrological models now capture the dynamics of flooding remarkably well. Water does move in predictable ways, even if higher powers, including politicians, do not. There are relatively few deep scientific mysteries to be resolved.
The intrinsic difficulty is still in effectively communicating what is already known to decision makers. Flooding is largely predictable (at least in a probabilistic sense). Nevertheless, poor political decisions are constantly made that ignore both science and common sense. They will continue to be made if decision makers can claim that a scientific consensus has not been reached and then use the resulting confusion as a justification to act on non-scientific agendas.
In some specific circumstances forest cover can indeed mitigate the effects of intense rainfall events. This is especially true in the case of small scale, local events. In other cases flood plains are inundated regardless of whether the upper parts of watersheds are forested. Politicians sometimes use deforestation in order to distract attention from their own poor planning in heavily populated flood plains. Sound environmental management involves both protecting forests and ensuring high quality urban planning with sophisticated risk management in the flood plains. Distracting attention from the complexities of the trade offs involved by setting up false dichotomies based on questionable analysis of evidence is unhelpful. Floods cost lives. We must try to get this right.
Here are a few quick maps based on the authors, data.
- Proportion of deforestation in South America
- Proportion of deforestation in Sub Saharan Africa
- Proportion of deforestation in South Asia
- Legend for absolute deforestation (km2)
- Absolute deforestation (global)
- Legend for relative deforestation (% of 2000 cover lost since 1990)
The authors investigated two statements. (i) flood frequency is correlated with the total forest cover (natural and plantation) and/or (ii) flood frequency is correlated with the total forest cover loss over the period of interest. Only floods caused by heavy or brief torrential rain were included; those caused by typhoons, cyclones, dam breakage and tsunamis were excluded because they represent events that originate independently of landscape characteristics.
It should be clear that total forest cover must be correlated generally with climate. Absolute forest loss is positively correlated with the size of the country with large countries clearly losing more. Proportional forest loss is negatively correlated with the size of the country, with small countries losing more forest. So the authors’ attempted to hold for these effects statistically by including them in an initial model.
As I read the paper, this analysis looks increasingly odd. Instead of concentrating on “offsetting” effects such as total area, population or rainfall they threw them directly into the statistical model. The use of “random effect” to refer to country level soil moisture is also extremely odd. On initial reading I thought that country id must have been used as a random factor in a GLMM that looked at within country effects, sensu hierarchical mixed effects models as used in social science. I could find no clue that the authors had investigated any random effects as I understand the term.
Using the data set that I had previously made available for download in a previous post on this weblog I fit one of the models cited in the paper using R.
mod summary(mod)
Call:
glm(formula = nfloods ~ area + slope + rain + degrad + for2000 +
forloss, family = gaussian(link = “sqrt”), data = d)
Deviance Residuals:
Min 1Q Median 3Q Max
-12.570 -4.091 -1.526 2.090 38.183
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.377e-01 6.697e-01 1.102 0.27666
area 6.295e-07 3.268e-07 1.926 0.06054 .
slope 1.182e-01 2.022e-01 0.585 0.56171
rain 9.215e-04 3.056e-04 3.015 0.00425 **
degrad 1.925e-06 7.054e-07 2.729 0.00909 **
for2000 -1.732e-06 5.392e-07 -3.211 0.00247 **
forloss 8.676e-05 5.967e-05 1.454 0.15307
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 74.80764)
Null deviance: 15817.0 on 50 degrees of freedom
Residual deviance: 3291.4 on 44 degrees of freedom
(5 observations deleted due to missingness)
AIC: 373.26
In other words, at first sight no significant effect of deforestation at this scale. I am still not sure what the authors’ analysis actually involved but although the (often justifiable) use of information criteria to mediate between models appears sophisticated, and the lack of mention of statistical significance testing is sensible, in this particular setting the contemporary feel to the analysis mainly serves to make it more difficult to penetrate for the uninitiated.
As my major issue remains that of aggregation errors (for example what does median annual rainfall of 616 mm mean for a countrty like Mexico?) I am reluctant to go into further detail.









































