A very common operation in GIS involves overlaying vector polygons on a raster map and calculating the area of each class in the raster map. The results are then added to the attribute table of the vector map. How can this be achieved in PostGIS?
If the raster layer is quite small it could be imported as points and the result achieved through a point in polygon SQL overlay. However that is not the usual way to achieve the operation. When large raster layers derived from the analysis of satelite imagery are of interest the conventional GIS approach is to first rasterise the vector layer. The computations involved in the overlay are then simple and fast.
This example taken from our ongoing work. It overlays a map of agricultural units on a raster layer representing deforestation in the last decade derived from analysis of satelite data. Because deforestation in Chiapas has tended to take place on a rather small scale involving only a few hectares at a time, it is not at all easy to see the large scale regional pattern from an image. Calculating the proportion of the total area of each agricultural unit deforested helps to see the pattern. The map above shows the percentage of the total (forested + non-forested) area of each agricultural unit estimated to have been deforested between 1990 and 2000.
The simplest way to achieve this result with PostGIS is to import the vector layer into GRASS, where the raster is also held. There are a few issues that have to be addressed when doing this. The first is that the gid index is lost when the attributes are imported. As a unique identifier for each polygon is needed during the overlay an index should be added to the table first if the gid is the only unique identifier for the table. This is very simple.
ALTER TABLE chiapas.agros ADD COLUMN id int8;
UPDATE chiapas.agros SET id=gid;
Now enter GRASS in a location with the same projection as the PostGIS data base (typically EPSG:4326)
Now there is another technical issue. Clearly vector processing in any GIS involves a combination of the geometery and attribute tables. We have both already in PostGIS. However in GRASS we use functions that work on GRASS’s own representation of the geometry, so an import is needed. GRASS can use a wide variety of databases as a back end for storing attribute data. Among these is Postgresql. What is not obvious, and potentially quite confusing, is that we already have the data in a Postgresql data base! So, does it all need to be imported again? The answer seems to be yes and no. It is quite possible to link the GRASS topology with the original table. However it is much easier to import it all in the usual way using v.in.ogr. This produces a new table, that could either stored in an external data base or perhaps even within the original PostGIS data base (as a new copy). Of course this attribute only table lacks the geometry column.
For example to set up GRASS to use the original local PostGIS data base as the backend for vector layers
db.connect driver=pg database=”host=localhost,dbname=gisdb2″
db.login user=Duncan pass=******
Note that this step is not necessary. By default Grass will store the table as in dbf format.
A layer can be imported from PostGIS using
v.in.ogr dsn=”PG:host=localhost dbname=gisdb2 user=Duncan” layer=chiapas.agros output=agros type=boundary,centroid
If the raster layer is in another coordinate system, as it is in this case, it will be necessary to exit the location and reproject in the location of the raster layer.
v.proj input=agros location=Global mapset=mesoanalysis output=agros
Then form a raster layer.
v.to.rast input=agros output=agros use=attr column=id layer=1 value=1
Then in GRASS the overlay is achieved quite simply using r.stats and sending the output to a text file.
r.stats -a input=agros,defor >agros.txt
This forms a file in which the first column holds the id of the vector layer, the second the class in the raster layer and the third the area in square meters of each combination. For example…
1 111 1319094.000000
1 112 159201.000000
1 122 1041304.500000
1 211 4026323.250000
1 222 85285437.750000
1 244 89347.500000
1 422 52796.250000
1 444 1749586.500000
2 111 565326.000000
So the task is now done, apart from the job of adding the data back into the original PostGIS table. There are many ways to do this. Often only one, or a few focal classes are of sufficient interest to merit a new column. So an easy way to process the output and send it back to PostGIS involves R.
Connect to the PostGIS data base using RODBC and set the schema
odbcQuery(con,”SET search_path =chiapas, pg_catalog;”)
Read in the data and set the first column as numeric, removing any missing data (by default it will probably have been interpreted as a factor as GRASS will use a * for missing data).
Now a subset of the data can be taken for each of the columns of interest, named and added back to the data base. In this case code “111” represents area that was classified as forest in 1990,2000 and 2005. “122” pixels that were deforested in the period between 1990 and 2000.
odbcQuery(con,”ALTER TABLE agros ADD COLUMN forest int4;”)
odbcQuery(con,”ALTER TABLE agros ADD COLUMN defor2000 int4;”)
We are rather concerned that this cover change analysis, that has been achieved with the standard methodology used by Conservation International, tends to quite seriously underestimate small scale disturbance and canopy opening. Nevertheless in comparative terms it does demonstrate that absolute deforestation in Chiapas is certainly not a homogeneous process. Extensive deforestation is not occurring, although if the map is zoomed out further to show the Marques de Comillas area a very clear hotspot of deforestation is shown. In this part of Chiapas (Central Valley, Sierra and Altos) comparatively few agricultural units have deforested more than 2% of their total area in the last decade.
The other way of looking at this would be as the proportion of remaining forest lost. This woud tend to emphasise deforestation rates in ejidos and agricultural units with a very small ammount of forest in 1990. However at this level the approach would be somewhat unfair as it would be biased by classification errors in small areas.
To put the map in context a map with the proportion of the total area classified as forest can be used. In R the column can be added using
odbcQuery(con,”ALTER TABLE agros ADD COLUMN propforest float;”)
odbcQuery(con,”update agros set propforest=100*forest/hectares;”)
A surprising number of the agricultural units have over 40% tree cover. This has to be placed in the context of a classification method that draws no distinction between a pixel within a pasture with sufficient cover of secondary vegetation to be classified as forest and untouched primary forest. The next step clearly involves analysis of fragmentation in order to clarify this.