Duncan Golicher’s weblog

Research, scripts and life in Chiapas

Posts Tagged ‘raster

“Raster” data in PostGIS?

with one comment


PostGIS does not support raster data as yet. Hopefully there should be raster support soon. Raster layers can be visualised from a local or remote server using geoserver. However having raster layers integrated into a PostGIS data base would be extremely useful for tasks such as overlaying herbarium records with continuous extrapolated climate data or elevation.

I have previously experimented with importing the points of a raster layer. This works quite well. It can be used for simple spatial overlays using proximity operations. However the visual impression produced by the points is not very satisfactory and a query using the contains operator would be more efficient.

QGIS has a plugin for graticule generation. This forms a grid of individual polygons. So if a grid is produced using the graticule plugin of QGIS and then imported into PostGIS using SPIT it is possible to partially mimic a raster layer, although with some computational overhead.

The document mnp-workshop3.pdf showed how worldclim data (http://www.worldclim.org/) can be imported into GRASS and R.

If R is run from GRASS the following lines can be used to import the appropriate raster layers at a chosen resolution into R.

Most of the lines of the script are concerned with calculations on the basic worldclim data. The dataset consists of 36 grids of monthly mean minimum temperature, maximum temperature and precipitation. Some mean annual values may be derived for simplicity.

a<-c(paste(“tmin”,1:12,sep=”"),paste(“tmax”,1:12,sep=”"),paste(“prec”,1:12,sep=”") ,”alt”)
d<-readRAST6(a)
d@data[,1:24]<-lapply(d@data[,1:24],function(x)x/10)
fullgrid(d)<-FALSE

f<-function(x)mean(x[1:12])
d@data[["meanmim"]]<-apply(d@data,1,f)

f<-function(x)mean(x[13:24])
d@data[["meanmax"]]<-apply(d@data,1,f)

f<-function(x)mean(x[1:24])
d@data[["meantemp"]]<-apply(d@data,1,f)

f<-function(x)sum(x[25:36])
d@data[["totprec"]]<-apply(d@data,1,f)

The trick for importing the layers to PostGIS from R is shown below. If the data is in R as a spatialPixelDataFrame the coordinates can be added to the data to form a table.

dd<-data.frame(coordinates(d),d@data)

This table can then be saved to a PostGIS database using ODBC.

library(RODBC)
con<-odbcConnect(“mydb”)
odbcQuery(con,”SET search_path =chiapas, pg_catalog;”)

sqlSave(con,dd,”worldclim”)

###########################################

Now either in pgadmin or running the queries directly through R by placing them inside an odbcQuery function.

ALTER TABLE chiapas.worldclim add column gid serial NOT NULL;
ALTER TABLE chiapas.worldclim ADD CONSTRAINT worldclim_pkey PRIMARY KEY(gid);
ALTER TABLE chiapas.worldclim ADD COLUMN the_geom geometry;

UPDATE chiapas.worldclim
SET the_geom = PointFromText(‘POINT(‘ || s1 || ‘ ‘ || s2 || ‘)’,4326);

CREATE INDEX
grat30arc_idx_geo on grat30arc USING GIST(the_geom GIST_GEOMETRY_OPS);

CREATE INDEX
worldclim_idx_geo on chiapas.worldclim USING GIST(the_geom GIST_GEOMETRY_OPS);

If a polygon graticule called grat30arc covering the same region with a 30 arch second grid has been produced using the graticule plugin in Qgis and imported into the public schema then the average values of the points within each grid square can then be calculated for the graticule and a new table produced using this as a model query.

CREATE TABLE chiapas.worldclim2 with oids as
SELECT a.*, g.totprec, g.meanmin, g.meanmax, g.elev from
grat30arc a,

(SELECT AVG(c.totprec) as totprec,
AVG(c.meanmim) as meanmin,
AVG(c.meanmax) as meanmax,
AVG(c.alt) as elev,d.gid
FROM chiapas.worldclim c, grat30arc d
WHERE c.the_geom && d.the_geom
AND contains(d.the_geom, c.the_geom) group by d.gid) g

WHERE a.gid=g.gid;

The result is very similar to the visualisation of an arcgrid in arcview when it is viewed in QGis over a shaded relief map. In this case the pixels are approximately 1 km x 1 km (30 arc seconds or 0.008333333 degrees) and the elevation lines up well with the contours derived from a 1:250,000 map.

Spatial indices on the herbarium table and the worldclim2 layer with the following

CREATE INDEX
herb_idx_geo on herb.herbario USING GIST(the_geom GIST_GEOMETRY_OPS);

CREATE INDEX
worldclim2_idx_geo on chiapas.worldclim2 USING GIST(the_geom GIST_GEOMETRY_OPS);

Then the following query runs in less than four seconds to provide 6,000 herbarium records with extrapolated climate data to the nearest km.

SELECT h.*, w.totprec, w.elev,w.meanmin,w.meanmax
FROM herb.herbario h, chiapas.worldclim2 w
WHERE h.the_geom && w.the_geom
AND contains(w.the_geom, h.the_geom);

Or

select h.*, w.totprec, w.elev,w.meanmin,w.meanmax
from herb.herbario h, chiapas.worldclim2 w
where ST_contains(w.the_geom, h.the_geom);


Written by Duncan Golicher

July 5, 2008 at 1:24 pm

Deforestation:Overlaying vector and raster layers in PostGIS using GRASS

leave a comment »

A very common operation in GIS involves overlaying vector polygons on a raster map and calculating the area of each class in the raster map. The results are then added to the attribute table of the vector map. How can this be achieved in PostGIS?

If the raster layer is quite small it could be imported as points and the result achieved through a point in polygon SQL overlay. However that is not the usual way to achieve the operation. When large raster layers derived from the analysis of satelite imagery are of interest the conventional GIS approach  is to first rasterise the vector layer. The computations involved in the overlay are then simple and fast.

This example taken from our ongoing work. It overlays a map of agricultural units on a raster layer representing deforestation in the last decade derived from analysis of satelite data. Because deforestation in Chiapas has tended to take place on a rather small scale involving only a few hectares at a time, it is not at all easy to see the large scale regional pattern from an image. Calculating the proportion of the total area of each agricultural unit deforested helps to see the pattern. The map above shows the percentage of the total (forested + non-forested) area of each agricultural unit estimated to have been deforested between 1990 and 2000.

The simplest way to achieve this result with PostGIS is to import the vector layer into GRASS, where the raster is also held. There are a few issues that have to be addressed when doing this. The first is that the gid index is lost when the attributes are imported. As a unique identifier for each polygon is needed during the overlay an index should be added to the table first if the gid is the only unique identifier for the table. This is very simple.

ALTER TABLE chiapas.agros ADD COLUMN id int8;
UPDATE chiapas.agros SET id=gid;

Now enter GRASS in a location with the same projection as the PostGIS data base (typically EPSG:4326)

Now there is another technical issue. Clearly vector processing in any GIS involves a combination of the geometery and attribute tables. We have both already in PostGIS. However in GRASS we use  functions that work on GRASS’s own representation of the geometry, so an import is needed. GRASS can use a wide variety of databases as a back end for storing attribute data. Among these is Postgresql. What is not obvious, and  potentially quite confusing, is that we already have the data in a Postgresql data base! So, does it all need to be imported again? The answer seems to be yes and no. It is quite possible to link the GRASS topology with the original table. However it is much easier to import it all in the usual way using v.in.ogr. This produces a new table, that could either stored in an external data base or perhaps even within the original PostGIS data base (as a new copy). Of course this attribute only table lacks the geometry column.

For example to set up GRASS to use the original local PostGIS data base as the backend for vector layers

db.connect driver=pg database=”host=localhost,dbname=gisdb2″
db.login user=Duncan pass=******

Note that this step is not necessary. By default Grass will store the table as in dbf format.

A layer can be imported from PostGIS using

v.in.ogr dsn=”PG:host=localhost dbname=gisdb2 user=Duncan” layer=chiapas.agros output=agros type=boundary,centroid

If the raster layer is in another coordinate system, as it is in this case, it will be necessary to exit the location and reproject in the location of the raster layer.

v.proj input=agros location=Global mapset=mesoanalysis output=agros

Then form a raster layer.

v.to.rast input=agros output=agros use=attr column=id layer=1 value=1

Then in GRASS the overlay is achieved quite simply using r.stats and sending the output to a text file.

r.stats -a input=agros,defor >agros.txt

This forms a file in which the first column holds the id of the vector layer, the second the class in the raster layer and the third the area in square meters of each combination. For example…

1 111 1319094.000000
1 112 159201.000000
1 122 1041304.500000
1 211 4026323.250000
1 222 85285437.750000
1 244 89347.500000
1 422 52796.250000
1 444 1749586.500000
2 111 565326.000000

So the task is now done, apart from the job of adding the data back into the original PostGIS table. There are many ways to do this. Often only one, or a few focal classes are of sufficient interest to merit a new column. So an easy way to process the output and send it back to PostGIS involves R.

Connect to the PostGIS data base using RODBC and set the schema

library(RODBC)
con<-odbcConnect(“mydb”)
odbcQuery(con,”SET search_path =chiapas, pg_catalog;”)

Read in the data and set the first column as numeric, removing any missing data (by default it will probably have been interpreted as a factor as GRASS will use a * for missing data).

d<-read.table(“agros.txt”)

d$V1<-as.numeric(as.character(d$V1))
d<-na.omit(d)

Now a subset of the data can be taken for each of the columns of interest, named and added back to the data base. In this case code “111″ represents area that was classified as forest in 1990,2000 and 2005. “122″ pixels that were deforested in the period between 1990 and 2000.

odbcQuery(con,”ALTER TABLE agros ADD COLUMN forest int4;”)
d2<-subset(d,d$V2==111)
d2<-d2[,-2]
names(d2)<-c(“gid”,”forest”)
d2$forest<-round(d2$forest/10000,0)
sqlUpdate(con,d2,”agros”)

odbcQuery(con,”ALTER TABLE agros ADD COLUMN defor2000 int4;”)
d2<-subset(d,d$V2==122)
d2<-d2[,-2]
names(d2)<-c(“gid”,”defor2000″)
d2$defor90<-round(d2$defor2000/10000,0)
sqlUpdate(con,d2,”agros”)

We are rather concerned that this cover change analysis, that has been achieved with the standard methodology used by Conservation International, tends to quite seriously underestimate small scale disturbance and canopy opening. Nevertheless in comparative terms it does demonstrate that absolute deforestation in Chiapas is certainly not a homogeneous process. Extensive deforestation is not occurring, although if the map is zoomed out further to show the Marques de Comillas area a very clear hotspot of deforestation is shown. In this part of Chiapas (Central Valley, Sierra and Altos) comparatively few agricultural units have deforested more than 2% of their total area in the last decade.

The other way of looking at this would be as the proportion of remaining forest lost. This woud tend to emphasise deforestation rates in ejidos and agricultural units with a very small ammount of forest in 1990. However at this level the approach would be somewhat unfair as it would be biased by classification errors in small areas.

To put the map in context a map with the proportion of the total area classified as forest can be used. In R the column can be added using

odbcQuery(con,”ALTER TABLE agros ADD COLUMN propforest float;”)

odbcQuery(con,”update agros set propforest=100*forest/hectares;”)

A surprising number of the agricultural units have over 40% tree cover. This has to be placed in the context of a classification method that draws no distinction between a pixel within a pasture with sufficient cover of secondary vegetation to be classified as forest and untouched primary forest. The next step clearly involves analysis of fragmentation in order to clarify this.

Written by Duncan Golicher

July 1, 2008 at 9:04 am

Follow

Get every new post delivered to your Inbox.