One of the tasks that is commonly used in most spatial analysis at some point involves obtaining zonal statistics from a vector on raster overlay. This has always been rather inelegant using GRASS or Arc as it involves several steps. A PostGIS raster query simplifies the work flow.
However in some cases my queries were taking much longer to run than expected using PostGIS2.0. The issue concerned ST_Clip. The function was originally written in pgsql. There is now a C version, but I had installed PostGIS on Ubuntu 12.04 using the ppa:sharpie/postgis-nightly repository. That package has not been rebuilt since October.
So I decided that I just had to compile the SVN version from source. This was very quick and easy after tracking down a handful of missing libraries reported on the first failed make. In case it is useful to others who don’t want to try compiling, here is the deb package that I built last night on Ubuntu 12.04 precise 32 bit.
And one today on the 64 bit machine.
The difference is impressive. The following query, shown on my post a couple of days ago, took over a minute to run, which was forgiveable due to the complexity of the multipolygon geometry that covered several tiles, but rather disappointing.
select binomial, (pvc).value covclass, sum((pvc).count) pcount FROM (SELECT binomial,rid, ST_ValueCount(ST_Clip(rast, geom, true)) pvc from (select binomial,geom from redlist_species where binomial like 'Baccharis_lati%') v, globcov1000x1000 r where st_intersects(rast,geom) group by binomial,rid,geom )foo group by binomial,covclass order by binomial,covclass
Now it takes just under 5 seconds!
So I know what I have been missing out on! Many thanks to Pierre and Bborie for pointing this out to me, and of course for all the work they put into programming PostGIS raster. Now all that I need to really show it off is that stable gdal driver.
An easy way of updating is to install the package using Ubuntu software centre (I apologise for any notice you may get that the package is of “bad quality”, this is due to poor documentation not the quality of the build). and then run the SQL files found in /usr/share/postgresql/9.1/contrib/postgis-2.1 in the data base you want to upgrade from PGAdmin as shown on this post.
The 140 or so complex multipolyons overalain on the original resolution raster now runs through in three minutes.