Archive for September 2011
Nearest neighbour in POSTGIS
The issue to solve is how to find the name of the nearest blue point (settlements with a population over 100) from the red points (plant collections).
Some interesting ways of solving the problem are provided here.
http://www.bostongis.com/?content_name=postgis_nearest_neighbor_generic
There are some slightly easier way using the current version of PostGIS (1.8).
Breaking down the queries by parts. First find all the towns with a population over 100.
We will call this query “w” and we wrap it up as subquery.
(select * from mexico.conteo2005 where pobtot>100) w
Now we choose all the elements we want to see from the herbarium table together with a calculation of the distance between the two points. The original tables are in epsg:4326 so distances are in degrees. This doesn’t matter for our purposes, but it would be easy to include a CRS transform in PostGIS if we do want distances in km.
select h.gid, nom_loc,genus,species, h.the_geom, ST_distance(w.the_geom,h.the_geom) as dist from mexico.oldherbario h,
The search can be restricted to a distance of 0.1 degrees (approximately 10km) using
ST_DWithin(w.the_geom,h.the_geom,0.1)
And we want to order the results first by the herbarium collection ID and then by the distance.
order by h.gid,dist
Roll this all together and we get
select h.gid, nom_loc,genus,species, h.the_geom, ST_distance(w.the_geom,h.the_geom) as dist from mexico.oldherbario h,
(select * from mexico.conteo2005 where pobtot>100) w
where ST_DWithin(w.the_geom,h.the_geom,0.1)
order by h.gid,dist
So now we just need to get the first entry from each group and we have the nearest neighbour (with a population over 100 and within 10 km … the later criteria could be changed easily if needed, but the query would be slower). We can do that using select distinct. taking advantage of the fact that the function will return the first distinct id for each group, and the data is ordered by distance, so this is the nearest neighbour.
Less than 7 seconds is not bad. The query now looks like this
select distinct on(d.gid) d.* from
(select h.gid, nom_loc,genus,species, h.the_geom, ST_distance(w.the_geom,h.the_geom) as dist from mexico.oldherbario h,
(select * from mexico.conteo2005 where pobtot>100) w
where ST_DWithin(w.the_geom,h.the_geom,0.1)
order by h.gid,dist) as d
Finally if we wish to visualise the results we can form a new table or a new view.
create table chiapas.herblocalities as
select distinct on(d.gid) d.* from
(select h.gid, nom_loc,genus,species, h.the_geom, ST_distance(w.the_geom,h.the_geom) as dist from mexico.oldherbario h,
(select * from mexico.conteo2005 where pobtot>100) w
where ST_DWithin(w.the_geom,h.the_geom,0.1)
order by h.gid,dist) as d
If you are going to keep the table in the data base don’t forget to add entries in the geometry_columns table.
INSERT INTO geometry_columns(f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, “type”)
SELECT ”, ‘chiapas’, ‘herblocalities’, ‘the_geom’, ST_CoordDim(the_geom), ST_SRID(the_geom), GeometryType(the_geom)
FROM chiapas.herblocalities LIMIT 1;
Installing PostGIS in Ubuntu Natty
This will install PostGIS and add the R language through PLR in the current version of Ubuntu Natty 11.04
sudo apt-get install qgis
sudo apt-get install postgresql
sudo apt-get install postgresql-8.4-postgis
sudo apt-get install postgresql-8.4-plr
sudo apt-get install pgadmin3
One small step that is necessary is to change the user password for postgres. You can do this with psql. Care needed here. This has to be done correctly. The following line gets you into psql.
sudo -u postgres psql -d template1
Type this (being very careful with quotation marks and the semicolon).
alter user postgres with password 'postgres';
If successul you get a message saying ALTER ROLE (If there is any problem here then retype the single quotation marks to make sure they are simple. WordPress keeps changing them for some reason if they are not in an HTML box).
Now become the postgres user and create a postgistemplate. This is a blank data base into which you will load all the functions and tables needed for POSTGIS. All new data bases created using this template will have these functions loaded.
sudo su postgres
createdb postgistemplate
createlang plpgsql postgistemplate
The PostGIS functions and spatial_ref system are loaded from here on Ubuntu
psql -d postgistemplate -f /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql
psql -d postgistemplate -f /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql
This cpmmand loads in the R language (to use this you must have R installed)
psql -d postgistemplate -f /usr/share/postgresql/8.4/plr.sql
To create a database under this template
createdb -T postgistemplate -O postgres gisdb
You will also have to alter the following config file
sudo gedit /etc/postgresql/8.4/main/postgresql.conf
Find the line with listen_addresses and uncomment it. If you want the database to be open to other users use’*’ instead of localhost
listen_addresses = ‘*’ # what IP address(es) to listen on;
Download a small test database with the countries of the word from this site with wget. Again the file is disguised as a word doc in order to go into the wordpress site.
wget http://duncanjg.files.wordpress.com/2008/09/paises.doc
Restore this database. The database was built using an old version of Postgis, so you will see many errors which you can ignore (See
http://blog.cleverelephant.ca/2010/09/postgis-back-up-restore.html
sudo -u postgres psql gisdb<paises.doc
Now run qgis and connect to your new PostGIS data base.
The trial database with a single countries of the world table can also be downloaded without wget by clicking here.
Updating to Ubuntu 11.04: Solving the sticking cursor
I now expect Ubuntu to work strait out of the box so I was disappointed by a severe bug in the kernel that the current version of Natty uses.
uname -a
Linux duncan-HP 2.6.39-02063904-generic #201108040905 SMP Thu Aug 4 11:04:48 UTC 2011 i686 i686 i386 GNU/Linux
The issue makes the cursor periodically freeze on the screen for up to a second. The system can still be used, but the experience is extremely frustrating. It turns out that a process called kworker is using 100% of the CPU. This can be fixed by following the instructions here, http://ubuntuforums.org/showthread.php?t=1796873.
Typing
sudo echo “options drm_kms_helper poll=N”>/etc/modprobe.d/local.conf
and rebooting solved the problem for me.
Notice that if you install this version you get a new “look and feel”. Not to my taste, but it is easy to go back to the old classic at log in.






