An element of spatial analysis that can be both fascinating and frustrating involves resolving the tension between the attraction of visual pattern matching and the formality of statistical analysis.
Traditional non spatial statistics in an observational setting relies heavily on correlation analysis. This alone can be uninformative, misleading and often confusing. Modern spatial statistics that attempt to go beyond the limitations of correlation can be obscure and difficult to communicate. At the same time maps can be difficult to summarize. Presenting data in the form of patterns that overlay each other in space (and sometimes time) is very difficult to achieve convincingly online and much harder to incorporate into a scientific format that still relies heavily on printed material.
Take the example of the relationship between population density and forest cover in Chiapas. My own experience in the region has led me to some simple, robust conclusions based on the evidence I see around me on a daily basis.
1. The highlands of Chiapas have an extremely high population density that can not be sustained through subsistence agriculture. Around San Cristobal this is above 100 people per km2. Yet a surprising amount of forest cover still remains.
2. The Central Depression has a much lower population density and an entirely agricultural economy. It is largely deforested. This is largely true despite the fact that satellite imagery can underestimate the amount of remaining forest.
3. The rural population of the highlands has continued to grow in recent years. However the amount of land dedicated to agriculture in the highlands has slightly decreased as has the value of agricultural production.
4. Recent deforestation in the highlands has disproportionately affected old growth mature forest. Secondary forest in the highlands has tended to expand.
5. The economy of the highlands of Chiapas is growing while the Central Depression is static.
6. Extreme poverty is still more prevalent in the highlands of Chiapas, while chronic rural deprivation is a feature of the central depression.
These observations have been the subject of academic discussion and are not always accepted in the way I have stated them. In particular the empirical relationship between population density and deforestation is often still at the centre of the controversy.
This may be due to trivialization of the obviously non linear (in the statistical sense) pattern I tried to show in the animated gif above (move back up if you didn’t realize it was an animation, I left a longish pause for thought between each frame). Agricultural “frontiers” with no resident population are trivially forested.The agricultural frontier regions also (trivially) show the highest rates of deforestation (deforestation is not shown explicit here, but see some previous posts). However the extremely complex dynamics of forest loss, regrowth and consequent structural and compositional change as a result of the management of a culturally determined landscape are often misrepresented and misunderstood. Attempting to linearize this sort of pattern though correlation analysis is never going to provide insight.
The animation above is based on the 2000 census data which was imported into a POSTGIS database and then visualized using Qgis. The polygons are watersheds, produced using r.watershed in GRASS, translated into a vector layer and also exported to POSTGIS. The populations within the watersheds can then be calculated using a simple spatial query in POSTGIS. I will provide more technical details on these steps in a subsequent post. The size of the points for the population centres is not proportional to population size as such but was chosen for visual clarity.