Aggregating time series in R: The Iraq body count

Yesterday I was asked whether R could be used to analyse time series. The answer is of course it can. R is used extensively in the financial sector for analysing complex time series such as stock prices. I have already included an example using R in the context of climate variability (El Niño). One challenge is that there are a lot of different ways of working with time series and representing dates. Aggregation can be rather tricky. The zoo package is one of the most powerful tools for working with time series, but it is not always simple to use. I still haven’t got my head around all the different ways to achieve results.

So here is an example. The code first reads in the data from an online database, then uses tapply to sum the number of casualties per day. Zoo is then used to aggregate by month and the total is plotted. If anyone reading this can suggest better ways to do this or add a more sophisticated analysis I would like to know.

(Open this document if the code doesn’t work due to problems with quotation marks bodycount.doc)

Or try this  …

source(url(“”)) ,

again you might need to retype the quotation marks)



x <- zoo(as.vector(a), as.POSIXct(
Deaths<-aggregate(x,f, sum)


However you get there, the result is shocking. These are documented civilian casualties and I chose the lower estimate.

Even if the trend was initially downwards after the start of the surge, it still only bottomed out at around the level it began at when the “mopping up” operations were taking place in the first few months after the invasion. The time series taken from the compiled online data base stops in January 2008, so it doesn’t take into account the recent renewal of violence. Today, Thursday 6 March there were 86 civilian dead. Two bomb attacks killed 68 in Baghdad alone. Apparently this is not an isolated incident. The trend is sadly upwards again.

Joseph Stiglitz has estimated the price of the war at 3 trillion dollars. I wasn’t even sure what a trillion was until he used the figure. It has twelve zeros, in other words a thousand billion. That works out at 30 million dollars for each dead Iraqi civilian or enough to make ever one of these people as rich as Bill Gates. No further comment, apart from a heartfelt request to visit the site of those who have worked so hard to compile this important data set and re-run the code periodically to check the updated figures.

As an addition, I was saddened to hear that Harvard professor Samantha Power has resigned from Obama’s campaign team this week, apparently for speaking too openly on this, among other, issues. I was extremely impressed by her thoughtful, yet emotional, contribution to BBC radio’s start the week in which she talked of the biography she has written on Sergio Vieira de Mello, a heroic figure who constantly impressed me in every interview I heard with him up to his tragic death in Iraq. The recent news suggests that honest, open minded expression by academics is still considered to be a liability, even for apparently honest, open minded, politicians. This is a link to the Hard Talk interview.


One thought on “Aggregating time series in R: The Iraq body count

  1. I’ll post a few ideas of other ways to play around with this data set. In the meantime, a question. Have you ever done regression analysis on a multi-dimensional array?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s