Tuesday, April 6, 2010

Week 1 data visualization in Political Science


In political science much of the past research on the outbreak of civil wars has been conducted using aggregate state level demographic and economic data in cross-national comparisons. Using a handful of explanatory variables such as GDP per capita, ethnic fictionalization, or colonial history these studies have generally coalesced around explaining the probability of a nation having a civil war with factors such as natural resource reliance, low per capita GDP, or difficult terrain. Moreover, there seems to be a lack of correlation with, religious or ethnic divisions, regime type, or economic inequality.

           
Below is a graph from an highly influence article on civil war onset. Pictured are probabilities of a nation experiencing a civil war derived from 220+ onsets along a dozen explanatory variables for over 150 nations spanning 40 years. One of the important and somewhat controversial findings is that ethnically divided nations are not significantly more at risk for violence. This graph attempts to summarize that core finding by placing the probability associated with ethnicity in perspective with that associated with different levels of per capita GDP.

How well is this graph presenting the information, is it getting in the way of the data, or is it helping to identify patterns? First, it is unclear exactly what it is telling us. For instance, what does the probability mean, is it a lot, a little? It is quite difficult to tell how varying ethnicity changes the probability. What is the relationship of varying both variables? What is the variation across nations or regions etc . . ? Is it so abstract and highly aggregated that we lose any feel for substantive significance? Could this better be represented by a simple table? Moreover, it may even be misleading? Is that really the relationship between the two, is the level of aggregation obscuring important details? What if we had several highly geographically unequal societies wherein the civil wars were occurring in the wealthy regions? 
































Armed Conflict Location and Events Dataset (ACLED)

The level of aggregation of much of the civil war literature has bothered many scholars. Below is a figure derived from a new  data set attempting to overcome this problem. ACLED is cataloging information on individual civil war events along with locations, dates, participants, context and outcomes. The ovals represent the activity of varying rebel groups while the map shading represents population density.

There seems to be a high correlation with population density and rebel activity. However, is it population density or the border with Rwanda and Uganda that is the important factor. It is not readily apparent given the way the non- DRC countries are 'left out'. Secondly, are the colored circles the best way to represent the second layer of information? Are the circles distracting? How dependent are the areas of the circles on outliers or does it represent a more or less even dispersion?

























The last figure uses  the ACLED data for an analysis of the correlation of violent events and variables such as wealth, location of diamond mines, distance from the capital, ethnic make-up etc. While the unpublished version is in color, the published version is black and white (what most of the world will see, what would Tukey say?). Unlike the first chart the disaggregated information allows us to ask questions such as--can we say diamonds are correlated with civil wars when the conflict site is nowhere near the source of diamonds? Yet, the figure poses its own potential distortions. For example, it is hard to distinguish between size of bubble and number of war events. To the eye, a few war events take on a disproportionate significance. For instance, the majority of events take place around the capital Monrovia,  However, the figure gives the impression of a greater spread of events about the nation.

No comments:

Post a Comment