Tuesday, May 11, 2010

Second Verse: The Census Will Be Wrong

This is a follow-on to my post of three days ago: The Census Will Be Wrong. We Could Fix it. My friend Matt sort of set me off with his comments - I think he likes to do that :-)  - and my "longer response" has turned into a post of it's own, and inspired another yet to come.

Matt writes: I think the author underplays the risk of political/agenda hijinks... The first thing that came to my mind when I read this was that craptastic Lancet paper about civilian deaths in Iraq. Perception is reality, and statistics have gotten a bad rap from a few bad actors. Them's the breaks.

First of all, this is by no means a flame aimed at my friend, and anyone who says otherwise is itching for a fight. This is intended to be my professional opinion with some lightly researched examples to illustrate the problem.
  1. The current census "head count" is known to be flawed (1), and both Democrats and Republicans attempt to take advantage of this (1). 
  2. The mathematics of statistical resampling are apolitical. It's simple a better way to do it, and less subject to error and bias.
  3. Statistical resampling is a simple concept wrapped in a lot of boring math. The basic idea is to go back re-check some of the original counts, and fix them.
  4. The Lancet surveys of war casualties in Iraq is arguably flawed, but a flawed paper in no way invalidates a field of mathematics, or for that matter even the methods of that paper. By way of equally flawed logic, we should abandon all automobiles because Chrysler made the "K" car (That last bit makes more sense than I thought it would.).
  5. The Lancet paper is, if anything, an example that of a study that would benefit from resampling. At a glance - and that's literally all I've given it - the war causalities estimates may suffer more from a lack of precision than a lack of accuracy.

There seems to be a common thread here that many people just don't get what statistics can really tell us. Part of that problem is the growing pains of a fairly recent area of mathematics working it's way into a culture already stressed with information overload. Another part of the problem is that statistics have been poorly taught, frightening students with the math and failing to convey the meaning behind it.

That last sentence expresses one of my original motivation for this blog: to hell with the math, I want people to understand the meaning. Keeping with this theme, my next post will be about the statistical meaning of Accuracy, Precision, and Bias.

Here are some odds and ends I dugs up while researching this post:
-Article about 1999 Supreme Court decision on statistical resampling.
-The 1999 Supreme Court decision on statistical resampling.
-There are some additional comments on the blog of Jordan S. Ellenberg, author of the Washington Post Op-Ed.
-Unrelated Census Hijinx
Dread Tomato Addiction blog signature