I found an interesting series of blog articles, starting here:
What caught my interest isn't so much that this reveals any surprising secrets, but that it is a reasonable mathematical analysis of the value of education and ability explained in simple terms. Simplified, but it makes a good point all the same.
I am less certain about the author's holistic politics, which are also interesting. However, I don't really have the time or energy right now to evaluate this properly. I do see a certain appeal in his mathematical approach to politics, but I remain skeptical that this approach can't be manipulated to fit specific goals, rather than the math driving the policy.
Saturday, October 30, 2010
Saturday, October 23, 2010
Science Fail #1
Labels:
humor,
Science Fail
I'm borrowing the schtick of another blogger, who has not been actively lately. I hope he won't mind?
The idea of a "Science Fail" is to post a statement comment that fails in a completely self-explanitory way. Editing on my part is not allowed; each quote must fail on its own merits, or lack thereof. Go see Dirty Harry's Science Fail blog for many shining examples.
The idea of a "Science Fail" is to post a statement comment that fails in a completely self-explanitory way. Editing on my part is not allowed; each quote must fail on its own merits, or lack thereof. Go see Dirty Harry's Science Fail blog for many shining examples.
So what is this, Reality Show science? Leave to the Americans to get data by blowing something up or smashing something into something else. Keep this up and someday the Moon will be as livable as East Baghdad. (link to source)If only Terraforming were that easy!
Another tomato thrown by
Dan Eastwood
at
5:06:00 PM
Wednesday, October 13, 2010
Data Science, Science Data
Labels:
data,
Rant,
science,
statistics
Image Nature News |
The point of this is not to blow my own horn, but that I have a set of skills for managing databases that is nearly independent of my statistical knowledge. The Nature News article points out the problems with programming skills, but the same problem exist with database skills: Some researchers don't understand the basics of recording data in an organized manner, and disorganized data can lead to as many problems as disorganized programming.
It is not too unusual for researchers to bring me data (typically in a spreadsheet), and sometime I spot specific problems that could be error in how they collected and recorded the data. This is fairly important, because if the data is wrong then my analysis will be too. Sometimes I can fix these errors for them, other times I have to have the researcher fix the problems, because it requires medical knowledge and familiarity with (or access to) the original data source to make the correction. Once these bugs have been ironed out, all it well and I do my statistical thing.
There is another sort of error though, and it is much more subtle. These are the errors in the data that don't really look like errors. When someone brings me their data and there is nothing obviously wrong, I probably don't question it, and proceed with the analysis. There are some common ways this might happen: cut & paste errors, "bad" sorts that scramble the data, inconsistency in data entry, all simple mistakes. Sometimes evidence of these errors shows up during my database management prep work or during the analysis itself. Obviously if a mistake is found, it gets fixed. However, if my experience with finding errors in the late stages of analysis is any indicator, then if seem likely that some of these errors are never found. The "garbage-in, garbage-out" principle applies, and some of the analyses I've produced were likely garbage, because the data was garbage.
The good news is this sort of error is unlikely to contribute much to the larger of body of scientific knowledge. By the nature of statistics (and with an assumption of some randomness) these subtle errors are unlikely to produce significant results, less likely to be in agreement with other published studies, and certainly unlikely to be verified by follow-up studies. The bad news is that some simple, perhaps even careless mistakes can ruin months or even years of research effort, which is a waste of effort.
Finally, this brings me that other set of skills: teaching. Whenever I have the opportunity to work with people who are starting off on new research projects, I try to teach the basic data-skills, the do's and don'ts, to help them get good data and do good research. Not everyone is interested in spreadsheets and databases, but it is not too hard to convince researchers that a little extra effort up front to get good data will pay dividends down the road when it comes to publications. It certainly pays me dividends when it comes to actually doing the statistical analysis - my primary skill - rather than spending hours (or days, or weeks) trying to track down what went wrong with the data, or unknowingly analyzing junk data.
Another tomato thrown by
Dan Eastwood
at
6:24:00 PM
Subscribe to:
Posts (Atom)