Introduction
In
his 2005 paper Specification: the pattern that signifies intelligence,
William Dembski tries to give a rigorous definition to his concept of Complex
Specified Information (CSI). This paper has numerous problems, the most painful
of which is repeated equivocation of terms, making it very difficult to read.
Once I got past the equivocation, I discovered basic errors in how probabilities
are calculated and interpreted. One error in particular is very hard to
swallow, and anyone with a basic understanding of probability should
know better. Dembski has a master’s degree in statistics and a PhD in mathematics, therefore it is reasonable to think he knows better. How could he
be so wrong?
Here's the spoiler, in case you don't care to read the whole post:
- The concept of CSI in Dembski (2005) is based on a meaningless number, which is interpreted as probability even though it is not. As a consequence, CSI cannot have the meaning and interpretation stated. Dembski's math is wrong.
Taking a closer look
Incredulous that such an error might have escaped the authors' notice, I decided take a closer look, to see if there might be some information or context that I was overlooking. This required some study of Algorithmic Information Theory (AIT). Some heavy reading there, but fortunately inference by AIT is actually quite similar to the Maximum Likelihood and Bayesian methods I already know.From Dembski (2005), pp 20. |
`M*N*phi_s(T)*P[T|H]`
where M and N are positive integers representing replicational resources (how many opportunities are there for T to occur), `phi_s(T)` is a positive integer representing the descriptive complexity of a sequence T, and P[T|H] is the probability of the sequence T given hypothesis H.
Dembski refers to this as an upper bound on the probability because of a limitation of AIT; you can never be sure there isn't some shorter coding scheme for a sequence, therefore the probability might be smaller under that other coding scheme. That's OK, because the coding scheme is not an issue here. All I really care about is that Dembski says that `M*N*phi_s(T)*P[T|H]` is a probability.
The paragraph immediate following has Dembski's interpretation of this probability:
From Dembski (2005), pp 20-21. |
See also (2) in the addendum.
And now some statistics
That’s what Dembski says about CSI. Now let’s take a moment to review some basic statistics. Consider the following simple binomial probability experiment. I propose to roll a 6-sided die 10 times, and before rolling I will tell you the probability of rolling one or more 6’s. I make the following statement:
- The probability of rolling a “6” is 1/6 or about 0.167, so the probability of rolling at least one "6" in 10 rolls is 10 times that, giving a probability of 10/6 or 1.67.
If you know even a little bit about probability, you ought
to be suspicious of my statement, because probabilities must range between 0
and 1, and 1.67 is greater than 1. This cannot be a probability so my statement
is obviously wrong. A correct statement would be:
- The probability of rolling a “6” is 1/6 or about 0.167, so the expected number of 6’s in 10 rolls is 10 times that, giving the expected value of 10/6 or 1.67.
That is, I expect to roll "6" an average of 1.67 times in 10 tries. This simple experiment is a demonstration of the binomial distribution, and my mistake in the first statement was confusing the average
or expected value of the number of 6’s rolled for the probability of rolling at
least one "6". Now let’s change the experiment a little, imagine using a 100-sided die instead of a 6-sided die (or use percentile dice!). My first statement now
becomes:
- The probability of rolling a “100” is 1/100 or 0.01, so the probability of rolling at least one “100” in 10 rolls is 10 times that, giving a probability of 10/100 or 0.10.
This statement is also wrong for the same reason as before,
but is less obviously wrong because 0.10 is between 0 and 1, which looks like a probability. Someone who doesn’t understand
probability, or who does not understand the circumstances, might be fooled into
thinking this number (0.10) is a probability bounded on (0,1). The correct interpretation is
again the expected number of times “100” is rolled in 10 trials. It could be 0 (zero)
or as high as 10, but on average this value will be 0.10, and the number of
“100” events observed will follow a binomial distribution.
It doesn’t matter if the die has 6
sides, 100 sides, or 10^150 sides, the expected number of events X that occur
with probability `p` in `N` trials is `N*p`, the expectation of a binomial
random variable. This number is NOT a probability.
See also (3) in the addendum.
How Dembski is wrong
Dembski starts with the probability P[T|H] and multiplies by `M*N` (ignoring `phi_s(T)` for the moment). Just as in my dice examples above, the result is the expectation of the number of T events that will be observed in `M*N` binomial trials with probability of success P[T|H]. Dembski then multiplies by another positive integer `phi_s(T)`, a measure of descriptive length, resulting in a number which has no apparent interpretation at all.
Additionally, since this is not a probability, there is no justification for a result "less than 1/2" being evidence against the hypothesis H (the chance hypothesis).
Edits: After re-reading the final example in the paper (page 23), Dembski shows how this number could be greater than 1. This is in direct contradiction to page 21, where he uses it as a probability. Dembski seems to be confused about his own creation. The point remains that he refers to this as a upper bound on the probability, when it is neither a probability nor an upper bound.
Additionally, since this is not a probability, there is no justification for a result "less than 1/2" being evidence against the hypothesis H (the chance hypothesis).
Edits: After re-reading the final example in the paper (page 23), Dembski shows how this number could be greater than 1. This is in direct contradiction to page 21, where he uses it as a probability. Dembski seems to be confused about his own creation. The point remains that he refers to this as a upper bound on the probability, when it is neither a probability nor an upper bound.
Conclusion
The concept of Complex Specified Information put forward in Dembski (2005) is fundamentally flawed. CSI is stated to depend on a probability, but the method used does not result in a proper estimate of probability. As a result CSI has no meaningful interpretation.
CSI is known to be flawed before I found this additional bug, so my criticism here may be moot. I suspect the only reason no other critics (notably Elsberry, Shallit, and Devine) have written about this previously, is it requires accepting more fundamental errors before it can even be considered.
CSI is known to be flawed before I found this additional bug, so my criticism here may be moot. I suspect the only reason no other critics (notably Elsberry, Shallit, and Devine) have written about this previously, is it requires accepting more fundamental errors before it can even be considered.
Epilogue (April 20, 2023): A discussion with Joe Felsenstein prompted this post at Panda's Thumb:
Discussion: Is William Dembski's CSI argument mistaken or merely useless?
Discussion: Is William Dembski's CSI argument mistaken or merely useless?
I think this resolves the question conclusively, and further discussion in the comments changed my own interpretation too. Briefly, phi_s(T) is a ranking, and shouldn't be using it as a number of additional trials as I describe above. Unfortunately for Dembski, that leaves the definition of CSI completely unsalvageable. It's not even an expected value, but a probability multiplied by a ranking, which doesn't seem to have any meaning at all.
Addendum
1) I almost titled this post "Equivocation: the pattern that typifies nonsense", but I think I'll save that zinger for another day.2) I should note that I am assuming Dembski is correct in his use and interpretation of `phi_s(T)` and `P[T|H]`. Elsberry and Shallit (2011) and Devine (2014) strongly disagree with Dembski's usage of these quantities, and give correct methods.
3) There is a correct way to do the calculation Dembski needs, which results in a probability:
`P["At least one success" | N,p] = 1-((1-p)^N)`
This is a basic probability calculation which Dembski could hardly have avoided learning while studying for a Master's degree in statistics. The probability Dembski seems to want to calculate (after some additional manipulation) is approximately:
`P["E occurs at least once"] ~~ 1-1/(e^(M*N*phi_s(T)*P[T|H]))`
which should be quite accurate for P[T|H] < 0.1.
[corrected 2/23/2016]`P["E occurs at least once"] ~~ 1-1/(e^(M*N*phi_s(T)*P[T|H]))`
which should be quite accurate for P[T|H] < 0.1.
References
Dembski, W. A. (2005).
Specification: the pattern that signifies intelligence. Philosophia Christi, 7(2), 299-343. http://www.bilimfelsefedin.org/blog/wp-content/uploads/Specification_-_The_Pattern_That_Signifies_Intelligence_-_William_Dembski.pdf
Devine, S. (2014). An
algorithmic information theory challenge to intelligent design. Zygon®,49(1), 42-65. http://onlinelibrary.wiley.com/doi/10.1111/zygo.12059/abstract
Elsberry, W., &
Shallit, J. (2011). Information theory, evolutionary computation, and Dembski’s
“complex specified information”. Synthese, 178(2), 237-270.
http://link.springer.com/article/10.1007/s11229-009-9542-8
http://link.springer.com/article/10.1007/s11229-009-9542-8
Weisstein, Eric W. Binomial Distribution. From MathWorld--A Wolfram Web. Retrieved February 12, 2016, from http://mathworld.wolfram.com/BinomialDistribution.html
Weisstein, Eric W. "Probability." From MathWorld--A Wolfram Web. Retrieved February 13, 2016, from http://mathworld.wolfram.com/Probability.html
William A. Dembski. (2016, February 4). In Wikipedia, The Free Encyclopedia. Retrieved February 12, 2016, from https://en.wikipedia.org/w/index.php?title=William_A._Dembski&oldid=703225648
No comments:
Post a Comment