Intelligent Design as a Theory of Information

William A. Dembski

Department of Philosophy

University of Notre Dame

Notre Dame, Indiana, USA





Information

In his book Steps Towards Life, Manfred Eigen (1992, p. 12) summarizes the task of origins-of-life research as follows: ÒOur task is to find an algorithm, a natural law that leads to the origin of information.Ó This summary of origins-of-life research is at once insightful and misguided. It is insightful because it correctly isolates the central problem facing origins-of-life research, to wit, the origin of information. At the same time, it is misguided because it prescribes an unworkable solution for this problem, to wit, algorithms and natural laws. Algorithms and natural laws are utterly incapable of producing information. Indeed, it is an oxymoron to attribute the origin of information to algorithms and natural lawsÑinformation is inaccessible from algorithms and natural laws. Eigen is working on the right problem, but looking to the wrong solution. EigenÕs insight is to see that the origin of information constitutes the central problem facing origins-of-life research; EigenÕs mistake is to think that algorithms and natural laws constitute the solution. In this paper I shall examine EigenÕs insight and correct EigenÕs mistake. To examine EigenÕs insight, I shall explicate the concept of information and connect it to biological reality. To correct EigenÕs mistake, I shall introduce intelligent causation and show why it, and not algorithms or natural laws, provide the right way to account for the origin of information.

Let us then begin with information. What is information? The fundamental intuition underlying information is not, as is commonly thought, the transmission of signals across a communication channel, but rather, the ruling out of possibilities. To be sure, when signals are transmitted across a communication channel, invariably a set of possibilities is ruled out, namely, those signals which were not transmitted. But to acquire information remains fundamentally a matter of ruling out possibilities, whether these possibilities comprise signals across a communication channel or take some other form. As Robert Stalnaker (1984, p. 85) puts it, ÒTo understand the information conveyed in a communication is to know what possibilities would be excluded by its truth.Ó Information in the first instance presupposes not some medium of communication, but contingency. For there to be information, there must be a multiplicity of distinct possibilities any one of which might happen. When one of these possibilities does happen and the others are ruled out, information becomes actualized. Indeed, information in its most general sense can be defined as the actualization of one possibility to the exclusion of others.




Complex Information

This definition of information is highly abstract and by itself of little use to biology and science more generally. To render information a useful concept for science we need to do two things: first, provide a means for measuring information; second, introduce a crucial distinctionÑthe distinction between specified and unspecified information. First, let us consider how to measure information. In measuring information it is not enough to count the number of possibilities that were ruled out, and offer this number as the relevant measure of information. The problem is that this simple enumeration of excluded possibilities tells us nothing about how the possibilities under consideration were individuated in the first place. Consider, for instance, the following individuation of poker hands:

(i) A royal flush.

(ii) Everything else.

To learn that something other than a royal flush was dealt (i.e., possibility (ii)) is clearly to acquire less information than to learn that a royal flush was dealt (i.e., possibility (i)). Yet if our measure of information is simply an enumeration of excluded possibilities, then the same numerical value must be assigned in both instances since in both instances a single possibility is excluded.

It follows, therefore, that how we measure information needs to be independent of whatever procedure is used to individuate the possibilities under consideration. And the way to do this is not simply to count possibilities, but to assign probabilities to these possibilities. For a thoroughly shuffled deck of cards, the probability of being dealt a royal flush (i.e., possibility (i)) is approximately .000002 whereas the probability of being dealt anything other than a royal flush (i.e., possibility (ii)) is approximately .999998. Probabilities by themselves, however, are not information measures. Although probabilities properly distinguish possibilities according to the information they contain, nonetheless probabilities remain an inconvenient way measuring information. There are two reasons for this. First, the scaling and directionality of the numbers assigned by probabilities needs to be recalibrated. We are clearly acquiring more information when we learn someone was dealt a royal flush than when we learn someone wasnÕt dealt a royal flush. And yet the probability of being dealt a royal flush (i.e., .000002) is minuscule compared to the probability of being dealt something other than a royal flush (i.e., .999998). Smaller probabilities signify more information, not less.

The second reason probabilities are inconvenient for measuring information is that they are multiplicative rather than additive. If I learn that Alice obtained a royal flush playing poker at CaesarÕs Palace and that Bob obtained a royal flush playing poker at the Mirage, the probability that both Alice and Bob were dealt royal flushes is the product of the individual probabilities. Nonetheless, it is convenient for information to be measured additively so that the measure of information assigned to Alice and Bob jointly being dealt royal flushes equals the measure of information assigned to Alice being dealt a royal flush plus the measure of information assigned to Bob being dealt a royal flush.

Now there is an obvious way of transforming probabilities which circumvents both these difficulties, and that is to apply a negative logarithm to the probabilities. Applying a negative logarithm assigns the more information to the less probability and, because the logarithm of a product is the sum of the logarithms, transforms multiplicative probability measures into additive information measures. WhatÕs more, in deference to communication theorists it is customary to use the logarithm to the base 2. The rationale for this choice of logarithmic base is as follows. The most convenient way for communication theorists to measure information is in bits. Any message sent across a communication channel can be viewed as a string of 0Õs and 1Õs. For instance, the ASCII code uses strings of eight 0Õs and 1Õs to represent the characters on a typewriter, with whole words and sentences in turn represented as strings of such character strings. In like manner all communication may be reduced to the transmission of sequences of 0Õs and 1Õs. Given this reduction, the obvious way for communication theorists to measure information is in number of bits transmitted across a communication channel. And since the negative logarithm to the base 2 of a probability corresponds to the average number of bits needed to identify an event of that probability, the logarithm to the base 2 is the canonical logarithm for communication theorists.

We may now summarize how information is measured as follows. Given a collection of possibilities, and probabilities assigned to those possibilities, the measure of information inherent in one of those possibilities is the negative logarithm to the base 2 of the probability of that possibility. This is harder to say than to conceive, and is really quite straightforward. One terminological point, however, is worth making. As a purely formal object, the information measure I have just described is a complexity measure (cf. Dembski, 1996, ch. 4). It is therefore appropriate to speak of the Òcomplexity of informationÓ and say that the complexity of information increases as the associated information measure increases (and, correspondingly, as the associated probability measure decreases). This notion of complexity is important to biology since it is not just the origin of information that stands in question, but the origin of complex information.




Complex Specified Information

With a means of measuring information in hand, we turn now to the distinction between specified and unspecified information. This is a vast and complicated topic whose full elucidation is beyond the scope of this paper, requiring the formulation of a substantial technical apparatus involving both probability and complexity theory. All the painstaking details about specification may be found in my monograph The Design Inference, which I expect to have published next year. Nonetheless, in what follows I shall try to make this distinction intelligible, and offer some hints on how to make it rigorous.

For an intuitive grasp of the difference between specified and unspecified information, consider the following example. Suppose an archer stands 50 meters from a large blank wall with bow and arrow in hand. The wall, let us say, is sufficiently large that the archer cannot help but hit it. Consider now two alternative scenarios. In the first scenario the archer simply shoots at the wall. In the second scenario the archer first paints a target on the wall, and then shoots at the wall, squarely hitting the target in the bullÕs-eye. Let us suppose that in both scenarios the precise place on the wall where the arrow lands is identical. In both scenarios the arrow might have landed anywhere on the wall. WhatÕs more, any place where it might land is highly improbable. It follows that in both scenarios highly complex information is actualized. Yet the conclusions we draw from each scenario are very different. In the first scenario we can conclude absolutely nothing about the archerÕs ability as an archer. On the other hand, in the second scenario we have evidence of the archerÕs skill.

The obvious difference between these two scenarios is of course that in the first the information follows no pattern whereas in the second it does. Now the information that tends to interest us as rational inquirers is not the actualization of arbitrary possibilities corresponding to no patterns, but rather the actualization of circumscribed possibilities corresponding to patterns. And indeed, when we speak of information in common parlance, we typically do not mean the actualization of an arbitrary possibility so much as the actualization of a possibility that corresponds to a pattern. In fact, in the study of information, patterns assume so great a significance that the patterns themselves become identified as information. The patterns that represent informationÑbe they linguistic, pictorial, or mathematicalÑare in common parlance what we mean by information. Yet in the service of clarity it is useful to distinguish information qua the actualization of a possibility from its representation qua some pattern.

All the same, information that corresponds to a pattern still isnÕt quite enough to constitute specified information. The problem is that patterns can be concocted after the fact so that instead of helping us make sense of information, they are merely read off already actualized information. To see this, consider a third scenario in which an archer shoots at a wall. As before, we suppose the archer stands 50 meters from a large blank wall with bow and arrow in hand, the wall being so large that the archer cannot help but hit it. And as in the first scenario, the archer shoots at the wall while it is still blank. But this time suppose that after having shot the arrow, and finding the arrow stuck in the wall, the archer paints a target around the arrow so that the arrow sticks squarely in the bullÕs-eye. Let us further suppose that the precise place on the wall where the arrow lands in this scenario is identical with where it landed in the first two scenarios. Since any place where the arrow might land is highly improbable, in this as in the other scenarios highly complex information has been actualized. WhatÕs more, since the information corresponds to a pattern, we can even say that in this third scenario highly complex patterned information has been actualized. Nevertheless, we would be wrong to say that highly complex specified information has been actualized. Of the three scenarios, only the information in the second scenario is specified. In that scenario, by first painting the target and then shooting the arrow, the pattern is given independently of the information. On the other hand, in this, the third scenario, by first shooting the arrow and then painting the target around it, the pattern is merely read off the information.

Specified information is always patterned information, but patterned information is not always specified information. For specified information not just any pattern will do. We may therefore distinguish between ÒgoodÓ patterns and ÒbadÓ patterns. The ÒgoodÓ patterns will henceforth be called specifications. Specifications are the independently given patterns that are not simply read off information. By contrast, the ÒbadÓ patterns will be called fabrications. Fabrications are the post hoc patterns that are simply read off information.

Unlike specifications, fabrications are wholly uninformative. We are no better off with a fabrication than without one. This is clear from comparing the first and third scenarios. Whether an arrow lands on a blank wall and the wall stays blank (as in the first scenario), or an arrow lands on a blank wall and a target is then painted around the arrow (as in the third scenario), any conclusions we draw about the arrowÕs flight remain the same. In either case chance is as good an explanation as any for the arrowÕs flight. The fact that the target fixes a pattern in the third scenario makes no difference since this pattern is constructed only after the arrow has flown and landed. Only when the pattern qua target is given in advance of the arrow being shot does a hypothesis other than chance come into play. Thus only in the second scenario does it make sense to ask whether we are dealing with a skilled archer. Only in the second scenario does the pattern constitute a specification. In the third scenario the pattern constitutes a mere fabrication.

The distinction between specified and unspecified information may now be defined as follows: the actualization of a possibility (i.e., information) is specified if independently of the possibilityÕs actualization, the possibility is identifiable via a pattern. If not, then the information is unspecified. Note that this definition implies an asymmetry between specified and unspecified information: specified information cannot become unspecified information, though unspecified information may become specified information. Unspecified information need not remain unspecified, but may become specified as our background knowledge increases. For instance, a cryptographic transmission whose cryptosystem we have yet to break will constitute unspecified information. Yet as soon as we break the cryptosystem, the cryptographic transmission becomes specified information.

What is it for a possibility to be identifiable via an independently given pattern? A full exposition of specification requires a detailed answer to this question. Unfortunately, such an exposition is beyond the scope of this paper. The key conceptual difficulty here is to characterize the independence condition that obtains between patterns and information. This independence condition in turn decomposes into two conditions: (1) a condition to stochastic conditional independence between the information in question and certain relevant background knowledge; and (2) a tractability condition whereby the pattern in question is constructible via the aforementioned background knowledge. Although these conditions make good intuitive sense, they are not easily formalized. For the details refer to my monograph The Design Inference.

If formalizing what it means for a pattern to be given independently of a possibility is difficult, determining in practice whether a pattern is given independently of a possibility is much easier. If the pattern is given prior to the possibility being actualizedÑas in the second scenario above where the target was painted before the arrow was shotÑthen the pattern is automatically independent of the possibility, and we are dealing with specified information. Patterns given prior to the actualization of a possibility are just the rejection regions of statistics. There is a well-established statistical theory that describes such patterns and their use in probabilistic reasoning. These are clearly specifications since having been given prior to the actualization of some possibility, they have already been identified, and thus are identifiable independently of the possibility being actualized.

Many of the interesting cases of specified information, however, are those in which the pattern is given after a possibility has been actualized. This is certainly the case with the origin of life: life originates first and only afterwards do pattern-forming rational agents (like ourselves) enter the scene. It remains the case, however, that a pattern corresponding to a possibility, though formulated after the possibility has been actualized, can constitute a specification. Certainly this was not the case in the third scenario above where the target was painted around the arrow only after it hit the wall. But consider the following example. Alice and Bob are celebrating their fiftieth wedding anniversary. Their six children all show up bearing presents. Each present is part of a matching set of china. There is no duplication of presents, and together the presents form a complete set of china. Suppose Alice and Bob were satisfied with their old set of china, and had no inkling prior to opening their presents that they might expect a new set of china. Alice and Bob are therefore without a relevant pattern whither to refer their presents prior to actually receiving the presents from their children. Nevertheless, the pattern they explicitly formulate only after receiving the presents could be formed independently of receiving the presents (after all, their colluding children formed just such a pattern prior to delivering their presents; so too, the china manufacturer formed this pattern to construct the china in the first place). This pattern therefore constitutes a specification.

But what about the origin of life? Is the origin of life specified? If so, to what patterns does life correspond, and how are these patterns given independently of lifeÕs origin? As was just pointed out, pattern-forming rational agents like ourselves donÕt enter the scene till after life originates. Nonetheless, there are functional patterns to which life corresponds, and which are given independently of the actual living systems. An organism is a functional system comprising many functional subsystems. The functionality of organisms can be cashed out in any number of ways. Arno Wouters (1995) cashes it out globally in terms of viability of whole organisms. Michael Behe (1996) cashes it out in terms of the irreducible complexity and minimal function of biochemical systems. Even the staunch Darwinist Richard Dawkins will admit that life is specified functionally, cashing out the functionality of organisms in terms of genetic reproduction. Thus Dawkins (1987, p. 9) will write: ÒComplicated things have some quality, specifiable in advance, that is highly unlikely to have been acquired by random chance alone. In the case of living things, the quality that is specified in advance is .Ê.Ê. the ability to propagate genes in reproduction.Ó

Life is specified. Life is also complex. The origin of life is the origin of complex specified information. This then, suitably reformulated, is Manfred EigenÕs problemÑhow to explain the origin of complex specified information. Complex specified information, or CSI for short, is what all the fuss over information has been about in recent years, not just in biology, but within science more generally. It is CSI that the various anthropic principles are trying to explain when they account for the fine-tuning of the universe (cf. Barrow and Tipler, 1986). It is CSI that David BohmÕs quantum potentials are extracting when they scour the microworld for what Bohm calls Òactive informationÓ (cf. Bohm, 1993, pp. 35-38). It is CSI that enables MaxwellÕs demon to outsmart a thermodynamic system tending towards thermal equilibrium (cf. Landauer, 1991, p. 26) It is CSI that David Chalmers posits in attempting to explain human consciousness (Cf. Chalmers, 1996, ch. 8). It is CSI that the mathematician Keith Devlin (1991, p.Ê1) intends when he writes: ÒThat there is such a thing as information cannot be disputed.Ê.Ê.Ê. After all, our very lives depend upon it, upon its gathering, storage, manipulation, transmission, security, and so on. Huge amounts of money change hands in exchange for information. People talk about it all the time. Lives are lost in its pursuit. Vast commercial empires are created in order to manufacture equipment to handle it. Surely then it is there.Ó

Nor is CSI confined to the domain of science. CSI is indispensable in our everyday lives. The 16-digit number on your VISA card is an example of CSI. The complexity of this number ensures that a would-be thief cannot randomly pick a number and have it turn out to be a valid VISA card number. WhatÕs more, the specification of this number ensures that it is your number, and not anyone elseÕs. Even your phone number constitutes CSI. As with the VISA card number, the complexity ensures that this number wonÕt be dialed randomly (at least not too often), and the specification that this number is yours and yours only. All the numbers on our bills, credit slips, and purchase orders represent CSI. CSI makes the world go round. Consequently, CSI is also a rife field for criminality. CSI is what motivated the villainous Michael Douglas character in the movie Wall Street to lie, cheat, and steal. CSIÕs total and absolute control was the objective of the monomaniacal Ben Kingsley character in the movie Sneakers. CSI is the artifact of interest in most techno-thrillers. Ours is an information age, and the information that excites us most is CSI.




The Law of Conservation of Information

With this characterization of CSI in hand, I want now to return to Manfred EigenÕs central problemÑthe origin of CSI. Where does CSI come from, and where is CSI incapable of coming from? According to Eigen, CSI comes from algorithms and natural laws. To recall EigenÕs dictum: ÒOur task is to find an algorithm, a natural law that leads to the origin of [complex specified] information.Ó The only question for Eigen is which algorithms and natural laws explain the origin of CSI. The logically prior question of whether algorithms and natural laws are even in-principle capable of explaining the origin of CSI is not one he properly considers. And yet it is a question whose answer vitiates EigenÕs entire project. Algorithms and natural laws are in-principle incapable of explaining the origin of information. To be sure, algorithms and natural laws can explain the flow of information. Indeed, algorithms and natural laws are ideally suited for transmitting already existing information. What they cannot do, however, is originate information.

The easiest way to see this is mathematically. From a mathematical point of view algorithms and natural laws are just functions, that is, relations between two sets, which to every member in one set (called the domain) associates one, and only one, member in the other set (called the range). As such, the functional relationship is fully deterministic: given an element in the domain, the function determines a unique element in the range. For algorithms the domain comprises the various possible input data, and the range the various possible output data. For natural laws the domain comprises the various possible initial and boundary conditions, and the range the various possible states at subsequent times t. Now suppose we had some CSI j, and a function (qua algorithm or natural law) f that, to quote Eigen again, led to the origin of the [complex specified] information j. This would mean that some element in the domain of f, call it i, when acted on by f, yielded the output j. But this hardly explains the origin of the information j. One problem has been solved by creating another, for now the origin of i must be explained.

Worse yet, the newly created problem is no easier than the one we started with. Functional relationships at best preserve what information is already there, or else degrade itÑbut they can never add to it. Thus however much information resides in j will be contained in any i that via the function f maps onto j. WhatÕs more, if j is specified, then the inverse image under the function f will also be specified (in particular, since i maps onto j via f, i is in this inverse image). In short, if j constitutes complex specified information and f is a function that maps i onto j, then i constitutes specified information at least as complex as j. Thus instead of explaining the origin of CSI, algorithms and natural laws shift the problem elsewhere to a place where the origin of CSI will be at least as difficult to explain.

It is vital to realize that functions can only make the information problem worse. Suppose, for instance, you look at the U.S. Statistical Abstract and find that the average income of a U.S. citizen is so-much-and-so-much. How did this item of information originate? Well, the census bureau had to contact all the U.S. citizens, record their individual incomes, add the incomes all together, and divide by the number of U.S. citizens. To take an average is thus to apply a functionÑgiven the input data (all the individual U.S. incomes), the output data is uniquely determined. But more so, to take an average is also to compress data. The information inherent in the record of all individual incomes far exceeds the information inherent in the corresponding average. Taking an average is a standard statistical technique for compressing data. In an information age we are inundated with information. Thus frequently when we look at information, we look at information whose complexityÑas a service to the information seekerÑhas already been drastically compressed.

There is one subtlety we need now to consider, and it is the one which not just Manfred Eigen, but also Ilya Prigogine, Stuart Kauffman, and indeed the entire Santa Fe Institute group is pinning their hopes on. I have just argued that when a function acts to yield information, what the function acts upon has at least as much information as what the function yields. This argument, however, treats functions as mere conduits of information, and does not take seriously the possibility that functions might actually add information. I give the example of taking an average whereby data is compressed and information is lost. But consider the function which maps library call numbers to their corresponding books. Clearly, there is less information in the call numbers than in the books. Thus here we have a function that is adding information. WhatÕs more, it is adding information because the information is embedded in the function itself.

Although this observation seems to undermine my previous argument, in fact it leaves the previous argument virtually unchanged. The point is that instead of the function f now merely serving as a conduit taking information i and yielding information j, the information in f must now itself be taken into account. The way to do this is to employ the universal composition function U, which to an ordered information-function pair (i,f) assigns the information obtained by applying f to iÑin this case j. Thus U(i,f) = f(i) = j. Now unlike f, which may well incorporate information, U, the universal composition function, incorporates no information of its own, but serves merely as a conduit for information. By simply taking ordered pairs, and treating the second element as a function applied to the first, U introduces no information of its own. U adds no information. Note that in the case of algorithms U is just a universal Turing machine. The form of my original argument is therefore unchanged: the information j arises by applying U (cf. f in the original argument) to the information (i,f) (cf. i in the original argument. Just as in computer science the distinction between data and programs is not hard and fast, so the distinction between functions and information is not hard and fast. We can thus treat the ordered pair (i,f) as information which via the universal composition function yields the information U. And now it is clear that information inherent in (i,f) exceeds that in j. Like a bulge under a rug, the information problem can be shifted around, but it does not go away.

This argument, by employing the universal composition function, is perfectly general. In particular, it answers the attempt by complexity-theorists to account for the origin of information in terms of dynamical systems (for popular accounts of this enterprise see Levy 1992 and Waldrop 1992). Complexity-theorists, especially the Santa Fe Institute group, continue to hope that information can be gotten on the cheap. ÒLook at all those amazing fractal patterns,Ó we are told. ÒThe incredibly intricate Mandelbrot set is generated by so modest a complex function as h(z) = z2Ê+Êc.Ó To state the matter in this way, however, is misleading. The function h(z) = z2 + c is simple enough, and even simpler to write down. And granted, it is the crucial element in constructing a graphic depiction of the Mandelbrot set. But that is the point: It is the graphic depiction of the Mandelbrot set that has to be explained, not its existence as an abstract mathematical object. And this graphic depiction has to be constructed.

Pixels on a computer screen have to be assigned coordinates representing complex numbers. The function h(z) = z2 + c has to be iterated with respect to those coordinates. The trajectory of those iterations needs to be tracked to see if the trajectory stays locally bounded or heads off towards infinity. Given these trajectories, a color has to be assigned to the pixel, black if the trajectory stays locally bounded, white if it heads off to infinity. All of this must be programmed. All of this is information far exceeding the information inherent in simply writing down Òh(z) = z2 + c.Ó The function h(z) = z2 + c is never the function that produces the pretty graphic depictions of the Mandelbrot set we see in books on fractals. Any function that produces a graphic depiction of the Mandelbrot set will be a complicated algorithm employing a complicated set of input data. Any such algorithm f applied to a data set i can be conjoined as an ordered pair (i,f), and then evaluated by the universal composition function U to produce a graphic depiction of the Mandelbrot set j. But by itself the function h(z) = z2 + c is too information-poor to produce this graphic depiction of the Mandelbrot set j. Once we examine the precise informational antecedents to j, the illusion that we have generated information for nothing disappears.

The origin of CSI simply cannot be explained as the output of a function, be it an algorithm, a natural law, or whatever. The root problem here is that functions are deterministic, and thus cannot yield contingency. Recall that information becomes realizable only as a multiplicity of distinct possibilities obtains any one of which might actually happen. The problem with functions is that they invariably yield only a single live possibility. Take a computer algorithm that performs addition. Let us say the algorithm has a correctness proof, so that it performs its additions correctly. Given the input data 2+2, can the algorithm output anything other than 4? Algorithms, and functions more generally, are wholly deterministic. They allow for no contingency, and thus can generate no information. At best functions can shift information around, or lose it, as when data gets compressed. What they cannot do is produce contingency. And without contingency they cannot generate information.

If not by means of functions, how then does contingency arise? Two, and only two, answers are possible here. Either the contingency is a blind, purposeless contingencyÑwhich is chance; or it is a guided, purposeful contingencyÑwhich is intelligent causation. We shall return to intelligent causation in due course, but for now let us examine whether chance is capable of generating CSI. First notice that pure chance, entirely unsupplemented and left to its own devices, is incapable of generating CSI. Chance can generate complex unspecified information, and chance can generate non-complex specified information. What chance cannot generate is information that is jointly complex and specified.

To see this, consider again our archer friend who fires arrows at a large blank wall. The archer, even if driven purely by chance, is perfectly capable of generating complex unspecified information: the precise place where the arrow hits a large blank wall signifies a highly improbable unspecified event, instancing complex unspecified information (recall that high probability corresponds to low complexity whereas low probabilityÑi.e., high improbabilityÑcorresponds to high complexity). Alternatively, if a target is painted on the wall, but the target is so large that the bullÕs-eye takes up half the area of the wall, then the archer, even if driven purely by chance, will be quite likely to hit the bullÕs-eye, thereby generating non-complex specified information: hitting the bullÕs-eye signifies a specified high-probability event, instancing non-complex specified information. What an archer driven purely by chance cannot do is, having painted a minuscule target on the wall, hit the bullÕs-eye, thereby generating information that is both complex and specified: hitting the bullÕs-eye of a minuscule target signifies a highly improbable specified event, instancing complex specified informationÑCSI.

But canÕt someone simply by chance let fly an arrow and hit a bullÕs-eye? Not if the target is sufficiently small. At some point the improbabilities become too vast and the specifications too tight for chance to be taken seriously. Just where this point is first reached can be debated, but that there is a probabilistic cut-off beyond which chance becomes an unacceptable explanation is beyond doubt. The universe will experience heat death before random typing at a keyboard produces a Shakespearean sonnet. The French mathematician Emile Borel (1962, p. 28) proposed 10­50 as a universal probability bound below which chance could definitely be precluded, i.e., any specified event as improbable as this could not be attributed to chance. Borel based his universal probability bound on cosmological considerations, taking into account the opportunities to repeat and observe events through the history and expanse of the universe. BorelÕs 10­50 probability bound translates into 170 bits of information. I have proposed a more stringent universal probability bound of 10­150 based on the number of elementary particles in the universe, the Planck time, and the duration of the universe until its head death (see Dembski, 1996, ch. 6). A probability bound of 10­150 translates into 500 bits of information. The bound I propose is more securely justified than BorelÕs. Given a universal probability bound of 10­150 we therefore refuse to attribute to chance specified information with a complexity of 500 or more bits. I have yet to encounter CSI with a complexity greater than the 500 bits for which chance is an adequate explanation.

Biologists by and large do not dispute this claim. Most are agreed that pure chanceÑthe Epicurean hypothesis as Hume called itÑis not an adequate explanation for CSI. Jacques Monod (1972) is one of the few exceptions, arguing that the origin of life, though vastly improbable, can nonetheless be attributed to chance because of a selection effect. Just as the winner of a lottery is shocked at winning, so we are shocked to have evolved. But the lottery was bound to have a winner, and so too something was bound to have evolved. Something vastly improbable was bound to happen, and so, the fact that it happened to us (i.e., that we were selectedÑhence the name selection effect) does not preclude chance. This is MonodÕs argument and it is fallacious. It has been refuted by the philosophers John Earman, William Craig, and Richard Swinburne. It has also been refuted by the biologists Wolfgang StegmŸller, Bernd Olaf-KŸppers, and Hubert Yockey. SwinburneÕs refutation is perhaps the most memorable (Swinburne, 1979, p. 138):

Suppose that a madman kidnaps a victim and shuts him in a room with a cardshuffling machine. The machine shuffles ten packs of cards simultaneously and then draws a card from each pack and exhibits simultaneously the ten cards. The kidnapper tells the victim that he will shortly set the machine to work and it will exhibit its first draw, but that unless the draw consists of an ace of hearts from each pack, the machine will simultaneously set off an explosion which will kill the victim, in consequence of which he will not see which cards the machine drew. The machine is then set to work, and to the amazement and relief of the victim the machine exhibits an ace of hearts drawn from each pack. The victim thinks that this extraordinary fact needs an explanation in terms of the machine having been rigged in some way. But the kidnapper, who now reappears, casts doubt on this suggestion. ÒIt is hardly surprising,Ó he says, Òthat the machine [drew] only aces of hearts. You could not possibly see anything else. For you would not be here to see anything at all, if any other cards had been drawn.Ó But of course the victim is right and the kidnapper is wrong. There is indeed something extraordinary in need of explanation in ten aces of hearts being drawn. The fact that this peculiar order is a necessary condition of the draw being perceived at all makes what is perceived no less extraordinary and in need to explanation.

Selection effects do nothing to render chance an adequate explanation of complex specified information. For a detailed treatment of selection effects and their failure to account for CSI, see Dembski (1996, sec. 6.3).

Most biologists then reject pure chance as an adequate explanation of CSI. The problem here is not simply one of faulty statistical reasoning. Besides flying in the face of every canon of sound statistical reasoning, pure chance is scientifically unsatisfying as an explanation of CSI. To explain CSI in terms of pure chance is no more instructive than pleading ignorance or proclaiming CSI a mystery. It is one thing to explain the occurrence of heads on a coin toss by appealing to chance. It is quite another, as KŸppers (1990, p. 59) points out, to follow Monod and take the view that Òthe specific sequence of the nucleotides in the DNA molecule of the first organism came about by a purely random process in the early history of the earth.Ó CSI cries out for explanation, and pure chance wonÕt do it. Richard Dawkins (1987, pp. 139, 145-146) makes this point eloquently:

We can accept a certain amount of luck in our explanations, but not too much.Ê.Ê.Ê. In our theory of how we came to exist, we are allowed to postulate a certain ration of luck. This ration has, as its upper limit, the number of eligible planets in the universe.Ê.Ê.Ê. We [therefore] have at our disposal, if we want to use it, odds of 1 in 100 billion billion as an upper limit (or 1 in however many available planets we think there are) to spend in our theory of the origin of life. This is the maximum amount of luck we are allowed to postulate in our theory. Suppose we want to suggest, for instance, that life began when both DNA and its protein-based replication machinery spontaneously chanced to come into existence. We can allow ourselves the luxury of such an extravagant theory, provided that the odds against this coincidence occurring on a planet do not exceed 100 billion billion to one.

Dawkins is right. We can allow our scientific theorizing only so much luck. After that we degenerate into handwaving and mystery. A probability bound of 10­150, or a corresponding complexity bound of 500 bits of information, sets a conservative limit on the amount of luck we can allow ourselves (certainly more conservative than the one Dawkins was just now alluding to).

We may summarize our findings up to this point as follows: (1) Chance generates contingency, but not complex specified information. (2) Functions (e.g., algorithms and natural laws) generate neither contingency, nor information, much less complex specified information. (3) At best functions transmit already present information. Given these three findings, it seems intuitively obvious that no chance-function combination is going to generate information either. After all, functions transmit what they are given, and whatever chance gives a function is not complex specified information. Ergo, chance and functions working in tandem cannot generate information. This intuition is of course exactly right, and I shall provide a theoretical justification for it momentarily. Nevertheless, the sense that functions can sift chance and thereby generate CSI is deep-seated in the scientific community. Trial and error is the basis for all sorts of probabilistic algorithms (e.g., genetic algorithms), and what is trial and error but the sifting of chance by means of a function? WhatÕs more, the very Darwinian mechanism of mutation and natural selection is a chance-function combination, in which the variability of the organism provides the chance component, and selection pressure from the environment provides the function component.

The theoretical justification for the inability of chance and functions working in tandem to generate information is virtually the same as the theoretical justification given earlier for the inability of functions by themselves to generate information. Instead of considering a deterministic function f(i) in one variable, we now consider an indeterministic function f(i,w) in two variables where the first variable signifies the object on which the function acts, and the second the randomizing component. We then define the universal composition function U which inputs the object-chance-function ordered triple (i,w,f) and outputs f(i,w) = j, i.e., U(i,w,f) = f(i,w) = j. As in the deterministic case, the universal composition function U incorporates no information of its own, but serves merely as a conduit for information. U adds no information. The formalism just described for combining chance and functions is perfectly general, and accommodates everything from DarwinÕs mutation-selection mechanism to the probabilistic algorithms of computer science (genetic algorithms being a case in point).

Now suppose we had some CSI j, and an indeterministic function (i.e., chance-function combination) f that, to quote Eigen again, led to the origin of the CSI j. The origin of the CSI j can then be broken into two stages. In the first stage, a chance outcome w occurs. Once w occurs and is fixed, the function f becomes deterministic, i.e., f becomes the function in one variable f(.,w) = fw(.), w now being treated as a fixed parameter of the function f. In the second stage, the parameterized deterministic function fw(.) gets applied to some element in its domain, call it i, yielding the item of interest, the CSI j. From this it is clear that neither of these stages can generate CSI. The first stage involves only chance, and therefore, as was argued earlier, cannot generate CSI. The second stage involves no chance, but only a deterministic function, and therefore, as was argued earlier, cannot generate CSI either. Thus at no point in the transition from w to fw(.) to fw(i) = j is CSI created. Whatever CSI is inherent in j is therefore already inherent in the indeterministic function f together with the nonrandom element in the domain of f, namely, i. This argument is valid and holds universally. Just as chance or functions left to themselves individually cannot purchase CSI, so too their joint action cannot purchase CSI either.

This result, that neither chance nor functions nor some combination of the two can generate CSI, I call the Law of Conservation of Information, or LCI for short. Though formulated at a high level of mathematical abstraction, LCI has many profound implications for science. Among its immediate corollaries are the following: (1) The CSI within a system closed to outside information always remains constant or decreases. (2) If CSI increases within a system, then CSI was added exogenously. (3) CSI cannot be generated spontaneously, originate endogenously, or organize itself. (4) To explain the CSI within a system is to appeal to a system whose CSI is equal or greater in complexity still (in particular, reductive explanations of CSI are never adequate).




Applying the Theory to

Evolutionary Biology

Up to this point I have sketched a theory of complex specified information, and concluded with a general law characterizing the origin of complex specified information, to wit, the Law of Conservation of Information. I want next to apply this theory to evolutionary biology. Before doing so, however, it will be convenient to provide a synonym for the term Òfunction.Ó As I have used the term, function signifies a certain law-like mathematical relation between two sets. In the sequel it will therefore be convenient to use the word ÒlawÓ to signify functions. If we do this, the Law of Conservation of Information has the following perspicuous formulation: Neither law nor chance nor some combination of the two can generate complex specified information. The reference to functions was useful so long as their mathematical properties were being explicitly cited. But continued reference to them, especially when juxtaposed with chance, will tend henceforth obscure rather than clarify. Thus in particular we shall refer to deterministic laws as functions of the form f(i) = j and indeterministic laws as chance-function combinations of the form f(i,w) = j with random component w.

In applying the theory of information here developed to evolutionary biology, let us begin by noting that nothing in this theory so far undermines the naturalistic accounts of evolution currently in vogue. All that has been shown so far is that CSI is not a free lunch in the sense that law and chance together cannot generate CSI. But law and chance can take already existing CSI and shift it around. And there is nothing to prevent CSI from being abundant in the universe, and thus to prevent law and chance from expressing CSI in the origin and development of biological systems. With Hubert Yockey (1992, p. 335) we could therefore say that CSI, and by implication life, is axiomatic, and leave it at that. Like the principle of rationality which according to the ancient Stoics pervaded the universe, we could simply treat CSI as a given.

Although this move might be philosophically justified, it remains scientifically unsatisfying. As scientists we want to know how the CSI which supposedly is so abundant in the universe got itself into the organisms we see around us. In reference to the origin of life, we want to know the informational pathway that takes the CSI inherent in a lifeless universe, and translates it into a protobiont. In reference to the development of life, we want to know the informational pathway that takes the CSI inherent in an already existing organism plus its environment, and translates this CSI into an organism of still greater complexity. Even if the origin of CSI admits no scientific explanation, its flow surely does. How then does CSI flow into and out of biological systems?

The answer to this question, at least in broad terms, is clear: The CSI inherent in an organism consists of the CSI acquired at birth together with whatever CSI it acquires during the course of its life. The CSI acquired at birth derives from inheritance with modification (i.e., the CSI acquired at birth is inherited from the parent(s) and consists of the CSI inherent in the parent(s) as modified by chance). The CSI acquired after birth consists of selection (i.e., the environmental pressure that selects some organisms to reproduce and eliminates others before they can reproduce) along with infusion (i.e., the direct introduction of novel information from outside the organism). The Darwinian mechanism admits selection and inheritance with modification, but proscribes infusion. The Lamarckian mechanism, on the other hand, focuses mainly on infusion. Certainly infusion as Lamarck conceived it has largely been discredited. Nevertheless, there is good scientific evidence for non-Lamarckian infusion wherein organic informational structures belonging to one organism are assimilated by another. For instance, it is well-established that bacteria exchange plasmids as a way of developing antibiotic resistance (cf.ÊAm‡bile-Cuevas et al., 1995, p. 324). On the other hand, Lynn MargulisÕs idea of symbiosis, where organisms co-opt and assimilate other organisms to form still more complex organisms, remains speculative (cf. Margulis, 1993).

Inheritance with modification, selection, and infusionÑthese three account for the CSI inherent in biological systems. Together they comprise all the sources of CSI in biology. I want therefore to examine more closely the respective roles of these three sources in contributing to the CSI of an organism. First consider inheritance with modification (alternatively, inheritance and mutation). Inheritance is merely a conduit for already existing information and modification is merely chance operating on the information passing through this conduit. It follows that by itself inheritance with modification is incapable of explaining the increased complexity of CSI that organisms have exhibited in the course of natural history. Inheritance with modification needs therefore to be supplemented.

The most obvious candidate here, of course, is selection. Selection presupposes inheritance with modification, but instead of merely shifting around already existing information, selection also introduces new information. By seizing on advantageous modifications, selection is able to introduce new information into a population. The majority view in biologyÑknown as the neo-Darwinian synthesisÑis that selection and inheritance with modification together are adequate to account for all the CSI inherent in organisms. As a parsimonious account of the origin and development of life, this view has much to commend it. Unfortunately, this view places undue restrictions on biological information flow, restrictions which biological systems seem routinely to violate. The problem is that selection and inheritance with modification can only yield very gradual increases in the informational complexity of organisms, whereas many of the increases in the informational complexity of organisms are abrupt and large.

This point deserves careful attention. Suppose that an organism in reproducing generates N offspring, and that of these N offspring M (1ʲÊMʲÊN) succeed in reproducing. The amount of information introduced through selection is then Ðlog2M/N. Let me stress that this formula is not an case of misplaced mathematical exactness. This formula holds universally and is non-mysterious. Take a simple non-biological example. If I am sitting at a radio transmitter, and can transmit only zeros and ones, then every time I transmit a zero or one, I choose between two possibilities, selecting precisely one of them. Here N equals 2 and M equals 1. The information Ðlog2M/N thus equals Ðlog21/2 = 1, i.e., 1 bit of information is introduced every time I transmit a zero or one. This is of course as things should be. Now this example from communication theory is mathematically isomorphic to the case of cell-division where only one of the daughter cells goes on to reproduce. On the other hand, if both daughter cells go on to reproduce, then N equals M equals 2, and thus Ðlog2M/N = Ðlog22/2 = 0, indicating that selection, by failing to eliminate any possibility failed also to introduce new information. To take another example, imagine you are typing at a keyboard consisting of the twenty-six capital Roman letters. Thus every time you type a key you select one of twenty-six letter. Here N equals 26 and M equals 1. The information Ðlog2M/N thus equals Ðlog21/26 = 4.7, i.e., 4.7 bits of information are introduced every time you type a key. Or consider a dog breeder who from a given litter of seven Boston terriers selects two for reproduction. The dog breeder thus introduces Ðlog22/7 = 1.8 bits of information into those Boston terriers selected for reproduction. (In the formula Ðlog2M/N and throughout these examples I have assumed a uniform probability distribution. This simplifying assumption, however, only strengthens our case: since uniform probability distributions maximize entropy, on average the information introduced through selection will in fact fall below Ðlog2M/N.)

ItÕs therefore clear that selection among the offspring of an organism can at most introduce a few bits of information. Cell division, the preeminent form of reproduction, and the only one prior to multi-cellular life, introduces at most one bit of information. Even if an organism can produce 1030 gametesÑso many gametes that their biomass would equal that of the earth, and each of these became mature organisms, and then only one of these mature organisms were selected for further reproduction, the total number of bits of information introduced through selection would in this instance be Ðlog21/1030 = 100. A hundred bits of information is far less information than is contained in an average protein.

From these observations it is clear that selection can accumulate a lot of information over successive generations. As is noted Joklik and WillettÕs (1976, p. 78) microbiology text, ÒWithin a short period, often as short as 20 minutes, a bacterium can create a complete duplicate of itself, which in turn is capable of duplicating.Ó Over a billion years, at one bit of information introduced every twenty minutes, selection could in principle produce 26 trillion bits of information, certainly enough to handle any conceivable genome. Nonetheless, from these observations it is equally clear that selection can only produce a very limited amount of information at any one generation. 100 bits is certainly too generous. The most fecund breeders with which I am familiar are certain fish whose spawn include a hundred million eggs. A realistic upper limit on the amount of biological information introduced by selection is therefore around 30 bits. For many organisms it is far less. Mammals, for instance, have an upper limit of about 5 bits of information per generation through selection.

The preceding analysis gives new urgency to DarwinÕs (1859, p. 189) famous challenge: ÒIf it could be demonstrated that any complex organ existed, which could not possibly have been formed by numerous, successive, slight modifications, my theory would absolutely break down.Ó In information-theoretic terms, this is to say that if informational jumps of considerably more than thirty bits are required in any one generation, then some means of producing information other than selection must be sought. Have such informational jumps been discovered? Darwin and his disciples believe in the infinite plasticity of organisms to change gradually from one form into another. This belief, however, no longer seems justified.

Perhaps the clearest examples of informational jumps that exceed the power of selection occur in biochemistry. Michael Behe (1996) and Siegfried Scherer (1983) have both examined biochemical systems which if produced by selection need to be produced in a single generation, but whose information requirements exceeds what selection can deliver in a single generation. The key feature of these biochemical systems is one Behe calls irreducible complexity. A system is irreducibly complex if it consists of several interrelated components the removal of any one of which leads to the complete loss of function of the system. As an example of irreducible complexity, Behe (1996, p. 43) offers a mousetrap. A mousetrap consists of a platform, a hammer, a spring, a catch, and a holding bar. Remove any one of these five components, and it is impossible to construct a functional mousetrap. Irreducible complexity needs to be contrasted with reducible complexity. A system is reducibly complex if it contains a dispensable component, i.e., a component which can be removed without destroying functionality. An example of a reducibly complex system is a pocket watch. The glass face that covers and protects the dial is not necessary for the watch to keep time. It can be removed without destroying the watchÕs function (function may be diminished, but it is not lost).

Besides being contrasted with reducible complexity, irreducible complexity needs also to be contrasted with cumulative complexity. A system is cumulatively complex if the components of the system can be arranged sequentially so that the successive removal of components never leads to the complete loss of function. An example of a cumulatively complex system is a city. It is possible successively to remove people and services from a city until one is down to a tiny village, all without losing the cohesiveness of the community, which in this case constitutes functionality. Note, however, that the order in which people and services are removed is important. To remove as the first thing the police and courts from a large city would result in chaos. Observe that it is possible to define cumulative complexity recursively in terms of reducible complexity: A system is cumulatively complex if it is reducibly complex, and if after the removal of some component from the system, the system is again cumulatively complex. It follows that cumulatively complex systems are always reducibly complex. The converse, however, is not the case. Reducibly complex systems may contain an irreducibly complex core, and thus fail to be cumulatively complex. For instance, a pocket watch, though reducibly complex, contains certain ineliminable components without which the watch cannot function, e.g., hour and minute hands, certain gears and springs, and a base to keep all these elements together. Such ineliminable components form the irreducible core of the pocket watch.

Given these types of complexityÑirreducible, reducible, and cumulativeÑit is clear that selection can account for cumulative complexity. The gradual accrual of information via selection mirrors the retention of function as components are removed in cumulative complexity. Selection has no problem producing cumulative complexity. But what about irreducible complexity? Can selection produce irreducible complexity? Certainly if selection acts with reference to a goal, it can produce an irreducibly complex system. Take Michael BeheÕs mousetrap, for instance. Given the goal of constructing a mousetrap, one can specify a goal-directed selection process that in turn selects a platform, a hammer, a spring, a catch, and a holding bar, and at the end puts all these components together to form a functional mousetrap. Given a pre-specified goal, selection has no difficulty producing irreducibly complex systems.

But the selection that operates in biology is Darwinian natural selection. And this form of selection operates without goals, has neither plan nor purpose, and is wholly undirected (cf. Miller and Levine, 1993, p.Ê658). The great appeal of DarwinÕs selection mechanism was precisely that it would eliminate teleology from biology. Yet by making selection an undirected process, Darwin drastically abridged the type of complexity biological systems could manifest. Henceforth biological systems could manifest only cumulative complexity, not irreducible complexity. Why is this? As Behe (1996, p. 39) explains, ÒAn irreducibly complex system cannot be produced .Ê.Ê. by slight, successive modifications of a precursor system, because any precursor to an irreducibly complex system that is missing a part is by definition nonfunctional.Ê.Ê.Ê. Since natural selection can only choose systems that are already working, then if a biological system cannot be produced gradually it would have to arise as an integrated unit, in one fell swoop, for natural selection to have anything to act on.Ó

Recall that for the complex specified information inherent in organisms, what specifies this information is functionality. The organism as a whole, as well as its various subsystems are specified in virtue of the respective functions these systems perform. For irreducibly complex systems, however, function is attained only when all components of a system are in place. Moreover, natural selection, insofar as it introduces complex specified information into organisms, must select for function. It follows that natural selection, if it is going to produce an irreducibly complex system, has to produce it all at once or not at all. Of course, this would not be a problem if the amount of information natural selection can produce in a single generation matches or exceeds the amount of information inherent in the irreducibly complex systems of biology. But nothing like this is the case. Whereas natural selection at its very best can introduce about 30 bits of information per generation, the irreducibly complex biochemical systems Michael Behe considers in DarwinÕs Black Box contain several orders of magnitude more information. These irreducibly complex biochemical systems, like the bacterial flagellum, are protein machines consisting of numerous distinct proteins, each indispensable for the function of the machine (hence the irreducible complexity), and where each individual protein in the machine requires more bits of information than natural selection can conceivably produce in a single generation.

The irreducible complexity of biochemical systems counts decisively against the joint action of selection and inheritance with modification to account for the CSI in biological systems. Because irreducible complexity occurs at the biochemical level, there is no lower level of biological analysis to which the irreducible complexity of biochemical systems might be referred, and at which a Darwinian analysis in terms of selection and inheritance with modification might still hope for success. Undergirding biochemistry is ordinary chemistry and physics, neither of which can account for biological information (cf. Yockey, 1992). Also, whether a biochemical system is irreducibly complex is a fully empirical question: Individually knock out each protein constituting an irreducibly complex biochemical system, and determine whether function is lost. If so, we are dealing with an irreducibly complex system. Mutagenesis experiments of this sort are routine in biochemistry.

If the joint action of selection and inheritance with modification is unable to account for the CSI in biological systems (and specifically for the irreducible complexity of certain biochemical systems like the bacterial flagellum), there remains but one source for the CSI in biological systems, namely, infusion, the direct introduction of novel information from outside the biological system. In principle there is nothing problematic or controversial about infusion. To innovate a given informational structure an organism has informational needs, and these needs can be supplied from outside the organism, either through selection pressures (and therefore indirectly), or by the insertion of ready-to-go information into the organism (and therefore directly). The latter is of course infusion.

Although at this level of generality infusion is unproblematic, it quickly becomes problematic once we start tracing backwards the informational pathways of infused information. Consider for instance what is perhaps the best scientifically confirmed instance of infusion in biology, namely, plasmid exchange among bacteria to develop antibiotic resistance (cf.ÊAm‡bile-Cuevas et al., 1995, p. 324). Plasmids are small circular pieces of DNA that can easily be exchanged among bacteria of the same species, and are capable of conferring antibiotic resistance. When one bacterium releases a plasmid and another absorbs it, information is infused from one into the other. By itself this is unproblematic. Problems begin, however, when we ask, Where did the bacterium that released the plasmid in turn derive it? There is a regress here, and this regress always terminates in something non-organismal. We canÕt just keep explaining plasmid infusion into a bacterium by plasmid release from another bacteriumÑeventually, as we trace the informational pathway back, we must tell a different kind of story. If, for instance, the plasmid is cumulatively complex, then it could have arisen through selection and inheritance with modification. But if on the other hand it is irreducibly complex, whence could it have arisen?

It will be helpful here to distinguish between symbiotic and abiotic infusion, and correspondingly between endogenous and exogenous information. Symbiotic infusion is the infusion of information from one organism to another; abiotic infusion is the infusion of information not derived from any organism. Correspondingly, endogenous information comprises symbiotically infused information (and thus information already present within biology); exogenous information comprises abiotically infused information (and thus information external to biology). Now regardless whether plasmids are irreducibly complex or have an irreducibly complex core (the analysis to determine the nature of the complexity of plasmids has to my knowledge not yet been performed), the fact remains that there exist irreducibly complex biochemical systems. WhatÕs more, even though symbiotic infusion may explain how a particular instance of an irreducibly complex biochemical system came to exist in a given organism, it cannot explain how such a system arose in the first place. Because organisms have a finite trajectory back in time, symbiotic infusion must ultimately give way to abiotic infusion, and endogenous information must ultimately derive from exogenous information.




Reconceptualizing Evolutionary Biology

The abiotic infusion of exogenous information is the great mystery confronting modern evolutionary biology. It is Manfred EigenÕs mystery with which we began this paper. Why is it a mystery? Not because the abiotic infusion of exogenous information is inherently spooky or unscientific, but rather because evolutionary biology has failed to grasp the centrality of information to its task. The task of evolutionary biology is to explain the origin and development of life. The key feature of life is the presence of complex specified informationÑCSI. Caught up in the Darwinian mechanism of selection and inheritance with modification, evolutionary biology has failed to appreciate the informational hurdles organisms need to jump in the course of natural history. To jump those hurdles, organisms require information. WhatÕs more, a significant part of that information is exogenous and must originally have been infused abiotically.

In this section I want briefly to consider what evolutionary biology would look like if information were taken as its central and unifying concept. First off, letÕs be clear that the Darwinian mechanism of selection and inheritance with modification will continue to occupy a significant place in evolutionary theory. Nevertheless, its complete and utter dominance in evolutionary theoryÑthat selection and inheritance with modification together account for the full diversity of lifeÑthis inflated view of the Darwinian mechanism will have to be relinquished. As a mechanism for conserving, adapting, and honing already existing biological structures, the Darwinian mechanism is ideally suited. But as a mechanism for innovating irreducibly complex biological structures, it utterly lacks the informational resources. As for symbiotic infusion, its role within an information-theoretic framework must always remain quite limited, for even though it can account for how organisms trade already existing biological information, it can never get at the root question of how that biological information came to exist in the first place.

Not surprisingly, therefore, the key task an information-theoretic approach to evolutionary biology faces is to make sense of abiotically infused CSI. Abiotically infused CSI is information exogenous to an organism, but which nonetheless gets transmitted to and assimilated by the organism. Two obvious questions now arise: (1) What is the mode of transmission of abiotically infused CSI into the organism? and (2) Where is this information prior to being transmitted? If this information is clearly represented in some empirically accessible non-biological physical system, and if there is a clear informational pathway from this system to the organism, and if this informational pathway can be shown suitable for transmitting this information to the organism so that the organism properly assimilates it, only then will these two questions receive an empirically adequate naturalistic answer. But note that this naturalistic answer, far from eliminating the information question, simply pushes it one step further back, for how did the CSI that was abiotically infused into an organism first get into a non-organism? Because of the Law of Conservation of Information, whenever we inquire into the source of some information, we never resolve the information problem, but only intensify it. This is not to say that such inquiries are unilluminating (contra Dawkins, 1987, pp. 11­13; and Dennett, 1995, p. 153 who think that the only valid explanations in evolutionary biology are reductive, explaining the more complex in terms of the simpler). We learn an important fact about a pencil when we learn a certain pencil-making machine made it. Nonetheless, the information in the pencil-making machine exceeds the information in the pencil. The Law of Conservation of Information guarantees that as we trace informational pathways backwards, we have more information to explain than we started with.

Where then do the informational pathways of life terminate as we trace them backwards? The possibilities are limited. One possibility is that we get nowhere, unable even to begin tracing backwards the information in a biological system. Thus we may discover an irreducibly complex biological system, but be unable to trace it back to any abiotic source of exogenous information (this is by far the most common case in biologyÑsee Behe, ch.Ê8). Another possibility is that we can trace the information in a biological system back to an abiotic source of exogenous information, but then canÕt trace it back any further. Graham Cairns-Smith (1985; 1986), for instance, has a clay-template theory for the origin of life in which self-replicating clays form templates for carbon-based life. The Cairns-Smith theory is clearly an abiotic infusion theory, with exogenous information represented in (abiotic) clays providing templates for carbon-based life. What the Cairns-Smith theory does not consider is how the exogenous information that was transmitted to carbon-based life from clay templates got into those clay templates in the first place. Needless to say, the Cairns-Smith theory is highly speculative. Still another possibility is that we can trace the information in a biological system all the way back to the initial conditions of the big bang (cf. Corey 1994). Though this approach appeals to our naturalistic sensibilities, it remains scientifically sterile until a definite informational pathway can be traced back to the big bang. Finally, there is the creationist alternative which traces the information in a biological system to the direct intervention of God. Though this approach appeals to our theistic longings, it remains scientifically sterile until an in-principle argument is offered showing that information inherent in a biological system could not have been contained in any non-biological physical precursor. And even then itÕs not clear what sort of God one infers.

In tracing back the informational pathways of life, evolutionary biology does well to avoid speculation, and follow only those informational pathways that can be rigorously traced. To take an analogy, I can rigorously trace the informational pathway issuing in my copy of King Lear through the various extant editions of the play spanning the last four centuries. On the other hand, I cannot rigorously trace the informational pathway issuing in an isolated first century papyrus fragment. Any story behind this fragment is lost and cannot be reconstructed. Alternatively, any relevant informational pathways are blocked and cannot be rigorously traced. In a similar vein, evolutionary biology may progress to the point where it can rigorously trace an informational pathway back to an abiotic source of exogenous information. On the other hand, it may remain stuck on a given irreducibly complex biological structure, and never be able rigorously to trace it back to an abiotic source of exogenous information.

In fine, I propose to reconceptualize evolutionary biology in information-theoretic terms. An evolutionary biology thoroughly cognizant of information theory is one whose chief task is to trace informational pathways. In tracing these informational pathways, evolutionary biology must place a premium on rigor. Detailed informational pathways need to be explicitly exhibited. Moreover, unlike the nebulous informational pathways sketched by Stuart Kauffman and his associates at the Santa Fe Institute, informational pathways need to conform to biological reality, and not to the virtual reality residing in a computer (cf. Kauffman, 1996). Finally, empirical evidenceÑand not metaphysical prejudice or aesthetic preferenceÑmust decide whether an informational pathway exists at all. For instance, the Darwinian preference to cash out taxonomy in terms of genealogy must not be taken as evidence for common descent. To establish common descent requires showing that certain informational pathways connect all organisms. Many of the low-level facts of current evolutionary biology will stay put. WhatÕs more, information theory is sufficiently flexible to accommodate the mechanisms of evolutionary change proposed to date. Nonetheless, their adequacy will have to be evaluated in terms of the information-theoretic constraints to which they are subject. Thus for instance, the Darwinian mechanism can be formulated in information-theoretic terms, but the claim that this mechanism can account for the full diversity of life must be rejected given its inability to produce irreducibly complex systems. Many old questions will remain. Many new questions will arise. But some old questions will have to be discarded. In particular, all reductionist attempts to explain information in terms of something other than information will have to be discarded.





Intelligent Design

Up to this point I have developed a theoretical apparatus for understanding information, I have critiqued the main naturalistic attempts to account for biological information, and reconceptualized evolutionary biology in information-theoretic terms. One question, however, remains unanswered, to wit, Whence the origin of complex specified information in biology? Tracing informational pathways back to abiotic sources of exogenous information is as far back as the information trail goes within the framework so far developed. But again, all weÕve really done is push the information problem back, shift its focus, and exchange one information problem for another. To be sure, this need not be a vain exercise. Plasmid exchange, though it represents no more than a shifting around of pre-existing biological information still gives us tremendous insight into antibiotic resistance. Nonetheless, all such exercises get us no closer to the origin of information.

In what remains of this paper I want to argue that intelligent causation, or equivalently design, properly accounts for the origin of complex specified information. My argument focuses on the nature of intelligent causation, and specifically, on what it is about intelligent causes that makes them detectable. To see why CSI is a reliable indicator of design, we need to probe the nature of intelligent causation. The principal characteristic of intelligent causation is choice. Whenever an intelligent cause acts, it chooses from a range of competing possibilities. This is true not just of humans, but of animals as well as of extra-terrestrial intelligences. A rat navigating a maze must choose whether to go right or left at various points in the maze. When NASAÕs SETI researchers attempt to discover intelligence in the extra-terrestrial radio transmissions they are monitoring, they assume an extra-terrestrial intelligence could have chosen any number of possible radio transmissions, and then attempt to match the transmissions they observe with certain patterns as opposed to others. Whenever a human being utters meaningful speech, a choice is made from a range of possible sound-combinations that might have been uttered. Intelligent causation always entails discrimination, choosing certain things, ruling out others.

Given this characterization of intelligent causes, the next question is how to recognize their operation. Intelligent causes act by making a choice. How do we know when an intelligent cause has so acted? A bottle of ink spills accidentally onto a sheet of paper; someone takes a fountain pen and writes a message on a sheet of paper. In both instances ink is applied to paper. In both instances one among an almost infinite set of possibilities is realized. In both instances a choice is madeÑone possibility is selected and the rest are ruled out. Yet in one instance we infer design, in the other we donÕt. What is the relevant difference? Not only do we need to observe that a choice has been made, but we ourselves need also to be able to specify that choice. ItÕs not enough that one possibility has been chosen and others have been ruled out. We ourselves need to be able to make the same choice. Wittgenstein (1980, p. 1e) illustrated this point as follows: ÒWe tend to take the speech of a Chinese for inarticulate gurgling. Someone who understands Chinese will recognize language in what he hears. Similarly I often cannot discern the humanity in man.Ó

In hearing a Chinese utterance, someone who understands Chinese not only recognizes that a choice was made from the range of all possible utterances, but is also able to specify the utterance that was made as coherent Chinese speech. Contrast this with someone who does not understand Chinese. In hearing a Chinese utterance, someone who does not understand Chinese also recognizes that a choice was made from the range of all possible utterances, but this time, because lacking the ability to understand Chinese, is unable to specify the utterance as coherent speech. To someone who does not understand Chinese, the utterance is gibberish. To be sure, uttering gibberish always constitutes a choice from the range of all possible utterances. Nonetheless, gibberish corresponds to nothing we can understand in any language, and so cannot be specified. As a result, gibberish is never taken for intelligent communication, but always for what Wittgenstein calls Òinarticulate gurgling.Ó

This choosing of one among several competing possibilities, ruling out the rest, and specifying the one that was chosen encapsulates how we recognize intelligent causes, or equivalently, how we detect design. Psychologists who study animal learning and behavior have known this all along. For these psychologistsÑknown as learning theoristsÑlearning is discrimination (cf. Mazur, 1990; Schwartz, 1984). To learn a task an animal must acquire the ability to choose behaviors suitable for the task as well as the ability to rule out behaviors unsuitable for the task. Moreover, for a psychologist to recognize that an animal has learned a task, it is necessary not only to observe the animal making the appropriate discrimination, but also to specify this discrimination.

Thus to recognize whether a rat has successfully learned how to traverse a maze, a psychologist must first specify which sequence of right and left turns conducts the rat out of the maze. No doubt, a rat randomly wandering a maze also discriminates a sequence of right and left turns. But by randomly wandering the maze, the rat gives no indication that it can discriminate the appropriate sequence of right and left turns for exiting the maze. Consequently, the psychologist studying the rat will have no reason to think the rat has learned how to traverse the maze. Only if the rat executes the sequence of right and left turns specified by the psychologist will the psychologist recognize that the rat has learned how to traverse the maze. Now it is precisely the learned behaviors we regard as intelligent in animals. Hence it is no surprise that the same scheme for recognizing animal learning recurs for recognizing intelligent causes generally, to wit: choosing one among several competing possibilities, ruling out the others, and specifying the one chosen.

Now this general scheme for recognizing intelligent causes coincides precisely with how we recognize complex specified information: First of all, the basic precondition for information to exist must be established, to wit, contingency. Thus one must establish that any one of a multiplicity of distinct possibilities might actually obtain. Next, one must establish that the possibility chosen after the others were ruled out was also specified. So far the match between this general scheme for recognizing intelligent causation and how we recognize complex specified information is exact. Only one loose end remainsÑcomplexity. Although complexity is essential to CSI (corresponding to the first letter in the acronym), its role in this general scheme for recognizing intelligent causation is not immediately obvious. In this scheme a choice is made among several competing possibilities, the rest are ruled out, and the possibility chosen is specified. Where in this scheme does complexity figure in?

The answer is that it is there implicitly. To see this, consider again a rat traversing a maze, but now take a very simple maze in which two right turns conduct the rat out of the maze. How will a psychologist studying the rat determine whether it has learned to exit the maze. Just putting the rat in the maze will not be enough. Because the maze is so simple, the rat could by chance just happen to take two right turns, and thereby exit the maze. The psychologist will therefore be uncertain whether the rat actually learned to exit this maze, or whether the rat just got lucky. But contrast this now with a complicated maze in which a rat must take just the right sequence of left and right turns to exit the maze. Suppose the rat must take one hundred appropriate right and left turns, and that any mistake will prevent the rat from exiting the maze. A psychologist who sees the rat take no erroneous turns and in short order exit the maze will be convinced that the rat has indeed learned how to exit the maze, and that this was not dumb luck. With the simple maze there is a substantial probability that the rat will exit the maze by chance; with the complicated maze this is exceedingly improbable. And improbability is precisely what we mean by complexity.

This argument for showing that CSI is a reliable indicator of design may now be summarized as follows: CSI is a reliable indicator of design because its recognition coincides with how we recognize intelligent causation generally. In general, to recognize intelligent causation we must observe a choice among competing possibilities, note which possibilities were not chosen, and then be able to specify the possibility that was chosen. WhatÕs more, the competing possibilities that were ruled out must be live possibilities, and sufficiently numerous so that specifying the possibility that was chosen cannot be attributed to chance. In terms of probability, this just means that the possibility that was specified has small probability. In terms of complexity, this just means that the possibility that was specified has high complexity. All the elements in this general scheme for recognizing intelligent causation (i.e., choosing, ruling out, and specifying) find their counterpart in complex specified informationÑCSI. It follows that CSI pinpoints precisely what we need to be looking for when we detect design.

As a postscript, let me call the readerÕs attention to the etymology of the word Òintelligent.Ó The word ÒintelligentÓ derives from two Latin words, the preposition inter, meaning between, and the verb lego, meaning to choose or select. Thus according to its etymology, intelligence consists in choosing between. It follows that the etymology of the word ÒintelligentÓ parallels the formal analysis of intelligent causation just given. ÒIntelligent DesignÓ is therefore a thoroughly apt phrase, signifying that design is inferred precisely because an intelligent cause has done what only an intelligent cause can do, to wit, make a choice.









References


Am‡bile-Cuevas, Carlos F., Maura C‡rdenas-Garc'a, and Maaricio Ludgar. 1995. Antibiotic Resistance. American Scientist, 83: 320-329.

Barrow, John D. and Frank J. Tipler. 1986. The Anthropic Cosmological Principle. Oxford: Oxford University Press.

Behe, Michael. 1996. DarwinÕs Black Box: The Biochemical Challenge to Evolution. New York: The Free Press.

Bohm, David. 1993. The Undivided Universe: An Ontological Interpretation of Quantum Theory. London: Routledge.

Borel, Emile. 1962. Probabilities and Life, translated by M. Baudin. New York: Dover.

Cairns-Smith, Alexander G. 1985 Seven Clues to the Origin of Life. Cambridge: Cambridge University Press.

Cairns-Smith, Alexander G. and H. Hartman, eds. 1986. Clay Minerals and the Origin of Life. Cambridge: Cambridge University Press.

Chalmers, David J. 1996. The Conscious Mind: In Search of a Fundamental Theory. New York : Oxford University Press.

Corey, Michael A. 1994. Back to Darwin: The Scientific Case for Deistic Evolution. Lanham, Maryland: University Press of America.

Darwin, Charles. 1859. On the Origin of Species, facsimile first edition. Cambridge, Mass.: Harvard University Press, 1964.

Dawkins, Richard. 1987. The Blind Watchmaker. New York: Norton.

Dembski, William A. 1996. The Design Inference: Eliminating Chance through Small Probabilities. Doctoral Dissertation, University of Illinois at Chicago.

Dennett, Daniel C. 1995. DarwinÕs Dangerous Idea: Evolution and the Meanings of Life. New York: Simon & Schuster.

Devlin, Keith J. 1991. Logic and Information. New York: Cambridge University Press.

Eigen, Manfred. 1992. Steps Towards Life: A Perspective on Evolution, translated by Paul Woolley. Oxford: Oxford University Press.

Joklik, Wolgang K. and Hilda P. Willett, eds. 1976. Zinsser Microbiology, 16th ed. New York: Appleton-Century-Crofts.

Kauffman, Stuart. 1995. At Home in the Universe. Oxford: Oxford University Press.

KŸppers, Bernd-Olaf. 1990. Information and the Origin of Life. Cambridge, Mass.: MIT Press.

Landauer, Rolf. 1991. Information is Physical. Physics Today, May: 23­29.

Levy, Steven. 1992. Artificial Life: The Quest for a New Creation. New York: Pantheon.

Margulis, Lynn. 1993. Symbiosis in Cell Evolution: Microbial Communities in the Archean and Proterozoic Eons, 2nd ed. New York: Freeman.

Mazur, James. E. 1990. Learning and Behavior, 2nd edition. Englewood Cliffs, N.J.: Prentice Hall.

Miller, Kenneth R. and Joseph Levine. 1993. Biology. Englewood Cliffs, N.J.: Prentice-Hall.

Monod, Jacques. 1972. Chance and Necessity. New York: Vintage.

Scherer, Siegfried. 1983. Basic Functional States in the Evolution of Light-driven Cyclic Electron Transport. Journal of Theoretical Biology, 104: 289-299.

Schwartz, Barry. 1984. Psychology of Learning and Behavior, 2nd edition. New York: Norton.

Stalnaker, Robert. 1984. Inquiry. Cambridge, Mass.: MIT Press.

Swinburne, Richard. 1979. The Existence of God. Oxford: Oxford University Press.

Waldrop, M. Mitchell. 1992. Complexity: The Emerging Science at the Edge of Order and Chaos. New York: Simon & Schuster.

Wittgenstein, Ludwig. 1980. Culture and Value, edited by G. H. von Wright, translated by P. Winch. Chicago: University of Chicago Press.

Wouters, Arno. 1995. Viability Explanation. Biology and Philosophy, 10: 435-457.

Yockey, Hubert P. 1992. Information Theory and Molecular Biology. Cambridge: Cambridge University Press.