Intelligent Design as a Theory of Information
William A. Dembski
Department of Philosophy
University of Notre Dame
Notre Dame, Indiana, USA
Information
In his book Steps Towards Life, Manfred Eigen (1992,
p. 12) summarizes the task of origins-of-life research as follows:
ÒOur task is to find an algorithm, a natural law that
leads to the origin of information.Ó This summary of origins-of-life
research is at once insightful and misguided. It is insightful
because it correctly isolates the central problem facing origins-of-life
research, to wit, the origin of information. At the same time,
it is misguided because it prescribes an unworkable solution for
this problem, to wit, algorithms and natural laws. Algorithms
and natural laws are utterly incapable of producing information.
Indeed, it is an oxymoron to attribute the origin of information
to algorithms and natural lawsÑinformation is inaccessible
from algorithms and natural laws. Eigen is working on the right
problem, but looking to the wrong solution. EigenÕs insight
is to see that the origin of information constitutes the central
problem facing origins-of-life research; EigenÕs mistake
is to think that algorithms and natural laws constitute the solution.
In this paper I shall examine EigenÕs insight and correct
EigenÕs mistake. To examine EigenÕs insight, I
shall explicate the concept of information and connect it to biological
reality. To correct EigenÕs mistake, I shall introduce
intelligent causation and show why it, and not algorithms or natural
laws, provide the right way to account for the origin of information.
Let us then begin with information. What is information?
The fundamental intuition underlying information is not, as
is commonly thought, the transmission of signals across a communication
channel, but rather, the ruling out of possibilities. To be sure,
when signals are transmitted across a communication channel, invariably
a set of possibilities is ruled out, namely, those signals which
were not transmitted. But to acquire information remains fundamentally
a matter of ruling out possibilities, whether these possibilities
comprise signals across a communication channel or take some other
form. As Robert Stalnaker (1984, p. 85) puts it, ÒTo understand
the information conveyed in a communication is to know what possibilities
would be excluded by its truth.Ó Information in the first
instance presupposes not some medium of communication, but contingency.
For there to be information, there must be a multiplicity of
distinct possibilities any one of which might happen. When one
of these possibilities does happen and the others are ruled out,
information becomes actualized. Indeed, information in its most
general sense can be defined as the actualization of one possibility
to the exclusion of others.
Complex Information
This definition of information is highly abstract
and by itself of little use to biology and science more generally.
To render information a useful concept for science we need to
do two things: first, provide a means for measuring information;
second, introduce a crucial distinctionÑthe distinction
between specified and unspecified information. First, let us
consider how to measure information. In measuring information
it is not enough to count the number of possibilities that were
ruled out, and offer this number as the relevant measure of information.
The problem is that this simple enumeration of excluded possibilities
tells us nothing about how the possibilities under consideration
were individuated in the first place. Consider, for instance,
the following individuation of poker hands:
(i) A royal flush.
(ii) Everything else.
To learn that something other than a royal flush
was dealt (i.e., possibility (ii)) is clearly to acquire less
information than to learn that a royal flush was dealt (i.e.,
possibility (i)). Yet if our measure of information is simply
an enumeration of excluded possibilities, then the same numerical
value must be assigned in both instances since in both instances
a single possibility is excluded.
It follows, therefore, that how we measure information
needs to be independent of whatever procedure is used to individuate
the possibilities under consideration. And the way to do this
is not simply to count possibilities, but to assign probabilities
to these possibilities. For a thoroughly shuffled deck of cards,
the probability of being dealt a royal flush (i.e., possibility
(i)) is approximately .000002 whereas the probability of being
dealt anything other than a royal flush (i.e., possibility (ii))
is approximately .999998. Probabilities by themselves, however,
are not information measures. Although probabilities properly
distinguish possibilities according to the information they contain,
nonetheless probabilities remain an inconvenient way measuring
information. There are two reasons for this. First, the scaling
and directionality of the numbers assigned by probabilities needs
to be recalibrated. We are clearly acquiring more information
when we learn someone was dealt a royal flush than when we learn
someone wasnÕt dealt a royal flush. And yet the probability
of being dealt a royal flush (i.e., .000002) is minuscule compared
to the probability of being dealt something other than a royal
flush (i.e., .999998). Smaller probabilities signify more information,
not less.
The second reason probabilities are inconvenient
for measuring information is that they are multiplicative rather
than additive. If I learn that Alice obtained a royal flush playing
poker at CaesarÕs Palace and that Bob obtained a royal
flush playing poker at the Mirage, the probability that both Alice
and Bob were dealt royal flushes is the product of the individual
probabilities. Nonetheless, it is convenient for information
to be measured additively so that the measure of information assigned
to Alice and Bob jointly being dealt royal flushes equals the
measure of information assigned to Alice being dealt a royal flush
plus the measure of information assigned to Bob being dealt a
royal flush.
Now there is an obvious way of transforming probabilities
which circumvents both these difficulties, and that is to apply
a negative logarithm to the probabilities. Applying a negative
logarithm assigns the more information to the less probability
and, because the logarithm of a product is the sum of the logarithms,
transforms multiplicative probability measures into additive information
measures. WhatÕs more, in deference to communication theorists
it is customary to use the logarithm to the base 2. The rationale
for this choice of logarithmic base is as follows. The most convenient
way for communication theorists to measure information is in bits.
Any message sent across a communication channel can be viewed
as a string of 0Õs and 1Õs. For instance, the ASCII
code uses strings of eight 0Õs and 1Õs to represent
the characters on a typewriter, with whole words and sentences
in turn represented as strings of such character strings. In
like manner all communication may be reduced to the transmission
of sequences of 0Õs and 1Õs. Given this reduction,
the obvious way for communication theorists to measure information
is in number of bits transmitted across a communication channel.
And since the negative logarithm to the base 2 of a probability
corresponds to the average number of bits needed to identify an
event of that probability, the logarithm to the base 2 is the
canonical logarithm for communication theorists.
We may now summarize how information is measured
as follows. Given a collection of possibilities, and probabilities
assigned to those possibilities, the measure of information inherent
in one of those possibilities is the negative logarithm to the
base 2 of the probability of that possibility. This is harder
to say than to conceive, and is really quite straightforward.
One terminological point, however, is worth making. As a purely
formal object, the information measure I have just described is
a complexity measure (cf. Dembski, 1996, ch. 4). It is therefore
appropriate to speak of the Òcomplexity of informationÓ
and say that the complexity of information increases as the associated
information measure increases (and, correspondingly, as the associated
probability measure decreases). This notion of complexity is
important to biology since it is not just the origin of information
that stands in question, but the origin of complex information.
Complex Specified Information
With a means of measuring information in hand, we
turn now to the distinction between specified and unspecified
information. This is a vast and complicated topic whose full
elucidation is beyond the scope of this paper, requiring the formulation
of a substantial technical apparatus involving both probability
and complexity theory. All the painstaking details about specification
may be found in my monograph The Design Inference, which I expect
to have published next year. Nonetheless, in what follows I shall
try to make this distinction intelligible, and offer some hints
on how to make it rigorous.
For an intuitive grasp of the difference between
specified and unspecified information, consider the following
example. Suppose an archer stands 50 meters from a large blank
wall with bow and arrow in hand. The wall, let us say, is sufficiently
large that the archer cannot help but hit it. Consider now two
alternative scenarios. In the first scenario the archer simply
shoots at the wall. In the second scenario the archer first paints
a target on the wall, and then shoots at the wall, squarely hitting
the target in the bullÕs-eye. Let us suppose that in both
scenarios the precise place on the wall where the arrow lands
is identical. In both scenarios the arrow might have landed anywhere
on the wall. WhatÕs more, any place where it might land
is highly improbable. It follows that in both scenarios highly
complex information is actualized. Yet the conclusions we draw
from each scenario are very different. In the first scenario
we can conclude absolutely nothing about the archerÕs ability
as an archer. On the other hand, in the second scenario we have
evidence of the archerÕs skill.
The obvious difference between these two scenarios
is of course that in the first the information follows no pattern
whereas in the second it does. Now the information that tends
to interest us as rational inquirers is not the actualization
of arbitrary possibilities corresponding to no patterns, but rather
the actualization of circumscribed possibilities corresponding
to patterns. And indeed, when we speak of information in common
parlance, we typically do not mean the actualization of an arbitrary
possibility so much as the actualization of a possibility that
corresponds to a pattern. In fact, in the study of information,
patterns assume so great a significance that the patterns themselves
become identified as information. The patterns that represent
informationÑbe they linguistic, pictorial, or mathematicalÑare
in common parlance what we mean by information. Yet in the service
of clarity it is useful to distinguish information qua the actualization
of a possibility from its representation qua some pattern.
All the same, information that corresponds to a pattern
still isnÕt quite enough to constitute specified information.
The problem is that patterns can be concocted after the fact
so that instead of helping us make sense of information, they
are merely read off already actualized information. To see this,
consider a third scenario in which an archer shoots at a wall.
As before, we suppose the archer stands 50 meters from a large
blank wall with bow and arrow in hand, the wall being so large
that the archer cannot help but hit it. And as in the first scenario,
the archer shoots at the wall while it is still blank. But this
time suppose that after having shot the arrow, and finding the
arrow stuck in the wall, the archer paints a target around the
arrow so that the arrow sticks squarely in the bullÕs-eye.
Let us further suppose that the precise place on the wall where
the arrow lands in this scenario is identical with where it landed
in the first two scenarios. Since any place where the arrow might
land is highly improbable, in this as in the other scenarios highly
complex information has been actualized. WhatÕs more,
since the information corresponds to a pattern, we can even say
that in this third scenario highly complex patterned information
has been actualized. Nevertheless, we would be wrong to say that
highly complex specified information has been actualized. Of
the three scenarios, only the information in the second scenario
is specified. In that scenario, by first painting the target
and then shooting the arrow, the pattern is given independently
of the information. On the other hand, in this, the third scenario,
by first shooting the arrow and then painting the target around
it, the pattern is merely read off the information.
Specified information is always patterned information,
but patterned information is not always specified information.
For specified information not just any pattern will do. We may
therefore distinguish between ÒgoodÓ patterns and
ÒbadÓ patterns. The ÒgoodÓ patterns
will henceforth be called specifications. Specifications are
the independently given patterns that are not simply read off
information. By contrast, the ÒbadÓ patterns will
be called fabrications. Fabrications are the post hoc patterns
that are simply read off information.
Unlike specifications, fabrications are wholly uninformative.
We are no better off with a fabrication than without one. This
is clear from comparing the first and third scenarios. Whether
an arrow lands on a blank wall and the wall stays blank (as in
the first scenario), or an arrow lands on a blank wall and a target
is then painted around the arrow (as in the third scenario), any
conclusions we draw about the arrowÕs flight remain the
same. In either case chance is as good an explanation as any
for the arrowÕs flight. The fact that the target fixes
a pattern in the third scenario makes no difference since this
pattern is constructed only after the arrow has flown and landed.
Only when the pattern qua target is given in advance of the arrow
being shot does a hypothesis other than chance come into play.
Thus only in the second scenario does it make sense to ask whether
we are dealing with a skilled archer. Only in the second scenario
does the pattern constitute a specification. In the third scenario
the pattern constitutes a mere fabrication.
The distinction between specified and unspecified
information may now be defined as follows: the actualization
of a possibility (i.e., information) is specified if independently
of the possibilityÕs actualization, the possibility is
identifiable via a pattern. If not, then the information is unspecified.
Note that this definition implies an asymmetry between specified
and unspecified information: specified information cannot become
unspecified information, though unspecified information may become
specified information. Unspecified information need not remain
unspecified, but may become specified as our background knowledge
increases. For instance, a cryptographic transmission whose cryptosystem
we have yet to break will constitute unspecified information.
Yet as soon as we break the cryptosystem, the cryptographic transmission
becomes specified information.
What is it for a possibility to be identifiable via
an independently given pattern? A full exposition of specification
requires a detailed answer to this question. Unfortunately, such
an exposition is beyond the scope of this paper. The key conceptual
difficulty here is to characterize the independence condition
that obtains between patterns and information. This independence
condition in turn decomposes into two conditions: (1) a condition
to stochastic conditional independence between the information
in question and certain relevant background knowledge; and (2)
a tractability condition whereby the pattern in question is constructible
via the aforementioned background knowledge. Although these conditions
make good intuitive sense, they are not easily formalized. For
the details refer to my monograph The Design Inference.
If formalizing what it means for a pattern to be
given independently of a possibility is difficult, determining
in practice whether a pattern is given independently of a possibility
is much easier. If the pattern is given prior to the possibility
being actualizedÑas in the second scenario above where
the target was painted before the arrow was shotÑthen the
pattern is automatically independent of the possibility, and we
are dealing with specified information. Patterns given prior
to the actualization of a possibility are just the rejection regions
of statistics. There is a well-established statistical theory
that describes such patterns and their use in probabilistic reasoning.
These are clearly specifications since having been given prior
to the actualization of some possibility, they have already been
identified, and thus are identifiable independently of the possibility
being actualized.
Many of the interesting cases of specified information,
however, are those in which the pattern is given after a possibility
has been actualized. This is certainly the case with the origin
of life: life originates first and only afterwards do pattern-forming
rational agents (like ourselves) enter the scene. It remains
the case, however, that a pattern corresponding to a possibility,
though formulated after the possibility has been actualized, can
constitute a specification. Certainly this was not the case in
the third scenario above where the target was painted around the
arrow only after it hit the wall. But consider the following
example. Alice and Bob are celebrating their fiftieth wedding
anniversary. Their six children all show up bearing presents.
Each present is part of a matching set of china. There is no
duplication of presents, and together the presents form a complete
set of china. Suppose Alice and Bob were satisfied with their
old set of china, and had no inkling prior to opening their presents
that they might expect a new set of china. Alice and Bob are
therefore without a relevant pattern whither to refer their presents
prior to actually receiving the presents from their children.
Nevertheless, the pattern they explicitly formulate only after
receiving the presents could be formed independently of receiving
the presents (after all, their colluding children formed just
such a pattern prior to delivering their presents; so too, the
china manufacturer formed this pattern to construct the china
in the first place). This pattern therefore constitutes a specification.
But what about the origin of life? Is the origin
of life specified? If so, to what patterns does life correspond,
and how are these patterns given independently of lifeÕs
origin? As was just pointed out, pattern-forming rational agents
like ourselves donÕt enter the scene till after life originates.
Nonetheless, there are functional patterns to which life corresponds,
and which are given independently of the actual living systems.
An organism is a functional system comprising many functional
subsystems. The functionality of organisms can be cashed out
in any number of ways. Arno Wouters (1995) cashes it out globally
in terms of viability of whole organisms. Michael Behe (1996)
cashes it out in terms of the irreducible complexity and minimal
function of biochemical systems. Even the staunch Darwinist Richard
Dawkins will admit that life is specified functionally, cashing
out the functionality of organisms in terms of genetic reproduction.
Thus Dawkins (1987, p. 9) will write: ÒComplicated things
have some quality, specifiable in advance, that is highly unlikely
to have been acquired by random chance alone. In the case of
living things, the quality that is specified in advance is .Ê.Ê.
the ability to propagate genes in reproduction.Ó
Life is specified. Life is also complex. The origin
of life is the origin of complex specified information. This
then, suitably reformulated, is Manfred EigenÕs problemÑhow
to explain the origin of complex specified information. Complex
specified information, or CSI for short, is what all the fuss
over information has been about in recent years, not just in biology,
but within science more generally. It is CSI that the various
anthropic principles are trying to explain when they account for
the fine-tuning of the universe (cf. Barrow and Tipler, 1986).
It is CSI that David BohmÕs quantum potentials are extracting
when they scour the microworld for what Bohm calls Òactive
informationÓ (cf. Bohm, 1993, pp. 35-38). It is CSI that
enables MaxwellÕs demon to outsmart a thermodynamic system
tending towards thermal equilibrium (cf. Landauer, 1991, p. 26)
It is CSI that David Chalmers posits in attempting to explain
human consciousness (Cf. Chalmers, 1996, ch. 8). It is CSI that
the mathematician Keith Devlin (1991, p.Ê1) intends when
he writes: ÒThat there is such a thing as information
cannot be disputed.Ê.Ê.Ê. After all, our very
lives depend upon it, upon its gathering, storage, manipulation,
transmission, security, and so on. Huge amounts of money change
hands in exchange for information. People talk about it all the
time. Lives are lost in its pursuit. Vast commercial empires
are created in order to manufacture equipment to handle it. Surely
then it is there.Ó
Nor is CSI confined to the domain of science. CSI
is indispensable in our everyday lives. The 16-digit number on
your VISA card is an example of CSI. The complexity of this number
ensures that a would-be thief cannot randomly pick a number and
have it turn out to be a valid VISA card number. WhatÕs
more, the specification of this number ensures that it is your
number, and not anyone elseÕs. Even your phone number
constitutes CSI. As with the VISA card number, the complexity
ensures that this number wonÕt be dialed randomly (at least
not too often), and the specification that this number is yours
and yours only. All the numbers on our bills, credit slips, and
purchase orders represent CSI. CSI makes the world go round.
Consequently, CSI is also a rife field for criminality. CSI
is what motivated the villainous Michael Douglas character in
the movie Wall Street to lie, cheat, and steal. CSIÕs
total and absolute control was the objective of the monomaniacal
Ben Kingsley character in the movie Sneakers. CSI is the artifact
of interest in most techno-thrillers. Ours is an information
age, and the information that excites us most is CSI.
The Law of Conservation of Information
With this characterization of CSI in hand, I want
now to return to Manfred EigenÕs central problemÑthe
origin of CSI. Where does CSI come from, and where is CSI incapable
of coming from? According to Eigen, CSI comes from algorithms
and natural laws. To recall EigenÕs dictum: ÒOur
task is to find an algorithm, a natural law that leads to the
origin of [complex specified] information.Ó The only
question for Eigen is which algorithms and natural laws explain
the origin of CSI. The logically prior question of whether algorithms
and natural laws are even in-principle capable of explaining the
origin of CSI is not one he properly considers. And yet it is
a question whose answer vitiates EigenÕs entire project.
Algorithms and natural laws are in-principle incapable of explaining
the origin of information. To be sure, algorithms and natural
laws can explain the flow of information. Indeed, algorithms
and natural laws are ideally suited for transmitting already existing
information. What they cannot do, however, is originate information.
The easiest way to see this is mathematically. From
a mathematical point of view algorithms and natural laws are just
functions, that is, relations between two sets, which to every
member in one set (called the domain) associates one, and only
one, member in the other set (called the range). As such, the
functional relationship is fully deterministic: given an element
in the domain, the function determines a unique element in the
range. For algorithms the domain comprises the various possible
input data, and the range the various possible output data. For
natural laws the domain comprises the various possible initial
and boundary conditions, and the range the various possible states
at subsequent times t. Now suppose we had some CSI j, and a function
(qua algorithm or natural law) f that, to quote Eigen again, led
to the origin of the [complex specified] information j. This
would mean that some element in the domain of f, call it i, when
acted on by f, yielded the output j. But this hardly explains
the origin of the information j. One problem has been solved
by creating another, for now the origin of i must be explained.
Worse yet, the newly created problem is no easier
than the one we started with. Functional relationships at best
preserve what information is already there, or else degrade itÑbut
they can never add to it. Thus however much information resides
in j will be contained in any i that via the function f maps onto
j. WhatÕs more, if j is specified, then the inverse image
under the function f will also be specified (in particular, since
i maps onto j via f, i is in this inverse image). In short, if
j constitutes complex specified information and f is a function
that maps i onto j, then i constitutes specified information at
least as complex as j. Thus instead of explaining the origin
of CSI, algorithms and natural laws shift the problem elsewhere
to a place where the origin of CSI will be at least as difficult
to explain.
It is vital to realize that functions can only make
the information problem worse. Suppose, for instance, you look
at the U.S. Statistical Abstract and find that the average income
of a U.S. citizen is so-much-and-so-much. How did this item of
information originate? Well, the census bureau had to contact
all the U.S. citizens, record their individual incomes, add the
incomes all together, and divide by the number of U.S. citizens.
To take an average is thus to apply a functionÑgiven the
input data (all the individual U.S. incomes), the output data
is uniquely determined. But more so, to take an average is also
to compress data. The information inherent in the record of all
individual incomes far exceeds the information inherent in the
corresponding average. Taking an average is a standard statistical
technique for compressing data. In an information age we are
inundated with information. Thus frequently when we look at information,
we look at information whose complexityÑas a service to
the information seekerÑhas already been drastically compressed.
There is one subtlety we need now to consider, and
it is the one which not just Manfred Eigen, but also Ilya Prigogine,
Stuart Kauffman, and indeed the entire Santa Fe Institute group
is pinning their hopes on. I have just argued that when a function
acts to yield information, what the function acts upon has at
least as much information as what the function yields. This argument,
however, treats functions as mere conduits of information, and
does not take seriously the possibility that functions might actually
add information. I give the example of taking an average whereby
data is compressed and information is lost. But consider the
function which maps library call numbers to their corresponding
books. Clearly, there is less information in the call numbers
than in the books. Thus here we have a function that is adding
information. WhatÕs more, it is adding information because
the information is embedded in the function itself.
Although this observation seems to undermine my previous
argument, in fact it leaves the previous argument virtually unchanged.
The point is that instead of the function f now merely serving
as a conduit taking information i and yielding information j,
the information in f must now itself be taken into account. The
way to do this is to employ the universal composition function
U, which to an ordered information-function pair (i,f) assigns
the information obtained by applying f to iÑin this case
j. Thus U(i,f) = f(i) = j. Now unlike f, which may well incorporate
information, U, the universal composition function, incorporates
no information of its own, but serves merely as a conduit for
information. By simply taking ordered pairs, and treating the
second element as a function applied to the first, U introduces
no information of its own. U adds no information. Note that
in the case of algorithms U is just a universal Turing machine.
The form of my original argument is therefore unchanged: the
information j arises by applying U (cf. f in the original argument)
to the information (i,f) (cf. i in the original argument. Just
as in computer science the distinction between data and programs
is not hard and fast, so the distinction between functions and
information is not hard and fast. We can thus treat the ordered
pair (i,f) as information which via the universal composition
function yields the information U. And now it is clear that information
inherent in (i,f) exceeds that in j. Like a bulge under a rug,
the information problem can be shifted around, but it does not
go away.
This argument, by employing the universal composition
function, is perfectly general. In particular, it answers the
attempt by complexity-theorists to account for the origin of information
in terms of dynamical systems (for popular accounts of this enterprise
see Levy 1992 and Waldrop 1992). Complexity-theorists, especially
the Santa Fe Institute group, continue to hope that information
can be gotten on the cheap. ÒLook at all those amazing
fractal patterns,Ó we are told. ÒThe incredibly
intricate Mandelbrot set is generated by so modest a complex function
as h(z) = z2Ê+Êc.Ó To state the matter in
this way, however, is misleading. The function h(z) = z2 + c
is simple enough, and even simpler to write down. And granted,
it is the crucial element in constructing a graphic depiction
of the Mandelbrot set. But that is the point: It is the graphic
depiction of the Mandelbrot set that has to be explained, not
its existence as an abstract mathematical object. And this graphic
depiction has to be constructed.
Pixels on a computer screen have to be assigned coordinates
representing complex numbers. The function h(z) = z2 + c has
to be iterated with respect to those coordinates. The trajectory
of those iterations needs to be tracked to see if the trajectory
stays locally bounded or heads off towards infinity. Given these
trajectories, a color has to be assigned to the pixel, black if
the trajectory stays locally bounded, white if it heads off to
infinity. All of this must be programmed. All of this is information
far exceeding the information inherent in simply writing down
Òh(z) = z2 + c.Ó The function h(z) = z2 + c is
never the function that produces the pretty graphic depictions
of the Mandelbrot set we see in books on fractals. Any function
that produces a graphic depiction of the Mandelbrot set will be
a complicated algorithm employing a complicated set of input data.
Any such algorithm f applied to a data set i can be conjoined
as an ordered pair (i,f), and then evaluated by the universal
composition function U to produce a graphic depiction of the Mandelbrot
set j. But by itself the function h(z) = z2 + c is too information-poor
to produce this graphic depiction of the Mandelbrot set j. Once
we examine the precise informational antecedents to j, the illusion
that we have generated information for nothing disappears.
The origin of CSI simply cannot be explained as the
output of a function, be it an algorithm, a natural law, or whatever.
The root problem here is that functions are deterministic, and
thus cannot yield contingency. Recall that information becomes
realizable only as a multiplicity of distinct possibilities obtains
any one of which might actually happen. The problem with functions
is that they invariably yield only a single live possibility.
Take a computer algorithm that performs addition. Let us say
the algorithm has a correctness proof, so that it performs its
additions correctly. Given the input data 2+2, can the algorithm
output anything other than 4? Algorithms, and functions more
generally, are wholly deterministic. They allow for no contingency,
and thus can generate no information. At best functions can shift
information around, or lose it, as when data gets compressed.
What they cannot do is produce contingency. And without contingency
they cannot generate information.
If not by means of functions, how then does contingency
arise? Two, and only two, answers are possible here. Either
the contingency is a blind, purposeless contingencyÑwhich
is chance; or it is a guided, purposeful contingencyÑwhich
is intelligent causation. We shall return to intelligent causation
in due course, but for now let us examine whether chance is capable
of generating CSI. First notice that pure chance, entirely unsupplemented
and left to its own devices, is incapable of generating CSI.
Chance can generate complex unspecified information, and chance
can generate non-complex specified information. What chance cannot
generate is information that is jointly complex and specified.
To see this, consider again our archer friend who
fires arrows at a large blank wall. The archer, even if driven
purely by chance, is perfectly capable of generating complex unspecified
information: the precise place where the arrow hits a large blank
wall signifies a highly improbable unspecified event, instancing
complex unspecified information (recall that high probability
corresponds to low complexity whereas low probabilityÑi.e.,
high improbabilityÑcorresponds to high complexity). Alternatively,
if a target is painted on the wall, but the target is so large
that the bullÕs-eye takes up half the area of the wall,
then the archer, even if driven purely by chance, will be quite
likely to hit the bullÕs-eye, thereby generating non-complex
specified information: hitting the bullÕs-eye signifies
a specified high-probability event, instancing non-complex specified
information. What an archer driven purely by chance cannot do
is, having painted a minuscule target on the wall, hit the bullÕs-eye,
thereby generating information that is both complex and specified:
hitting the bullÕs-eye of a minuscule target signifies
a highly improbable specified event, instancing complex specified
informationÑCSI.
But canÕt someone simply by chance let fly
an arrow and hit a bullÕs-eye? Not if the target is sufficiently
small. At some point the improbabilities become too vast and
the specifications too tight for chance to be taken seriously.
Just where this point is first reached can be debated, but that
there is a probabilistic cut-off beyond which chance becomes an
unacceptable explanation is beyond doubt. The universe will experience
heat death before random typing at a keyboard produces a Shakespearean
sonnet. The French mathematician Emile Borel (1962, p. 28) proposed
1050 as a universal probability bound below which chance
could definitely be precluded, i.e., any specified event as improbable
as this could not be attributed to chance. Borel based his universal
probability bound on cosmological considerations, taking into
account the opportunities to repeat and observe events through
the history and expanse of the universe. BorelÕs 1050
probability bound translates into 170 bits of information. I
have proposed a more stringent universal probability bound of
10150 based on the number of elementary particles in the
universe, the Planck time, and the duration of the universe until
its head death (see Dembski, 1996, ch. 6). A probability bound
of 10150 translates into 500 bits of information. The bound
I propose is more securely justified than BorelÕs. Given
a universal probability bound of 10150 we therefore refuse
to attribute to chance specified information with a complexity
of 500 or more bits. I have yet to encounter CSI with a complexity
greater than the 500 bits for which chance is an adequate explanation.
Biologists by and large do not dispute this claim.
Most are agreed that pure chanceÑthe Epicurean hypothesis
as Hume called itÑis not an adequate explanation for CSI.
Jacques Monod (1972) is one of the few exceptions, arguing that
the origin of life, though vastly improbable, can nonetheless
be attributed to chance because of a selection effect. Just as
the winner of a lottery is shocked at winning, so we are shocked
to have evolved. But the lottery was bound to have a winner,
and so too something was bound to have evolved. Something vastly
improbable was bound to happen, and so, the fact that it happened
to us (i.e., that we were selectedÑhence the name selection
effect) does not preclude chance. This is MonodÕs argument
and it is fallacious. It has been refuted by the philosophers
John Earman, William Craig, and Richard Swinburne. It has also
been refuted by the biologists Wolfgang Stegmller, Bernd
Olaf-Kppers, and Hubert Yockey. SwinburneÕs refutation
is perhaps the most memorable (Swinburne, 1979, p. 138):
Suppose that a madman kidnaps a victim and shuts
him in a room with a cardshuffling machine. The machine shuffles
ten packs of cards simultaneously and then draws a card from each
pack and exhibits simultaneously the ten cards. The kidnapper
tells the victim that he will shortly set the machine to work
and it will exhibit its first draw, but that unless the draw consists
of an ace of hearts from each pack, the machine will simultaneously
set off an explosion which will kill the victim, in consequence
of which he will not see which cards the machine drew. The machine
is then set to work, and to the amazement and relief of the victim
the machine exhibits an ace of hearts drawn from each pack. The
victim thinks that this extraordinary fact needs an explanation
in terms of the machine having been rigged in some way. But the
kidnapper, who now reappears, casts doubt on this suggestion.
ÒIt is hardly surprising,Ó he says, Òthat
the machine [drew] only aces of hearts. You could not possibly
see anything else. For you would not be here to see anything
at all, if any other cards had been drawn.Ó But of course
the victim is right and the kidnapper is wrong. There is indeed
something extraordinary in need of explanation in ten aces of
hearts being drawn. The fact that this peculiar order is a necessary
condition of the draw being perceived at all makes what is perceived
no less extraordinary and in need to explanation.
Selection effects do nothing to render chance an
adequate explanation of complex specified information. For a
detailed treatment of selection effects and their failure to account
for CSI, see Dembski (1996, sec. 6.3).
Most biologists then reject pure chance as an adequate
explanation of CSI. The problem here is not simply one of faulty
statistical reasoning. Besides flying in the face of every canon
of sound statistical reasoning, pure chance is scientifically
unsatisfying as an explanation of CSI. To explain CSI in terms
of pure chance is no more instructive than pleading ignorance
or proclaiming CSI a mystery. It is one thing to explain the
occurrence of heads on a coin toss by appealing to chance. It
is quite another, as Kppers (1990, p. 59) points out, to
follow Monod and take the view that Òthe specific sequence
of the nucleotides in the DNA molecule of the first organism came
about by a purely random process in the early history of the earth.Ó
CSI cries out for explanation, and pure chance wonÕt do
it. Richard Dawkins (1987, pp. 139, 145-146) makes this point
eloquently:
We can accept a certain amount of luck in our explanations,
but not too much.Ê.Ê.Ê. In our theory of how
we came to exist, we are allowed to postulate a certain ration
of luck. This ration has, as its upper limit, the number of eligible
planets in the universe.Ê.Ê.Ê. We [therefore]
have at our disposal, if we want to use it, odds of 1 in 100 billion
billion as an upper limit (or 1 in however many available planets
we think there are) to spend in our theory of the origin of life.
This is the maximum amount of luck we are allowed to postulate
in our theory. Suppose we want to suggest, for instance, that
life began when both DNA and its protein-based replication machinery
spontaneously chanced to come into existence. We can allow ourselves
the luxury of such an extravagant theory, provided that the odds
against this coincidence occurring on a planet do not exceed 100
billion billion to one.
Dawkins is right. We can allow our scientific theorizing
only so much luck. After that we degenerate into handwaving and
mystery. A probability bound of 10150, or a corresponding
complexity bound of 500 bits of information, sets a conservative
limit on the amount of luck we can allow ourselves (certainly
more conservative than the one Dawkins was just now alluding to).
We may summarize our findings up to this point as
follows: (1) Chance generates contingency, but not complex specified
information. (2) Functions (e.g., algorithms and natural laws)
generate neither contingency, nor information, much less complex
specified information. (3) At best functions transmit already
present information. Given these three findings, it seems intuitively
obvious that no chance-function combination is going to generate
information either. After all, functions transmit what they are
given, and whatever chance gives a function is not complex specified
information. Ergo, chance and functions working in tandem cannot
generate information. This intuition is of course exactly right,
and I shall provide a theoretical justification for it momentarily.
Nevertheless, the sense that functions can sift chance and thereby
generate CSI is deep-seated in the scientific community. Trial
and error is the basis for all sorts of probabilistic algorithms
(e.g., genetic algorithms), and what is trial and error but the
sifting of chance by means of a function? WhatÕs more,
the very Darwinian mechanism of mutation and natural selection
is a chance-function combination, in which the variability of
the organism provides the chance component, and selection pressure
from the environment provides the function component.
The theoretical justification for the inability of
chance and functions working in tandem to generate information
is virtually the same as the theoretical justification given earlier
for the inability of functions by themselves to generate information.
Instead of considering a deterministic function f(i) in one variable,
we now consider an indeterministic function f(i,w) in two variables
where the first variable signifies the object on which the function
acts, and the second the randomizing component. We then define
the universal composition function U which inputs the object-chance-function
ordered triple (i,w,f) and outputs f(i,w) = j, i.e., U(i,w,f)
= f(i,w) = j. As in the deterministic case, the universal composition
function U incorporates no information of its own, but serves
merely as a conduit for information. U adds no information.
The formalism just described for combining chance and functions
is perfectly general, and accommodates everything from DarwinÕs
mutation-selection mechanism to the probabilistic algorithms of
computer science (genetic algorithms being a case in point).
Now suppose we had some CSI j, and an indeterministic
function (i.e., chance-function combination) f that, to quote
Eigen again, led to the origin of the CSI j. The origin of the
CSI j can then be broken into two stages. In the first stage,
a chance outcome w occurs. Once w occurs and is fixed, the function
f becomes deterministic, i.e., f becomes the function in one variable
f(.,w) = fw(.), w now being treated as a fixed parameter of the
function f. In the second stage, the parameterized deterministic
function fw(.) gets applied to some element in its domain, call
it i, yielding the item of interest, the CSI j. From this it
is clear that neither of these stages can generate CSI. The first
stage involves only chance, and therefore, as was argued earlier,
cannot generate CSI. The second stage involves no chance, but
only a deterministic function, and therefore, as was argued earlier,
cannot generate CSI either. Thus at no point in the transition
from w to fw(.) to fw(i) = j is CSI created. Whatever CSI is
inherent in j is therefore already inherent in the indeterministic
function f together with the nonrandom element in the domain of
f, namely, i. This argument is valid and holds universally.
Just as chance or functions left to themselves individually cannot
purchase CSI, so too their joint action cannot purchase CSI either.
This result, that neither chance nor functions nor
some combination of the two can generate CSI, I call the Law of
Conservation of Information, or LCI for short. Though formulated
at a high level of mathematical abstraction, LCI has many profound
implications for science. Among its immediate corollaries are
the following: (1) The CSI within a system closed to outside
information always remains constant or decreases. (2) If CSI
increases within a system, then CSI was added exogenously. (3)
CSI cannot be generated spontaneously, originate endogenously,
or organize itself. (4) To explain the CSI within a system is
to appeal to a system whose CSI is equal or greater in complexity
still (in particular, reductive explanations of CSI are never
adequate).
Applying the Theory to
Evolutionary Biology
Up to this point I have sketched a theory of complex
specified information, and concluded with a general law characterizing
the origin of complex specified information, to wit, the Law of
Conservation of Information. I want next to apply this theory
to evolutionary biology. Before doing so, however, it will be
convenient to provide a synonym for the term Òfunction.Ó
As I have used the term, function signifies a certain law-like
mathematical relation between two sets. In the sequel it will
therefore be convenient to use the word ÒlawÓ to
signify functions. If we do this, the Law of Conservation of
Information has the following perspicuous formulation: Neither
law nor chance nor some combination of the two can generate complex
specified information. The reference to functions was useful
so long as their mathematical properties were being explicitly
cited. But continued reference to them, especially when juxtaposed
with chance, will tend henceforth obscure rather than clarify.
Thus in particular we shall refer to deterministic laws as functions
of the form f(i) = j and indeterministic laws as chance-function
combinations of the form f(i,w) = j with random component w.
In applying the theory of information here developed
to evolutionary biology, let us begin by noting that nothing in
this theory so far undermines the naturalistic accounts of evolution
currently in vogue. All that has been shown so far is that CSI
is not a free lunch in the sense that law and chance together
cannot generate CSI. But law and chance can take already existing
CSI and shift it around. And there is nothing to prevent CSI
from being abundant in the universe, and thus to prevent law and
chance from expressing CSI in the origin and development of biological
systems. With Hubert Yockey (1992, p. 335) we could therefore
say that CSI, and by implication life, is axiomatic, and leave
it at that. Like the principle of rationality which according
to the ancient Stoics pervaded the universe, we could simply treat
CSI as a given.
Although this move might be philosophically justified,
it remains scientifically unsatisfying. As scientists we want
to know how the CSI which supposedly is so abundant in the universe
got itself into the organisms we see around us. In reference
to the origin of life, we want to know the informational pathway
that takes the CSI inherent in a lifeless universe, and translates
it into a protobiont. In reference to the development of life,
we want to know the informational pathway that takes the CSI inherent
in an already existing organism plus its environment, and translates
this CSI into an organism of still greater complexity. Even if
the origin of CSI admits no scientific explanation, its flow surely
does. How then does CSI flow into and out of biological systems?
The answer to this question, at least in broad terms,
is clear: The CSI inherent in an organism consists of the CSI
acquired at birth together with whatever CSI it acquires during
the course of its life. The CSI acquired at birth derives from
inheritance with modification (i.e., the CSI acquired at birth
is inherited from the parent(s) and consists of the CSI inherent
in the parent(s) as modified by chance). The CSI acquired after
birth consists of selection (i.e., the environmental pressure
that selects some organisms to reproduce and eliminates others
before they can reproduce) along with infusion (i.e., the direct
introduction of novel information from outside the organism).
The Darwinian mechanism admits selection and inheritance with
modification, but proscribes infusion. The Lamarckian mechanism,
on the other hand, focuses mainly on infusion. Certainly infusion
as Lamarck conceived it has largely been discredited. Nevertheless,
there is good scientific evidence for non-Lamarckian infusion
wherein organic informational structures belonging to one organism
are assimilated by another. For instance, it is well-established
that bacteria exchange plasmids as a way of developing antibiotic
resistance (cf.ÊAmbile-Cuevas et al., 1995, p. 324).
On the other hand, Lynn MargulisÕs idea of symbiosis,
where organisms co-opt and assimilate other organisms to form
still more complex organisms, remains speculative (cf. Margulis,
1993).
Inheritance with modification, selection, and infusionÑthese
three account for the CSI inherent in biological systems. Together
they comprise all the sources of CSI in biology. I want therefore
to examine more closely the respective roles of these three sources
in contributing to the CSI of an organism. First consider inheritance
with modification (alternatively, inheritance and mutation).
Inheritance is merely a conduit for already existing information
and modification is merely chance operating on the information
passing through this conduit. It follows that by itself inheritance
with modification is incapable of explaining the increased complexity
of CSI that organisms have exhibited in the course of natural
history. Inheritance with modification needs therefore to be
supplemented.
The most obvious candidate here, of course, is selection.
Selection presupposes inheritance with modification, but instead
of merely shifting around already existing information, selection
also introduces new information. By seizing on advantageous modifications,
selection is able to introduce new information into a population.
The majority view in biologyÑknown as the neo-Darwinian
synthesisÑis that selection and inheritance with modification
together are adequate to account for all the CSI inherent in organisms.
As a parsimonious account of the origin and development of life,
this view has much to commend it. Unfortunately, this view places
undue restrictions on biological information flow, restrictions
which biological systems seem routinely to violate. The problem
is that selection and inheritance with modification can only yield
very gradual increases in the informational complexity of organisms,
whereas many of the increases in the informational complexity
of organisms are abrupt and large.
This point deserves careful attention. Suppose that
an organism in reproducing generates N offspring, and that of
these N offspring M (1ʲÊMʲÊN)
succeed in reproducing. The amount of information introduced
through selection is then Ðlog2M/N. Let me stress that this
formula is not an case of misplaced mathematical exactness. This
formula holds universally and is non-mysterious. Take a simple
non-biological example. If I am sitting at a radio transmitter,
and can transmit only zeros and ones, then every time I transmit
a zero or one, I choose between two possibilities, selecting precisely
one of them. Here N equals 2 and M equals 1. The information
Ðlog2M/N thus equals Ðlog21/2 = 1, i.e., 1 bit of information
is introduced every time I transmit a zero or one. This is of
course as things should be. Now this example from communication
theory is mathematically isomorphic to the case of cell-division
where only one of the daughter cells goes on to reproduce. On
the other hand, if both daughter cells go on to reproduce, then
N equals M equals 2, and thus Ðlog2M/N = Ðlog22/2 = 0,
indicating that selection, by failing to eliminate any possibility
failed also to introduce new information. To take another example,
imagine you are typing at a keyboard consisting of the twenty-six
capital Roman letters. Thus every time you type a key you select
one of twenty-six letter. Here N equals 26 and M equals 1. The
information Ðlog2M/N thus equals Ðlog21/26 = 4.7, i.e.,
4.7 bits of information are introduced every time you type a key.
Or consider a dog breeder who from a given litter of seven Boston
terriers selects two for reproduction. The dog breeder thus introduces
Ðlog22/7 = 1.8 bits of information into those Boston terriers
selected for reproduction. (In the formula Ðlog2M/N and throughout
these examples I have assumed a uniform probability distribution.
This simplifying assumption, however, only strengthens our case:
since uniform probability distributions maximize entropy, on
average the information introduced through selection will in fact
fall below Ðlog2M/N.)
ItÕs therefore clear that selection among
the offspring of an organism can at most introduce a few bits
of information. Cell division, the preeminent form of reproduction,
and the only one prior to multi-cellular life, introduces at most
one bit of information. Even if an organism can produce 1030
gametesÑso many gametes that their biomass would equal
that of the earth, and each of these became mature organisms,
and then only one of these mature organisms were selected for
further reproduction, the total number of bits of information
introduced through selection would in this instance be Ðlog21/1030
= 100. A hundred bits of information is far less information
than is contained in an average protein.
From these observations it is clear that selection
can accumulate a lot of information over successive generations.
As is noted Joklik and WillettÕs (1976, p. 78) microbiology
text, ÒWithin a short period, often as short as 20 minutes,
a bacterium can create a complete duplicate of itself, which in
turn is capable of duplicating.Ó Over a billion years,
at one bit of information introduced every twenty minutes, selection
could in principle produce 26 trillion bits of information, certainly
enough to handle any conceivable genome. Nonetheless, from these
observations it is equally clear that selection can only produce
a very limited amount of information at any one generation. 100
bits is certainly too generous. The most fecund breeders with
which I am familiar are certain fish whose spawn include a hundred
million eggs. A realistic upper limit on the amount of biological
information introduced by selection is therefore around 30 bits.
For many organisms it is far less. Mammals, for instance, have
an upper limit of about 5 bits of information per generation through
selection.
The preceding analysis gives new urgency to DarwinÕs
(1859, p. 189) famous challenge: ÒIf it could be demonstrated
that any complex organ existed, which could not possibly have
been formed by numerous, successive, slight modifications, my
theory would absolutely break down.Ó In information-theoretic
terms, this is to say that if informational jumps of considerably
more than thirty bits are required in any one generation, then
some means of producing information other than selection must
be sought. Have such informational jumps been discovered? Darwin
and his disciples believe in the infinite plasticity of organisms
to change gradually from one form into another. This belief,
however, no longer seems justified.
Perhaps the clearest examples of informational jumps
that exceed the power of selection occur in biochemistry. Michael
Behe (1996) and Siegfried Scherer (1983) have both examined biochemical
systems which if produced by selection need to be produced in
a single generation, but whose information requirements exceeds
what selection can deliver in a single generation. The key feature
of these biochemical systems is one Behe calls irreducible complexity.
A system is irreducibly complex if it consists of several interrelated
components the removal of any one of which leads to the complete
loss of function of the system. As an example of irreducible
complexity, Behe (1996, p. 43) offers a mousetrap. A mousetrap
consists of a platform, a hammer, a spring, a catch, and a holding
bar. Remove any one of these five components, and it is impossible
to construct a functional mousetrap. Irreducible complexity needs
to be contrasted with reducible complexity. A system is reducibly
complex if it contains a dispensable component, i.e., a component
which can be removed without destroying functionality. An example
of a reducibly complex system is a pocket watch. The glass face
that covers and protects the dial is not necessary for the watch
to keep time. It can be removed without destroying the watchÕs
function (function may be diminished, but it is not lost).
Besides being contrasted with reducible complexity,
irreducible complexity needs also to be contrasted with cumulative
complexity. A system is cumulatively complex if the components
of the system can be arranged sequentially so that the successive
removal of components never leads to the complete loss of function.
An example of a cumulatively complex system is a city. It is
possible successively to remove people and services from a city
until one is down to a tiny village, all without losing the cohesiveness
of the community, which in this case constitutes functionality.
Note, however, that the order in which people and services are
removed is important. To remove as the first thing the police
and courts from a large city would result in chaos. Observe that
it is possible to define cumulative complexity recursively in
terms of reducible complexity: A system is cumulatively complex
if it is reducibly complex, and if after the removal of some component
from the system, the system is again cumulatively complex. It
follows that cumulatively complex systems are always reducibly
complex. The converse, however, is not the case. Reducibly complex
systems may contain an irreducibly complex core, and thus fail
to be cumulatively complex. For instance, a pocket watch, though
reducibly complex, contains certain ineliminable components without
which the watch cannot function, e.g., hour and minute hands,
certain gears and springs, and a base to keep all these elements
together. Such ineliminable components form the irreducible core
of the pocket watch.
Given these types of complexityÑirreducible,
reducible, and cumulativeÑit is clear that selection can
account for cumulative complexity. The gradual accrual of information
via selection mirrors the retention of function as components
are removed in cumulative complexity. Selection has no problem
producing cumulative complexity. But what about irreducible complexity?
Can selection produce irreducible complexity? Certainly if selection
acts with reference to a goal, it can produce an irreducibly complex
system. Take Michael BeheÕs mousetrap, for instance.
Given the goal of constructing a mousetrap, one can specify a
goal-directed selection process that in turn selects a platform,
a hammer, a spring, a catch, and a holding bar, and at the end
puts all these components together to form a functional mousetrap.
Given a pre-specified goal, selection has no difficulty producing
irreducibly complex systems.
But the selection that operates in biology is Darwinian
natural selection. And this form of selection operates without
goals, has neither plan nor purpose, and is wholly undirected
(cf. Miller and Levine, 1993, p.Ê658). The great appeal
of DarwinÕs selection mechanism was precisely that it would
eliminate teleology from biology. Yet by making selection an
undirected process, Darwin drastically abridged the type of complexity
biological systems could manifest. Henceforth biological systems
could manifest only cumulative complexity, not irreducible complexity.
Why is this? As Behe (1996, p. 39) explains, ÒAn irreducibly
complex system cannot be produced .Ê.Ê. by slight,
successive modifications of a precursor system, because any precursor
to an irreducibly complex system that is missing a part is by
definition nonfunctional.Ê.Ê.Ê. Since natural
selection can only choose systems that are already working, then
if a biological system cannot be produced gradually it would have
to arise as an integrated unit, in one fell swoop, for natural
selection to have anything to act on.Ó
Recall that for the complex specified information
inherent in organisms, what specifies this information is functionality.
The organism as a whole, as well as its various subsystems are
specified in virtue of the respective functions these systems
perform. For irreducibly complex systems, however, function is
attained only when all components of a system are in place. Moreover,
natural selection, insofar as it introduces complex specified
information into organisms, must select for function. It follows
that natural selection, if it is going to produce an irreducibly
complex system, has to produce it all at once or not at all.
Of course, this would not be a problem if the amount of information
natural selection can produce in a single generation matches or
exceeds the amount of information inherent in the irreducibly
complex systems of biology. But nothing like this is the case.
Whereas natural selection at its very best can introduce about
30 bits of information per generation, the irreducibly complex
biochemical systems Michael Behe considers in DarwinÕs
Black Box contain several orders of magnitude more information.
These irreducibly complex biochemical systems, like the bacterial
flagellum, are protein machines consisting of numerous distinct
proteins, each indispensable for the function of the machine (hence
the irreducible complexity), and where each individual protein
in the machine requires more bits of information than natural
selection can conceivably produce in a single generation.
The irreducible complexity of biochemical systems
counts decisively against the joint action of selection and inheritance
with modification to account for the CSI in biological systems.
Because irreducible complexity occurs at the biochemical level,
there is no lower level of biological analysis to which the irreducible
complexity of biochemical systems might be referred, and at which
a Darwinian analysis in terms of selection and inheritance with
modification might still hope for success. Undergirding biochemistry
is ordinary chemistry and physics, neither of which can account
for biological information (cf. Yockey, 1992). Also, whether
a biochemical system is irreducibly complex is a fully empirical
question: Individually knock out each protein constituting an
irreducibly complex biochemical system, and determine whether
function is lost. If so, we are dealing with an irreducibly complex
system. Mutagenesis experiments of this sort are routine in biochemistry.
If the joint action of selection and inheritance
with modification is unable to account for the CSI in biological
systems (and specifically for the irreducible complexity of certain
biochemical systems like the bacterial flagellum), there remains
but one source for the CSI in biological systems, namely, infusion,
the direct introduction of novel information from outside the
biological system. In principle there is nothing problematic
or controversial about infusion. To innovate a given informational
structure an organism has informational needs, and these needs
can be supplied from outside the organism, either through selection
pressures (and therefore indirectly), or by the insertion of ready-to-go
information into the organism (and therefore directly). The latter
is of course infusion.
Although at this level of generality infusion is
unproblematic, it quickly becomes problematic once we start tracing
backwards the informational pathways of infused information.
Consider for instance what is perhaps the best scientifically
confirmed instance of infusion in biology, namely, plasmid exchange
among bacteria to develop antibiotic resistance (cf.ÊAmbile-Cuevas
et al., 1995, p. 324). Plasmids are small circular pieces of
DNA that can easily be exchanged among bacteria of the same species,
and are capable of conferring antibiotic resistance. When one
bacterium releases a plasmid and another absorbs it, information
is infused from one into the other. By itself this is unproblematic.
Problems begin, however, when we ask, Where did the bacterium
that released the plasmid in turn derive it? There is a regress
here, and this regress always terminates in something non-organismal.
We canÕt just keep explaining plasmid infusion into a
bacterium by plasmid release from another bacteriumÑeventually,
as we trace the informational pathway back, we must tell a different
kind of story. If, for instance, the plasmid is cumulatively
complex, then it could have arisen through selection and inheritance
with modification. But if on the other hand it is irreducibly
complex, whence could it have arisen?
It will be helpful here to distinguish between symbiotic
and abiotic infusion, and correspondingly between endogenous and
exogenous information. Symbiotic infusion is the infusion of
information from one organism to another; abiotic infusion is
the infusion of information not derived from any organism. Correspondingly,
endogenous information comprises symbiotically infused information
(and thus information already present within biology); exogenous
information comprises abiotically infused information (and thus
information external to biology). Now regardless whether plasmids
are irreducibly complex or have an irreducibly complex core (the
analysis to determine the nature of the complexity of plasmids
has to my knowledge not yet been performed), the fact remains
that there exist irreducibly complex biochemical systems. WhatÕs
more, even though symbiotic infusion may explain how a particular
instance of an irreducibly complex biochemical system came to
exist in a given organism, it cannot explain how such a system
arose in the first place. Because organisms have a finite trajectory
back in time, symbiotic infusion must ultimately give way to abiotic
infusion, and endogenous information must ultimately derive from
exogenous information.
Reconceptualizing Evolutionary Biology
The abiotic infusion of exogenous information is
the great mystery confronting modern evolutionary biology. It
is Manfred EigenÕs mystery with which we began this paper.
Why is it a mystery? Not because the abiotic infusion of exogenous
information is inherently spooky or unscientific, but rather because
evolutionary biology has failed to grasp the centrality of information
to its task. The task of evolutionary biology is to explain the
origin and development of life. The key feature of life is the
presence of complex specified informationÑCSI. Caught
up in the Darwinian mechanism of selection and inheritance with
modification, evolutionary biology has failed to appreciate the
informational hurdles organisms need to jump in the course of
natural history. To jump those hurdles, organisms require information.
WhatÕs more, a significant part of that information is
exogenous and must originally have been infused abiotically.
In this section I want briefly to consider what evolutionary
biology would look like if information were taken as its central
and unifying concept. First off, letÕs be clear that the
Darwinian mechanism of selection and inheritance with modification
will continue to occupy a significant place in evolutionary theory.
Nevertheless, its complete and utter dominance in evolutionary
theoryÑthat selection and inheritance with modification
together account for the full diversity of lifeÑthis inflated
view of the Darwinian mechanism will have to be relinquished.
As a mechanism for conserving, adapting, and honing already existing
biological structures, the Darwinian mechanism is ideally suited.
But as a mechanism for innovating irreducibly complex biological
structures, it utterly lacks the informational resources. As
for symbiotic infusion, its role within an information-theoretic
framework must always remain quite limited, for even though it
can account for how organisms trade already existing biological
information, it can never get at the root question of how that
biological information came to exist in the first place.
Not surprisingly, therefore, the key task an information-theoretic
approach to evolutionary biology faces is to make sense of abiotically
infused CSI. Abiotically infused CSI is information exogenous
to an organism, but which nonetheless gets transmitted to and
assimilated by the organism. Two obvious questions now arise:
(1) What is the mode of transmission of abiotically infused CSI
into the organism? and (2) Where is this information prior to
being transmitted? If this information is clearly represented
in some empirically accessible non-biological physical system,
and if there is a clear informational pathway from this system
to the organism, and if this informational pathway can be shown
suitable for transmitting this information to the organism so
that the organism properly assimilates it, only then will these
two questions receive an empirically adequate naturalistic answer.
But note that this naturalistic answer, far from eliminating
the information question, simply pushes it one step further back,
for how did the CSI that was abiotically infused into an organism
first get into a non-organism? Because of the Law of Conservation
of Information, whenever we inquire into the source of some information,
we never resolve the information problem, but only intensify it.
This is not to say that such inquiries are unilluminating (contra
Dawkins, 1987, pp. 1113; and Dennett, 1995, p. 153 who think
that the only valid explanations in evolutionary biology are reductive,
explaining the more complex in terms of the simpler). We learn
an important fact about a pencil when we learn a certain pencil-making
machine made it. Nonetheless, the information in the pencil-making
machine exceeds the information in the pencil. The Law of Conservation
of Information guarantees that as we trace informational pathways
backwards, we have more information to explain than we started
with.
Where then do the informational pathways of life
terminate as we trace them backwards? The possibilities are limited.
One possibility is that we get nowhere, unable even to begin
tracing backwards the information in a biological system. Thus
we may discover an irreducibly complex biological system, but
be unable to trace it back to any abiotic source of exogenous
information (this is by far the most common case in biologyÑsee
Behe, ch.Ê8). Another possibility is that we can trace
the information in a biological system back to an abiotic source
of exogenous information, but then canÕt trace it back
any further. Graham Cairns-Smith (1985; 1986), for instance,
has a clay-template theory for the origin of life in which self-replicating
clays form templates for carbon-based life. The Cairns-Smith
theory is clearly an abiotic infusion theory, with exogenous information
represented in (abiotic) clays providing templates for carbon-based
life. What the Cairns-Smith theory does not consider is how the
exogenous information that was transmitted to carbon-based life
from clay templates got into those clay templates in the first
place. Needless to say, the Cairns-Smith theory is highly speculative.
Still another possibility is that we can trace the information
in a biological system all the way back to the initial conditions
of the big bang (cf. Corey 1994). Though this approach appeals
to our naturalistic sensibilities, it remains scientifically sterile
until a definite informational pathway can be traced back to the
big bang. Finally, there is the creationist alternative which
traces the information in a biological system to the direct intervention
of God. Though this approach appeals to our theistic longings,
it remains scientifically sterile until an in-principle argument
is offered showing that information inherent in a biological system
could not have been contained in any non-biological physical precursor.
And even then itÕs not clear what sort of God one infers.
In tracing back the informational pathways of life,
evolutionary biology does well to avoid speculation, and follow
only those informational pathways that can be rigorously traced.
To take an analogy, I can rigorously trace the informational
pathway issuing in my copy of King Lear through the various extant
editions of the play spanning the last four centuries. On the
other hand, I cannot rigorously trace the informational pathway
issuing in an isolated first century papyrus fragment. Any story
behind this fragment is lost and cannot be reconstructed. Alternatively,
any relevant informational pathways are blocked and cannot be
rigorously traced. In a similar vein, evolutionary biology may
progress to the point where it can rigorously trace an informational
pathway back to an abiotic source of exogenous information. On
the other hand, it may remain stuck on a given irreducibly complex
biological structure, and never be able rigorously to trace it
back to an abiotic source of exogenous information.
In fine, I propose to reconceptualize evolutionary
biology in information-theoretic terms. An evolutionary biology
thoroughly cognizant of information theory is one whose chief
task is to trace informational pathways. In tracing these informational
pathways, evolutionary biology must place a premium on rigor.
Detailed informational pathways need to be explicitly exhibited.
Moreover, unlike the nebulous informational pathways sketched
by Stuart Kauffman and his associates at the Santa Fe Institute,
informational pathways need to conform to biological reality,
and not to the virtual reality residing in a computer (cf. Kauffman,
1996). Finally, empirical evidenceÑand not metaphysical
prejudice or aesthetic preferenceÑmust decide whether an
informational pathway exists at all. For instance, the Darwinian
preference to cash out taxonomy in terms of genealogy must not
be taken as evidence for common descent. To establish common
descent requires showing that certain informational pathways connect
all organisms. Many of the low-level facts of current evolutionary
biology will stay put. WhatÕs more, information theory
is sufficiently flexible to accommodate the mechanisms of evolutionary
change proposed to date. Nonetheless, their adequacy will have
to be evaluated in terms of the information-theoretic constraints
to which they are subject. Thus for instance, the Darwinian mechanism
can be formulated in information-theoretic terms, but the claim
that this mechanism can account for the full diversity of life
must be rejected given its inability to produce irreducibly complex
systems. Many old questions will remain. Many new questions
will arise. But some old questions will have to be discarded.
In particular, all reductionist attempts to explain information
in terms of something other than information will have to be discarded.
Intelligent Design
Up to this point I have developed a theoretical apparatus
for understanding information, I have critiqued the main naturalistic
attempts to account for biological information, and reconceptualized
evolutionary biology in information-theoretic terms. One question,
however, remains unanswered, to wit, Whence the origin of complex
specified information in biology? Tracing informational pathways
back to abiotic sources of exogenous information is as far back
as the information trail goes within the framework so far developed.
But again, all weÕve really done is push the information
problem back, shift its focus, and exchange one information problem
for another. To be sure, this need not be a vain exercise. Plasmid
exchange, though it represents no more than a shifting around
of pre-existing biological information still gives us tremendous
insight into antibiotic resistance. Nonetheless, all such exercises
get us no closer to the origin of information.
In what remains of this paper I want to argue that
intelligent causation, or equivalently design, properly accounts
for the origin of complex specified information. My argument
focuses on the nature of intelligent causation, and specifically,
on what it is about intelligent causes that makes them detectable.
To see why CSI is a reliable indicator of design, we need to
probe the nature of intelligent causation. The principal characteristic
of intelligent causation is choice. Whenever an intelligent cause
acts, it chooses from a range of competing possibilities. This
is true not just of humans, but of animals as well as of extra-terrestrial
intelligences. A rat navigating a maze must choose whether to
go right or left at various points in the maze. When NASAÕs
SETI researchers attempt to discover intelligence in the extra-terrestrial
radio transmissions they are monitoring, they assume an extra-terrestrial
intelligence could have chosen any number of possible radio transmissions,
and then attempt to match the transmissions they observe with
certain patterns as opposed to others. Whenever a human being
utters meaningful speech, a choice is made from a range of possible
sound-combinations that might have been uttered. Intelligent
causation always entails discrimination, choosing certain things,
ruling out others.
Given this characterization of intelligent causes,
the next question is how to recognize their operation. Intelligent
causes act by making a choice. How do we know when an intelligent
cause has so acted? A bottle of ink spills accidentally onto
a sheet of paper; someone takes a fountain pen and writes a message
on a sheet of paper. In both instances ink is applied to paper.
In both instances one among an almost infinite set of possibilities
is realized. In both instances a choice is madeÑone possibility
is selected and the rest are ruled out. Yet in one instance we
infer design, in the other we donÕt. What is the relevant
difference? Not only do we need to observe that a choice has
been made, but we ourselves need also to be able to specify that
choice. ItÕs not enough that one possibility has been
chosen and others have been ruled out. We ourselves need to be
able to make the same choice. Wittgenstein (1980, p. 1e) illustrated
this point as follows: ÒWe tend to take the speech of
a Chinese for inarticulate gurgling. Someone who understands
Chinese will recognize language in what he hears. Similarly I
often cannot discern the humanity in man.Ó
In hearing a Chinese utterance, someone who understands
Chinese not only recognizes that a choice was made from the range
of all possible utterances, but is also able to specify the utterance
that was made as coherent Chinese speech. Contrast this with
someone who does not understand Chinese. In hearing a Chinese
utterance, someone who does not understand Chinese also recognizes
that a choice was made from the range of all possible utterances,
but this time, because lacking the ability to understand Chinese,
is unable to specify the utterance as coherent speech. To someone
who does not understand Chinese, the utterance is gibberish.
To be sure, uttering gibberish always constitutes a choice from
the range of all possible utterances. Nonetheless, gibberish
corresponds to nothing we can understand in any language, and
so cannot be specified. As a result, gibberish is never taken
for intelligent communication, but always for what Wittgenstein
calls Òinarticulate gurgling.Ó
This choosing of one among several competing possibilities,
ruling out the rest, and specifying the one that was chosen encapsulates
how we recognize intelligent causes, or equivalently, how we detect
design. Psychologists who study animal learning and behavior
have known this all along. For these psychologistsÑknown
as learning theoristsÑlearning is discrimination (cf. Mazur,
1990; Schwartz, 1984). To learn a task an animal must acquire
the ability to choose behaviors suitable for the task as well
as the ability to rule out behaviors unsuitable for the task.
Moreover, for a psychologist to recognize that an animal has
learned a task, it is necessary not only to observe the animal
making the appropriate discrimination, but also to specify this
discrimination.
Thus to recognize whether a rat has successfully
learned how to traverse a maze, a psychologist must first specify
which sequence of right and left turns conducts the rat out of
the maze. No doubt, a rat randomly wandering a maze also discriminates
a sequence of right and left turns. But by randomly wandering
the maze, the rat gives no indication that it can discriminate
the appropriate sequence of right and left turns for exiting the
maze. Consequently, the psychologist studying the rat will have
no reason to think the rat has learned how to traverse the maze.
Only if the rat executes the sequence of right and left turns
specified by the psychologist will the psychologist recognize
that the rat has learned how to traverse the maze. Now it is
precisely the learned behaviors we regard as intelligent in animals.
Hence it is no surprise that the same scheme for recognizing
animal learning recurs for recognizing intelligent causes generally,
to wit: choosing one among several competing possibilities, ruling
out the others, and specifying the one chosen.
Now this general scheme for recognizing intelligent
causes coincides precisely with how we recognize complex specified
information: First of all, the basic precondition for information
to exist must be established, to wit, contingency. Thus one must
establish that any one of a multiplicity of distinct possibilities
might actually obtain. Next, one must establish that the possibility
chosen after the others were ruled out was also specified. So
far the match between this general scheme for recognizing intelligent
causation and how we recognize complex specified information is
exact. Only one loose end remainsÑcomplexity. Although
complexity is essential to CSI (corresponding to the first letter
in the acronym), its role in this general scheme for recognizing
intelligent causation is not immediately obvious. In this scheme
a choice is made among several competing possibilities, the rest
are ruled out, and the possibility chosen is specified. Where
in this scheme does complexity figure in?
The answer is that it is there implicitly. To see
this, consider again a rat traversing a maze, but now take a very
simple maze in which two right turns conduct the rat out of the
maze. How will a psychologist studying the rat determine whether
it has learned to exit the maze. Just putting the rat in the
maze will not be enough. Because the maze is so simple, the rat
could by chance just happen to take two right turns, and thereby
exit the maze. The psychologist will therefore be uncertain whether
the rat actually learned to exit this maze, or whether the rat
just got lucky. But contrast this now with a complicated maze
in which a rat must take just the right sequence of left and right
turns to exit the maze. Suppose the rat must take one hundred
appropriate right and left turns, and that any mistake will prevent
the rat from exiting the maze. A psychologist who sees the rat
take no erroneous turns and in short order exit the maze will
be convinced that the rat has indeed learned how to exit the maze,
and that this was not dumb luck. With the simple maze there is
a substantial probability that the rat will exit the maze by chance;
with the complicated maze this is exceedingly improbable. And
improbability is precisely what we mean by complexity.
This argument for showing that CSI is a reliable
indicator of design may now be summarized as follows: CSI is
a reliable indicator of design because its recognition coincides
with how we recognize intelligent causation generally. In general,
to recognize intelligent causation we must observe a choice among
competing possibilities, note which possibilities were not chosen,
and then be able to specify the possibility that was chosen.
WhatÕs more, the competing possibilities that were ruled
out must be live possibilities, and sufficiently numerous so that
specifying the possibility that was chosen cannot be attributed
to chance. In terms of probability, this just means that the
possibility that was specified has small probability. In terms
of complexity, this just means that the possibility that was specified
has high complexity. All the elements in this general scheme
for recognizing intelligent causation (i.e., choosing, ruling
out, and specifying) find their counterpart in complex specified
informationÑCSI. It follows that CSI pinpoints precisely
what we need to be looking for when we detect design.
As a postscript, let me call the readerÕs
attention to the etymology of the word Òintelligent.Ó
The word ÒintelligentÓ derives from two Latin words,
the preposition inter, meaning between, and the verb lego, meaning
to choose or select. Thus according to its etymology, intelligence
consists in choosing between. It follows that the etymology of
the word ÒintelligentÓ parallels the formal analysis
of intelligent causation just given. ÒIntelligent DesignÓ
is therefore a thoroughly apt phrase, signifying that design is
inferred precisely because an intelligent cause has done what
only an intelligent cause can do, to wit, make a choice.
References
Ambile-Cuevas, Carlos F., Maura Crdenas-Garc'a,
and Maaricio Ludgar. 1995. Antibiotic Resistance. American
Scientist, 83: 320-329.
Barrow, John D. and Frank J. Tipler. 1986. The
Anthropic Cosmological Principle. Oxford: Oxford University
Press.
Behe, Michael. 1996. DarwinÕs Black Box:
The Biochemical Challenge to Evolution. New York: The Free Press.
Bohm, David. 1993. The Undivided Universe: An Ontological
Interpretation of Quantum Theory. London: Routledge.
Borel, Emile. 1962. Probabilities and Life, translated
by M. Baudin. New York: Dover.
Cairns-Smith, Alexander G. 1985 Seven Clues to
the Origin of Life. Cambridge: Cambridge University Press.
Cairns-Smith, Alexander G. and H. Hartman, eds.
1986. Clay Minerals and the Origin of Life. Cambridge: Cambridge
University Press.
Chalmers, David J. 1996. The Conscious Mind: In
Search of a Fundamental Theory. New York : Oxford University
Press.
Corey, Michael A. 1994. Back to Darwin: The Scientific
Case for Deistic Evolution. Lanham, Maryland: University Press
of America.
Darwin, Charles. 1859. On the Origin of Species,
facsimile first edition. Cambridge, Mass.: Harvard University
Press, 1964.
Dawkins, Richard. 1987. The Blind Watchmaker.
New York: Norton.
Dembski, William A. 1996. The Design Inference:
Eliminating Chance through Small Probabilities. Doctoral Dissertation,
University of Illinois at Chicago.
Dennett, Daniel C. 1995. DarwinÕs Dangerous
Idea: Evolution and the Meanings of Life. New York: Simon &
Schuster.
Devlin, Keith J. 1991. Logic and Information.
New York: Cambridge University Press.
Eigen, Manfred. 1992. Steps Towards Life: A Perspective
on Evolution, translated by Paul Woolley. Oxford: Oxford University
Press.
Joklik, Wolgang K. and Hilda P. Willett, eds. 1976.
Zinsser Microbiology, 16th ed. New York: Appleton-Century-Crofts.
Kauffman, Stuart. 1995. At Home in the Universe.
Oxford: Oxford University Press.
Kppers, Bernd-Olaf. 1990. Information and
the Origin of Life. Cambridge, Mass.: MIT Press.
Landauer, Rolf. 1991. Information is Physical.
Physics Today, May: 2329.
Levy, Steven. 1992. Artificial Life: The Quest
for a New Creation. New York: Pantheon.
Margulis, Lynn. 1993. Symbiosis in Cell Evolution:
Microbial Communities in the Archean and Proterozoic Eons, 2nd
ed. New York: Freeman.
Mazur, James. E. 1990. Learning and Behavior, 2nd
edition. Englewood Cliffs, N.J.: Prentice Hall.
Miller, Kenneth R. and Joseph Levine. 1993. Biology.
Englewood Cliffs, N.J.: Prentice-Hall.
Monod, Jacques. 1972. Chance and Necessity. New
York: Vintage.
Scherer, Siegfried. 1983. Basic Functional States
in the Evolution of Light-driven Cyclic Electron Transport. Journal
of Theoretical Biology, 104: 289-299.
Schwartz, Barry. 1984. Psychology of Learning and
Behavior, 2nd edition. New York: Norton.
Stalnaker, Robert. 1984. Inquiry. Cambridge, Mass.:
MIT Press.
Swinburne, Richard. 1979. The Existence of God.
Oxford: Oxford University Press.
Waldrop, M. Mitchell. 1992. Complexity: The Emerging
Science at the Edge of Order and Chaos. New York: Simon &
Schuster.
Wittgenstein, Ludwig. 1980. Culture and Value,
edited by G. H. von Wright, translated by P. Winch. Chicago:
University of Chicago Press.
Wouters, Arno. 1995. Viability Explanation. Biology
and Philosophy, 10: 435-457.
Yockey, Hubert P. 1992. Information Theory and
Molecular Biology. Cambridge: Cambridge University Press.