Indo-European and the Comparative Method

Everything you ever wanted to know about Proto-Indo-European (and the comparative method), but were afraid to ask!

Kathleen Hubbard's answer to the question "How do we know what we know about Proto-Indo-European and other languages that died out before they were written down? [Kathleen is assistant professor of linguistics at the University of California, San Diego. She describes herself as a "recovering Indo-Europeanist."] I have also appended some bibliography at the end.

The hard-core indo-europeanist may be interested in the TITUS Indo-European Resources project in Stuttgart (eventually in many languages, but currently only German and Spanish).

Okay, in 1786 Sir William Jones announced to the Asiatick Society of 
Calcutta that Sanskrit had to be related to Greek and Latin, touching 
off what would come to be known as the Neogrammarian move from 
philology (the comparison of texts) to what we now consider 
linguistics.

If you were to see a whole huge raft of cognates like the following, 
you might come to the same conclusion (Avestan is an ancestor of 
Persian, it's the language of the Zoroastrian texts):

Sanskrit   Avestan   Greek    Latin    Gothic     English

pita                 pater    pater    fadar      father 
padam                poda     pedem    fotu       foot 
bhratar              phrater  frater   brothar    brother 
bharami    barami    phero    fero     baira      bear 
jivah      jivo               wiwos    qius       quick 
('living')
sanah      hano      henee    senex    sinista    senile 
virah      viro               wir      wair       were(wolf) 
('man')
                     tris     tres     thri       three
                     deka     decem    taihun     ten
           satem     he-katon centum   hund(rath) hundred

Now, cognates mean "pair/set of words descended from a common 
ancestor", not just words that happen to look like each other -- i.e. 
"coffee" is not a cognate of kaffe, kahawa, cafe, etc.; that's an 
instance of lots of borrowing of the same word by various languages. 
What we're talking about here are historically related words. When we 
know we've got cognates, we can talk about reconstruction.

Reconstruction revolves around the notion that sound change is 
mechanical and exceptionless. If a proto-/p/ becomes /f/ in a 
daughter language, it does so in regular fashion (that's the 
heuristic you have to use). If there are exceptions, there must be 
some other conditioning factor. Using this assumption, we can 
conclude that some common ancestor produced Sanskrit /bh/, Avestan 
/b/, Greek /ph/ (which is NOT /f/, it's aspirated /p/ at the stage 
we're talking about), Latin /f/, and Germanic /b/. Now the question 
is, what was that common ancestor?

The way we decide what segment must have been there in the proto-
language involves things we know independently about how sounds 
behave, based partly on how sounds alternate synchronically in 
languages (i.e. rules that operate to change one sound to another in 
different contexts during a single stage of a language), partly on 
what we know about acoustics and articulation of speech sounds (which 
tells us what directionality is more or less likely), and partly on 
experience. Pure gold for the historical linguist is ATTESTED 
(written) ancient forms.

For instance, we know that the modern Romance languages (French, 
Italian, Spanish, Portuguese, Romansch, Rumanian, etc.) are descended 
from Latin.  And we have lots of attested Latin to work with -- so we 
have clear, unambiguous examples of how some sound changes have 
worked. Likewise in other language families where ancient texts are 
preserved (i.e. ancient religious texts in Semitic etc.) So we have 
some real-life models on which to build our guesses.

So anyway, you reconstruct Proto-Indo-Iranian, and Proto-Germanic, 
and Proto-Balto-Slavic, and Proto-Celtic, and ultimately you have a 
pretty good idea of what -- on the basis of very rigorous analysis -- 
must have been the forms of certain words/roots in 
Proto-Indo-European, before it split up. Now, this method does NOT 
yield reliable results further back than about 10,000 years, because 
beyond that, too much change has occurred for there to be any 
recognizable remnants (that we can be sure about anyway) in attested 
languages. (Pace Greenberg et al. who get lots of popular press.

One real triumph of this method of reconstruction was the Laryngeal 
Hypothesis: it was known that there were some troublesome places in 
Indo-European where the sound changes seemed not to be behaving in 
their usual regular way; things were happening to vowels and 
sometimes consonants that couldn't be easily explained based on what 
we saw in the attested languages. Ferdinand de Saussure in the late 
19th century said that there had to be a set of three segments in the 
proto-language that had not survived in any of the daughter languages 
-- he was fairly conservative about claiming what they must have 
been, but he called them laryngeals and pointed out the precise 
locations where they must have occurred. Many years later, when a 
bunch of texts in Turkey were finally decoded and we knew we were 
looking at the ancient Anatolian language Hittite, the oldest 
attested Indo-European language -- voila: there were the laryngeals, 
exactly where Saussure had predicted they must be just on the basis 
of careful reconstruction.

There are other wrinkles, like you can do internal reconstruction 
under some circumstances, and there are things other than sounds that 
point to common ancestry (morphology, syntax, etc.). And semantic 
change is a really neat thing to trace, though much slipperier than 
sound change. But the general answer to your question is, we know 
what we know about Proto-Indo-European because of the Comparative 
Method, which arose in the 19th century and gives us a rigorous way 
to compare sounds in daughter languages and determine what the 
antecedent sounds must have been.

Oh, and the PIE reconstructions for the above words are (always 
preceded by a star to show they're unattested, followed by a hyphen 
if they're roots that get suffixed, and with hedges if a vowel or 
something is uncertain -- consonants are much easier to reconstruct 
than vowels -- oh yes and @ stands for schwa here):

*p@ter-         father
*ped-           foot
*bhrater-       brother
*bher-          carry
*gwei-          live
*sen-           old
*wi-ro-         man     (derived from *wei@- vital force)
*trei-          three
*dekm-          ten
*dkm-tom-       hundred (derived from *dekm- ten)

A Little Bibliography

Anthony Arlotto, Introduction to Historical Linguistics (New York 1971). [particularly good introduction for non-linguists]

Emile Benveniste, Indo-European Language and Society (London 1973). [Contains cultural as well as linguistic material.]

Carl D. Buck, A Dictionary of Selected Synoynms in the Principal Indo-European Langauges (Chicago 1949).[A wonderful old reference work. Lists and discusses synonyms and cognates for a variety of ideas (arranged topically) in over 30 Indo-European langauges. Now available in an affordable paperback reprint edition.]

N. E. Collinge, The Laws of Indo-European (Amsterdam 1985). [Catalogs real and alleged sound changes in IE families and languages. Fairly technical]

Antoine Meillet (trans. S. N. Rosenberg), The Indo-European Dialects (Huntsville 1967). [This and the two following works are by one of the great masters of the field, but are still relatively clear and accessible.]
----- (trans. Gordon B. Ford, Jr.), The Comparative Method in Historical Linguistics (Paris 1967).
-----, Introduction a l'etude comparative des langues indo-europeennes (Paris 1937).

Holgar Pedersen, The Discovery of Language (Bloomington 1959). [Includes historical perspective on how these discoveries were made.]

Andrew Sihler, New Comparative Grammar of Greek and Latin (New York 1995).

Oswald Szemerenyi, Comparative-historical linguistics : Indo-European and Finno-Ugric (Amsterdam 1993).

Calvert Watkins (ed.), The American Heritage Dictionary of Indo-European Roots (Boston 1985). [Note the extensive introductory essay. Much of the same material can be found in the first and third editions of AHD.]

Werner Winter (ed.), Evidence for Laryngals (The Hague 1965). [Evidence from the various IE languages bearing on Saussure's laryngal theory cited above. Highly technical.]


Return