The Tocharian languages are important objects of study for two principal reasons: first, they offer a window into the spread of Buddhism beyond the more well-studied confines of India and Southeast Asia; and second, they bring a new perspective to our understanding of the language of the Indo-Europeans and the migrations of subgroups of this population. The latter point concerns us in this overview.
The relationship of Tocharian to the other Indo-European languages is not an incidental fact of interest only to historical linguists. This relationship in fact has useful pedagogical implications. This can be seen by imagining the ways in which one might approach the learning of Tocharian. On the one hand, since at the time of the Tocharian documents the language seemed to be a relative isolate on a synchronic level, we may study the language in isolation. From that point of view we are free to organize our description of the language in any manner whatsoever. But in doing so, the language will appear a fairly complex jumble of facts, given the number of nominal and verbal paradigms, and we will have little recourse to bring to bear in any coherent sense any knowledge we may have gained from a study of other languages.
On the other hand, we may use a knowledge of Proto-Indo-European (PIE) as an organizing principle. A reader may object that, if he or she does not already know Proto-Indo-European, then this is no benefit at all. In reality this is not so. When a historical linguist speaks of "Proto-Indo-European," in the strictest sense this really just denotes a collection of information about the common traits of a large swath of languages spreading in archaic times from Iceland to India (and now China, as our studies will show). From this viewpoint, understanding some information about Proto-Indo-European shows one how knowledge of certain individual languages --- e.g. English, French, Spanish, German, Latin, Greek, Sanskrit --- can be imported to a larger number of these languages. It is in this regard that knowledge of PIE assists us in studying Tocharian: by stating that Tocharian is like or unlike PIE in certain specific ways, we are thus able to view Tocharian against the backdrop of a large number of well-studied languages. In this way, for example, a Buddhist scholar who knows Sanskrit may not only use his or her knowledge of Sanskrit Buddhist vocabulary in learning Tocharian vocabulary; but he or she may also use Sanskrit grammatical structure as a point of comparison for the understanding of Tocharian grammatical structure. Not only does he or she know that Tocharian is or is not similar to Sanskrit in certain obvious ways, but one finds out how certain features of PIE still extant in Sanskrit --- but not in Tocharian --- may have developed through a logical sequence of steps into what one actually finds in Tocharian. Thus even the differences between Tocharian and other languages like Sanskrit often become logical in a certain manner of speaking. Not only that, but one finds that these similarities and differences are linked to a large number of other languages which the reader may go on to study in the future.
With this in mind, the organization of the present work will pay attention to how Tocharian developed out of PIE. PIE may be viewed as a repository of those linguistic traits which are exhibited by a certain family of languages, or more truthfully as the starting point from which one can more or less systematically derive the traits found in the documented languages of the family. There are in fact two benefits to this approach. The first is that discussed above: this gives us an organizing principle that highlights systematic similarities and differences from a large number of related languages of interest. The second benefit comes from taking a slightly more "substantial" view of PIE, that is, by saying that PIE or something very similar was in fact a real language spoken by a real group of inhabitants of a certain area. Endowing PIE with substance in this way, and adding the postulate that (in an age before telecommunication) divergence in linguistic features corresponds to increasing geographical isolation, the historical linguist may then map linguistic changes onto rather coarse-grained routes of migration. In this way, when the predominantly Buddhist texts of Tocharian fail to tell us about where the Tocharians came from, the structures of the language can speak for themselves and reveal parts of their history otherwise lost in the dust.
Historical linguistics opened up as a scientific discipline with the recognition that sound changes are regular. That is to say, if a sound change occurs in a speech community, that change occurs in a fashion almost exclusively conditioned by phonetic environment, not by the particular word the sound happens to be found in. Thus, for example, the change of a: to o that took Old English ba:n to Modern English bone affected not only the a: of the word ba:n, but all instances of a: throughout the language. Hence ha:m also became home.
Given this regularity, that is, this ability to describe the "laws" of sound changes, the historical linguistic endeavor begins to resemble that of physics. In physics, the great minds discover by hook or crook the rules by which to derive the final states of systems given knowledge of their initial states. Similarly in historical linguistics, and specifically historical phonology, one postulates an initial state -- the phonological system of Proto-Indo-European -- and applies rules of sound change to this initial state in order to arrive at the phonological systems of the daughter languages.
If one knows the sound rules, then one only needs the starting point to arrive at the end result. In historical linguistic terms, one calls the starting point (for the languages that concern us here) Proto-Indo-European. If one knows the rules that take the phonemes of PIE to, say, the phonemes of Latin, then all one needs is the starting point -- the phonological inventory of PIE -- to predict the phonological inventory of Latin. Of course, in practice the situation is a little more difficult, on the one hand because one has to work in reverse (e.g. we already know the phonological system of Latin), and on the other hand because neither the PIE phonological system nor the rules of derivation are known a priori. Thus one must hypothesize a PIE phonological system, hypothesize rules of derivation, and then change one or both in a give-and-take manner until one can cogently arrive at the phonological system of Latin with a minimum of exceptions to the rules. (This is not as ad hoc or "unscientific" as it may appear: after all, in physics one knows the outcome of experiments. Given that, one then has to reason or guess to figure out a law that explains those results. Only then may one proceed to try to predict new results using the law that explained the old results.) These rules are generally in the form of correspondences, stating that some specific PIE sound corresponds to an equally specific Latin sound (perhaps with the attendant necessity of specifying the surrounding environment in which one sound corresponds to the other). The following table shows an example of correspondences between PIE and Latin phonemes.
| PIE | Latin | PIE | Latin | PIE | Latin | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Init. | Med. | Init. | Med. | Init. | Med. | |||||||||||||||||||
| *p | p | p | *b | b | b | *bh | f | b | ||||||||||||||||
| *t | t | t | *d | d | d | *dh | f | d (b) | ||||||||||||||||
| *k | c | c | *g | g | g | *gh | h (f) | g (h) | ||||||||||||||||
| *kw | qu | qu | *gw | v | v | *gwh | f | gu (v) |
This says, for example, that the initial *gh- and the medial *-bh- of PIE *ghabh- correspond respectively to the h- and -b- of Latin habe:re 'have'.
One may ask: where does this get us? After all, we already knew the Latin phonological system! A simple example may illustrate the point. Grimm's Law is the well-known set of rules relating the PIE phonetic inventory to the Germanic phonetic inventory. Simply applying Grimm's Law to the PIE root *ghabh- leads one eventually to Modern English give. Thus Grimm's Law relates Modern English give -- but not have -- to Latin habe:re 'have'. The point: one infers that perhaps the original PIE root *ghabh- must have contained elements of both 'give' and 'have'. This correspondence and others among etymologically related words across the Indo-European languages allow historical linguists to reconstruct a culture based on ties of reciprocal giving for the culture of the Proto-Indo-European speakers, a culture many of whose traits we seem to find preserved in the Greek literary heritage of Homer's Iliad and Odyssey. We thereby gain insight into a culture that left no written records whatsoever, merely by careful consideration of rules relating the sounds of the languages of the daughter cultures!
In the early period of historical linguistic investigation, familiarity with ancient languages like Latin, Classical Greek, Sanskrit and Gothic, and with their modern relatives such as Italian, Modern Greek, Hindi and German, led scholars to believe that the evolution of languages followed a general principle of phonological and morphological simplification. In postulating the phonological system of PIE, scholars thus posited the most ornate of the systems they had encountered, which happened to be that of Sanskrit. The various points of stop consonant articulation, such as the lips, then had a complete series of four phonemes: *p, *ph, *b, *bh. Similarly for the dental stops, palatals, and velars. Sanskrit was thus supposed to be the only language to keep the stops unchanged, while other languages simply lost one or other of the consonants.
With closer inspection of the relationships among the Indo-European languages, as well as the discovery of hitherto unknown members of the family such as Hittite, whose documents were older than those of any other Indo-European language but whose consonant inventory was far simpler than that of Sanskrit, scholars finally had to abandon the equation of the Sanskrit and PIE systems. In the common understanding of the PIE consonant inventory, there are no voiceless aspirates (e.g. *ph), but only voiceless non-aspirates (e.g. *p), voiced non-aspirates (*b), and voiced aspirates (*bh).
The current reconstruction of the PIE system of stop consonants is given in the following table.
| Labial | Dental | Palatal | Velar | Labiovelar | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Voiceless | p | t | k' | k | kw | |||||
| Voiced | b | d | g' | g | gw | |||||
| Voiced Asp. | bh | dh | g'h | gh | gwh |
The labiovelar consonants are similar to their velar counterparts, but with a simultaneous rounding of the lips. Examples of their remnants in daughter languages are Latin quid, Hittite kwit, Old English hwæt.
In truth, no ancient Indo-European language maintains all of the Indo-European stop consonants intact. One of the major changes that occurs between the period of PIE and that of the daughter languages is the merger of series (that is, the merger of columns in the preceding table). The principal merger is that between velars and palatals on the one hand, and between velars and labiovelars on the other hand. These mergers divide the IE languages into two groups, called the centum and satem groups. Specifically, the groups show mergers as follows.
The general situation is outlined in the following chart.
| Mergers | Centum | PIE | Satem | |||
|---|---|---|---|---|---|---|
| *kw | *kw | *k | ||||
| *k | *k | *k | ||||
| *k | *k' | *k' |
There are similar correspondences for the voiced non-aspirates, e.g. PIE *g, and the voiced aspirates, e.g. PIE *gh. The following chart provides a specific illustration of the above.
| Examples | Latin | PIE | Sanskrit | |||
|---|---|---|---|---|---|---|
| quod | *kwod | kát | ||||
| cruor | *kreuh2 | kravís. | ||||
| centum | *k'm.tóm | s'ata'm |
One may liken this process to a much later change which took place within the Romance language family, whereby the c- (pronounced [k]) of Latin centum became the c- (pronounced [s]) of Spanish cien. This shows a similar assibilation process to that found in the satem group, though the change in the Romance group occurred thousands of years later.
This merger of PIE series had the effect of giving all the ancient IE languages situated to the east a common linguistic trait, and similarly all those to the west a common trait. Specifically, the language families grouped as follows.
The centum and satem dialectal grouping thus correlated with a geographic grouping.
The discovery of Tocharian, however, put an end to this neat correlation between linguistic and geographic affiliation. The Tocharian word for 'hundred' is A känt B kante, showing an initial velar k-. This squarely associates Tocharian with the other centum subfamilies. The latter are in the west, however, while Tocharian alone among them is in the east, thereby undercutting the correlation between geography and the centum-satem division.
The following text is an excerpt from the Buddhist Pun.yavanta-Ja:taka. This text was initially published in Sieg and Siegling's Tocharische Sprachreste, Leipzig 1921. Somewhat later George Lane published the text and translation in English in the Journal of the American Oriental Society, Vol. 67, No. 1 (Jan. - Mar. 1947), 33-53. Though much has been learned about Tocharian in the time since Lane's translation, the article still remains a very useful starting point, providing a translation that generally remains close to the original text.
A Lane points out, the Tocharian text does not provide a previously unknown text; the Pun.yavanta-Ja:taka was already known to scholarship via Sanskrit texts. This is often the case with Tocharian manuscripts. What is revealing, however, is that as Sieg et al. and Lane note, the Tocharian version turns the style of composition on its head. Rather than giving prominence to the actual exploits of the story's protagonists (as is done in the Sanskrit version), the Tocharian version instead gives overwhelming prominence to the stories told before the protagonists start their journey. The Tocharian version also downplays some of the erotic overtones found in the Maha:vastu version, thus showing a different moral perspective. Such changes hold interest for the Buddhist scholar in delineating the fluidity of Buddhist doctrine within the different cultures in which it took hold. But they also hold interest for the Tocharian scholar inasmuch as they are clues to the cultural fabric of the Tocharian peoples themselves.
Note in verse 10 the phrase pon'cäm. sam.sa:ris -- a noun in the genitive takes a modifier in the oblique. This illustrates how the agglutinative structure of Tocharian in the secondary cases begins to cross over into the primary cases. That is, in secondary case phrases, the case marker is generally only appended to one of the elements of the phrase (usually the last element); all other elements of the phrase, following group inflection, only employ the oblique ending, foregoing the secondary case marker. The present phrase displays a similar tendency, even though the genitive is a synthetic case; that is, once the genitive is expressed, other elements in apposition may forego the overt case marking and default to the oblique.
1 - ka:su n'om-klyu tsras.is's'i s'äk kälymentwam. sätkatär.
yärk yna:n'mune nam poto tsras.s.uneya: p(o)käs. kälpna:l;
yukna:l yma:räk yäslun'cäs, kälpna:l yma:räk ya:tlune.
2 - tsras.is's'i ma:k nis.palntu tsras.is's'i ma:k s'kam. s.n'as.s.en'.
nämsen'c yäslus. tsras.isac, kunsen'c yärkant tsras.isac.
tsras.in' waste wrasas's'i, tsras.is's'i ma: praski nas..
3 - tämyo ka:su tsras.s.une p(o)kam. pruccamo n'i pälskam..
4 - tsras.s.uneyo tämne nes. pras.tam. Siddha:rthes la:nt se Sarva:rthasiddhe bodhisattu sa:mudram. ka:rp, n'emis.im. pran`ka: yes..
5 - n'emintuyo ypic olyiyam. sa:rth Jambudvipac pe ya:muräs., s.pät kom.sa: kn'ukac wram. kälk, s.pät kom.sa: pokena: kälk, s.pät kom.sa: lyomam. kälk.
6 - s.pät kom.sa: wälts pältwa:yo opla:syo wram. opläs. opla: ka:rnma:m. kälkoräs., pän' kursärwa: a:rs.la:syo rarkusa:m. tkana: kälk.
7 - tmäs. ra:ks.tsa:s's'i dvipam. yes., tmäs. yaks.a:s's'i, tmäs. Baladvipam. yes..
8 - tmäs. s'twar-wäkna: a:rs.la:syo rarkun'cäs is.anäs kcäk. s'twar-wäkna: spes.inäs klum.tsäsyo sopis Sa:gares la:nt la:n'ci was.t pa:s.änta:s s'a:wes empeles na:ka:s a:suk kätkoräs., Sa:garem. la:ntäs. cinda:man.i wma:r torim. kälpa:t, pon'cäm. Jambudvipis ekrorn'e wawik.
9 - s'lak s'kam. -- S.a:mnernam.
ma:ski kätka:läm. ktän`ken'c tsras.in' sa:muddrä,
traidha:tuk sam.sa:r tsras.s.uneyo ktän`ken'c kram.s'.
kälpna:ntär torim. puttis'paräm. wärs.s.ältse.
ma:=pärma:t tsru-yärm ya:tal yatsi tsras.s.une.
10 - ma: täprem. sam. pon'cäm. sam.sa:ris ka:ripac sa:spärtwu a:lak wram nas. kosne a:la:sune.
11 - kyalte nes. wrasas's'i sne-wa:wles.u sne-psäl klu s'wa:tsi s.es., kalpavr.ks.äntwam. a:rwar papyätkunt wsa:lu yetweyntu was.lam. s.en'c-äm..
12 - a:la:sa:p klu kropluneya: kalpavr.ks.äntu nakäntäm, kappa:n' pa:kär ta:karäm.
13 - sne-wa:wles.u sne-psäl klu naktäm, s'a:wam. wlesam.tyo psälas's'äl pa:kär ta:kam.
cami a:la:suneyis nu tsras.s.une pratipaks. na:m.tsu. tämyo tsras.s.une n'i a:rkis'os.yam. p(o)kam. pruccamo pälskam..
1 ka:su n'om-klyu tsras.is's'i s'äk kälymentwam. sätkatär.
yärk yna:n'mune nam poto tsras.s.uneya: p(o)käs. kälpna:l;
yukna:l yma:räk yäslun'cäs, kälpna:l yma:räk ya:tlune.
2 tsras.is's'i ma:k nis.palntu tsras.is's'i ma:k s'kam. s.n'as.s.en'.
nämsen'c yäslus. tsras.isac, kunsen'c yärkant tsras.isac.
tsras.in' waste wrasas's'i, tsras.is's'i ma: praski nas..
3 tämyo ka:su tsras.s.une p(o)kam. pruccamo n'i pälskam..
4 tsras.s.uneyo tämne nes. pras.tam. Siddha:rthes la:nt se Sarva:rthasiddhe bodhisattu sa:mudram. ka:rp, n'emis.im. pran`ka: yes.. 5 n'emintuyo ypic olyiyam. sa:rth Jambudvipac pe ya:muräs., s.pät kom.sa: kn'ukac wram. kälk, s.pät kom.sa: pokena: kälk, s.pät kom.sa: lyomam. kälk. 6 s.pät kom.sa: wälts pältwa:yo opla:syo wram. opläs. opla: ka:rnma:m. kälkoräs., pän' kursärwa: a:rs.la:syo rarkusa:m. tkana: kälk. 7 tmäs. ra:ks.tsa:s's'i dvipam. yes., tmäs. yaks.a:s's'i, tmäs. Baladvipam. yes.. 8 tmäs. s'twar-wäkna: a:rs.la:syo rarkun'cäs is.anäs kcäk. s'twar-wäkna: spes.inäs klum.tsäsyo sopis Sa:gares la:nt la:n'ci was.t pa:s.änta:s s'a:wes empeles na:ka:s a:suk kätkoräs., Sa:garem. la:ntäs. cinda:man.i wma:r torim. kälpa:t, pon'cäm. Jambudvipis ekrorn'e wawik. 9 s'lak s'kam. -- S.a:mnernam.
ma:ski kätka:läm. ktän`ken'c tsras.in' sa:muddrä,
traidha:tuk sam.sa:r tsras.s.uneyo ktän`ken'c kram.s'.
kälpna:ntär torim. puttis'paräm. wärs.s.ältse.
ma:=pärma:t tsru-yärm ya:tal yatsi tsras.s.une.
10 ma: täprem. sam. pon'cäm. sam.sa:ris ka:ripac sa:spärtwu a:lak wram nas. kosne a:la:sune. 11 kyalte nes. wrasas's'i sne-wa:wles.u sne-psäl klu s'wa:tsi s.es., kalpavr.ks.äntwam. a:rwar papyätkunt wsa:lu yetweyntu was.lam. s.en'c-äm.. 12 a:la:sa:p klu kropluneya: kalpavr.ks.äntu nakäntäm, kappa:n' pa:kär ta:karäm. 13 sne-wa:wles.u sne-psäl klu naktäm, s'a:wam. wlesam.tyo psälas's'äl pa:kär ta:kam. cami a:la:suneyis nu tsras.s.une pratipaks. na:m.tsu. tämyo tsras.s.une n'i a:rkis'os.yam. p(o)kam. pruccamo pälskam..
1 "The good fame of the strong spreads in the ten directions.
Reverence, respect, obeisance, (and) honor (are) to be attained through strength from everyone.
To be conquered quickly (are) enemies. To be obtained quickly (is) prosperity.
2 Of the strong (there are) great riches; of the strong (are) also many relatives.
Enemies bow down before the strong; to the strong come honors.
The strong (are) the protection of creatures; of the strong there is no fear.
3 Therefore strength (is) good (and) in every way the best (thing) in my opinion.
4 "By means of strength thus, at an earlier time, the son of king Siddhartha, the Bodhisattva Sarvarthasiddha descended upon the ocean. He went to the island of jewels. 5 With a caravan to Jambudvipa also having been made in a ship filled with jewels, for seven days he walked up to the neck in water; for seven days with the arms he walked; for seven days in mud he walked; 6 for seven days in water with lotuses with a thousand leaves, ascending from lotus to lotus he went; five leagues he walked though a place covered by snakes. 7 Thereupon he went to the island of the Raksasas, then to the island of the Yaksas, to Baladvipa, he went. 8 Thereupon he traversed the moats covered by four sorts of snakes. Nets with four sorts of Sphatika thread guarding the royal house of king Sagara, the great, awful Nagas having traversed completely, he obtained the Cintamani-stone, the precious, from king Sagara. Of all Jambudvipa the sickness he caused to disappear. 9 And so (in samner-meter):
"The ocean difficult to cross the strong cross.
The threefold world (of) existence by strength the good cross.
The superior obtain precious Buddhahood.
Strength is not capable of performing a disgrace (even) to a small degree.
10 "There is not another thing (which has) become (lit. turned) so for the injury of the entire world as (has) sloth. 11 For formerly of men without work (there) was chaffless rice to eat. In the kalpa-trees ready prepared for them to wear were clothing and ornaments. 12 The rice of the slothful (man) (to be had) by gathering and the kalpa-trees disappeared for them. Miseries (?) were plainly before them. 13 Without work (and) without chaff the rice disappeared for them. By great labor and with chaff a store of grain was for them. 14 Indeed of this, sloth being the opposite, therefore, strength (is) in the world in my opinion altogether the best thing."
The writing system of the Tocharian languages was not a wholly new creation, but an adaptation of a pre-existing script. The Tocharian scribes adopted a certain version of the north Indian Bra:hmi: script. This they subsequently modified to suit the requirements of their own language. A very small number of texts, however, are found in a Manichean script.
The Brahmi script is not quite pure alphabetic, nor pure syllabic. It is a system of so-called aks.aras. Each consonant has a separate, unique representation. The unmodified version, however, always represents the consonant followed the default vowel a. Thus the symbol <p>, in the absence of other modifications, represents [pa] --- much as in English we write p, but say it is the letter 'pee'. But in English we would then have to write, say, the name Peter as p-ter, since we pronounce 'pee' whenever we see the symbol p.
To then tell the reader to pronounce the consonant with a different vowel, a certain symbol would be located above the <p> to denote [pi], a different symbol below to denote [pu], and so on. These symbols placed around the consonants to change the value of the following vowel are the bound forms of the vowels. Each vowel also had its own free form, generally used in word-initial position. The diphthongs ai and au had their own symbols, and were not written as the combination of their constituent elements.
The Indic languages are blessed with a wealth of stop consonants; the Tocharian languages, by contrast, lie impoverished in this regard. Tocharian thus had no need, in principle, to use symbols for the voiced aspirates such as gh, dh, bh; nor for the retroflex consonants such as t., d., n.. But this is not to say that Tocharian scribes did not employ them. The scribes were, in fact, often very faithful to the sounds and spellings of the Sanskrit words they borrowed. And as in other Buddhist traditions, so too the Tocharians borrowed a very large inventory of terminology directly from Sanskrit Buddhist texts. Thus the majority of Indic sounds have a graphical representation in some word or other in the Tocharian languages; but these are to be taken as the result of conservative tendencies in spelling and not necessarily as aids to the native Tocharian speaker in reproducing a faithful pronunciation -- much the same as our own tendency in English to keep the long silent gh of words like through.
Hard as it is to believe, there are in fact some sounds in Tocharian which are not to be found in Sanskrit. In particular, there is the voiceless dental affricate ts. This was in fact written by the Tocharians with a ligature of the characters representing t and s, but the sound itself is a single phoneme in Tocharian. Tocharian also possesses a reduced high central vowel denoted ä, since its representation in the Tocharian script involved the placements of two dots above the character for a. Take care to remember, however, that this is merely a convention of scholarly transcription, and it does not represent the German sound ä in words such as Mädchen. The phonetic value was probably closest to the IPA [i-].
One peculiar feature of Tocharian is that some vowels -- generally i, u, and ä -- could lose their syllabic content in open syllables. When this occurred, the Tocharian scribes would combine the preceding consonant with the following consonant, and write the non-syllabic vowel above the vowel which properly belonged to the following consonant. For example, phonemic /kuse/ was evidently pronounced with a reduced vowel as [kwse], and this latter was represented in writing as <kso!(e)>. Modern scholars generally transcribe this as k(o)se.
Relative to a language like Sanskrit, and even to Proto-Indo-European itself, the phonological system of Tocharian is quite simple. Both Tocharian languages have almost identical phonological systems. In particular, they have the same consonant inventory.
| Labial | Dental | Palatal | Retroflex | Velar | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Stops | p | t | c | k | ||||||
| Affricates | ts | |||||||||
| Sibilants | s | s' | s. | |||||||
| Liquids | l | ly | r | |||||||
| Nasals | m | n | n' | |||||||
| Glides | w | y |
The only voiced consonants are the resonants (liquids and nasals) and glides; all stops, affricates, and sibilants are voiceless. To what degree this classification is actually phonetic, and not merely phonemic, is difficult to say. For example, the stops of native Tocharian words are generally not written with the characters corresponding to the Sanskrit aspirates. But this is no guarantee that the stops themselves did not have some degree of aspiration, much like the p in English pot. We can only be relatively certain that the distinction between aspirate and non-aspirate was not important in Tocharian.
One important distinction is that between palatal and non-palatal consonants. As is clear in the chart, the Tocharian languages have a large palatal inventory, and the distinction between e.g. l and ly is phonemic. This alternation is largely a result of historical processes which will be discussed elsewhere in these lessons.
All consonants can be single (e.g. s.) or doubled (e.g. s.s.), possibly denoting a difference in consonant length. Compare, for example, the distinction between [n] in English pennant and [nn] in English penknife. Doubled consonants are rare, however, in Tocharian A. There is evidence that consonant doubling might not (always) denote consonant length: it appears that ll is a frequent spelling for the single palatal consonant ly. From this and other alternations, it seems likely that doubling consonants is a typical manner of denoting palatalization.
The two Tocharian languages have a common inventory of simple vowels. Their transcription and probable phonetic values are given in the chart below.
| Transcription | Phonetic Value | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Front | Central | Back | Front | Central | Back | |||||||||||||||||
| High | i | ä | u | [i] | [i-] | [u] | ||||||||||||||||
| Mid | e | a | o | [e] | [á:] | [o] | ||||||||||||||||
| Low | a: | [a] |
There is no certain evidence that the Tocharian languages had phonemic vocalic length. Rather, all vowels are phonemically short in both languages. It is important to note in this regard that the symbol a: is merely a convention of transcription -- it does not denote a long vowel, but rather an open, low, back unrounded vowel.
The phonetic value of ä is also poorly understood. Some evidence points to it being a front, mid vowel, though likely very weakly articulated. ä is often found where it is not etymologically expected, being the vowel generally employed to break up difficult consonant clusters.
By the time of the documented Tocharian languages, diphthongs remain only in Tocharian B.
| Front | Central | Back | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| High | ||||||||||
| Mid | oy | |||||||||
| Low | ay, aw |
The diphthongs are falling diphthongs formed the addition one of semivowels y or w. The diphthongs written with the simple vowel e appear to have had the vocalism of a diphthong with nucleus a: one finds variations <ey> alternating with <ai>, <eu, e(o), e(o)w> alternating with au. Likewise <o(i)> alternates with oy. The diphthongs which existed in Proto-Tocharian were monophthongized in Tocharian A.
Some spellings indicate possible allophonic variation of consonants, that is, variation in the actual phonetic realization of a sound, but which nevertheless does not change the meaning of the form. Though the stops were generally voiceless in word-initial or word-final position, there is some evidence that stops were voiced in certain other environments. For example, the occasional writing of n` for n`k suggests that -k was voiced to [g] in this position. It also seems stops were generally voiced between vowels, or after a consonant but before a vowel. Doubling of stop consonants evidently denotes a voiceless consonant which would otherwise have been voiced between vowels: for example, nätk- 'push' has present stem nättäk-, suggesting that the t remained voiceless between vowels. A summary of the possible allophones of the stop phonemes in various environments is given in the table below.
| #_ or _# | V_V or C_V | N_ | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| /p/ | p | ß | b | |||||||||||||||
| /t/ | t | d | d | |||||||||||||||
| /k/ | k | g | g |
The character used for w seems at times to represent the voiced bilabial fricative [ß]. The v of Sanskrit, which frequently had a fricative pronunciation, is typically rendered by Tocharian scribes as v, emulating Sanskrit spelling, or as w (e.g. avis' or awis' from Skt. avi:ci). As the table shows, Tocharian p seems also at times to represent [ß]. This might explain such spelling alternations as B cpi for more usual cwi 'his'.
Given the preceding discussion, it is possible to estimate the actual phonetic value of the Tocharian sounds. These are given in the table below. These can only be approximate at best and are certainly open to revision as the Tocharian languages become better understood.
| Transcription | Example | & |
|---|