The University of Texas at Austin; College of Liberal Arts
Hans C. Boas, Director :: PCL 5.556, 1 University Station S5490 :: Austin, TX 78712 :: 512-471-4566
LRC Links: Home | About | Books Online | EIEOL | IE Doc. Center | IE Lexicon | IE Maps | IE Texts | Pub. Indices | SiteMap

Syntactic Typology: Studies in
the Phenomenology of Language

Winfred P. Lehmann


6. The Syntax of
Subject-Final Languages

Edward L. Keenan III

6.0. Introduction

In section 6.1 I present a defininition of subject-final language and exhibit eight languages which appear to meet the definition. In section 6.2 I present some of the general syntactic properties of subject-final languages. And in section 6.3 I propose a partial answer to the question "Why are so few languages subject-final?"

6.1.0. Definition of Subject-Final Languages

By subject-final language I understand any language in which full noun phrase subjects must follow full noun phrase direct objects in the pragmatically less marked sentence types (which contain both subjects and direct objects) of the language. By pragmatically less marked (henceforth least marked or unmarked) I refer to those sentences which place the fewest restrictions on their contexts of appropriate use. Thus English would not be considered a subject-final language, since in general those sentences which present objects before subjects, like (1a) below, place more restrictions on their context of appropriate use than do sentences like (1b) in which the object follows the subject.

(1) a. Ice cream the children will eat.
  b. The children will eat ice cream.
(1a) is most appropriate in a context in which ice cream is being contrasted with, or mentioned to the exclusion of, something else, say spinach. And since (1b) does not put such restrictions on its context of use, it is (pragmatically) less marked than (1a).

Further, the use of must in the definition of subject-final language rules out of consideration languages like Walbiri (Australia; see Hale 1967), Tagalog (Philippines, Malayo-Polynesian; see Schachter and Otanes 1972), and Ignaciano (Bolivia, Arawakan, Andean-Equatorial Phylum; see Ott and Ott 1967). In these languages changing the relative order of subject and object in unmarked sentences appears to effect no change in the pragmatic markedness of the sentence. Thus attention here is restricted to languages in which the relative order of subject and object is an important property of the language. Any change in relative order of subject and object in the languages considered then will either yield ungrammatical structures or else ones which are more marked (pragmatically) than the original.

Determining which sentences in a language are less marked, however, can be a far from simple task. In the languages discussed below, assessment of the least-marked word order is based, depending on the language, on the following:

In the third case, most of the sources were not specifically concerned with the question of least-marked word order, so I have made the following assumptions: first, that the least-marked sentences will be among the most frequently occurring; second, that the citation order in simple sentences in the grammar is among the least-marked word orders; and, third, that the least-marked sentences have the greatest syntactic distribution. Thus they will in general present the greatest range of marking for tense/aspect, mood, voice, etc.; they will nominalize and embed more easily than more marked sentences; they will be the easiest to question and relativize into, etc. For example, in (1a,b) above from English, one can relativize on the children in the unmarked sentence (1b) but not in the more marked sentence (1a).

(2) a. *the children who ice cream will eat
  b. the children who will eat ice cream

6.1.1. Distribution of Subject-Final Languages

Using the definition given in the previous section, it is clear that very few languages appear to be subject-final. Indeed, Greenberg's first universal (Greenberg 1966: 110) states: "In declarative sentences with nominal subject and object, the dominant order is almost always one in which the subject precedes the object." Greenberg does, however, cite Coeur d'Alene, Siuslaw, and Coos, all Amerindian languages, as presenting the order V(erb) O(bject) S(subject) as dominant. While lacking substantiating data on these languages, I have found eight other languages which appear to present VOS, in one variant or another, as the least-marked word order. These I refer to as the "sample" throughout my essay. There are, however, no convincing examples of languages whose unmarked word order is either OVS (but see postscript) or OSV, the other two logical possibilities for being subject-final (considering only the relative order of subject, verb, and object). See Pullum 1977 for discussion and rejection of the few tenuous examples of languages suggested as being OSV.

6.1.2. Some Subject-Final Languages

6.1.3.1. Malagasy (Madagascar; Malayo-Polynesian)

My data on Malagasy is based on a year of field work in situ, continued informant work, and several good grammars, (e.g., Malzac 1926). For sentences which present both subject and direct object, the least-marked word order is clearly VOXS, where X is any other full noun phrase required by the verb, such as indirect object, benefactive, locative, instrumental, etc. If several such noun phrases occur, it is possible for some locatives and temporals to occur to the right of the subject. The direct object, however, is rigidly bound to the immediate postverbal position; putting any other noun phrase between the verb and direct object results in resoundingly ungrammatical sentences.

(3) Manasa lamba (ho'an ny ankizy) ny zazavavy.
  wash clothes (for the children) the girl
  'The girl is washing clothes (for the children).'
(4) Nametraka ny harona teo ambon'ny latabatra Rabe.
  placed the basket there on-the table Rabe
  'Rabe put the basket on top of the table.'
The only alternate word order in which the subject would precede the object is SVO. This order, however, is clearly more marked. It occurs rarely in discourse and has a contrastive emphatic effect. (5) would be a natural context of use.
(5)
Inona no ataon'ny mpianatra?
what Ptc. done-by-the students
'What are the students doing?'
 
Ny zazavavy mihira, ny zazalahy mianatra.
the girl sing the boy study
'The girls are singing; the boys are studying.'

Of more common occurrence in Malagasy is an SVO order in which the subject is separated from the predicate phrase by the particle dia.

(6) Rasoa dia manasa lamba (izy).
  Rasoa Ptc. wash clothes 3Sg.
  'Rasoa (she) is washing clothes.'
This construction, however, represents a clear, if weak, topicalization of the subject (or any noun phrase). Its distribution is largely limited to main clauses. Sentences with such fronted noun phrases cannot be nominalized, relativized into, etc. The sentence type is thus clearly not among the least marked in Malagasy. See Keenan 1976a for further discussion.

Furthermore, one could question whether (4) above is really among the least-marked sentences (pragmatically speaking; see Keenan 1976a for justification that it is among the least-marked syntactically). Sentence (7) below, the passive variant of (4), would probably be a more usual way to express the idea in (4).

(7) Nopetrahan-d Rabe teo ambon'ny latabatra ny harona.
  placed-by Rabe there on-the table the basket
  'The basket was put on the table by Rabe.'
(7) however, does not present (in surface) a direct object phrase and so is not one of the sentences on which the definition of subject-final is based. Apparently then the set of least-marked sentences in Malagasy contains fewer ones with direct objects than does the set of least-marked sentences in, say, English. If this set contained no sentences with direct objects in Malagasy, we would not be justified in calling Malagasy a subject-final language. But this is not true. Sentence (3) cannot be construed in a passive form, since the semantic direct object lamba 'clothes' is indefinite and surface subjects in Malagasy must be definite; that is, they are presupposed referential. See Keenan 1976a for more discussion. Thus (8) is clearly ungrammatical.
(8) *Sasan'ny zazavavy (ho'an'ny ankizy) lamba.
  washed-by-the girl (for-the child) clothes
  'Clothes are being washed (for the children) by the girl.'
Thus the simple sentence in (3), without the oblique noun phrase, is the only unmarked way to express this idea. Malagasy, then, does present unmarked sentences with both subjects and direct objects; in such instances the subject always follows the object.

6.1.3.2. Batak (Toba Dialect, Northern Sumatra, Malayo-Polynesian)

Data here are based on informant work with two native speaker linguists, Dr. Maruli Butar-Butar and Mr. Liberty P. Sihombing, as well as on Silitonga 1973. (Page numbers by examples below all refer to Silitonga 1973.) I consider here only the Toba spoken in Northern Sumatra. There is a large Toba community in Djakarta (Java) which, on the basis of elicitation from one native speaker, appears to use SVO as the least-marked word order. It is likely that this is due to the influence of Bahasa Indonesia.

The unmarked order in active sentences in Toba Batak is VOSX. No noun phrase can intervene between the verb and its direct object. Oblique noun phrases, including indirect objects, follow the subject.

(9) Mangisap sandu nasida di djabu. (p. 21)
  smoke opium they in house  
  'They are smoking opium in the house.'
(10) Mangalean poda guru i tu dakdanak i.
  give advice teacher the to child the
  'The teacher gives advice to the child.'
Here again, however, SVO is a grammatically possible order, as in (11a).
(11) a.
Ibana mangisap sandu.
he smoke opium
'He is smoking opium.'
  b.
Ndang ibana mangisap sandu. (p. 69)
not he smoke opium  
'He isn't smoking opium.'
But the SVO order again seems contrastive. Thus (11b), the negation of (11a), only denies that he, in distinction to someone else, is smoking opium. It is still implied that someone else is smoking opium. Thus (11a) makes presuppositions on its context of use concerning whether other people than those mentioned in the sentence are smoking opium or not. Hence (11a) is more marked than the corresponding sentence in VOS order. As further support that VOS is the least marked order, SVO sentences occur only rarely in Silitonga (1973) and in my elicited data, while VOS occurs commonly. In addition, Silitonga argues that VOS is the syntactically most basic order, and he derives SVO sentences from VOS ones via a transformation. I conclude then that VOS sentences like (9) and (10) are among the least-marked sentences in Toba Batak. The direct objects in these sentences are, however, indefinite. While it is possible for direct objects to be definite, as in (12) below,
(12) Manghindat poti i baoa i. (p. 40)
  lift case the man the  
  'The man lifted the case.'
it is more usual for sentences in which there is a definite patient to present themselves in either of two ways. First, as in Malagasy, the sentence can be passivized, as in (13).
(13) Dihindat baoa i poti i.
  lift-by man the case the
  'The case was lifted by the man.'
The use of passives, which exist in a considerable variety of forms, is extremely common in Toba Batak (see Silitonga 1973, Ch. 11, for discussion). And some verbs can appear only as passive in simple sentences (although the active form shows up in more complex constructions).
(14) a.
*Mananda baoa i ibana.
Act.-know man the he
'He knows the man.'
  b.
Ditanda ibana baoa i. (p. 27)
Pass.-known he man the  
'The man is known to him.'
  c.
Ise mananda baoa i?
who Act.-know man the
'Who knows the man?'
These data suggest then that, as in Malagasy, the set of unmarked sentences in Toba Batak contains fewer sentences presenting direct object phrases than does the corresponding set in English. Nonetheless, those sentences like (9) and (10) with indefinite direct objects appear to be in this set. Furthermore, there is a second type of active construction in which definite direct objects precede subjects and which may be among the unmarked constructions in the language; a particle do is inserted between the direct object and the subject, as in:
(15) Manghindat poti na borat i do baoa i. (p. 30)
  lift case that heavy the Ptc. man the  
  'The man lifted the heavy case.'
Most usually do serves to single out the constituent which precedes it with the force of a cleft construction, as in (16).
(16) Si Bissar do mangisap sandu.
  Art. Bissar Ptc. smoke opium
  'It is Bissar who smokes opium.'
Sentences like (16) presuppose it is known that someone smokes opium and assert merely that the person is Bissar. Hence they are more marked than the simple VOS statement. We expect then that in sentences like (15) it is the whole predicate phrase lift the heavy case which is being contrasted with some other relevant action, such as locking up the room. However, such sentences occur quite frequently, either in active or passive forms, as translations of unmarked declarative sentences of English, so that it appears that this contrastive force of do is hardly felt as a predicate focus. Rather do seems to function as a mere particle separating the predicate phrase from the subject.

I conclude then that active sentences with indefinite objects are clearly among the unmarked sentences in Toba Batak, and that ones with definite objects probably are for certain verbs but not for others.

6.1.3.3. Fijian (Fiji, Malayo-Polynesian)

My data here come principally from a two-term field methods course given by me with the collaboration of a native speaker of Fijian. I wish to thank the participants in that field methods course for their help in procuring the Fijian data. Specifically I have used examples here from papers by Eser Ergavanli, Ammon Gordon, and Lynn Gordon.

The unmarked word order in Fijian appears to be VOXS, as in Malagasy, but with a considerably greater degree of word order freedom as discussed below.

(17) A tauva nai lavo mai na kato na tagane.
  past take Art. money from Art. box Art. man
  'The man took money from the box.'

(The "articles" nai/na above merely indicate that the nouns that follow them are common nouns and not that they are definite. In particular, direct objects occur easily as definite; lavo 'money' above could as well have been glossed as the money.) Simple sentences in Fijian may sometimes, however, appear in either a VOS or a VSO order:

(18) a.
Sa kila na taro o Wati.
Past-3Sg. know Art. question Subj. Wati
'Wati knows the question.'
  b.
Sa kila o Wati na taro.
Past-3Sg. know Subj. Wati Art. question
'Wati knows the question.'
The particle o precedes proper noun or independent pronoun subjects, so it is clear that (18b) has order VSO. The VSO order, however, is much more restricted in its distribution than the VOS order. Thus if both subject and object are proper nouns, or both are independent pronouns, the VSO order is not acceptable.
(19) a.
Sa kila-i Bale o Wati.
Past-3Sg. know-Obj. Bale Subj. Wati
'Wati knows Bale.'
  b.
*Sa kila-(i) o Wati Bale.
Past-3Sg. know-Obj. Subj. Wati Bale
'Wati knows Bale.'

When the direct object is a proper noun or a pronoun, the verb takes a characteristic suffix, -i for the example chosen. Further, preceding the verb is a pronominal clitic whose form varies as a function of the person and number of the subject.

It would appear then that VOS order is usable in a greater variety of instances than VSO order and so should be considered the less marked of the two. Further, there are two other reasons for considering VOS as the more basic of the two orders.

First, "heavy" noun phrases (with lexical heads) occur easily in the object slot in VOS order, but heavy subjects cannot occur in the VSO order (Gordon 1976). Once again then the distribution of VOS is wider than that of VSO.

(20) a.
A raica na tagane ka vakamatea na toa
Past see Art. man who kill Art. chicken
o koya.
Subj. he
'He saw the man who killed the chicken.'
  b.
*A raica na tagane ka vakamatea na toa
Past see Art. man who kill Art. chicken
koya.
him
'The man who killed the chicken saw him.'
(Koya in (20b) must be interpreted as the object since it lacks the subject particle o.)

Second, in sentences in which subject and object are both common nouns, and so not distinguished by case marking or verbal "agreements," the informant generally preferred the VOS interpretation of the sentence rather than saying it was ambiguous (although judgments here were not always consistent, so this evidence is less convincing than it might be).

(21) A raica na tagane na yalewa.
  Past see Art. man Art. woman
  'The woman saw the man.'
?'The man saw the woman.'

In addition to VSO, Fijian presents another alternate word order in which the subject precedes the object: SVO. Thus, (22) is an acceptable variant of (21), according to our informant.

(22) Na yalewa a raica na tagane.
  Art. woman Past see Art. man
  'The woman saw the man.'
(22) has only the reading on which the woman is the subject.

It appears likely, however, that the possibility of SVO order here was an artifact of the elicitation situation (in English) and the informant's having lived in America for some years. For example, in a survey of three recent newspaper articles in Fijian (where one might expect English language influence) no SVO sentences were observed, while verb-initial structures were very common. SVO is not the ordinary citation order in the literature. See, for example, Arms 1974 and references cited there. Furthermore, even for the informant SVO has a more limited distribution. Thus from (21) the phrase na tagane 'the man' can be relativized as in (23) below.

(23) na tagane ka raica na yalewa
  Art. man Rel. see Art. woman
  'the man who saw the woman' or
'the man whom the woman saw'
In (23) there is no way to tell whether yalewa 'woman' functions as the subject or the object of the verb, and the relative clause is fully ambiguous as between the two readings. If, however, we could relativize on tagane 'man' from (22), the meaning would be unequivocal since yalewa 'woman' would be left in the preverbal position and hence necessarily be the subject. However, despite this good motivation, our informant in general rejected such structures.
(24) *na tagane ka na yalewa a raica
  Art. man Rel. Art. woman Past see
  'the man whom the woman saw'

So, again, VOS seems the more unmarked of the two structures and I conclude that Fijian is basically a VOS language, but less rigidly so than Malagasy, for instance.

6.1.3.4. Gilbertese (The Gilbert Islands, Micronesia; Malayo-Polynesian)

The data here are based on informant work and Cowell 1971.

In very many respects, both syntactically and morphologically, Gilbertese patterns like Fijian. For example, it has preverbal subject clitics and object "agreement" suffixal to the verb. The least-marked word order is VOSX, where X covers all prepositional noun phrases including indirect objects. As my data on Fijian are considerably more extensive than those for Gilbertese, I merely give an example of subject-final order here and do not discuss Gilbertese otherwise. My claim that Gilbertese

(25) E kamatea te naeta te moa.
  iti kill-itj Art. snakej Art. chickeni
  'The chicken killed the snake.'
is subject-final is based on two considerations: first, that was the order of elicited material, and, second, it is the cited basic order in Cowell 1951. I lack information, however, as to whether SVO and VSO are grammatical orders, whether they represent topicalized or more marked orders, and so on.

6.1.3.5. Tzeltal (Southern Mexico; Mayan Family, Penutian Phylum)

The data here are based on field notes, elicitation, and texts provided by Penny Brown (see also Brown, forthcoming).

The least-marked word order in Tzeltal appears to be VOXS, with the possibility of locatives occurring after the subject attested in texts for intransitive sentences:

(26) La y- il te'tikil mut ta hamal te Ziak-e.
  Past he- see wild chicken in forest Art. Ziak-Art.
  'Ziak saw a wild bird in the forest.'
SVO however is also an attested order; while VSO is not:
(27) 'In te winik- e, la s- mil s- bankil.
  that-one Art. man- Art., Past he- kill his- brother
  'That man killed his brother.'

The VOS order appears clearly to be the less marked of the two orders, VOS and SVO, for the following reasons:

Often, however, there is a pause between the verb phrase and the subject in VOS sentences, which may suggest that VOS is a kind of "afterthought" order. The relative frequency of VOS over SVO, however, argues against this. Further, the SVO sentences in the text also present the subject followed by a pause. What may be more generally true is that any full noun phrase marked on the verb is separated by a pause, usually, from the rest of the sentence. Verbs mark both subjects and indirect objects, with the corresponding full noun phrase present; indirect objects may also be marked by a pause.

(30)
La y-ak'- be s-na, te anc, te
Past 3-give- to-her 3-house, Art. woman, Art.
 
winik -e
man -Art.
 
'The man gave his house to the woman.'

6.1.3.6. Otomi (Mezquital Dialect, Hidalgo, Mexico; Oto-Manguean Phylum)

All data here come from Hess 1968.

The least-marked word order for transitive sentences is given by Hess as:

Temp.-Foc.-V-O-S/IO-Obl.
(31) Bihě dútu núʔą ra da̧mé,
  he-donned-it his-Pl. clothing that the man
  'That man put on his clothing.'
(32) Pε̌ʔca ʔna ra ngǔ núʔą ra ríko.
  he-has-it one the house that the rich-man
  'That rich man has a house.'

Thus the subject and the indirect object occur in complementary distribution, but either follows the direct object. The immediate preverbal position is what Hess calls a focus position; it can be filled by any major noun phrase, such as the subject, direct object, or locatives. "Focus indicates to the listener by a shift of the item from post-nuclear to pre-nuclear that the speaker is shifting attention to something not already calling for special attention in the preceding clauses" (Hess 1968: 80-81).

It appears clear then that Otomi is subject-final, and that instances of SVO order are focused or pragmatically more marked. Further, VOS is the citation order of examples in Hess. In the two texts given in Hess there are four sentences with full noun phrase subject and direct object; three have VOS order and the other has SOV order, presumably indicating some sort of double focus.

6.1.3.7. Ineseño Chumash (Southern California, Chumash Family, Hokan Phylum)

The data here come entirely from Applegate (1972), whose work is based on the field work of John P. Harrington, done in 1911 and 1919. Ineseño Chumash is now presumed extinct. I am indebted to Pamela Munro for drawing my attention to Applegate's work and for providing a typological sketch of Ineseño Chumash, which summarizes the lengthy description in Applegate. (Page references cited below are to Applegate.)

Applegate (p. 475) states that the "favored, neutral (word) order" in I-Chumash is V-IO-DO-S-Obl., noting (p. 473) that there is some variation in the relative order of IO and DO. He notes further (p. 466) that full noun phrases carry no case marking (excluding obliques) and that relative position after the verb is the primary indicator of grammatical relations (subject, direct object, etc.).

(33) S-ul'iš- it ha- k- tuʔ ha- ʔɨhɨy-ʔ. (p. 471)
  3-grab- me Art.- my- ear Art.- man  
  'The man grabs my ear.'
(34) S-uluaqpey- us- wun ha- weselu ha- mɨy.
  3-chase- 3- 3Pl. Art.- calf Art.- wolf
  'The wolf chases the calves.'
There are, however, two ways, in simple sentences, in which an SVO order may arise (VSO does not seem to be attested at all). First, most major noun phrases may be moved to a preverbal position by a process of topicalization. Subjects, direct objects, and locatives are among those which can be so topicalized:
(35) Šow ha s-ʔuw hi Ponoya.
  pespibata Ptc. 3-eat Ptc. Ponoya
  'Ponoya eats pespibata.'
We may assume that the topicalized order is less basic than the VOS order, since it has the effect of emphasizing the topicalized noun phrase. For example, if the topicalized noun phrases are independent pronouns, they often occur in a special emphatic form when preverbal (p. 484).

Second, apparently it is not possible for more than two full noun phrases to follow the verb. Thus if the sentence has more than two, the subject may occur preverbally, as in (36); but nonsubjects can also be fronted, as in (37).

(36)
Ma- ʔeneq hi s-sin'ay ha- malak hi mam'
  Art.- woman Ptc. 3-put Art.- tar Ptc. inside
 
ha- s-ʔawaq.
Art.- 3-jug
 
'The woman puts tar inside her jug.'
(37)
Ma- takak ha- s-am- axšɨs ha- maxal'amɨs
Art.- Quail Art.- 3-Indef.- invite Art.- fiesta
 
hi ʔas'aka.
Ptc. ʔas'aka
 
'They invite Quail (to) the fiesta (at) ʔas'aka.'

It appears clear then that for simple sentences the least-marked order is VOS; orders in which the subject precedes the object are either emphatic or grammatically conditioned if the sentence is complex.

6.1.3.8. Baure (Bolivia, Arawakan Family, Andean-Equatorial Phylum)

All data here come from a single article, Baptista and Wallin 1967. (Page numbers in this section all refer to that article.) The least-marked word order appears to be VOS, as illustrated in (38) and (39). However, very few sentences presenting full noun phrase subjects and direct objects occur in Baptista and Wallin texts, so Baure is the least well supported candidate for being subject-final in our sample.

(38) Ro- pónoek- iyo- wo- ni to čor teč ni-šír?
  3Sg.M.- plant- where- Punct.- Incom. the corn that my-son
  'Where did my son plant the corn?'
(39) Wéčon to neč te hir ačó- w to ro-píri.
  fighter the those this man and- Punct. the his-brother
  'This man and his brother are fighting those.'
SVO however is a grammatically possible word order:
(40)
To šiyé ro- hínokopaw kon to rámpikow
the fox 3-Sg.M.- see what the he-carry-come
 
teč toéroker.
that field-man
 
'The fox was seeing what the farmer was bringing.'
In my view VOS represents the least-marked order in Baure for the following reasons: first, of the twenty-four clause types formulaically described in Baptista and Wallin, twelve present subjects and direct objects. Of these, eleven present the subject slot after the object slot and one presents SVO order. These orders are described as the most frequent (p. 36), although it is also stated there that the order varies freely. Second, for all the intransitive sentence formulae presented, the subject follows the verb. To my knowledge, basic intransitive order VS associated with transitive order SVO is nowhere attested among the languages of the world. Third, and perhaps most important, clause-initial position is claimed to be emphatic (p. 33), although no exemplified discussion is given regarding the meaning differences between SVO and VOS orders.

I conclude then that the Baure evidence suggests that it is subject-final, but clearly more extensive investigation would be needed to establish this convincingly.

6.2. Some General Properties of Subject-Final Languages

In this section I present several generalizations (Gs) concerning the syntax of subject-final languages.

G-1: Subject-final languages are always verb-initial
(but see Postscript).

That is, in the least-marked sentence types in a subject-final language, the verb normally precedes the major noun phrases required by the verb. Thus in intransitive sentences the unmarked word order is VSX (e.g., Batak) or VXS (e.g., Malagasy, Chumash). For the languages in our sample, G-1 needs only sporadic qualifications. Perhaps the most significant is that in Chumash, intransitive sentences with nominal predicates usually have an SV order, although VS order is also possible:

(41) Kay ka wot'. (Applegate 1972: 452)
  he Ptc. chief  
  'He is a chief.'
(42) Šiša-k-pepe? hi kay. (Applegate 1972: 452)
  half-my-brother Ptc. he  
  'He is my half-brother.'

G-2: Subject-final languages normally occur in linguistic phyla in which verb-initial languages are common.1

The most serious qualification of G-2 concerns Chumash, presumably a member of the Hokan phylum. The more common order in Hokan would appear to be verb-final, as for Yuman languages generally, such as Digueño and Mojave (see Munro 1974 for discussion). (My knowledge about the distribution of word order types in other Hokan languages is limited. Specifically, I know nothing of the Hokan languages in northern California to which Applegate asserts Ineseño Chumash is related. It is quite generally true however that Amerindian languages of the northwest coast are verb-initial. Thus Salish, e.g., Bella Coola, is dominantly verb-initial. So are Wakashan languages like Nootka, at least some Penutian languages like Chinook, and various isolates like Quileute.) On the other hand, one may infer from Applegate that the affiliation of Chumash with Hokan is not entirely well established: "The Chumashan languages are thought to be related to the Hokan stock, represented primarily in northern California, but this relationship is not an easy one to establish" (1972: 1).

The phyla to which the other languages in my sample belong evidence significant verb-initiality. The best established evidence is for the Malayo-Polynesian (Austronesian) phyla. The four languages cited here from those phyla, Malagasy, Fijian, Gilbertese, and Toba Batak, are all verb-initial. In addition, Philippine languages (Tagalog, Kapampangan, etc.) are all verb-initial, as are the Polynesian languages, Maori, Tahitian, etc.

As regards Tzeltal (Mayan family, Penutian phylum) verb-initial is a common order in Mayan (e.g., Jacaltec; see Craig 1977 for discussion). Elsewhere in Penutian, e.g., Chinookan on the northwest (American) coast, verb-initial occurs. As regards Otomi (Oto-Manguean phylum) the Zapotecan language dialects are verb-initial (see Picket 1960 for discussion).

Finally, as regards Baure (Arawakan family, Andean-Equatorial phylum) the picture is slightly more complex. Several other Arawakan languages (we know little about the distribution of word order types more generally in Andean-Equatorial) are known to be verb-initial, e.g., Ignaciano (Ott and Ott 1967) and Machiguenga (Snell and Wise 1963). But at least one other Arawakan language, Piro (Matteson 1965), is cited as verb-final. And furthermore, Baure, as well as Ignaciano and Machiguenga, presents several morphosyntactic properties characteristic of verb-final languages as opposed to verb-initial ones. Most of these properties (see below) concern the position of small morphemes which are either completely bound or else have their position rigidly fixed relative to a major constituent (verb or noun phrase) of a sentence. On the assumption that the position of bound or nearly bound morphemes changes more slowly historically speaking than the relative position of unbound constituents, the hypothesis that the verb-initial order in Arawakan is a relatively recent innovation, the bound morphology not yet having "caught up" so to speak, is tempting and worthy of further investigation.

G-3: SVO is a grammatical (although marked) word order in all VOS languages.

This point has been substantiated in section 6.1 above. Here I note only that, as Greenberg has pointed out, SVO is quite generally an "alternate" order available in verb-initial languages.

My next generalization concerns verb agreement. A verb agrees with a full noun phrase in a sentence if it has bound morphemes or clitics whose form varies with the noun class of the full noun phrase.

G-4: If a language is subject-final then either transitive verbs of unmarked sentences agree with no full noun phrase in the sentence or they agree with two noun phrases.

G-5: If transitive verbs in subject-final languages present agreement at all, then they have a prefixal (pre-verb stem) agreement with the subject noun phrase and a suffixal agreement with a nonsubject.

For the languages in our sample, Malagasy and Batak present no verb agreement at all. All the others have a prefixal subject agreement and some form of suffixal object agreement. In Tzeltal and Otomi the suffixal agreement is with the indirect object rather than the direct object. It is unclear from Hess 1968 whether verbs also inflect for direct object, or whether there is simply a distinction in stem forms for transitive and intransitive verbs. In Baure, object agreement is, according to Baptista and Wallin 1967, optional when the full noun phrase direct object is present, but shows up with pronominal force when the full noun phrase is not present, as in (43):

(43) Pi- kótokoše- ro.
  you grab-will- it
  'You will grab it.'
And for some sentence types, the suffixal proform on the verb must be interpreted as referring to the indirect object if a full noun phrase direct object is present.

G-6: Subject-final languages have relatively little nominal ease marking.

By nominal case marking I mean that full noun phrases carry affixes or pre- or postpositions which indicate the semantic and/or the grammatical relation they bear to the verb. Thus for no language in my sample is there a general way of distinguishing full noun phrase subjects and objects by nominal case marking. Malagasy and Fijian do however present limited means for making a subject-object distinction by case marking.

In Fijian, proper nouns and the independent personal pronouns (but not common nouns) are preceded by a particle o when they occur as subjects:

(44) E mokut- i Wati o Bale.
  3Sg. hit- Obj. Wati Topic Bale
  'Bale hit Wati.'
This particle is more properly considered a topic marker rather than principally a subject marker, since pronouns and proper nouns, when they are fronted as topics, also carry this particle:
(45) O Wati e mokuta o Bale.
  Topic Wati 3Sg. hit   Bale
  'Wati Bale hit.'
Nonetheless in unmarked sentences o does distinguish pronoun and proper noun subjects from objects. If the subject is a common noun phrase, however, it is not morphologically distinct from the object.

In Malagasy, proper noun human direct objects as well as direct objects beginning with the phoneme /i/ may take a locative-genitive particle an-. For some speakers the use of this particle is optional. Compare (3) and (4), in which the object is unmarked, with (46) below:

(46) Nahita an- dRabe Rasoa.
  saw Acc.- Rabe Rasoa
  'Rasoa saw Rabe.'

Further, it might appear from examples like (30) that in Tzeltal subjects carry a special discontinuous form of the article te ... -je, while nonsubjects have only te. It appears from my data however that the discontinuous form of the article is generally used for the last noun phrase in a sentence, regardless of grammatical role. Thus in (29) Ziak has the discontinuous form of the article even though it may be interpreted as the object, not the subject. The distribution of the discontinuous form of the article would however bear further investigation.

Despite these qualifications, it seems clear from the data that in general in subject-final languages common noun phrase subjects and objects are not distinguished by adpositions or case affixes, and proper noun subjects and objects are only sporadically distinguished by these means.

Furthermore, it is not uncommon in these languages for indirect objects to carry no nominal case marking. Thus full noun phrase indirect objects are most usually unmarked by affixes or adpositions in Malagasy, Chumash, Tzeltal, Otomi, and to judge from very limited data, Baure. On the other hand, such indirect objects are constructed with prepositions in Fijian, Gilbertese, and Batak.

G-7: Subject-final languages are generally prepositional rather than postpositional.

Baure is a partial counterexample to G-7. The adpositions which occur as bound morphemes are clearly postpositions, not prepositions:

(47) Biyónopoekopas̀a pon soratí- ye.
  we-walk-down other town- to
  'We are going by land to another town.'
Oh the other hand, the postpositional system in Baure does seem to be of limited productivity. Thus postpositions occur only on nouns but do not follow entire noun phrases: (48) and (49) are from Baptista and Wall in 1967: 45:
(48) a.
to pon bipér típoreko- ye c̀ic̀a
the other Class. chicken- in big
'in the other big chicken'
  b.
to no- sóri-ye to ni-ronáneb
the their- town-in the my-parents
'in my parents' town'

Furthermore, adpositions which are morphemically independent occur as prepositions:

(49) a.
iyowón soratí- ye
from town- in
'from within the town'
  b.
is̀kón embére
until tomorrow
'until tomorrow'
Since postpositions are highly characteristic of verb-final languages, this patterning is quite consistent with my earlier suggestion that Baure (as well as Ignaciano and Machiguenga, which also present bound adpositions as postpositions) has recently changed from a verb-final to a verb-initial language.

Otherwise, all the other languages in the sample here are exclusively prepositional. Yet most of the American Indian languages in the sample seem to have very few unanalyzable prepositions. Thus in Tzeltal, ta is just about the only unanalyzable preposition in the corpus; it translates instrumental and locative prepositions in English. Chumash (see Applegate 1972: 431) may have no unanalyzable prepositions at all. Even locatives are normally presented without prepositions, as in example (37), although the verb may take an affix indicating that a locative element is present. The few candidates for prepositions appear to be derived from independently existing nouns or verbs. Finally, in Otomi there again appear to be only a few indigenous prepositions. Many locatives, for example, are presented without overt locative prepositions. And many of the prepositions which do exist are clearly borrowings from Spanish, as Hess (1968: 145) points out.

Finally, verb-initial languages generally are prepositional (Baure and related Arawakan languages plus a few others such as Quileute on the northwest coast notwithstanding), so it is not likely that prepositionality is directly dependent on the property of being subject-final.

G-8: In subject-final languages noun phrase questions can always be formed by putting the question word, e.g., Who? What? etc., in a preverbal position, provided the question word is not a bound morpheme.
This generalization holds for all the languages in our sample. The Malagasy example below is typical, although it is not true that the question word is separated from the rest of the sentence in all the languages.
(50) Iza no manasa lamba?
  who Ptc. wash clothes
  'Who is washing clothes?'

The proviso to G-8 regarding morphemically independent question words is necessitated by Baure. Some question words, as in (38), occur as bound morphemes on the verb, and these are not fronted in questions. Others, such as Who? and Why? do occur as independent morphemes, however, and so are fronted, taking, curiously, subject and object clitics (at least in the case of Why?).

(51) Ko- ro- pi poékon?
  who-why him- you allower
  'Why do you let him?'

Furthermore, in the languages in my sample, it is not always necessary to front independent question words. For example, both Malagasy and Batak allow certain types of non-subjects to be questioned merely by inserting the appropriate question word in the noun phrase position, as (52) from Malagasy illustrates.

(52) Manasa lamba amin'inona Rasoa?
  wash clothes with-what Rasoa
  'With what is Rasoa washing clothes?'
(52) need not be an echo question.

It is possible that G-8 can be stated in a more general fashion, something along the lines of "focused, or relatively new, information is fronted." I cannot give a general definition of focused constituent but the intent here is to characterize the information role of John in sentences like (53).

(53) It was John whom Mary saw.
Intuitively, (53) presupposes that Mary saw someone and asserts that that person was John. Similarly in questions like Whom did Mary see? it is presupposed that Mary saw someone, and the identity of that person is requested. Several of the languages in the sample here, notably Malagasy, Batak, and Fijian, present constructions which focus on noun phrases in the way (53) focuses on John. And all three languages focus the noun phrase by presenting it in a preverbal position. The Malagasy (54) is illustrative:
(54) Rasoa no manasa lamba.
  Rasoa Ptc. wash clothes
  'It is Rasoa who is washing clothes.'

Data on focusing in the other languages of the sample are lacking, but if they have such focus constructions and they follow the pattern of question formation, as is plausible on semantic grounds, then the generalization of G-8 would be established.

G-9: All subject-final languages present morphemically independent subordinate conjunctions which precede a finite subordinate clause.

(55) from Malagasy is illustrative:

(55) Tsy faly Rabe satria marary Rasoa.
  not happy Rabe because sick Rasoa
  'Rabe is not happy because Rasoa is sick.'
The significance of G-9 however is considerably weakened by the fact that subordinate conjunction plus finite subordinate clause is not a dominant way of expressing subordination in the Amerindian languages of my sample. Subordinate conjunctions exist in Tzeltal, but the corpus contains very few examples, so it does not look like a well-developed category. In Otomi several of the subordinate conjunctions are clearly borrowings from Spanish, so again the category does not look well entrenched in the language. In Baure, those subordinating particles which are morphemically independent clearly do precede finite clauses, but a more usual way to indicate subordination in this limited corpus is by bound suffixes on the verb. Such clauses, however, usually translate headless relatives and embedded questions in English, rather than true adverbial clauses of the "if, when, because" sort. And finally, in Chumash again, while there are a few examples of words analyzable as subordinate conjunctions (see Applegate 1972: 421 for examples), the more common way to express subordinate clauses is by nominalization of sentences. These nominalizations, however, generally translate headless relatives or embedded questions.

The next several generalizations concern the internal structure of the noun phrase:

G-10: In possessive constructions subject-final languages always present full noun phrase possessors after the head (the possessed) noun phrase.

In the Amerindian languages of this sample, the possessor noun phrase is not marked as genitive or introduced by a preposition, and the head noun carries a pronominal prefix or clitic which agrees in person and number with the possessor (something which is otherwise common in Amerindian languages). Otomi (56) is illustrative:

(56) noyá ʔaxwǎ
  his word God
  'God's word'
In the Malayo-Polynesian languages of this sample, heads do not normally agree with possessors; but there is usually some sort of particle between the head and the possessor, as in Malagasy (57):
(57) ny trano- nd- Rabe
  the house- Poss.- Rabe
  'Rabe's house'
(The /d/ in /nd/ above is not part of the possessor morpheme; it is inserted whenever an /n/ and an /r/ would otherwise occur together, in that order.)

In Chumash the full noun phrase possessor may be placed before the head for emphasis, and in all the Amerindian languages in this sample pronominal possessors are represented (in the least-marked way) solely by the pronominal prefix or clitic. Thus his word in Otomi would be expressed merely by deleting ʔaxwǎ 'God' from (56).

G-11: In subject-final languages relative clauses always present the head noun to the left of the restricting clause.

(58) from Malagasy is illustrative:

(58) ny zazavavy (izay) manasa lamba
  the girl (that) wash clothes
  'the girl who is washing clothes'

G-12: Subject-final languages do not have relative pronouns.

By relative pronoun I understand some kind of pronominal element which marks the grammatical role (subject, object, etc.) of the position relativized. G-12 clearly holds for the Malayo-Polynesian languages of the sample. All these languages present a morphologically invariable particle which occurs (optionally or obligatorily depending on the language) between the head and the restricting clause, as Malagasy izay in (58). In Fijian and Gilbertese, personal pronouns may be retained directly in the position relativized if the position is constructed with a preposition or is a possessor noun phrase. In Batak only possessor noun phrases allow pronoun retention. Malagasy allows no pronoun retention.

The application of G-12 to the Amerindian languages in this sample is less clear however, in large part due to lack of data. In Baure and Chumash the head noun phrase of relative clauses is followed by the restricting clause, which is in nominalized form and could stand alone as a headless relative (the one that...). The nominalized verb carries an article and various subordination particles, but these particles do not appear to differ according to whether the head functions as a subject or object of the verb. Nor do they otherwise appear to have any pronominal function — that is, they are subordinators used in many contexts besides relative clauses and do not appear to have a referential function. -al- in Chumash is illustrative:

(59) ha- k- ʔuw-muʔ ha ka ha- k-al- aqs̀iyɨk
  Art.- my- food Art. Ptc. Art.- I-Sub.- like
  'my food that I like'

In Tzeltal and Otomi, on the other hand, interrogative pronouns sometimes appear as (part of) the relativizer word(s). Since locative interrogative words, for example, are distinct from those which question subjects and objects, one might (to extrapolate from the limited data) be able to distinguish, e.g., the house that I saw from the house where I live. The interrogative pronouns however are not distinct for subject and object, so to that extent at least the languages lack relative pronouns.

G-13a: All subject-final languages possess articles.

G-13b: With more than chance frequency subject-final languages have definite articles (distinct from the ordinary demonstrative adjectives).
As regards G-13b, Malagasy, Batak, Otomi, Tzeltal, and Baure have definite articles, while Gilbertese, Fijian, and Chumash do not. In Gilbertese, however, there is a singular article (definite or indefinite) and a plural indefinite article. In Fijian, common nouns carry an article, but it appears to give no semantic information, e.g., definiteness, number, or noun class, concerning the noun. And in Chumash almost all common nouns are constructed with an article, although again no semantic information seems to be indicated by it. Its presence does, however, distinguish nominalized verbs from non-nominalized ones.

Concerning other common elements within noun phrases I find no fully universal generalizations. The closest is:

G-14: With much greater than chance frequency numerical expressions precede the nouns they modify.

Only Malagasy is exceptional here. Numerical expressions behave by and large like descriptive adjectives and as such follow the noun (although there are one or two frozen expressions with numerals preceding nouns). In some of the other languages, notably Tzeltal and Gilbertese, the constructions of numerical expressions are complicated by the existence of numeral classifiers.

G-15: With much greater than chance frequency articles precede nouns.

Only Batak is exceptional here in having the article (which also functions as the third person inanimate personal pronoun) follow the noun.

As regards the positioning of descriptive adjectives only the weak generalization is possible that it is slightly more usual for such adjectives to follow nouns than to precede. However the normal order is adjective plus noun in Tzeltal and Otomi, and both orders, adjective before or after noun, are about equally common in Chumash.

And as regards demonstrative adjectives, I have no generalization at all. They precede nouns in Chumash, Otomi, and Baure. They follow nouns in Gilbertese, Fijian, and Batak. And they "frame" noun phrases in Malagasy and Tzeltal, as illustrated in (60) from Malagasy and (61) from Tzeltal:

(60) ity trano fotsy ity
  this house white this
  'this white house'
(61) men k'shk'al winik- i
  that fiery man- that
  'that fiery man'

The next generalizations concern the internal structure of verbs and verb phrases.

G-16: Negative elements precede the verb in subject-final languages.

G-16 holds for all the languages in the sample. (55) above from Malagasy is illustrative.

G-17: A causative element precedes the root of the causativized verb in subject-final language.

In all the languages in the sample except Tzeltal and Otomi, causative constructions are formed by prefixing the verb root with a causative morpheme. (62) below from Malagasy is illustrative.

(62) Mamp- ianatra ny ankizy Rabe.
  cause- study the child Rabe
  'Rabe teaches the children.'
In Tzeltal the only examples I have of causative constructions are constructed with the verb 'give' followed by a nominalization of the causativized verb:
(63) Ya yák' ta mánei.
  Pres. he-give Prep. buying
  'He causes (him) to buy (it).'
Similarly in Otomi an overt verb of causation precedes the causativized verb.

G-18: All subject-final languages have passive forms of verbs (ones in which the object of the active verb functions as the subject).

Sentences (4) and (7) above from Malagasy illustrate active-passive pairs.

G-19: 'Passive' is generally marked in the verbal morphology in subject-final languages.

The only clear exception here is Tzeltal, where passives are constructed from the active verb receive followed by a nominalized form of the "passivized" verb (these nominalized forms occur in a great variety of complex verbal constructions, e.g., the causatives mentioned above).

(64)
La y- ich' 'utel yu'un s- tat te
Past he- receive bawling-out because his- father Art.
 
Ziak-e.
Ziak-Art.
 
'Ziak was bawled out by his father.'
In Baure 'Passive' appears to be marked on the verb, but more examples illustrating the full verbal morphology are needed to make this conclusion certain.

G-20: Subject-final languages generally do not have overt copulas.

By copula I mean a morphophonemically independent element which has the characteristic properties of stative verbs in its language and which functions as the main verb in sentences with nominal predicates, like John is a thief. Malagasy is a typical here:

(65) Mpampianatra Rajaona.
  teacher John
  'John is a teacher.'

Otomi may be a counterexample to G-20. In sentences like (66) below there is a morphophonemically independent element which codes categories of person and, apparently (no full paradigms are given), of tense.

(66) Nugí ma nǎna mrá měngu ʔbεtʔí
  I my mother 3Past native Beti
  'As for me, my mother was a native of Beti.'
It is not clear from Hess (1968) however that mrá in (66) has otherwise the morphology of verbs. Rather it looks as though the apparent copula shows more affinity with nominal constructions than verbal ones, although more complete data would be needed to justify this suggestion. Furthermore, mrá is not present when the main predicate would be adjectival, as in Mary is pretty.

Otherwise both Malagasy and Chumash allow the subject to be separated from a nominal predicate by a particle, but this particle is invariable in form and otherwise has none of the properties of verbs. (41) above from Chumash is illustrative, and the Malagasy construction, (67) below, is clearly a form of topicalization (see section 6.1 for more discussion).

(67) Rabe dia mpampianatra.
  Rabe Ptc. teacher
  'Rabe is a teacher.'

Finally, other possible word order correlates in verb phrases based on Greenberg, such as the relative position of modals, adverbs, and the verb, have proven difficult to verify, since for many of the languages in the sample it appears doubtful that there are distinct grammatical categories of modal and adverb.

Nonetheless elements which translate modal concepts like obligation, necessity, desire, intention, etc., usually precede the verb; they follow the verb however in Chumash and, where they occur as bound affixes, in Baure (where they are independent morphemes in Baure they occur preverbally).

Manner adverbs, however, are even harder to establish as a basic grammatical category in the languages of the sample. They appear as a distinct category in Malagasy and Fijian, where they follow the verb. They seem to follow the verb in Otomi, but more examples are needed. Certain adverbial-like particles appear suffixal to verbs in Baure. In Tzeltal and Chumash, adverbs appear to be merely verbs and occur preverbally, as in (68) from Chumash.

(68) S- towic̀ ha s- wala-tepet
  it- go-fast Ptc. it- turn
  'It turns fast.'

The few available examples of manner adverbials in Gilbertese appear to occur preverbally, in the same position as negation and other sentential adverbs like truly.

6.3. Explaining the Scarcity of Subject-Final Languages

Subject-final languages, as indicated here, comprise a small and erratically distributed percentage of verb-initial languages. Verb-initial languages are themselves a clear minority among the world languages, probably not constituting more than 10 percent of the total. Verb-final languages, with various degrees of word order freedom, are the most widely distributed of word order types; SVO languages are a reasonably close second. Together these two types include about 90 percent of the world's languages. Given this distribution, it is natural to wonder why so few languages avail themselves of the syntactic possibility of presenting the subject in final position.

Here I suggest that at least part of the explanation is cognitive, that is, it concerns how humans understand the meanings of sentences, and part of the meaning is pragmatic, that is, it concerns the purposes for which humans use language. (I do not claim however that this constitutes the whole explanation.) Specifically I argue that syntactically "simple" or basic sentences present certain cognitive and pragmatic difficulties for users of subject-final languages. And second I argue that several major syntactic processes for forming complex (less basic) sentences aggravate the basic difficulties and in addition introduce further cognitive difficulties.

6.3.1. Basic Difficulties

Many might consider that the principle that "old information comes first" is sufficient to explain the preponderance of subject-initial languages, given that most commonly the subject is old information in the sense that it usually refers to some person or object already known to the speaker and addressee(s) at the time of speaking. However, I find this principle, as it stands, insufficient. Specifically, why should old information come first? After all, usually when we communicate we want to say something not specifically known to the addressee. Why not begin then with the novel information and trail off with what is already known? Furthermore, I find no clear sense in which the information in the predicate is new and that in the subject old. What is new in a simple sentence like Ché lives! is not the information in either Ché or lives but rather that the referent of the subject has the property, or performed the action, expressed by the predicate. To understand what is new then we need both the subject and the predicate, so why not put randomly either one first? Or at least why don't languages vary randomly with regard to the choice they make? Below I propose a Relevance Principle which in my view covers perhaps some of the same ground as was intended for the old-information principle but which I think has somewhat greater explanatory force.

The Relevance Principle: The reference of the subject (phrase) determines in part, the relevance of what is said, regardless of what it is, to the addressee.
In other words, if all we know is what the subject phrase of a sentence is, we have some expectation whether what will be said is of any importance to us, depending on whether the subject phrase refers to an individual (or object) which is of concern to us or not.

Suppose, for example, we are discussing the success of a political meeting and someone claims:

(69) John left the meeting early.
Immediately upon hearing the subject phrase John we have some idea of the relevance, and importance, of whatever is said to our interest. If John is the candidate we are supporting, we are concerned to understand what is said about him; whereas if John is, say, merely the man who was supposed to set up the tables, we might be less concerned to understand what was said about him. In a subject-final language on the other hand we must attend to most of the sentence before we have much idea as to whether the new information is of much concern to us.

Subjects then are old information in the sense mentioned above, but more important they are the topics (of the least-marked sentences in a language) and thus specify what it is that the speaker is talking about. And a major advantage of mentioning the topic first is that the hearer immediately has at least a serious idea of how what is said (whatever it is) will concern him or her. We note further that:

1. Probably in all languages (Dyirbal may be an exception; see Keenan 1976b for discussion) there are many more verbs which take human subjects than any other sort of noun. Thus relatively few verbs in any language require that their subjects refer to times, locations, or even instruments of actions. Now if subjects are, in part, those noun phrases which serve to identify the interest of the address in the discussion it is natural (given that we are in general more interested in humans than in other things) that subjects should be dominantly human. (See Givón 1976a for discussion of the relation between topichood and humanness.)

2. Overt (marked) topicalization operations in most languages move noun phrases to the front of the sentence. Thus there may well be instances in which we are primarily interested in a participant in an action who would not normally have been identified by the subject phrase. In such instances languages generally have the means of marking that constituent in some way, very commonly by fronting it (perhaps introducing other changes in the sentence). And the same reason for saying that subjects in unmarked sentences come first usually also explains why topics in general ought to come first — they identify (at least in part) the interest that the addressee may have in understanding what is said. To put it differently, subjects usually come first because they are the topics of the least-marked sentences in a language. And topics in general come first because they determine the relevance of what is said for the addressee.

It should be noted perhaps that relevance for the addressee need not be restricted to mean relevance for the addressee's interests considered independently of the linguistic discourse in which the sentence is uttered. It may well be that identifying the reference of the subject or topic (phrase) functions to allow the addressee to determine the relevance of what is going to be said to what has already been said. Thus as Givón (1976a) has argued, topics have a discourse linking function.

A Principle of Semantic Interpretation: The meaning of the predicate phrase often depends on the reference of the subject.
If this principle is correct, it is cognitively advantageous to present the subject before the predicate (= verb + its closely associated objects), since otherwise the addressee will have to "store" the predicate phrase without fully understanding it until the subject appears.

One might suppose that the meaning of a predicate phrase could be described, say as an event or activity, or a state, independently of who or what was asserted to be in the state or to engage in the activity. But this, it seems to me, is not true. Consider for example a predicate like be strong. As a first approximation at least it can be said that if the subject is animate, as in John is strong or Weightlifters are strong, we indicate that the referent of the subject can exert a lot of force. But if the subject is inanimate, as in This chain is strong or The table is strong, we indicate that the referent of the subject can withstand a lot of force. Similarly the activity denoted by drop on the bed is rather different according as the subject is John, as in John dropped on the bed, or pieces of the ceiling, in Pieces of the ceiling dropped on the bed. And consider the different activities all referred to by run in sentences like The children are running, The fish are running, The buses are running today, This watch is running, The colors are running, The water is running, The stockings are running, My nose is running, etc. In each the exact activity which we interpret running as referring to depends on the nature of the referent of the subject. This fact about language probably reflects a more general fact about our ontology: discrete objects may exist in some sense independently of the properties or activities we ascribe to them on particular occasions, but the existence of properties or activities in the absence of objects which have these properties or engage in the activities is dubious.

It seems to me that, although the point would have to be further researched, in general, considering the entire lexicon of a language, simple nouns are relatively fixed in meaning or reference (though there are certainly genuine cases of ambiguity) and the meaning of verbs and adjectives depends, somewhat subtly perhaps, on what object they are predicated of. Accepting this claim then it seems that the cognitive advantage of a subject-predicate language over a predicate-subject language follows.

Furthermore, it may well be that subject-final languages present, even in simple sentences, a more acute version of the dependency problem. We have already seen that, with the exception of Malagasy and Batak, the subject-final languages present two pronominal affixes whose form varies with the noun class of the subject and either the direct or indirect object; see (34) above from Chumash, for example. Further, the affixes on the verb are clearly pronominal in function, that is, they are referential. Thus if, for example, the agent and patient of an action have already been mentioned in the discourse and we want to assert that He hit her, we would not normally (unless some emphasis or contrast were implied) use the independent pronouns in the full noun phrase positions. Rather we would just use what appears to be the simple verb as it occurs in the sentence with full noun phrases. The pronominal affixes on the verb are sufficient to reference the participants of the action.

But this means, from the point of view of the speaker, that it is necessary to anticipate to a significant extent the subject and object phrases in a sentence when enunciating the verb, in order to know which pronominal affixes to use. And the hearer on the other hand will interpret these affixes as referential, but in many instances their reference won't be determinable until the full noun phrase subjects and objects have been enunciated.

The anticipation and interpretation problems would not appear to arise in Malagasy and Batak, because they lack the pronominal affixes on the verb. But an analogous problem, at least from the point of view of the speaker, does arise.

We have seen that in these two languages the use of various forms of passive verbs is quite common, and in selected instances is clearly the pragmatically least marked option. But this means that at the time of enunciating the verb of the sentence the speaker must have anticipated the agent and patient phrases to know which participant will be placed in the subject slot. In a subject-initial language on the other hand we can mention the subject/topic without yet having decided exactly what we want to say about it. It may be then that a cognitive dependency between verb and subject exists in subject-final languages but not in subject-initial ones. My general suggestion here then is that subject-final languages may be cognitively somewhat more difficult than subject-initial languages, since they require a longer "look-ahead" on the part of the speaker and a longer "unprocessed storage" on the part of the hearer. (A partial counterexample here is afforded by SOV languages which are ergative. Here the case-marking on the subject must anticipate the transitivity of the verb, though nothing more specific about its meaning. I note that nominative-accusative case-marking systems are more widely distributed than ergative-absolutive ones.) What remains open in this analysis, however, is whether it is merely an accident of our small sample of languages that verbs in all of them generally code properties of the subject to a greater extent than verbs in subject-initial languages.

6.3.2. Derived Difficulties

Many of the major processes which form complex structures from simpler ones, and which most if not all languages have to varying degrees, are ones which aggravate the basic difficulties discussed above. Thus they either force the subject farther from the beginning of the sentence ("farther" here relative to the subject's distance from the beginning of a simple or basic sentence) or they increase the amount of material whose meaning or reference cannot be completely evaluated until the subject phrase is processed. Below I discuss several such processes.

First, however, I note that the processes which aggravate the basic difficulties appear to trigger an additional difficulty, which I shall refer to as cognitive dissonance. (To the best of my recollection the term cognitive dissonance was first used in psychology, by Festinger.)

The Principle of Cognitive Dissonance: A language is cognitively dissonant to the extent that principles of semantic interpretation which apply to basic sentences of the language must be modified to yield the correct interpretation for complex sentences.
Thus it will appear in many of the examples to follow that the syntax of complex sentences in subject-final languages differs in certain ways from that of simpler ones, ways which avoid the aggravation of the basic difficulties, but which mean that the assignment of meaning to complex sentences by listeners is not done in analogy with the way simple sentences are interpreted. Thus, somewhat indirectly perhaps, subject-finality is responsible for a more complex set of interpretation "rules" than is needed for subject-initial languages.

6.3.3. Some Nonbasic Sentence Types in Subject-Final Languages

6.3.3.1. Reciprocals

Consider the interpretation of sentences like (70) from English:

(70) The children were hitting each other.
We know from the special form of the object pronoun each other that the recipients of the action are the same as the actors, and immediately upon hearing the object pronoun we know just what set of people this is, since it is necessarily the same as the set referenced by the subject, and the subject has already been mentioned.

In a subject-final language, however, we might expect (70) to be rendered roughly as (71):

(71) Were hitting each other the children.
And indeed this is basically the way reciprocals are expressed in Tzeltal:
(72) La s-mah s-ba- ik te winike- tik.
  Past 3-hit 3-self- Pl. the man- Pl.
  'The men were hitting each other.'
In this type of reciprocal construction, however, we cannot fully interpret (assign a meaning or reference to) the reciprocal pronoun at the moment of hearing. Rather, we must wait until the subject phrase has been enunciated, and only then do we know which individuals were being hit. Reciprocal Formation then (however it is formulated) extends the amount of material whose interpretation must await the subject phrase.

Interestingly however, of the subject-final languages in my sample, only Tzeltal forms reciprocals in just this way. All the other languages in the sample, except possibly Otomi, for which relevant data are lacking, form reciprocals by affixing the verb. Malagasy below is typical:

(73) Mif- anoratra taratasy Rasoa sy Ravelo.
  Recip.- write letter Rasoa and Ravelo
  'Rasoa and Ravelo write each other letters.'

Forming reciprocals by verbal affixing seems to have the effect of reducing somewhat the distance between the subject and the beginning of the sentence. Moreover, it does not introduce a noun phrase whose interpretation must await that of another noun phrase, but it still does aggravate the dependency of meaning of the predicate upon the subject in the sense that, for example, in Malagasy the verb mifanoratra 'to write each other' contains more information than does the nonreciprocal form manoratra 'to write', but that information is not fully determinate in the sense that we still don't know who is receiving letters until the subject phrase appears.

6.3.3.2. Reflexives

Reflexives, as illustrated in (74a) for SVO languages and for VOS languages, pose basically the same interpretation problem as do reciprocals.

(74) a. John hit himself.
  b. Hit himself John.
Here, however, four of the eight languages in my sample present an independent pronoun in the object position whose interpretation cannot be made until the subject is enunciated. Malagasy and Tzeltal have specifically reflexive forms of pronouns. (75) from Malagasy is illustrative. The reflexive pronoun in Tzeltal is just the nonplural form of the reciprocal pronoun in (72).
(75) Namono tena Rakoto.
  killed self Rakoto
  'Rakoto killed himself.'
Fijian and Gilbertese lack specifically reflexive pronouns altogether, but use ordinary personal pronouns in the object position, illustrated by (76) from Fijian:
(76) A mokut- i koya o Bale.
  Past hit- Obj. him Topic Bale
  'Bale hit him/himself.'

The interpretation of koya 'him' in (76) is not so dependent on the subject as is the reflexive pronoun in Malagasy or Tzeltal. It might corefer to the subject, or it might refer to some other third party already prominent in the context of speech. It is understood like the pronoun him in John thinks that Mary adores him, where him may refer to someone other than John.

Thus, to some extent, Fijian and Gilbertese avoid the dependency of reference problem by sacrificing logical explicitness (in the sense that they cannot unequivocally distinguish in surface between John hit him and John hit himself; see Keenan, forthcoming, for more discussion). Further, the dependency problem is not altogether avoided, since the object pronoun in (76) could be uttered in a context in which no particular individual had been mentioned, as in "What happened at school yesterday?"

In addition, the Malagasy and Fijian types of reflexive construction may create a small amount of cognitive dissonance, as follows: If the subject phrase is an independent pronoun, as in (77a,b) from Malagasy and Fijian respectively, the subject pronoun must be interpreted as difficult in reference from the object pronoun.

(77) a.
Namono an- dRabe izy.
killed Acc.- Rabe he
'He killed Rabe.'
'Rabe killed himself.'
  b.
a mokut- i Bale o koya.
Past kill- Obj. Bale Topic he
'He killed Bale.'
'Bale killed himself.'
Thus we have here a simple instance of a pronoun following a referential noun phrase of the right sort but with which it cannot be interpreted as coreferential. But elsewhere in the language, in fact almost everywhere, a pronoun occurring after a full noun phrase of the right category can be interpreted as coreferential with it. Thus it appears as though one small principle of semantic interpretation which applies to fairly simple sentences does not apply to more complex ones.

As regards reflexives in the other subject-final languages, Baure and Chumash reflexivize by affixing verbs, much as was done in the instance of reciprocals. Otomi may also reflexivize in this way, but I lack sufficient data to be certain. Batak, however (together with Tagalog and other Philippine languages), provides one of the very rare exceptions to the generalization that, in simple sentences, subjects control reflexivization and nonsubjects are presented as reflexive pronouns or omitted. The most usual way to overtly reflexivize in Batak is to present the sentence as passive in form with the derived subject being the reflexive pronoun.

(78) di- pukkul si Bissar diri-na.
  Pass.- hit Art. Bissar self-his
  'Bissar struck himself.'
(Lit., 'Was struck by Bissar he-self.')
We note that dirina 'himself is fully pronominal, in that the form of the reflexive varies in person and number with its antecedent. (Dirina is a slightly collapsed form of diri-ni-ia 'self-of-he'.)

Batak then appears to avoid the problem of having to suspend the interpretation of object phrases until the subject phrase is reached, but at the expense of violating an extremely general property of subjecthood — namely, being an expression whose reference is determinable independently of other expressions within the same simple sentence (see Keenan 1976b for more discussion).

Batak does, however, present at least two other ways of expressing reflexives. In some instances at least, it can have recourse to an SVO order, as in (79):

(79) Si Bissar makkaholongi diri-na.
  Art. Bissar love self-his
  'Bissar loves himself.'
To the extent that the subject is not being specifically topicalized the option is cognitively dissonant, since usually fronted noun phrases are understood to be more topical than those in the least-marked sentences.

A further option for presenting reflexive sentences in Batak, as well as in Malagasy, is to use an intransitive though not specifically reflexive form of the verb. Thus both Batak and Malagasy form verbs by affixing (prefixing is dominant) roots which most commonly do not otherwise occur as words in the language. If a given root accepts both transitive and intransitive affixes, as is fairly common, then, when semantically appropriate, the intransitive form may be interpreted reflexively. Thus in Batak the transitive verb 'to wash' is formed from a root plus the prefix maN to yield mandidi; the intransitive form is maridi, and it is the intransitive form which my informant used to translate the English reflexive:

(80) Maridi si Rotua.
  wash Art. Rotua
  'Rotua washes herself.'

Analogous claims hold here for Malagasy. In fact, the total number of verbs in Malagasy which would construct a reflexive with the reflexive pronoun is rather limited. To judge from claims made in early, nineteenth-century, grammars of Malagasy, the use of tena 'body, self' is an innovation as a reflexive pronoun. And certainly in many instances where we expect a reflexive pronoun in English, an intransitive verb form such as misasa 'to wash (intransitive)' would be used rather than the transitive form (manasa 'to wash') plus the reflexive pronoun.

We are left then with the general impression that frequently in subject-final languages specifically reflexive constructions are not well-developed.

6.3.3.3. Heavy Objects

Quite generally the most common position for full sentences to be embedded as noun phrases is the direct object position. Thus, for verbs of saying and thinking, the sentence expressing what is thought or said occurs as an object in languages like English:

(81) John said that Fred left the party early.
In subject-final languages we might expect that (81) would be rendered as (82):
(82) Said that left the party early Fred John.
Such a construction would obviously aggravate the basic problems of subject-finality, however, since the subject is now far removed from sentence-initial position; and, since now many verbs and nouns precede the subject, there is a greater possibility than in simple sentences for elements to depend for their meaning or reference on the subject.

In fact no subject-final language seems to express sentential complements of saying with the word order in (82). Most commonly a subject-final language simply uses a VSO order, illustrated in (83) from Malagasy.

(83) Nihevitra Rabe fa namaky boky ny mpianatra.
  thought Rabe that cut book the student
  'Rabe thought that the students were reading books.'
The use of the VSO option is common in Malagasy, Fijian, Chumash, and, to judge from a limited corpus, Otomi. It occurs in elicitation in Tzeltal and occurs in texts in Baure (where however the sentential object is always a direct quotation). The relevant data on Gilbertese are lacking.

The VSO order is, however, cognitively dissonant, since in simple sentences the noun phrase immediately following the verb is normally interpreted as the object, not the subject, of the verb. The use of an SVO order here would also be dissonant to the extent that the subject was not being specifically topicalized. Chumash does, infrequently, present an SVO order here, but in general this is not a common option in the sample.

Another option, which is not cognitively dissonant, though utilized significantly only in Malagasy and Batak, is to present the verb of saying as passive. (84) from Batak is illustrative.

(84) Ndang di- boto ibana na mangisap sandu hamu.
  not Pass. know he that smoke opium we
  'That we smoke opium isn't known by him.'
It appears in fact that verbs of saying and thinking in Batak always appear in a passive form when construed with a sentence complement.

6.3.3.4. Sentential Objects with Like Subjects

In English, and in many other languages, it is often possible to omit, in one way or another, the subject of a sentential object if it is understood to be the same as the subject of the main verb. Thus we might posit that the underlying syntactic structure of (85a) below is (85b).

(85) a. John wants to close the shop.
  b. John wants [John close the shop]
Since the understood subject of 'close' is the same as that of 'want' we may omit it and obtain a derived sentence of the usual subject-predicate form, where the predicate contains in effect two verbs, one of which (in English) is nonfinite, that is, it does not have the same form it would have if it were the main verb of a simple sentence.

We might expect then that, to the extent that subject-final languages are not dissonant, sentences like (85a) would be presented as:

(86) Wants to close the shop John.
And (86) in fact is a commonly occurring order in a majority of the languages in the sample: Malagasy, Batak, Otomi, Tzeltal, and Chumash. I lack the data on Baure and Gilbertese; Fijian does not in general delete coreferential noun phrases in such contexts but merely pronominalizes them. In all the languages in the sample the embedded verb remains finite. (87) from Batak is illustrative:
(87) Nunga mulai manussi abit si Rotua.
  already begin wash clothes Art. Rotua
  'Rotua has already begun to wash clothes.'
This order obviously aggravates the basic problem, however, especially since, at least in principle, arbitrarily many "higher" verbs may be added to the beginning of the sentence. (88) below from Malagasy illustrates the case of three such higher verbs.
(88) Te- hanaiky hanasa ny zaza Rasoa.
  want- agree-Fut. wash-Fut. the child Rasoa
  'Rasoa wants to agree to wash the child.'

An option which avoids increasing the distance of the subject from the beginning of the sentence is to place the subject immediately after the first, or main, verb. Thus (87) above from Batak could also be rendered as:

(87') Nunga mulai si Rotua manussi abit.
  already begin Art. Rotua wash clothes
  'Rotua has already begun to wash clothes.'
This word order option is also used in Malagasy and Tzeltal. But, as with the previous examples concerning sentence complements, this option is clearly anomalous, since for transitive verbs the noun phrase immediately following the verb is interpreted as the direct object for the least-marked sentences in the language.

6.3.3.5. Conjunction Reduction

Given a sentence formed by the coordinate conjunction of two sentences with the same subject it is possible in most if not all languages to eliminate one of the subject noun phrases. In subject-initial languages the predicate phrase remaining after the deletion of the subject can be naturally conjoined with the predicate phrase of the other sentence to yield a derived sentence with a single subject and a compound predicate phrase, illustrated in (89a,b) below from English:

(89) a. John came early and John left late.
  b. John [came early and left late].
In subject-final languages however, if the result of the deletion is to yield a compound predicate phrase with a single subject, it would have to be the subject of the first sentence which is deleted:
(90) a. Came early John and left late John.
  b. [Came early and left late] John.
Clearly such a pattern of conjunction pushes the derived subject far indeed from sentence-initial position. Surprisingly, however, this pattern is an option in many of our languages: Malagasy, Fijian, Batak, Chumash, and Tzeltal, as in (91a,b) from Malagasy:
(91) a.
Misotro taoka Rabe ary mihinam- bary Rabe.
drink alcohol Rabe and eat- rice Rabe
'Rabe is drinking alcohol and Rabe is eating rice.'
  b.
[Misotro taoka sy mihinam- bary] Rabe.
drink alcohol and eat- rice Rabe
'Rabe is drinking alcohol and eating rice.'
(Sy above conjoins only phrases, never sentences, whereas ary conjoins sentences but not, in general, phrases; see Keenan 1976b for more discussion.)

In discourse, however, it is possible in all these languages, and to my knowledge in all languages, to omit the subject of a sentence when it is the same as the subject of the preceding sentence (though it may not always be possible to conjoin such sentences with an overt coordinate conjunction). Thus in Malagasy:

(92) Misotro taoka Rabe ary mihinam- bary.
  drink alcohol Rabe and eat- rice
  'Rabe is drinking alcohol and (he) is eating rice.'
Such structures, however, with an overt coordinate conjunction, are arguably dissonant to some extent. For normally if X and Y are joined by a coordinate conjunction we expect that X and Y are of the same grammatical category. But (92) exemplifies a sentence conjoined with a mere predicate phrase. Further, the only noun phrase which could be understood as the subject of the entire construction is not in the normal position, sentence-final, for subjects. And in fact in Malagasy (I do not have the pertinent data for other languages) that noun phrase does not behave like a subject. Thus many syntactic operations in Malagasy are restricted in the operation to subjects (see Keenan 1976a for further discussion). For example, in Malagasy, clefting, which moves a noun phrase to the front and inserts the particle no, operates on subjects but not on objects. And (93) can be formed from (91b), indicating that Rabe is the derived subject of (91b):
(93) Rabe no misotro taoka sy mihinam- bary.
  Rabe Cleft. drink alcohol and eat- rice
  'It is Rabe who is drinking alcohol and eating rice.'
But from (92) above we cannot cleft on Rabe, indicating that in fact Rabe does not function as the subject of (92).
(94) *Rabe no misotro taoka ary mihinam- bary
  Rabe Cleft. drink alcohol and eat- rice
  'It is Rabe who is drinking alcohol and eating rice.'

6.3.4. Ambiguity

Finally, I consider a general problem which, in principle, all verb-terminal (verb-initial or verb-final) languages face. If all major noun phrases are on the same side of the verb in unmarked sentences, then any operation which moves or deletes a noun phrase may leave the resulting structure ambiguous according to whether the subject or the object was moved or deleted. Consider, for example (95a,b) below from Fijian.

(95) a.
A raica na tagane na gone yalewa.
Past see Art. man Art. child woman
'The girl saw the man.'
  b.
Na gone yalewa ka a raica na tagane
Art. child woman that Past see Art. man
'the girl who saw the man' or 'the girl whom the man saw'
Given that the relative clause construction (95b) has been formed by moving the subject (or object) to the front, the relative order of noun phrases after the verb can no longer be used to distinguish subject and object. Thus had we relativized on girl from (96) below we would have equally obtained (95b) above:
(96) A raica na gone yalewa na tagane.
  Past see Art. child woman Art. man
  'The man saw the girl.'
We expect then that relative clause constructions like (95b) in Fijian are ambiguous, and in fact they are, as our translation indicates. So also are the corresponding noun phrase questions, which are formed by fronting the question word.

There are, however, several ways a language may be constructed so that such ambiguities do not arise. If, for example, subject and object have distinct case-marking, then moving one of them will not create an ambiguity, since the grammatical role (subject, object) of the one that is left will be signaled by its case-marker and hence the noun phrase that was moved (or deleted) will be the one that is missing (from the argument position of the verb from which the movement took place). And because case-marking is prevalent in verb-final languages, such languages do not in general create ambiguities by moving or deleting noun phrases. Verb-initial languages however are generally not endowed with rich case-marking systems; and as we have seen, subject-final languages in particular do not in general distinguish subject and object by case-marking.

A second way out of the problem is to leave a pronominal "trace" in the position from which a full noun phrase was "moved" or "deleted." Thus if (95b) above were rendered literally as the girl that saw her the man, we would know by the relative position of the pronoun and full noun phrase that girl was the object of the verb, not the subject. And in general, verb-initial languages use pronoun-retaining strategies much more than do verb-final languages in relative clause formation (though rarely in question formation; see Keenan and Comrie 1977 for discussion of the former point). And Gilbertese does in fact retain object pronouns under relativization as illustrated in (97a,b,c), thus avoiding the ambiguity in (95b).

(97) a.
E ore-a te mane te aine.
3Sg. hit-3Sg. Art. man Art. woman
'The woman hit the man.'
  b.
Te aine are ora-a te mane
Art. woman that hit-3Sg. Art. man
'the woman who hit the man'
  c.
Te mane are oro-ia te aine
Art. man who hit-him Art. woman
'the man that the woman hit'
However it is not common in this sample for subject or object pronouns to be retained, although noun phrases that would be governed by prepositions more commonly are represented by pronouns when relativized, as particularly in Fijian.

A final way of avoiding the ambiguity is simply to restrict the positions which can be moved or deleted. This is particularly common in many languages as regards deletion. Thus in many contexts, many languages allow only subjects to be deleted when coreferential with some other noun phrase. Many fewer languages restrict the movement of noun phrases in a similar way. Both Malagasy and Batak however do. Thus in both of those languages direct objects cannot be directly relativized; only subjects can, as (98a,b) from Malagasy illustrate:

(98) a.
ny zazavavy izay manasa ny lamba
the girl that wash the clothes
'the girl who is washing the clothes'
  b.
*ny lamba izay manasa ny zazavavy
the clothes that wash the girl
'the clothes that the girl is washing'
(98b) could only mean the clothes that are washing the girl in Malagasy. If it is desired to relativize on a patient of an action, the sentence from which relativization takes place must be passivized so that the patient is a derived subject, and then we can say the clothes that were washed by the girl.

On the basis of the data available to me, then, subject-final languages are only partly successful in avoiding the ambiguity problem. Fijian and Tzeltal tolerate the massive ambiguity. Gilbertese (at least in certain contexts — I lack extensive data) avoids the ambiguity by pronoun retention. Malagasy and Batak avoid the problem by restricting the positions which can be moved. I lack sufficient data on Chumash, Otomi, and Baure to know whether they tolerate the ambiguity or not, although nothing in the data suggests that the positions from which relativization can take place are restricted. And since there is no case-marking and, in the few available examples, pronouns are not retained in positions relativized, it would appear that the ambiguity is tolerated.

In conclusion then, it seems that subject-finality poses both cognitive and pragmatic problems for understanding basic sentences, and many quite general means of forming complex structures from simpler ones both aggravate these basic problems and introduce new cognitive problems.

Postscript

The following pieces of data were brought to my attention too late to be included in the chapter:

1. Derbyshire (1977) discusses a Carib language, Hixkaryana (Brazil), whose basic word order is argued to be OVS. The examples given there suggest that Hixkaryana follows more closely a verb-final typology than a verb-initial one, and thus does not pattern in general like the verb-initial languages presented here. We should then restrict our claims here to languages which not only are subject-final in the sense discussed in this paper but also present the other major NPs after the verb.

2. There are at least three other languages not discussed here which are subject-final in our original sense: Tzotzil and K'ekchi, both Mayan, and thus related to Tzeltal (data from Ava Berenstein, personal communication), and Tsou, a Malayo-Polynesian language indigenous to Taiwan (see Tung 1964). Tzotzil and K'ekchi conform well to our generalizations for subject-final languages, patterning generally like Tzeltal but with better-developed passives and subject and object agreement. We know less of Tsou, but superficial inspection of Tung 1964 suggests that Tsou also conforms to our generalizations. It does at least have well-developed passives and both subject and object agreement, and it occurs in a phylum in which verb-initiality is common.

Notes

The work on subject-final languages was supported by a grant (No. 2944) from the Wenner-Gren Foundation for Anthropological Research; the work on Malagasy was supported by an NSF Postdoctoral Fellowship and a grant from Wenner-Gren (No. 2384).

1. A possible counterexample to the claim made in 6.2 is Zenéyze, that variety of Romance spoken in and around Genoa, Italy. It appears that simple sentences without a "theme" or topic have VOS order as in (i) below, cited from Pullum 1977.

(i) U- vend-e i peši a Zêna a Katayniŋ.
  it- sell-s the fishes in Genoa the Catherine
  'Catherine sells the fish in Genoa.'
On the other hand, when the agent is the topic, the order is SVO and the verb agrees with the topic/subject.
(ii) A Katayniŋ a- vend-e i peši a Zêna.
  the Catherine she- sell-s the fishes in Genoa
  'Catherine sells the fish in Genoa.'
Vattuone (1975) argues that the VOS order is syntactically the more basic and that the SVO order is derived by a rule of thematization. Pullum (1977) however argues, convincingly to my mind, that the SVO order is basic and that the VOS order is derived.

A critical point for me in the argument is that the verb-initial structure uses an impersonal verb: it is always third singular masculine regardless of the number and gender of the "subject." Such impersonal constructions are common in European languages. Note (iii) below from Dutch (Kirsner 1976) and (iv) from French.

(iii) Er wordt (door de jongens) gefloten.
  there becomes by the boys whistled
  'There is whistling (by the boys).'
(iv) Il est arrivé un homme et deux femmes.
  it is-3Sg. arrived-Sg.M. a man and two women
  'There arrived a man and two women.'
I consider then that the existence of impersonal constructions of that sort in Zenéyze is not sufficient to class Zenéyze as a subject-final language in the sense established here for two reasons:

First, even if such impersonal constructions are among the pragmatically least marked sentence types in Zenéyze they are clearly (as with other European languages presenting such constructions) a small minority among the unmarked sentence types.

Second, it is part of understanding what we mean by "subject" that the subject is the topic (of unmarked sentences — see section 6.3 for further discussion of that point). Arguably then sentences like (i) in Zenéyze are subjectless, and hence not among the sentences on which my definition of subject-final is based.

I further note that the use of a predicate-first order in presentational or topicless contexts, e.g., as discourse-initial in stories, is not uncommon in Romance. See Givon 1976b for discussion of predicate-first order in Spanish. Similarly in French, newspaper reports not uncommonly begin with a predicate-first order, as in (v) below:

(v)
Ont assisté à la réunion hier à l'Elysée
have attended to the meeting yesterday at the-Elysée
 
Mssrs. X, Y, et Z.
Mssrs. X, Y, and Z
 
'Attending the meeting yesterday at the Elysée were Mssrs. X, Y, and Z.'
(Sentences like (v) are primarily about the existence of a meeting, not about who attended.) Thus to the very limited extent that Zenéyze might be considered subject-final it seems likely that other varieties of Romance could also be considered verb-initial, so G-2 would not be violated.