Linguistics Research Center


The Linguistics Research Center is an organized research unit in the College of Liberal Arts at the University of Texas. Since its founding by Winfred P. Lehmann in 1961, virtually all LRC research projects have involved the processing of human language with the aid of computers. Our goals are to deepen the understanding of language as a structure consisting of symbols, and the computer seems ideally suited to help increase our knowledge of that system for languages of both the past and the present. As computational tools and theory have developed, they have become increasingly useful as aids to the development of language descriptions and the testing of linguistic theory.

The principal activities of the LRC have included machine translation, historical study (especially of Indo-European languages), lexicography, bibliography, and indexing. Our most recent activities revolve around publishing web pages in all these areas, although machine translation is represented by only a few pages.

Our website home page URL is Website traffic measured approx. 2.4 million page views in 2007, 2.8 million in 2008, 3.3 million in 2009, and 4 million in 2010 with 42% of page requests originating from 200 nations outside the U.S.

Pages in the website are currently organized in the following categories:

Machine Translation

During its first 35 years, research efforts at the LRC concentrated on the Machine Translation (MT) of texts from one human language to another with the aid of computers. Following a brief hiatus, new funding led to the development of a new system with the same name as the old, METAL, but with far better tools for linguists and vastly greater success, resulting in delivery of a production prototype then later a full-fledged commercial MT system. MT R&D continued at the LRC, with funding by various sponsors, until the mid-1990's.

Early work on MT at the LRC was supported by grants from the U.S. government; but unlike most MT projects in the U.S., which explored Russian-English translation, the LRC specialized in German-English translation. The system came to be called METAL; whether this stood for "META-Language(s)" or "MEchanical Translation and Analysis of Languages" is knowledge now lost in the mists of time. A personal commentary about this early MT work, by Winfred Lehmann, is found online in Machine Translation at Texas: the Early Years.

Funding by the U.S. government for MT work at the LRC ended ca. 1975, although a very small amount for "documentation & inventory" became available a few years later, after all the systems personnel and most of the linguists had left the LRC. Knowledge of the software and how to run it was all but lost, and in any case major hardware changes at the University were soon to render that knowledge moot, and the system unusable.

Eventually, in 1978-79, the German firm Siemens AG stepped in due to its growing corporate need for high-speed translation. Funding from Siemens enabled the LRC to hire new systems personnel, who delivered a full production prototype by mid-1984, to be followed by full-scale commercial implementation in C++. A personal commentary about this critical phase of system development by Jonathan Slocum, then MT Project Director at the LRC, is found online in Machine Translation at Texas: the Later Years.

As METAL evolved into a commercial product, both for in-house use and for licensing to other firms, Siemens gradually absorbed all system development work; LRC MT personnel became Siemens employees or contractors, or engaged in MT research funded by other sponsors, or found other work at the LRC or elsewhere. In 1993, Siemens funding for METAL development at the LRC came to an end, with general MT research continuing for a year or two thereafter.

Other Projects

Concurrent with and following work in MT, LRC staff members have engaged in a number of independent but often complementary projects, the most significant of late resulting in extensive online publications in the area of Indo-European languages and linguistics:

Somewhat earlier and other more recent projects have resulted in significant print publications, including books that are described in the next section of this brochure.

Projects Leading to Book Publications

From its early years to the present, the LRC has mounted a number of smaller projects resulting in the publication of significant works relating to Indo-European languages and/or their common ancestor, Proto-Indo-European. The hallmark of this work has been the use of computers to transcribe texts and prepare them for publication. Ancient works in Old Indic made available in electronic form include Rgveda-Samhitâ (1970-71), and Shatapatha, Maadhyandina Shaakha (1971), both by W.P. Lehman and H.S. Ananthanarayana and now included in the TITUS text database.

In a series sponsored by the Committee on Research Activities of the Modern Language Association, grammars of Old Irish, Gothic, and Old French were published. The first of these was An Introduction to Old Irish, by Ruth P.M. & Winfred P. Lehmann (1975). These grammars set new standards for mastery of older languages, especially from the medieval period; they are designed for learning without instructors, though they are used in classroom courses as well.

Two Ph.D. dissertations tied directly to research carried out at the LRC were published in the early 1980's: The Structure of the Merriam-Webster Pocket Dictionary by Robert Amsler (1980), and A Practical Comparison of Parsing Strategies for Machine Translation and Other Natural Language Processing Purposes by Jonathan Slocum (1981).

The Mesoamerican Languages project was begun at the LRC in 1978; languages of interest included Nahuatl and Yucatec Maya. Major grants from the National Science Foundation made possible book-length publications on Nahuatl by Frances Karttunen, including An Analytical Dictionary of Nahuatl in 1983; and a translation grant from the National Endowment for the Humanities made possible production of another book-length publication on Nahuatl, The Art of Nahuatl Speech: The Bancroft Dialogues by Frances Karttunen and James Lockhart (1987).

A prominent example of the LRC using computers to prepare texts for print publication is A Gothic Etymological Dictionary by Winfred P. Lehmann (1986), with bibliography prepared under the direction of Helen-Jo J. Hewitt. The final print-ready version was produced with the aid of a laser printer (exotic new technology, in those days) using, for the various languages included in the entries, approximately 500 special characters -- many of them designed at the LRC. This was the first major etymological dictionary for Indo-European languages to be produced with the aid of computers.

By far the most complex publication of its day was Astadhyayi of Panini by Sumitra M. Katre (1987); this is an English translation, with commentary, of the fundamental Sanskrit grammar by Panini. Prior to this effort, Panini's grammar was available only in the original Sanskrit, and in a German translation from the 19th century.

A sampling of other books published by LRC staff members in more recent years includes the following:

Educational Activities

As one of its major aims in a university setting, the LRC has trained many specialists in computational procedures, including computational linguistics. Personnel files at the LRC include some of the greatest names in machine translation R&D; and a number of leading members of the University of Texas Computation Center, which later grew to include Information Technology Services, received early training at the LRC. Likewise, a former Chairman of the Department of Computer Sciences was recruited to UT to work at the LRC. Former members of the LRC have occupied important positions at various other universities and institutions throughout the U.S. and abroad (e.g. in Europe, India, China, and Japan).

In recent years, undergraduate and graduate students from a number of UT academic departments have trained and worked at the LRC; a partial listing of these departments includes Asian Studies, Classics, French & Italian, Germanic Studies, Linguistics, and Physics.

Likewise, in recent years (2005-09) four LRC staff members taught undergraduate and graduate courses at UT and elsewhere:

The continued advances in LRC project work testify to the capabilities of LRC staff members, as does their extensive involvement in scholarly publication. The work they carry out is documented in various print and online publications; circulation figures of print publications and website traffic to online resources demonstrate a high level of interest in their academic contributions.

Sponsors and Funding

Research and development at the Linguistics Research Center has been sponsored by numerous institutions, corporations, and individuals. Chief among them:

LRC research grants and contracts over the years have exceeded $10 million in raw numbers; adjusted for modest 3% inflation, this amounts to approx. $25.3 million, averaging $496,000 per year since work began.

Selected Print Publications, 1999-2009

Bauer, B.L.M. "Word Order," New Perspectives on Historical Latin Syntax. Vol. 1. Syntax of the Sentence. Philip Baldi and Pierluigi Cuzzolin, eds. Berlin: Mouton de Gruyter, 2009, pp. 241-316.

Bauer, B.L.M. "Nominal Apposition in Vulgar and Late Latin. At the Cross-Roads of Major Linguistic Changes," Latin vulgaire et latin tardif, Roger Wright, ed. Tübingen: Niemeyer, 2008, pp. 42-50.

Bauer, B.L.M. Archaic Syntax in Indo-European. The Spread of Transitivity in Latin and French. Berlin: Mouton de Gruyter, 2000.

Justus, C.F., "Hittite and Indo-European Gender," Indo-European Perspectives, Journal of Indo-European Studies Monograph 43, ed. by Mark R. V. Southern, 2002, pp. 121-150.

Justus, C.F., "On Language and the Rise of a Base for Counting," General Linguistics 42, 2004 (2002), pp. 17-43.

Krause, T.B., "Perfective through Prefixes in PIE? Perhaps." East Coast Indo-European Conference, Virginia Tech University, May 2004.

Lehmann, W.P., E. Raizen, and H-J.J. Hewitt. Biblical Hebrew: An Analytical Introduction. San Antonio, Texas: Wings Press, 1999.

Lehmann, W.P. Pre-Indo-European, Journal of Indo-European Studies Monograph 41. Washington DC: Institute for the Study of Man, 2002.

Slocum, J., and D. Simms, "Going Online: Problems and Solutions for Indo-Europeanists and Other Linguists," General Linguistics 42, 2004 (2002), pp. 44-60.

Slocum, J., "Early Indo-European Online (EIEOL): Ancient language lessons in web page form," Journal of Indo-European Studies 33, 3 & 4, Fall/Winter 2005, pp. 315-323.

Selected Website Publications, 2002-2011

Bauer, B.L.M. and J. Slocum. Old French Online, 2006.

Harvey, S.L., W.P. Lehmann, and J. Slocum. Old Iranian Online, 2003.

Joseph, B.D., A. Costanzo, and J. Slocum. Albanian Online, 2011.

Kimball, S.E., W.P. Lehmann, and J. Slocum. Hittite Online, 2003.

Krause, T.B. and J. Slocum. Old Church Slavonic Online, 2002.

Krause, T.B. and J. Slocum. Classical Armenian Online, 2003.

Krause, T.B. and J. Slocum. Old Norse Online, 2004.

Krause, T.B. and J. Slocum. Gothic Online, 2005.

Krause, T.B. and J. Slocum. Tocharian Online, 2010.

Lehmann, W.P. and J. Slocum. Latin Online, 2002.

Lehmann, W.P. and J. Slocum. Classical Greek Online, 2002.

Lehmann, W.P. and J. Slocum. New Testament Greek Online, 2003.

Slocum, J. and W.P. Lehmann. Old English Online, 2007.

Slocum, J., et al. Indo-European Lexicon, 2009-11.

Stempel, P. de Bernardo, C. Esser, and J. Slocum. Old Irish Online, 2007.

Thomson, K. and J. Slocum. Ancient Sanskrit Online, 2006.

Vasiliauskiene, V., L. Zalkalns, P. Vanags, and J. Slocum. Baltic Online, 2005, 2007.