Albanian is an Indo-European language spoken mainly in the Balkan Peninsula by approximately five million people. It is the principal and official language of Albania, the principal and a co-official language of Kosovo (with Serbian), and the principal and co-official language of many western municipalities of the Republic of Macedonia (with Macedonian). Albanian is also spoken widely in some areas in Greece, southern Montenegro, southern Serbia, and in some towns in southern Italy and Sicily.
The terms Albania and Albanian are exonyms. The Albanians call themselves Shqiptar, their language shqip, and their country Shqipėria. These words are likely derived from the adverb shqip 'clearly' based on Latin excipere (whence shqipoj 'speak clearly'), though there are alternative explanations. In all other languages, a form from earlier *alban- or *arban- is used (the difference being most likely from a rhoticism process in Greek). In most other languages, a form with the same origin as Eng. Albanian is used (e.g., It. Albanese, Serb. Albanac, Germ. Albaner, etc). In Turkish, the Albanians are called Arnavut, derived in some way from arvan-. The terms Albania and Albanian are not to be confused with the area in the Caucasus referred to in ancient texts as Albania or the language spoken there referred to as Albanian (an ancestor of the modern Udi language spoken in Azerbaijan and a member of a language family with no confirmed connections to the Indo-European language family).
When compared with most of the other Indo-European languages, Albanian's first attestations are rather recent, with the first surviving fragment from the mid-15th century and the first major text from the mid-16th century. For this reason, these lessons cover Albanian from the modern standard language back to earlier attestations, starting with the modern variety to get a grounding in the language and working back to older material.
Note: this page is for systems/browsers lacking Unicode® support, or having less than full Unicode 2.0 font support. Versions of this page rendered in alternate character sets are available via links (Unicode 2 and Unicode 3) in the left margin, and at the bottom of this page.
Albanian forms a separate branch of Indo-European and cannot conclusively be closely connected with any other Indo-European language. There have been attempts to connect Albanian with some of the sparsely attested ancient languages of the Balkans, particularly Illyrian but also Dacian and Thracian. While this is plausible geographically, given that we know the Illyrians lived in an area that includes the modern Albanian-speaking area, there is no concrete linguistic evidence for any of these proposals. Some have proposed a connection between the ancestor of Albanian (without assigning a specific identity to this ancestor) and a Latinized variety of that ancestor that may have ultimately yielded Romanian, as there are several shared words not of Latin origin in both languages.
Mention of the Albanian people and the Albanian language appears rather late in the historical record. The earliest uncontroversial mention of the Albanian people is in Michael Attaleiates's late 11th century history of the Byzantine Empire, where he refers to the Albanoi taking part in a revolt against Constantinople and the Arvanitai as subjects of the duke of Dyrrachium (modern Durrės, Albania's main port on the Adriatic).
The first mentions of the Albanian language predate its first attestation by several centuries. Elsie (1991) describes a 1285 text in which the investigation of a robbery in Ragusa (modern Dubrovnik, Croatia) refers to a witness who said Audivi unam vocem clamantem in monte in lingua albanesca 'I heard a voice crying in the mountains in the Albanian language'. In the 1308 Anonymi Descriptio Europae Orientalis 'Anonymous description of Eastern Europe', the author writes Habent enim Albani prefati linguam distinctam a Latinis, Grecis et Sclavis ita quod in nullo se inteligunt cum aliis nationibus 'The aformentioned Albanians have a language which is entirely distinct from that of the Latins, Greeks and Slavs such that in no way can they communicate with other peoples'.
While the earliest attested Albanian texts are from over a century later, the existence of Albanian texts is mentioned in 1332 in Directorium ad passagium faciendum (by a French monk whose identity is uncertain): licet Albanenses aliam omnino linguam a latina habeant et diversam, tamen litteram latinam habent in uso et in omnibus suis libris 'The Albanians have a language different from Latin, although they use Latin letters in their books' (note that this could potentially be saying that Albanians just wrote in Latin).
The oldest unambiguous attested Albanian is a single line embedded in a Latin document from 1462. It is in a letter from Pal Engėlli, a bishop and associate of Skėnderbeu, and is a translation of a baptismal formula (formula e pagėzimit) into Geg Albanian:
| Vnte' paghesont premenit Atit et birit et spertit senit |
| 'I baptize you in the name of the father, the son, and the holy spirit' |
| cf. Std. Alb. Unė tė pagėzoj nė emėr tė Atit, tė Birit, e tė Shpirtit tė Shenjtė |
Over the following century the attested Albanian "texts" are of similar size, including a single line in a Latin play from 1483 and a short list of Albanian words from 1496.
The first larger text is Meshari i Gjon Buzukut 'The Missal of Gjon Buzuku', written in 1555 (see Lesson 5). Again, like the earlier attestations of Albanian, Buzuku's 'Missal' is written in Geg. Most of the early documentation of Albanian is in Geg, as that area was more difficult for the Ottomans to subdue (and consequently discourage the use of Albanian). The earliest attestation of Tosk Albanian is the E mbsuame e krėshterė 'Christian doctrine' of Lekė Matrėnga from 1592, written in Hora e Arbėreshėvet, an Arbėresh settlement in northeastern Sicily.
Albanian dialects are traditionally divided into two groups: Geg dialects in the north, and Tosk dialects in the south. The dividing line is traditionally considered to be the Shkumbin river, which runs east-west though central Albania (at approximately the 41st parallel north). Dialects spoken in Kosovo and Macedonia are Geg dialects, while those spoken in northwestern Greece are Tosk dialects. While they are technically Tosk dialects, Arvanitika (spoken in Greece, historically in Attica and Boeotia) and Arbėresh (spoken in southern Italy and Sicily) are also often considered major Albanian dialects; these dialects were brought to these areas after the Ottoman conquest of the western Balkans in the late 15th century, and they are maintained to this day.
Phonological variation:
Morphosyntactic variation:
Nearly all of the historical centers of Albanian culture (Durrės, Tiranė, Shkodėr, Prishtinė, Tetovė, etc.) are located squarely in Geg-speaking territory. However, Standard Albanian is predominantly based on Tosk. The promotion of a Tosk-based variety as a standard is actually quite recent, and likely has much to do with the fact that Enver Hoxha, Albania's dictator from the 1940s until the 1980s, was from Gjirokastėr (in southern Albania), and thus was a native speaker of a Tosk variety. Even though they are predominantly located in Geg-speaking areas, the standard variety used in Kosovo and Macedonia is the same one used in Albania (i.e., it is based on Tosk).
Standard Albanian, while predominantly based on Tosk, does also have some Geg features. For example, the Standard Albanian 1st person singular present verb ending -j is a Geg feature; most Tosk dialects, on the other hand, have the ending -nj.
As with the other languages of the Balkans, the development of Albanian has been drastically affected by contact with speakers of other languages.
While reports of over 90 percent of Albanian's lexicon being composed of foreign words are definitely overstated, lexical borrowing has had an enormous effect on Albanian. There are several strata of lexical borrowings.
As part of Balkan Sprachbund, Albanian shares a number of features with the other languages of the Balkans (e.g., Greek, Bulgarian, Macedonian, Romanian, Turkish, Romani, etc). The following are some of Albanian's more notable Balkan features:
The earliest texts were written in various forms of the Latin alphabet, with additional characters borrowed from the Greek alphabet (as well as some additional characters of other origins). Up until the late 19th century, the script used to write Albanian appears to have been dependent on the religion of the scribe: Latin for Catholics, Greek for Orthodox Christians, and Perso-Arabic script for Muslims. In the late 19th century there were various attempts to create a standardized alphabet for Albanian; in 1908, the modern Albanian alphabet was codified at the Congress of Manastir.
The modern Albanian alphabet consists of 36 letters, several of which are digraphs.
| A,a | B,b | C,c | Ē,ē | D,d | Dh,dh | E,e | Ė,ė | F,f | G,g | Gj,gj | H,h | |||||||||||||
| I,i | J,j | K,k | L,l | Ll,ll | M,m | N,n | Nj,nj | O,o | P,p | Q,q | R,r | |||||||||||||
| Rr,rr | S,s | Sh,sh | T,t | Th,th | U,u | V,v | X,x | Xh,xh | Y,y | Z,z | Zh,zh |
As briefly discussed above, Geg has nasalized vowels. The normal convention is to write these vowels with a circumflex accent. All other issues with the alphabet are discussed in the relevant lessons.
Standard Albanian, as well as most Tosk dialects, has a seven-vowel system:
| pronunciation | ||||
|---|---|---|---|---|
| i | similar to the vowel in Eng. meat | |||
| e | similar to the vowel in Eng. met | |||
| a | similar to the vowel in Eng. hot | |||
| o | similar to the vowel in Eng. boat, but not diphthongal. More akin to the vowel in Spanish no. | |||
| u | similar to the vowel in Eng. boot | |||
| y | a high, front, rounded vowel; absent in English; similar to the vowel in French tu | |||
| ė | similar to the final vowel in Eng. sofa |
In Standard Albanian (as well as in most Geg dialects), the vowel ė is typically not pronounced in final position (e.g., nėntė 'nine' is pronounced nėnt), except for in monosyllabic words (e.g., njė 'one', qė 'that', etc). This sound is also commonly elided in other unstressed syllables. In some (mainly Tosk) dialects, this vowel is fully pronounced.
While Standard Albanian has a relatively simple seven-vowel system, most Geg varieties have a much more complex set of vowels. Any of the vowels above, with the exception of ė, can be nasalized. In addition, Geg has distinctive vowel length, so any of the vowels (except, again ė) can be long or short. Camaj (1984) also claims that some Geg varieties have a distinction between short nasal vowels and long nasal vowels.
As for consonants, though most of the letter-sound correspondences will be familiar, there are some exceptions:
| description | sounds like... | |||||
|---|---|---|---|---|---|---|
| c | voiceless dental affricate | ts in English cats, z in Italian zio, c in Russian cvet | ||||
| ē | voiceless postalveolar affricate | ch in English choose, c in Italian cento | ||||
| dh | voiced dental fricative | th in English the | ||||
| gj | voiced palatal stop | similar to g in English gear | ||||
| ll | voiced velarized lateral | similar to ll in English ball; in Albanian, unlike in English, this sound can occur in any position in the word. | ||||
| nj | palatal nasal | gn in French agneau, similar to ni in Eng. onion | ||||
| q | voiceless palatal stop | similar to k in Eng. key | ||||
| rr | alveolar trill | rr in Spanish sierra | ||||
| th | voiceless dental fricative | th in English thing | ||||
| x | voiced dental affricate | ds in English needs, z in Italian zero | ||||
| xh | voiced postalveolar affricate | j in English judge, g in Italian giro | ||||
| zh | voiced postalveolar fricative | s in English pleasure, j in French jour |
Note: there are great disparities in capability among personal computers in contemporary use. Unfortunately, support for Unicode® and/or the repertoire of fonts installed on your personal computer cannot be detected by a web server! Accordingly, we have prepared multiple versions of each lesson; this set of lessons is for systems/browsers lacking Unicode support, or having less than full Unicode 2.0 font support. (You may switch to other versions via links below.) Lessons:
Leonard Newmark's Albanian-English Dictionary (Oxford Univ. Press, 1998) is the most comprehensive Albanian-English dictionary available. In addition to covering the standard language, it also lists Geg and other dialectal forms. Stuart Mann published both An English-Albanian Dictionary (Cambridge Univ. Press, 1957), as well as a A Historical Albanian-English Dictionary (Longmans, Green, 1948).
A comprehensive textbook-style grammar of Standard Albanian is that of Newmark, Hubbard, & Prifti, titled Standard Albanian: A Reference Grammar for Students (Stanford Univ. Press, 1982). This work is mainly focused on Albanian morphology, but gives some coverage to other aspects of Albanian grammar. Martin Camaj's Albanian Grammar with Exercises, Chrestomathy and Glossaries gives a description of both Geg and Tosk. Chris Hughes's Gegnishtja e Sotme: A Course in Modern Geg Albanian is a textbook-style introduction to Modern Geg.
Our Web Links page includes pointers to Balkan/Albanian resources elsewhere.