Zapotec languages

Diidxazá, Dizhsa
Ethnicity Zapotecs
Oaxaca, Veracruz, Guerrero, Puebla. Small populations in California and New Jersey, United States.
Native speakers
490,000 in Mexico (2020 census)[1]
Linguistic classification Oto-Manguean
Proto-language Ancient Zapotec
  • Central
       (Isthmus and Valley)
  • Mazaltepec
  • Sierra Norte
  • Sierra Sur
  • Western
ISO 639-2 / 5 zap
ISO 639-3 zap
Glottolog zapo1437
Zapotec languages map.svg
The Zapotec languages as classified by Glottolog

Zapotec speaking areas of Oaxaca (as of 2015)

The Zapotec /ˈzæpətɛk/[2] languages are a group of around 50 closely related indigenous Mesoamerican languages that constitute a main branch of the Oto-Manguean language family and which is spoken by the Zapotec people from the southwestern-central highlands of Mexico. A 2020 census reports nearly half a million speakers,[1] with the majority inhabiting the state of Oaxaca. Zapotec-speaking communities are also found in the neighboring states of Puebla, Veracruz, and Guerrero. Labor migration has also brought a number of native Zapotec speakers to the United States, particularly in California and New Jersey. Most Zapotec-speaking communities are highly bilingual in Spanish.


The name of the language in Zapotec itself varies according to the geographical variant. In Juchitán (Isthmus) it is Diidxazá [didʒaˈza],[3] in Mitla it is Didxsaj [didʒˈsaʰ],[4] in Zoogocho it is Diža'xon [diʒaʔˈʐon],[5] in Coatec Zapotec it is Di'zhke' [diʔʒˈkeʔ],[6] in Miahuatec Zapotec it is Dí'zdéh [diʔzdæ] and in Santa Catarina Quioquitani it is Tiits Së [tiˀts sæ], for example.[7] The first part of these expressions has the meaning 'word' (perhaps slightly reduced as appropriate for part of a compound).



Zapotec and the related Chatino languages together form the Zapotecan subgroup of the Oto-Manguean language family. Zapotec languages (along with all Oto-Manguean languages) form part of the Mesoamerican Linguistic Area, an area of linguistic convergence developed throughout millennia of interaction between the peoples of Mesoamerica. As a result, languages have acquired characteristics from genetically unrelated languages of the area.


Although commonly described as a language, Zapotec is a fairly extensive, if close-knit, language family. The time depth is comparable to that of the Romance languages.[8] Dialectal divergence between Zapotec-speaking communities is extensive and complicated.[9] Many varieties of Zapotec are mutually unintelligible with one another. There are some radical jumps in intelligibility between geographically close communities, so the varieties do not form a dialect continuum in a strict sense, though neither are there clear-cut divisions between groups of varieties.[10] As a result, the Mexican government officially recognizes sixty Zapotec languages.[11]

Zapotec languages fall into four broad geographic divisions: Zapoteco de la Sierra Norte (Northern Zapotec), Valley Zapotec, Zapoteco de la Sierra Sur (Southern Zapotec), and Isthmus Zapotec. Northern Zapotec languages are spoken in the mountainous region of Oaxaca, in the Northern Sierra Madre mountain ranges; Southern Zapotec languages and are spoken in the mountainous region of Oaxaca, in the Southern Sierra Madre mountain ranges; Valley Zapotec languages are spoken in the Valley of Oaxaca, and Isthmus Zapotec languages are spoken in the Isthmus of Tehuantepec. However, Valley Zapotec and Isthmus Zapotec group together (as Central Zapotec), and this ignores the Papabuco and Western Zapotec varieties.

Certain characteristics serve to classify Zapotec varieties in ways that cross-cut the geographical divisions. One of these is the distinction between disyllabic roots and monosyllabic roots. It is clear that proto-Zapotec had disyllabic roots; the vowel of the second syllable could be any one of the inventory of vowels. One innovation shared by many varieties of Zapotec is the loss (or partial loss) of the vowel of the second syllable. The word for 'water' illustrates this fact. In conservative varieties, the vowel of the second syllable is retained: /nisa/ in Isthmus Zapotec and /inda/ in Sierra de Juárez Zapotec, for example.

In innovative varieties, the vowel of the second syllable was lost: /nis/ in Amatlán Zapotec and Mitla Zapotec, for example. The loss of the vowel /i/ often resulted in palatalized consonants, and the loss of /u/ often resulted in labialized consonants. Compare the words for 'dog' in conservative varieties (Isthmus /beʔkuʔ/, Sierra de Juárez /bekuʔ/) and innovative varieties (Amatlán /mbak/ and Mitla /bæʔkʷ/). In this particular word Amatlán does not have a labialized consonant at the end, and the otherwise innovative variety Yatzachi keeps the final vowel: /bekoʔ/.

Another characteristic that classifies Zapotec varieties is the existence or not of a contrast between alveopalatal fricatives and retroflex fricatives. Innovative varieties have introduced the contrast while conservative varieties have not.[12]

The most influential classification of Zapotec languages is due to Thomas Smith Stark, who proposed the following overall classification of Zapotec languages.[13]

The branch of the family that contains the most languages is Central Zapotec, which includes most of the Zapotec languages of the Valley of Oaxaca and the Isthmus of Tehuantepec. The following figure shows the classification suggested by Smith Stark (2007).[14]

The Northern branch is shown in more detail below, again following Smith Stark (2007)

Based on intelligibility studies, previous classifications, and the needs for literacy development, Merrill (2008) classifies the varieties as follows; several varieties (in brackets) often small and moribund, were not included in the principal list:[15]

Central Zapotec
Mazaltepec Zapotec
? [Tejalapan]
Sierra Norte
Sierra Sur
Western Zapotec

Two of the moribund varieties, Asunción Mixtepec and San Bartolo Yautepec (ISO "Yautepec"), are apparently divergent.

Santa Catarina Albarradas Zapotec was not listed, and presumably subsumed under Albarradas Zapotec, but intelligibility is one-way.

Based on forms of the personal pronouns, Operstein (2003) groups the languages as follows:[16]

  • Proto-Zapotec
    • Southern Zapotec
    • Papabuco
    • (unnamed)
      • Solteco
      • Northern Zapotec
      • Central Zapotec
        • Valley Zapotec
        • Isthmus Zapotec

Based on the development of Proto-Zapotec *tty/*ty and *ttz/*tz, Operstein (2012) groups the Zapotec languages as follows.[17]

  • Proto-Zapotec
    • Western
    • Papabuco
    • Coatec
    • Core Zapotec
      • Southern
      • Central
      • Northern

Phonetics and phonology

Fortis and lenis

In Zapotec languages, fortis typically corresponds to voicelessness and extra length in obstruents and extra length in sonorants. Lenis corresponds to voicing and less length in obstruents and less length in sonorants. In addition, stressed vowels before lenis consonants may be longer than those before fortis consonants.[18]

Retroflex consonants

Some varieties of Zapotec have a contrast between alveopalatal fricatives and retroflex fricatives. In other varieties this distinction has been lost in favor of only one or the other.


Zapotec languages are tonal, as are Otomanguean languages generally. Unfortunately, materials on Zapotec languages vary widely in the quality of their tonal description and analysis.

Many Northern Zapotec languages, such as Sierra Juárez (Nellis and Nellis 1983, Bickmore and Broadwell 1998, Tejada 2010) show a system of three level tones (L, M, H) plus two contours. Potential aspect and 1st person singular both involve floating high tones. One example is Texmelucan Zapotec, which has four contrasting tones: three contour tones and one level tone, as shown in the figure. These tones are used for "word play" frequently.

A typical system for a Central Zapotec language has two level tones plus contours, but there are complex interactions between tone, stress and phonation type, e.g. San Lucas Quiaviní (Chávez Peón 2010).


Zapotec languages all display contrastive phonation type differences in vowels. Minimally they have simple vowels vs. some kind of laryngealization or creakiness; see Quioquitani Zapotec, for example.[19] Others have a contrast between simple, laryngealized and "checked" vowels (which sound like they end in a glottal stop); see Isthmus Zapotec, for example.[20] Others have a contrast between those types and also breathy vowels. The latter varieties include Mitla Zapotec[21] and San Lucas Quiaviní Zapotec.[22]


Varieties that are described as having stress, including Isthmus Zapotec,[20] have it on the penultimate syllable of the root. Prefixes and clitics do not affect it. Many varieties overwhelmingly have monosyllabic roots and stress falls on that syllable.


Zapotec languages vary considerably. Some characteristics of Zapotec grammar common to the language family (though not necessarily present in all members) are: an extensive 3rd person pronoun system based on noun classes such as divinity, babies, animals, objects (inanimate), etc.; a distinction in the first person plural ("we") as to inclusive (including the hearer[s]) and exclusive (not including the hearer[s]); a frequent underspecificity of singular/plural distinctions.

Word order

Zapotec languages are VSO, as in the following example from San Dionisio Ocotepec Zapotec (Broadwell 2001):

Though the most basic order has the verb at the beginning of the sentence, all Zapotec languages have a number of preverbal positions for topical, focal, negative, and/or interrogative elements. The following example from Quiegolani Zapotec (Black 2001) shows a focused element and an adverb before the verb:

The preverbal position for interrogatives is shown in the following example, from San Dionisio Ocotepec Zapotec (Broadwell 2001). This is an example of wh-movement:

The possessed noun precedes the possessor in Zapotec languages, as appropriate for head-initial languages:

The noun also precedes a modifying phrase that is another way to indicate possessor with nouns that are not inherently possessed.

The preceding example also illustrate that Zapotec languages have prepositional phrases as expected for head-initial languages. Quantifiers, including numbers and the word for 'one' used as an indefinite article, precede the noun.

Demonstratives, including one that means Aforementioned (in some varieties) and is sometimes translated as a definite article, occur phrase-finally (although they are sometimes written as if they were suffixes).

Descriptive adjectives follow the noun. When they occur they also typically receive the primary stress of the phrase, causing the noun to lose some phonation features. Note the loss of the breathy feature on the word /beʰnː/ in the following example.

Zapotec languages also show the phenomenon known as pied-piping with inversion, which may change the head-initial order of phrases such as NP, PP, and QP.

Verbal morphology

A few varieties of Zapotec have passive morphology, shown by a prefix on the verb. Compare Texmelucan Zapotec root /o/ 'eat' and its passive stem /dug-o/ 'be eaten', with the prefix /dug-/.[27] In many other cases, the transitive-intransitive verb pairs are appropriately described as causative vs. noncausative verb pairs and not as transitive-passive pairs.

Most if not all varieties of Zapotec languages have intransitive-transitive verb pairs which may be analyzed as noncausative vs. causative. The derivation may be obvious or not depending on the kinds of sounds that are involved. In the simplest cases, causative is transparently seen to be a prefix, cognate with /s-/ or with /k-/, but it may also require the use of a thematic vowel /u/, as in the following examples from Mitla Zapotec:[28]

Base verb root Causative verb stem
/juʔ/ ‘enter’ /u-s-juʔ/ ‘put in' (i.e. 'cause to enter')
/ja/ ‘be clean’ /u-s-ja/ ‘clean' (i.e., 'cause to be clean')

Setting aside possible abstract analyses of these facts (which posit an underlying prefix /k-/ that causes the changes seen superficially), we can illustrate the kinds of non-causative vs. causative pairs with the following examples. (Basic intransitive verbs are more common than basic transitive verbs, as in many languages.) The presence of the theme vowel /u-/ should be noted in the causative verbs, and in some cases is the only difference between the two verbs.[29] One example of a double causative is also included here; these are not possible in all varieties.

Base verb root Causative verb stem
/ʒiˀ/ ‘be squeezed’ /u-ʃiˀ/ ‘squeeze’
/deʰb/ ‘be wrapped’ /u-teʰb/ ‘wrap’
/niʰt/ ‘be lost’ /u-nniʰt/ ‘lose’
/liˀb/ ‘be tied’ /u-lliˀb/ ‘tie’
/dzukaʰ/ ‘be taken away’ /u-tsukaʰ/ ‘take away’
/kaˈduˀ/ ‘be tied’ /u-k-waˈduˀ/ ‘tie’
/uʔtʃ/ ‘be mixed’ /u-g-uʔtʃ/ ‘mix’ /u-s-g-uʔtʃ/ 'stir'

Verbs in Zapotec languages inflect with prefixes to show grammatical aspect. The three aspects that are found in all varieties are habitual, potential and completive. San Lucas Quiaviní Zapotec[30] has seven aspects: habitual, perfective, irrealis (viz., potential), progressive, definite (viz., completive), subjunctive, and neutral.

The shape of the root affects the way in which verbs conjugate. Consonant-initial roots conjugate differently than vowel-initial roots, for example, and causative verbs conjugate differently than simple verbs. Prefix vowels may be lost or merged with the root vowel, epenthetic vowels and consonants may be found, and root vowels may be affected. The following example shows the aspectual inflection of three verbs in Mitla Zapotec.[31]

habitual unreal continuative potential definite future completive
/ɾ-baʰnː/ /ni-baʰnː/ /ka-baʰnː/ /gi-baʰnː/ /si-baʰnː/ /bi-baʰnː/ ' wake up'
/ɾ-aʰdʒ/ /nj-aʰdʒ/ /kaj-aʰdʒ/ /g-adʒ/[32] /s-aʰdʒ/ /guʰdʒ/ ' get wet'
/ɾ-uʰn/ /nj-uʰn/ /kaj-uʰn/ /g-uʰn/ /s-uʰn/ /b-eʰn/ ' do, make'

Noun morphology

There is virtually no true morphology in the Zapotec noun. There is no case marking. Plurality is indicated (if at all) in the noun phrase, either by a number or a general quantifier that may be simply translated as "plural". Possessors are also indicated in the noun phrase either by a nominal or a pronominal element. (In both of these cases, since the plural morpheme and the pronouns may be enclitics, they are often written as if they were prefixes and suffixes, respectively, although they arguably are not true affixes.)

The only clear morphology in most varieties of Zapotec is the derivational prefix /ʂ-/ (or its cognate) that derives an inherently possessed noun from a noun that does not take a possessor.[33] Compare Mitla Zapotec /koʰb/ 'dough', /ʃ-koʰb/ 'dough of'. The derived noun is used when the possessor is indicated, as in /ʃkoʰb ni/ 'his/her dough'.[34]

Variable terminology in the description of Zapotec languages

Many linguists working on Zapotec languages use different terminology for describing what appear to be related or similar phenomena, such as grammatical aspect markers. This is due in part because of the different audiences for which the descriptions have been prepared (professional linguists vs. Zapotec speakers of the language communities, for example). The difference of terminology is particularly true in descriptions of the aspectual systems of the Valley Zapotec languages. The following table shows some correspondences:

Typical allomorphs Typical use Terms used
ru-, ri-, r-, rr- ongoing or habitual present tense events habitual (Mitla Zapotec, Stubblefield and Stubblefield 1991; San Dionisio Zapotec, Broadwell 2001, SLQZ), present (SAVZ)
bi-, b-, gu-, u- past tense completed events perfective (San Lucas Quiaviní Zapotec (SLQZ), Munro and Lopez, et al. 1999)

completive (Mitla Zapotec, Stubblefield and Stubblefield 1991; San Dionisio Zapotec, Broadwell 2001)

gi-, i-, fortis consonant future events irrealis (SLQZ), futuro (Santa Ana del Valle Zapotec (SAVZ), Rojas Torres), indefinite future (Mitla Zapotec, Stubblefield and Stubblefield 1991), potential (San Dionisio Zapotec, Broadwell 2001)
na-, n- used with stative verbs for a current state neutral (SLQZ), estativo (SAVZ)
si-, s- future events (where the speaker is strongly committed to the truth of the statement) definite future (Mitla Zapotec, Stubblefield and Stubblefield 1991), definite (SLQZ)
ni- used in the complement of a verb of negation negative (San Dionisio Ocotepec Zapotec, Broadwell 2001), irrealis (Mitla Zapotec, Stubblefield and Stubblefield 1991), subjunctive (SLQZ)
ka-, kay- ongoing events continuative (Mitla Zapotec, Stubblefield and Stubblefield 1991; San Dionisio Ocotepec Zapotec, Broadwell 2001) progressive (SLQZ)

Documentation and scholarship

Dictionary of the language by Francisco Pimental, date unknown

Franciscan and/or Dominican friars published a vocabulary and grammar of Zapotec (Antequera Zapotec) in the 16th century.[35] In the past century, there have been ongoing efforts to produce Zapotec orthographies and to write in Zapotec. The Isthmus Zapotec alphabet in use today was founded in the 1950s, drawing from works going back as far as the 1920s. Until recently the Zapotec languages were only sparsely studied and documented but in recent years Zapotec language has begun to receive serious attention by descriptive linguists (see bibliography).


The viability of Zapotec languages also varies tremendously. Loxicha Zapotec, for example, has over 70,000 speakers. San Felipe Tejalapan Zapotec might have ten, all elderly. San Agustín Mixtepec Zapotec reportedly has just one remaining speaker. Historically, government teachers discouraged the use of the language, which has contributed to its diminution in many places. In La Ventosa, Oaxaca, a Zapotec mother of three claims that her children are punished in class if they speak Zapotec.[citation needed] Other areas however, such as the Isthmus, proudly maintain their mother tongue.[36][37]

Zapotec-language programming is available on a number of radio stations: The CDI's radio stations XEGLO, based in Guelatao de Juárez, Oaxaca, and XEQIN-AM, based in San Quintín, Baja California, carry Zapotec-language programming along with other indigenous languages. (Coatecas Altas Zapotec speakers live in the area around San Quintín, Baja California.[38]) in the Isthmus there is one privately owned commercial station, Radio TEKA (1030 AM), and several community-based radio stations, most notably the community-based Radio Totopo (102.5 FM) in Juchitán, Oaxaca, and Radio Atempa in San Blas Atempa.

In California, Los Angeles is home to communities of Yalálag Zapotec and Zoogocho Zapotec language speakers.[39][40] In 2010, a Zapotec language class was offered at the University of California in San Diego.[41] In 2012, the Natividad Medical Center of Salinas, California, had trained medical interpreters bilingual in Zapotec languages as well as in Spanish;[42] in March 2014, Natividad Medical Foundation launched Indigenous Interpreting+, "a community and medical interpreting business specializing in indigenous languages from Mexico and Central and South America," including Zapotec languages, Mixtec, Trique, and Chatino.[43]