Climate change, etymology, and speaker population

A quick Google search turns up a number of theories on the etymology of the name Tabelbala, none of which correspond to the one that old men here tell me, which appears to me to be much the most plausible. The oasis' name is Tsawerbets in Kwarandzie, Tabelbalt in local Tamazight, and Belbala in local Arabic; they derive it from a tree called awerbel in Kwarandjie and belbal in local Arabic, that used to be common but (presumably due to the lower water table) no longer grows here. [e=schwa] It turns out that belbal is fairly widespread in North African Arabic, and refers to a type of pine; it's also attested in Taznatit, as abelbal. The normal Berber diminutive gives tabelbalt, and the usual Kwarandjie shift of l>r and t>ts would give tsaberbelts; intervocalic b>w is irregular, but I have heard it in other contexts, and final clusters tend to be simplified, which would give tsawerbets. Berber diminutive morphology is not productive in Kwarandjie, so it's hard to imagine this being a folk etymology. If this is correct, the very name of the oasis, like its many acres of ruins and its hundreds of dried-up foggaras, is a mute testimony to a time not too long ago when it was much greener and wetter.

At the moment, Kwarandjie turns out to have roughly on the order of 3000 speakers, adding up the populations of the three villages as given to me by a local official (himself a speaker) and assuming the minority that doesn't speak it at all is made up for by all the emigrant speakers in Tindouf and Bechar. This represents about half the population of the oasis; the other half is in el-Kartsi (le Quartier), the newer town centre. Despite the endangerment discussed in the previous post, this is larger than it's been at any point since 1908, when Cancel counted barely 500 or so speakers. But even in Cancel's time most of the foggaras were dry, and a few centuries earlier refugees had fled the area for places like Mlouka and Ktaoua; in earlier periods the number of speakers may have been significantly larger, judging by the ruins of their houses, which seem to cover an area rather larger than the present settlements do. That former climate might help explain why the oasis not only kept a language that has remained practically nowhere else in the thousand kilometers between it and Timbuktu, but also kept much more Songhay vocabulary than the other northern Songhay languages - even words like hawi "cow", referring to items currently totally absent from the oasis, or tsyu "read" and genga "pray", referring to concepts strongly associated with Arabic. The historic decline in the oasis's population and prosperity has surely itself had its effect on the language, letting words associated with particular specialties (perhaps silverwork, for example) to vanish for lack of customers to sustain them, or ones for species to vanish with their referents (as the word asiyed, "ostrich", has nearly finished doing - I've only found one speaker who knew it, although Champault confirms it). But is there any way to prove the existence of such an effect, or measure it?

Is this normal in language shift?

When I first got here, I thought I was seeing a textbook language shift situation. But I gradually realised something that I don't remember encountering mention of in my textbooks: there's a whole generation of fluent speakers here (most speakers under 25, actually) who only learned it in their early teen or preteen years. Most parents since the eighties speak only Arabic to their children, but the language is in wide use in situations like football games and farm work, and the younger ones seem to have picked it up there; in fact, it seems possible that the process is continuing with the even younger kids. Does anyone know of a similar case, or am I right in thinking this is a little unexpected?

Eid Mubarak / Happy Holidays!

Sahha Eidkoum, Eid Mubarak, and `agbwa lgabel to everybody out there! And to the rest of you, hope you're having a great holiday and a well-deserved break. Eid here in Tabelbala was good - plenty of mutton, couscous, and maqq, a dish made with boiled dates and bread which tastes rather good. And as a nice seasonal bonus, ADSL has arrived: it looks rather unreliable, but no more so than the phone system. The language is still getting more interesting every time I look at it, and I've started making some rather extensive recordings; just the day before yesterday Hadj Berrouk gave me a rather detailed explanation of astronomy. Mind you, all the star names are (dialectal) Arabic, but nonetheless interesting (and the two planet names are Kwarandzie, as are terms like "eclipse", "crescent moon", and "falling star".)

My friend Smail, who works at the local school, has just started a new blog (with some help from me): you can go read it at

More from Tabelbala

“əl`əyš ṭazu, əlma iri العيش طازو، الماء إيري
əlləṛḍ gəndza, ssma bini.” الأرض قندا، السما بيني

“Couscous is ṭazu, water iri,
earth is gəndza, sky bini.”

- A locally widely known ditty summarising Kwarandjie. Its antiquity is shown by the second line: across Songhay ganda and beene mean “earth” and “sky”, but in Kwarandjie their cognates have been restricted to “down” and “up”, with “earth” and “sky” normally expressed by dzəw and igərwən respectively - and the latter, while Berber, appears from the absence of an a before the w to have been borrowed not from Middle Atlas Tamazight nor Tabeldit (“ksours sud-oranien”) nor but from a language similar to Zenaga, which has not been spoken around here since the Reguibat's ancestors reached the area some five hundred plus years ago. Readers who know a little Berber may assume the r is a typo, as I at first did on reading Cancel, but it is not: I take it to be the product of dissimilation (n...n > l...n) plus the common Kwarandjie sound shift l > r. On the other hand, for the rhyme (such as it is) to work, the sound changes –e > -i and –a- > -e- [and thence to > i] / _r, at least, must already have happened.

The work continues. I've filled up five notebooks and made another few recordings, some quite interesting; my sketch grammar has reached 30 pages. I've gotten to know quite a large number of faces, something I find far more difficult than memorising words - although the latter is made easier by the habit of many people in this town of testing my knowledge of every noun they can think of on the spur of the moment.

Kwaṛa-n-dyəy, like many non-Arabic languages of the region, has a coded register in which Arabic loanwords or other expressions likely to be comprehensible to an outsider listener are replaced with other expressions. This register is quite extensive, and is known to many though not all speakers in all three towns. Since all numbers above 3 are Arabic borrowings, and hiding numbers is often particularly useful in trade, it perforce uses a base-5 counting system based on kembi "hand", a situation with parallels in several other Saharan oases which has led some to the probably mistaken idea that proto-Berber was base 5.

I have an open request from several interested citizens of Tabelbala for a competent archeologist, geologist, paleontologist, or other specialist in disciplines relevant to understanding and preserving the area's heritage to come and study. If you know or are such a person, please take note: you will find ample assistance and encouragement, and be welcomed hospitably. (Relevant bibliographical references would also be great.) The ruins of several medieval if not older towns are buried under the sands here, and some people at least would like to see them studied. You would be expected to make whatever information you find available to the town's citizens, and to help lobby for a local museum to put them in.

Qriqesh just came in, and requests that I put his nickname online for all to see: so here it is. (His real name is Abdallah Yahiaoui.)

Update from Tabelbala

I've gotten a clearer idea of the linguistic situation here. Apart from Kwarandjie, which, as I've said, is the main language of three of the four villages (and used to be the lingua franca of the oasis), and of course Arabic, there are a few families here (in Ifrenyu and el-Karti) speaking Tamazight - specifically, the dialects of the Ayt Khebbach and Ayt Atta tribes of southern Morocco. They seem to have traditionally been nomads in the general vicinity who settled down here in the seventies or so, although much outnumbered by the (Hassaniya Arabic speaking) Rgaybat who constitute what little population there is in the desert surrounding Tabelbala. I've been doing a little fieldwork with them, focusing on vocabulary that might be relevant to Kwarandjie etymologies, and have been struck by how rarely they seem to provide the source for Kwarandjie's Berber vocabulary - even when the word is quite common in Berber (as it often is not), like adra for "mountain", they seem to use a different one (in this case, tawrirt). The speakers I've spoken to have a rather impressively large vocabulary, but often seem quite embarrassed to speak the language at all - one at first quoted me a local proverb "Esshelha ma hi klam, weddhen ma hu lidam" - Shelha isn't language like ghee isn't (some sort of highly valued medicinal fat product.)"

The list of tense/aspect/mood particles continue to grow - a particularly impressive example I encountered yesterday was `a-s-a`a-m-k-dri (1S-neg-prox.fut.-subj.-yet-go), meaning something like "I've totally stopped going." (ma tlitsh nruh kamel). Actually, -s-a`a-m is a contraction that probably deserves a single lexical entry, but never mind. Note the `ayns in historic Songhay vocabulary here, deriving from original gh.

The phonological issues I mentioned last turn out to derive historically from deletion of an emphatic r, not from any significant difference in the consonants themselves. Not sure yet how to deal with them synchronically, though...

Brief update

Today I'm in Bechar, with a somewhat more effective net connection; I apologise for the poor appearance of the previous one, which I sent by email. I am currently sitting with Omar Yahiaoui (who asked me to mention this); I've been hanging out mainly with the Yahiaoui extended family of Kwara, one of the three villages that speak the language, although I'll have to balance this soon with some extended staying in Ifrenyu, the other main village with which there is a certain amount of mostly but not totally friendly rivalry. The phonology keeps getting bigger and richer - never mind all the emphatics and labiovelarised consonants and affricates, there are a few contrasts involving h and gw that have clear effects on surrounding vowels but that I simply can't seem to hear. I've made a few more recordings, and done a bit more sightseeing, going into the erg a little - sand dunes and not much else from Tabelbala all the way to Ougarta.

PS: should have written Shelha, not "Shelhiyya", in my previous Bechar post.

Fieldwork post I

Hello everybody! I am alive and well in Tabelbala, speaking "Korandje" ([kwɑ́:ṛɑ-n-dʒji], probably /kwaṛa-n-dzyəy/) on a regular basis and writing down vast numbers of obscure words enthusiastically volunteered by the more fluent older generation of speakers, while trying to figure out grammatical issues from natural speech overheard or addressed to me and from occasional elicitation when I can find someone who will put up with it. I've filled a notebook of more than two hundred pages with notes already, but I've done only a very small amount of recording, which has so far proven harder to negotiate.

The people here are incredibly hospitable, and the area remarkable for its beauty - an oasis of gardens (ləmbyu) and irrigation canals (tsirgyanən) between the mountain (aḍṛa) and the erg (amrər). However, it is remarkably isolated, connected to the outside world (and the nearest towns are a very long way away) by only a single road and a single telephone line, which has not been conducive to job creation; there is talk of a second road to Adrar, which might help. Its inherent touristic potential, which some here are keen on expanding, is difficult to realise in the absence of any hotels.

The language is clearly endangered. People from about 30 and up speak it routinely (though all speakers appear to speak dialectal Arabic to native standard), but most younger speakers seem to have a primarily passive knowledge of the language, always answering in Arabic or struggling to find even basic vocabulary, though this is more true of some families than others. Most people I've spent time talking with have been keen on the idea of reviving its fortunes, or even teaching it in school "like Kabyle", but some have been rather more negative, dismissing it as not a proper language and of no use.

There's some very interesting stuff going on in the language, including what I take to be a sound shift in progress of affricated [kç] (the sound that Cancel wrote as <χ>) to affricated [ts] (of which speakers are well aware.) Cancel's <th>, incidentally, is itself [ts]. The tense/aspect/mood system has been reworked much more radically than existing materials indicated, with a past copula (also used for what I so far interpret as a past progressive) ga showing up before personal agreement rather than, like aspect and mood markers, after. The phonology is complex: tone and most vowel contrasts have definitely been lost, but a lot of emphatics have been gained, including such unusual sounds as affricated [ṭṣ]. Vowels reduced to schwa, and lost coda r's, reappear in verbs when you add a 3rd person direct object pronoun "clitic" (but not when you add a 3rd person indirect object one.) The language has a specialised focus marker, which interacts interestingly with subject person/number markers. The vocabulary is of major interest in its own right for what it has to say about the history of this part of the Sahara. It defies any simple effort to pin down the immediate source of the agricultural technologies that have allowed the Belbalis to survive and flourish here: "palm" is Songhay kungu, but "date" Berber tsini; a foggara is Songhay bəng-bini as long as it stays underground, but Berber tsargya once it emerges.

I am currently in Bechar, the local capital, and plan to take the four-hour coach trip to Tabelbala tomorrow inshallah. After corresponding with a computing student here whose family is from Tabelbala for some time, I've finally met him in person; he seems a nice guy. Interestingly, when speaking Arabic, he calls Korandje "shelHiyya" - the name usually applied to the Berber dialects of the region and of southern Morocco. This suggests to me that this word may have become a generic term for non-Arabic local languages, in which case all statements about a given oasis around here speaking "Tachelhit" or "Shelha" need to be checked carefully.

Naturally, I've combed the local bookstores (there aren't too many, but there is a university here after all); I only found one book relating to the linguistics of this rough area, a work by Mohamed Bouali (2004) on the attitudes of people in the Berber-speaking oasis of Boussemghoun in western Algeria to a number of issues, including their own and other Algerian languages. Not very surprisingly, these seem closely aligned with moderate conservative opinion in Algeria generally, rather than showing any particularly strong similarity to the spectrum of attitudes common in Kabylie; his interviewees displayed pride in their language, but also identified fairly strongly with Arabic, and were more often than not hostile to the idea of teaching Berber ("Tachelhit") in school. The author reports that, unlike in some nearby oases, the Semghounis have consistently retained Berber and show no signs of shifting to Arabic as a home language. In Bechar itself, all talk I've heard has been in Arabic; the local accent is distinguished a lot of affricated t's (ts) and frequent use of "wah" for "yes", but is overall even closer to my own dialect then I was expecting.

Fieldwork begins

First of all: a belated Eid Mubarak everybody!

Second: Tomorrow I'm flying to Algeria, and heading to Tabelbala soon after to document Korandje (Kwarandji), a northern Songhay language spoken only there. I don't yet know how accessible the Internet is in Tabelbala, so blogging may be even more irregular than so far, or even impossible. If it is reasonably accessible, I plan to recount my experiences doing research out there - so stay tuned...

Language learning link

I recently found out just how much downloadable language material - mainly Peace Corps manuals, grammars, and dictionaries - there was on a site called ERIC, including but not limited to Hassaniya, Fulfulde, Kyrgyz, Kazakh, Malagasy, Tashelhiyt ("Tashelheet"), Hausa, Ilocano, Sinhala... It seems only fair to spread the word.

Child triglossia: an anecdote

While I was in Algeria, I was watching a cousin's toddler play with one of those toy computers that play a word when you press on a letter. The words, in this case, were in Arabic - Fusha (Classical), of course. Several times she repeated after the machine and then, with a very emphatic tone, added the Darja (Algerian dialect) translation - for example:

Machine: qird (monkey)
Toddler: qird! šadi!

Then she got to "bird" (Arabic ṭā'ir) and came out with the memorable line:

ṭā'ir! u b-əṛ-ṛumiyaa nqulu-lu ḷa ṭa'ir.
(ṭā'ir! And in French we call it ḷa ṭa'ir.)

Gets you wondering, really... how do kids acquire di/triglossia? It's certainly not just a matter of what they learn in school, as this case illustrates.

Berber Qur'an translations

Thanks to the efforts of various press agencies, there has been a story floating around the Internet this year about the "first Tamazight Quran". In reality, it's more like the last first Tamazight Quran. I'll try to describe the situation to date as best I can; if any readers know of relevant material I have omitted, please tell me!

You will find occasional reports online that the medieval Berghouata kingdom put together a Berber Qur'an translation; these are misunderstandings. If you look at what al-Bakri (the oldest source I can think of offhand for this) actually says about the Berghouata, he says their second king Salih ibn Tarif claimed to have received a revelation in Berber in 80 chapters which he called a Qur'an, but whose contents (some of which al-Bakri gives translated into Arabic) had nothing to do with the Qur'an. In fact, a later Berghouata king massacred thousands of Muslims in his kingdom for refusing to convert from Islam to the Berghouata religion. It would not surprise me at all to learn of a medieval Berber translation of the Qur'an; I know of such works for Turkish, Spanish, Persian, and Kanuri. However, discounting occasional ill-sourced reports of a no longer extant Almohad one, the earliest reference to such translation that I have come across is a fatwa by the Moroccan shaykh Al-Ḥasan bin Mas`ūd al-Yūsī in 1102 AH (1691 AD) judging translation of the Qur'ān into Tamazight to be permissible, mentioned in Jouhadi Hocine's translation's foreword; such a fatwa implies sporadic translation, but, as far as I am aware, no full written translation from the period has turned up.

Oral translations may be another matter. In Mali, there is reportedly a longstanding tradition of oral translation of the Qur'an into Tamasheq, the Berber language of the Tuareg; this was recorded in a series of 44 cassettes in 1989 by the Ahmed Baba Historical Documentation and Research Centre. Similar cases may well have existed elsewhere.

Serious published efforts at Qur'an translation seem to begin in the 1990s. The earliest partial translation to be printed seems to be Kamal Nait Zerrad's 1998 Lexique religieux berbère et néologie : un essai de traduction partielle du Coran. This work is primarily an effort to design a "purist" Berber religious vocabulary, one drawing on native lexical resources rather than Arabic borrowings, with a translation of a selection of suras added essentially as a proof of feasibility (the book's author, a well-known Berber linguist, does not in fact appear to be particularly strongly committed to Islam.) While the translation is basically into the author's native Kabyle, neologisms and words from other Berber varieties are so frequent as to make the translation rather difficult for native speakers of Kabyle to follow. This work uses the Latin orthography that has become more or less standard in Kabyle usage.

In 2003, with the Moroccan government's decision to raise the position of Tamazight and bring it into the school system, the first complete Berber Qur'an translation (strictly speaking, translation of the meanings of the Qur'an), Jouhadi Lhocine Baamrani's Tarjamat ma`ānī lqur'ān billuġati l'amāzīġiyyah: nūrun `alā nūr / tifawt f tifawt, many years in the making, finally appeared. This complete Moroccan translation (described years earlier, along with the political controversy surrounding it, by The Economist) has priorities more in accordance with one's expectations of such a work: the author's preface concerns itself primarily with reassuring the reader of the work's interpretative accuracy (the author uses the Warsh reading, and, in cases of difficulty, relies on examination of relevant hadith and well-known commentaries), and of the work's religious justification. However, conservative readers have expressed unease at his relative lack of religious training. The work is written in the Tashelhiyt of southern Morocco, a considerably less Arabic-influenced dialect than Kabyle; nonetheless, like Nait-Zerrad although not to the same extent, the author often chooses to use pure Berber vocabulary even when obscure in preference to Arabic loanwords, explicitly drawing an analogy to Fusha Arabic. "Some may say: I do not understand much of the Tamazight in which he has written, and I am Amazigh! I reply that not everyone who speaks Arabic, for example, understands the Qur'an which came down in faultless Arabic. Do not forget, dear reader, that a child spends much effort in gradually learning his native language, so why should you expect to know literary/pure (faṣīħ) Tamazight in a single go?" Apart from some Tifinagh on the cover, the author uses Arabic characters, regularly used by Tashelhiyt authors to write in their native language since the sixteenth century, although he substitutes a variant of Chafik's new orthography (writing all vowels as long instead of short, and using zay with three dots for the emphatic ẓ) which has grown in popularity. He has also published a translation of an-Nawawi's Forty Hadith, as well as some poetry.

Also in 2003, correlating to the Algerian government's gradual expansion of the role of Berber in efforts to conciliate opposition in Kabylie, a Kabyle translation of six hizbs, by Si Muḥend Muḥend Ṭayeb of the Ministry for Religious Affairs (with help from Said Bouziri, Djafar Oulefki, and Mohamed Tahar Ait Aldjet), was published by the King Fahd Complex for the Printing of the Holy Qur'an. This translation uses Arabic characters, but not in the systematic way of Jouhadi Lhocine's translation; rather than establishing a fixed phonemic orthography, it gives the impression of trying to fit Kabyle into Arabic characters in much the way that many people try to fit it into French ones, without any consideration for the phonemic rules of the language. For example, strictly phonetic assimilations across word boundaries, like n+r > rr, are written with shadda, and phonetically short a and ə are both written in the same way, with fatha. It was criticised by activists for its extensive use of Arabic vocabulary - although I rather suspect this makes it more readable to the average Kabyle speaker than the strict purism of other editions. A complete translation by the same people is to appear shortly; it is this which has been being carelessly reported as "the first Tamazight Qur'an".

However, when it does appear, it's not even going to be the first complete Kabyle translation. In late 2006, the poet and chemist Remḍan At Menṣur beat the Ministry to it; I saw copies of his complete translation in shop windows in Algiers and Paris, but have not yet got one. This work uses the Latin and Neo-Tifinagh orthographies on facing pages, and comes with an audio CD. The more extreme anti-Islamic wings of the Kabyle autonomy movement criticised the very fact of his translating this as promoting "Arabisation and Islamisation" (huh, who would have thought that translating the Qur'an might be construed as promoting Islam?) A more conservative reader, while praising the work, suggested that it would have been better off using Arabic script, and that the difficult task of translating with an eye to the correct interpretation required the efforts of a whole committee rather than a single man.

More translations are no doubt to be expected, and their quality - both interpretative and linguistic - will hopefully improve. But this cannot take place in isolation; the form Berber translations of the Qur'an end up taking will inevitably be heavily influenced by the form of the language that ends up being taught in the schools and used in other publications, and politics will continue to affect both whether and how the text is translated. It will be interesting to see how the situation develops.

(Non-)universal quantifiers

Many readers will recall Everett's argument that Pirahã had no universal quantifier because statements featuring what he had originally translated as "all" would generally be considered true even if a small part of the original had to be excepted. I'm not sure the conclusion follows (universal quantification could still be its prototypical meaning, for example), but if it does, then it could equally well be argued to be true of Darja; a lot of statements about "all" that I hear made here are ones which the speaker is perfectly aware (and accepting) of the existence of exceptions to, and it took me a while to go against my mathematical training and realise that when they said "all", they didn't mean it in the logicians' sense. Actually, I suspect the same is true of many idiolects of English. This was brought to mind by a little example I heard yesterday: hađa ybi` kŭll ħaja, bəṣṣəħħ əlfṭayəs ma ybi`š هذا يبيع كُلّ حاجة، بصّح الفطايس ما يبيعش "This guy sells everything, but hammers he doesn't sell."


I went to Tizi-Ouzou today, where I bought a few Kabyle-related books. The smallest, a tiny little handbook entitled Cahier d'écriture de l'alphabet tifinagh, or Attafttar, from Editions Baghdadi, Algiers (no date of publication or author given), provided a bit of a surprise. I thought I had seen every variation of Neo-Tifinagh there was to see, but I was wrong; this illustrated children's book presents yet another one. It's essentially Chaker's Neo-Neo-Tifinagh, but with one or two forms from the Academie Berbere alphabet (b, s) plus at least one sign, Arabic ع with the curves straightened out into right angles, that I've never seen anywhere else. You know, I'm not enormously in favour of Neo-Tifinagh to begin with, but the proliferation of variant forms that you find is just ridiculous; in a sufficiently Algerian mood, I could easily believe many of them are put together by anonymous opponents of Tifinagh seeking to weaken it by spreading confusion.

Impersonal vs. personal "you"

In English, "you" is equally used in a literal sense (referring to the addressee) or in an impersonal sense (referring to an arbitrary imagined experiencer.) In Darja, at first sight, it looks the same way - and for speakers of any one gender, this is true. However, looking at speakers of both genders allows you to realise that the distinction is grammaticalised. Addressee "you" agrees in gender with the addressee; impersonal "you" does not agree in gender with the addressee, but with the speaker. Thus a woman speaking to a man will say tṛuħ "you go" when "you" refers to the man addressed, but tṛuħi when it refers to an arbitrary person, like "When you go by bus, it takes a while."

Bits of Darja morphology

I heard a great word today: tməhbəl تمهبل "behave like a crazy person", a verb derived by consonant extraction from the noun/adjective məhbul مهبول "crazy", itself a passive participle (of a form familiar from Fusha, with a prefix m- plus an infix) from the verb hbəl هبل "go crazy". A similar process, whose intermediate stages do not however survive in Darja, brought about tməsxər تمسخر "kid, joke"; cf. Fusha saxira سخر "mock, laugh at".

Regular diminutives are formed with an infixed i, optionally with a feminine ending added. In Dellys, nouns normally form them either by simple infixation or (for three-letter roots) infixation with an added y after the infix, but adjectives add an extra consonant after the i - either a copy of the second root consonant (eg kħiħəl كحيحل "a little black", from kħəl كحل "black"), or a w (ṣġiwəṛ صغيور "tiny" from ṣġiṛ صغير "small"). "One", unlike other numbers, agrees in gender with its referent, a property of adjectives; thus it is perhaps unsurprising to learn that it can take the adjective-style diminutive wħiħda "a little one."

The English "substrate" in my Darja shows through most clearly, I suspect, in noun agreement. Gender for inanimate objects has always given me trouble; the gender of a noun is almost always obvious from its form, yet if I don't concentrate I still tend to revert to some kind of default gender when the noun in question is unmentioned or is in a different sentence. Number is a lot easier; but even there there are a few points I need to focus on. For one thing, I tend to give words like səṛwal سروال "trousers", which take plural agreement in English, plural agreement when they should be singular. (I am not alone in this kind of error; talking to a friend born and raised in a largely Kabyle area of the Casbah in Algiers, I heard that in his neighbourhood they consistently say things like əlma bardin الما باردين "The water is cold (pl.)", because "water", aman, happens to take plural agreement in Kabyle.) But there is a large class of nouns in Arabic in general where the unmarked form is essentially a mass noun but is used in most contexts where English speakers use plurals, and the singular (or rather the singulative) is formed from it by adding a feminine marker. For example, if you want to say "I bought some figs today", or "I made some fig jelly", or "I see a fig tree", for any of these you would use the unmarked bəxsis بخسيس "figs"; for "I ate a fig" or "This fig is tasty", you would add the feminine ending to get bəxsisa بخسيسة; if you said something like "There are three figs on the table", where the figs are individuated but there's more than one of them, you would pluralise the singulative (feminine) form and say bəxsisat بخسيسات. So in most contexts, the English plural "figs" gets rendered by the mass noun bəxsis. The difficulty that arises is that, as a result, I tend to think of words like bəxsis as plural, and give them plural agreement, when in fact I should be giving them singular agreement like any other mass noun.

Dellys news

Under the circumstances, a little non-linguistic posting seems to be called for. Some criminal exploded a bomb here in Dellys yesterday, down by the port just east of the centre. Neither I nor anybody I know was hurt, but the hospital was kept very busy. I haven't heard much clear news yet, but I understand there have been some 30 deaths, mostly young Army conscripts along with a couple of port workers standing nearby. Everyone here is shocked (of course) and angry - an attack on such a scale around here is unprecedented - but continuing with business as usual (apart from the port of course, which is closed last I checked.)

That story you can follow in the news easily, and, here on the scene though I am, I am probably worse informed about it than someone reading all the press releases; I haven't been down to the port since I got here. So I'll talk a bit about the environmental situation instead. On the part of the beach just east of sid-əl-məjni سيدي المجني, things get a bit depressing. After a couple of years of being free or sewage (zigu, from French (le)s égouts), the nearby stream (wad-əl-gəṭṭaṛ) is once again flowing a nice greyish-black colour. Only now, since the 2003 earthquake (əz-zənzla), the sea it used to flow into is a good deal lower (technically, the land is higher), so the black stuff just accumulates along the shallows to its west, and the stream's delta, coated with nitrate-loving vegetation, is poking out a good ten metres into what used to be the sea. The ecological effects are interesting; little river trees and bushes are appearing all along the shoreline of the black spot in what used to be sand, and swifts (xŭṭṭayəf) are swooping all over the area eating up the bugs it nourishes. At this rate, the whole bit between the stream and Sid-el-Medjni will probably not be a beach at all for much longer; it'll turn into some kind of swamp. I'm told the effects are visible further out to sea as well; off that part you only see one species of seaweed, and practically no sea urchins (lŭggi). A recent week-long heat wave got forest fires breaking out all over the country, to the point that the sea water was coloured with ashes. And don't get me started on the problem of sand theft (sand is needed for building, and nonstop, if slow, building is happening everywhere all the time), which has turned what used to be the widest sand beaches in the area, at Takdempt and Sahel, into thin strips disappearing into the sea, and led to the collapse of at least one house that I'm told had once been a good hundred metres from the beach, at whose windows seawater now laps. People say environmental issues aren't relevant in Algeria; personally, I find they tend to be a lot more conspicuous here than in the UK.

Speech errors and phonology; borrowing basic vocabulary; and a correction

A child's speech error I heard reported the other day provides a great illustration of the psychological reality both of allophones and of the skeletal tier in phonology: /twəħħəš-t توحّشْت / [twæħ'ħɛʃt] "I missed" (in the emotional sense), became /twəššəħ-t / [twɪʃ'ʃæħt]. The ħ and š are permuted without affecting the (in this case grammatically determined) position of the length; and the phonemic realisations of the schwas between them change to match their new neighbours. (ħ makes an adjacent schwa more a-like; š and w both do not affect the value of neighbouring schwas, and in the absence of any external influence, the default phonetic value of a schwa is roughly [ɪ].)

While counterexamples to the naive idea that basic family terms like "father" and "mother" are unborrowable are easy to come by, Algeria presents a particularly striking case; so many of the people who addressed their own fathers as baba are raising their own children to address them by the Fusha term 'abi, or even the French papa. (And baba itself may be a Berber borrowing, though the evidence is far from compelling.)

Contrary to what I wrote in my MA thesis, and to the intuitions of the native speakers I asked, negated nouns can certainly occur without ħətta, "any"; I heard a good four or five examples today, including waħda ma yxəlli (he won't leave a single one), ma ysərbi lħaja đ̣ukka "it serves no purpose now", xəlq ma kayən "there's not a soul". Something for me to investigate when I get back to that.

Tuesday, September 04, 2007

The Straw Road

Every time I come to Dellys, in between all the visiting and swimming, I find myself marvelling at the full linguistic resources of colloquial Algerian Arabic, and learning new words and constructions; this time is no exception. I've been typing some of these up, and plan to post them on my sporadic Internet visits.

One of my aunts taught me a new Darja term the other night - triq əttbən طريق التبن, the Milky Way, literally the Road of Straw. In the old days before there were clocks, people used to tell a prayer time by it: when it first became visible in the sky, they would go pray `Isha. That test may not work so well in modern city environments, but Dellys is still dark enough at night that you would probably get it about right. She also mentioned nəjmət əl`iša نجمة العشاء and nəjmət əlməġrib مجمة المغرب (the `Isha and Maghrib stars), which would presumably be Venus and Mercury respectively. I'll have to look into the rest of the astronomical terminology sometime, if I can find anyone who knows it.

Leiden conference on African languages and linguistics

I'm just back from a conference at Leiden, and heading off to take a holiday in Algeria very soon; here's my interim report to tide my readers (to whom I apologise for the interruption in service:) over.

Leiden turns out to be a very nice little town, clean, quiet, full of canals, and practically empty. I imagine all that changes when the students get there! The conference was good - I got to talk to several other people working on Berber and Songhay, and heard some interesting talks. To name a few, Jeffrey Heath discussed the remarkable ways in which syntax affects tone in Jamsay Dogon; Maarten Kossmann argued (and I am inclined to agree) that the Mande influence discernible in southern but not northern Songhay, and especially strong in "Inner", or Eastern, Songhay, is particularly to be linked to Soninke, and is not a feature of proto-Songhay; Alain Bassene presented a paper on topicalisation and focus in a Jola variety where both proved to behave in a manner almost completely identical to their behaviour in Algerian Arabic; and Mary Pearce presented in impressive detail what turned out to be a clear ongoing sound change (a shift from phonemic tone to phonemic voicing) in the Chadic language Kera. My own paper was perhaps a little too esoteric even for a conference like this - I'm not sure that more than two or three people in the audience actually cared about sound shifts in Songhay - but I heard corroborating evidence for one of my statements immediately afterwards, which was satisfying.

I also picked up a pleasing number of free language/linguistics books, including review copies (look for them on Afrikanistik sometime in the indefinite future) of a new dialectological atlas of the Moroccan Rif and of a book by Pichler on the history of Tifinagh which (I'm not sure whether to be amused or annoyed) briefly quotes me verbatim regarding Neo-Tifinagh without attribution or even quotation marks.

Phrasebook fiction

The bewilderingly odd and sometimes strangely evocative phrases that some phrasebook compilers apparently expect to be useful have caught the attention of many people besides me, although I do think the Andamanese one I found a year or two back takes the cake. However, until a few days ago, I had not come across phrasebook-based fiction. I can now report that there is at least one example of such a genre: Gene Wolfe's "Useful Phrases" (a short story in Strange Travellers):
Even so, many of the phrases thus translated struck me as peculiar. Who would wish to say "You no longer recognize her," "Mine is a similar address," or "I will tell the trees to be quiet"? I studied all these phrases diligently, however, so much so that I sometimes found myself murmuring in my bath, Pava pacch, tîsh ùtra. Neéve sort dufji. "How like a ghost are the fountain's waters! The flood carries away my riches." The paper is marvelously thin, and yet completely opaque; the print sharp-edged even when viewed through my best magnifying glass...

I addressed to him the phrase I had so often rehearsed: Semphonississima techsodeliphindera lafiondalindu tuk yiscav kriishhalôné! "How delightful to discover in the shrinking sea a crystal blossom of home!"

He dropped my advertisement and ran from the shop.
There would be no point in summarising the story - it's not about plot so much as mood. If it has a moral, it must be that you should keep phrase books of unknown origin for unidentifiable languages only if you want your life to become more exciting and dangerous.

A coming reanalysis in Arabic and Berber

In historical linguistics, when a word or string of words is reinterpreted as consisting of a different set of words (for example, when "an ewte", which is what people used to say in Middle English, becomes "a newt"), they call it reanalysis. Here are two somewhat parallel examples.

In classical Arabic, one word for "he came" is jā'a. "With" is bi-. "He came with X" is jā'a bi-X, and can usually be translated as "he brought X". In some parts of the paradigm, the two words remain more or less adjacent* - eg ya-jī'u bi- "he comes with"; in others, they are separated by an agreement morpheme - eg jā'-at bi- "she came with", ji'-nā bi- "we came with". In all modern dialects, the glottal stop is lost, and so are the final short vowels, which would regularly yield jā b(i)-, yijī b(i)-, jā-t b(i)-, etc. But in fact, this common construction was reanalysed as a single word, so you get forms along the lines of jāb, yijīb, jāb-it, jib-nā...

In Proto-Berber, as across most Berber languages, the word for "come" was something like as (perfect form y-usa, habitual yə-ttas, etc.) However, Proto-Berber also had a very productive system of "extensions", particles near the verb marking the direction in which the verb's action took place: towards (d) or away from (n) the speaker. Naturally, "come" normally featured the d extension. In many common forms, it was adjacent to the stem (eg y-usa d "he came", nə-ttas d "we come", etc.); in others, it was not (eg ad-d as-əγ "I will come", usa-n d "they came", etc.) In at least one variety - the dialect of the Beni Snous near Tlemcen, in western Algeria - this d was reinterpreted as part of the word "come"; so there (with voicing assimilation of s to z when next to d) you get forms like yusəd, nəttasəd, ad azd-əγ, uzd-ən.

* Strictly speaking, even in this one they're separated by a short vowel marking mood.

"The inadequacy of traditional Islamic languages"

A Pakistani physicist weighs in on the state of science in the Islamic world in Physics Today, a magazine I used to subscribe to during the very brief period when I was doing physics at university. The article's quality is variable; he makes some good points (like the alarming publication and patent statistics, and the way that authoritarian attitudes inhibit hypothesis forming), but also some poor ones (his Bourguiba-esque suggestion that fasting and prayer are incompatible with hard work, for example, is laughable.) Anyway, he throws in an observation on language worth discussing:
Second, the inadequacy of traditional Islamic languages—Arabic, Persian, Urdu—is an important contributory reason. About 80% of the world's scientific literature appears first in English, and few traditional languages in the developing world have adequately adapted to new linguistic demands.
In what sense can a language be inadequate for a purpose? What I take him to be referring to is the inadequacy of technical terminology. Specialists in any field have to learn a set of fairly complicated ideas to which they can refer concisely and unambiguously (phoneme, wh-movement, coronal, theta-role; integration, isomorphism, standard deviation...) Such terms often do not refer to anything normally noticed by people, and therefore have no equivalent in any language until one is created or borrowed. Various specialists or committees have undertaken to create such terms (in Arabic, at least, they generally eschew the idea of borrowing them.) But in many cases a chaos of alternative terms is spread. For "linguistics" alone, different Arabic dictionaries will suggest اللسانيات، الألسنيات، اللغويات، علم اللغة, and even other terms. I have three dictionaries of linguistic terminology in Arabic sitting on my shelf; randomly looking up "retroflex", for example, I find ارتدادي، التوائي، انقلابي all given as translations.

One might expect that the efforts of specialists to communicate with each other would end this problem, with the community of linguists (say) rapidly converging on a single term and abandoning the rest, just as such synonyms for "retroflex" as "cerebral" or "cacuminal" have largely disappeared in English. But there we have a vicious circle. At present, to be a good specialist in many fields, you need to have studied them in some Western language, and to be following a literature on them that's largely in a Western language, and to be communicating with colleagues who mostly speak that same language. In fact, given how little on average is spent on research in the Islamic world, in many such fields the odds are high that you won't even be able to find employment without going to or staying in the West, further reducing your opportunities to talk about, or teach, the subject in your own language - and if you do stay in your own country, you may find that specialist terminology dictionaries, especially those printed in other countries, are hard to find. So if ambiguities or misunderstandings come up, the easy thing to do is to switch to English or French or the like; the ideological incentive to use your own language is not supplemented by any significant material or practical incentive. And thus the language gets slowly pressured out of another domain. It's not inevitable, but to change it you'd have to create more incentives and more opportunities for people to stay and to teach in their own countries.

Of course, for Arabic in particular but to a lesser extent for Urdu and Persian, there is a second factor to be considered: diglossia, the wide gap between the language spoken in everyday conversation and the one considered suitable for writing or teaching in. This in itself has some negative implications for teaching science, although the obstacles it sets up to participation by the masses are far less than those that use of an unrelated foreign language like English or French does. But that is another topic for another time.

Miscellaneous linguistics news

I've been keeping busy lately, looking at some rather interesting grammatical facts about the Berber dialect of Ngousa (Ingusa) which I plan to talk about (among other things) at Paris in September. The vocabulary is also interesting; tiḥemẓin "couscous", for example, presumably somehow from timẓin "barley". However, a casual trawl of the news today revealed a surprising number of linguistics-related stories, which I thought I'd share:

Orangutans Play Charades When Misunderstood: For extra points, outline a scenario for the development of a fixed learned vocabulary from sufficiently frequent efforts in a small population to play this sort of charades.

Brain Responses in 4-Month-Old Infants Are Already Language Specific: 4-month-old German and French babies deal better with words stressed in accordance with the the laws of their soon-to-be-native language.

Parts, Wholes, and Context in Reading: A Triple Dissociation: "Do fast readers rely most on letter-by-letter decoding (i.e., recognition by parts), whole word shape, or sentence context? We manipulated the text to selectively knock out each source of information while sparing the others. Surprisingly, the effects of the knockouts on reading rate reveal a triple dissociation. Each reading process always contributes the same number of words per minute, regardless of whether the other processes are operating." I wonder whether this applies in other written languages or is a peculiarity of English.

And a little multimedia on an English regional dialect from the BBC: Pitmatic.

Writing codas, from Sylhet to Winnipeg

In Greek-based scripts (like Latin or Cyrillic), unless a consonantal letter is followed by a vowel letter, it is assumed not to be followed by a vowel. This seems natural enough if you're used to it; but if you look at it differently, it's rather wasteful. The commonest sound to follow any given consonant is usually a vowel, not another consonant, so if you allow a single letter to represent a consonant plus a vowel you're saving space and effort.

But if you do that, then how do you represent the fact that a consonant is not followed by a vowel? Different writing systems use different solutions. In alphabets that have stuck more closely to their Canaanite prototype, like Arabic, Hebrew, Syriac, or (traditional) Tifinagh, you normally don't bother: a consonant may be followed by a vowel or may not, and you rely on the reader to figure it out. However, sometimes the reader needs additional cues: maybe the word you're writing is obscure, or two words have the same consonants, or it's very important that the text be read exactly right with no possibility of error. In that case, in Arabic, Hebrew, and Syriac, you mark what follows each consonant with a little sign above or below the letter - one sign for "a", say, another for "i", and another to indicate that nothing follows it. Such a sign is necessary if you're still mainly using the system with no vowel marking, because if you left the letter unmarked it would mean not that the letter had no vowel but that what vowel, if any, followed the consonant should be deduced from context.

Typical Indic scripts, such as Devanagari (the script used for Hindi and Nepali), adopt a rather different solution. A consonant letter on its own is to be read with a default vowel, short a ([ʌ]); a consonant followed by a consonant is written as a single "conjunct" letter, formed in any of several ways, but usually by either putting the second letter underneath the first or taking away a line on the right of the first letter and joining it to the second. On the plus side, this yields much of the compactness of a vowel-optional system without any of the ambiguity, and means that each letter is pronounceable on its own; on the minus side, this means fonts have to include a much larger number of letter forms.

Sylheti Nagri is an Indic script formerly (up to the 1950s or so) in use in the district of Sylhet, in eastern Bangladesh. Like Devanagari, it represents consonant-consonant sequences using conjuncts. However, its users were often also familiar with the Arabic script, where letters could be combined into ligatures whether or not they had vowels between them. This may have inspired them to do something rather unusual for an Indic script: develop vowel-consonant conjuncts, such as a+m, a+l, i+n... and consonant-vowel-consonant conjuncts, like pi+r, mo+t... In fact, judging by the examples in the Unicode proposal, it seems that, for at least some historic users, Sylheti did not have a conjunct system at all, just a ligature system.

One very nice solution is that adopted in Canadian Syllabics, the family of writing systems used by a number of Native American tribes in Canada. The name is potentially misleading: I prefer to reserve the term "syllabary" for writing systems like hiragana, where different syllables differ from each other unpredictably. In Canadian Syllabics, for example Cree, the shape of a symbol represents the consonant, while its orientation represents the vowel that follows it, and length or labialisation may be represented by dots. If no vowel follows the consonant, then the base shape is simply written small and superscripted, using the a-orientation, or for labialised consonants the u-orientation.

Language endangerment in Yorkshire

Several members of the British Parliament took a few minutes out from worrying about issues like Iraq, the housing shortage, and global warming to put together an Early Day Motion expressing their concern about the fate of the Yorkshire dialect:
That this House is concerned at the recently published research indicating that words are disappearing from the Yorkshire dialect because of the influence of the internet, social mobility and globalisation; and furthermore supports the work of the Yorkshire Dialect Society in continuing to promote what is, after all, the best English regional accent in the world.
The amendments proposed are also worth a look, featuring such phrases as "a slow national convergence towards the monochrome mush of effete estuarial English". For what it's worth, I am rather inclined to agree that Yorkshire may have "the best English regional accent in the world" - although two MPs proposed to amend this to "after the Lancashire accent" - and I'm glad to see a bit of appreciation for dialectologists, but I find it difficult to be all that concerned about the loss of a few well-documented local words (supposedly due to people watching national media and getting out more) in a fairly widely spoken dialect of one of the world's most flourishing languages, when whole languages are disappearing virtually undocumented every month due to factors like kidnapping children or beating them when they speak their language.

Harun ar-Rashid and the Golden Apples of the Hesperides

I recently heard a rather good folk tale from my father about the adventures of (a completely mythologised) Hārūn ar-Rashīd during his foreordained seven years of hardship, living as a poor man dressed in goatskin nicknamed Bou-Krisha (بو كريشة). One element of the story fits nicely with the previous two posts' theme of cultural survivals from the classical era. The king gathers his sons-in-law and his would-be son-in-law Bou-Krisha, and tells them that he is terribly ill, and to cure him they must go and bring him:
ət-təffaħ ən-nifuħ التّفّاح النّيفوح
əlli yṛədd əṛ-ṛuħ اللي يردّ الرّوح
m-əs-səb`a jbal مسّبعة جبال

the fragrant apple
that restores the soul
from the Seven Mountains
For the tale's purposes, of course, all that matters about this evocative phrase is that it refers to something that it will take a long and arduous quest to get. But the historically minded listener may be excused for speculating on the phrase's origin.

Etymologically, the phrase is mildly interesting. nifuħ is unexpected, and possibly distorted to fit the rhyme - a more normal term, with obvious Classical Arabic origins, would be nəffaħ; it might have arisen by contamination from əlli yfuħ “which smells” (especially since əlli in Kabyle is ənni.) But it may be possible to look deeper.

Ceuta (Arabic səbta سبتة) is an ancient Moroccan port town at the edge of the Straits of Gibraltar which has been part of Spain since 1668. Its name derives from a longer Latin one - Septem Fratres, the Seven Brothers, said to be a reference to seven hills around the city; it was a wild area, among the last places in North Africa where elephants were found (as noted by Pliny.) And the region around the Straits of Gibraltar is where the gardens of the Hesperides were supposed to be located - where the Golden Apples grew. Is ət-təffaħ ən-nifuħ one of the Golden Apples?

Berberised Afro-Latin speakers in Gafsa

One reader of my last post asked how late Latin (or some descendant thereof) continued to be spoken in North Africa. The answer is, pretty late: the latest attestation I came across on short notice seems to be in the major medieval geographer Al-Idrisi (12th century) who, describing Gafsa in southern Tunisia, notes that:
وأهلها متبربرون وأكثرهم يتكلّم باللسان اللطيني الإفريقي.
Its inhabitants are Berberised, and most of them speak the African Latin tongue.
He even gives one word of their dialect:
ولها في وسطها العين المسماة بالطرميد.
In the middle of the town is a spring called the ṭarmīd (perhaps to be related to Latin thermae).
One interesting thing to note about this statement is that he said that the town was Berberised - in other words, that, in the very century when the Banū Hilāl were rapidly spreading through Tunisia and Libya (a subject he has fairly harsh things to say about), Berber culture was prestigious enough to be adopted by members of other cultures, in particular the remaining Roman or Romanised towns, in the area. Gafsa, of course, speaks Arabic now, but several nearby villages still spoke Berber in the 1800s, and two, Sened and Majoura, well into the 1900s.

Chenanith b'Libya - in the 11th century AD?

Anyone interested in North African languages who doesn't speak Dutch should immediately check out Bulbul's posting on Latino-Punic. The Phoenicians brought their language with them to North Africa when they founded Carthage and other cities. Carthage was destroyed, of course, but many other cities continued to speak Phoenician for longer; however, like Arabic in more recent times, it changed a lot under Berber influence, and this later dialect is usually called Punic. This language was spoken by St. Augustine, who quotes a number of Phoenician words, such as salus (< shalu:sh < shalo:sh < shala:sh < thala:th) "three", in his works. In eastern Libya, as it happens, Punic continued to be written even after the Phoenician alphabet was forgotten; this body of inscriptions, using the Latin alphabet to write Punic, is called (logically enough) Latino-Punic, and a comprehensive database of such inscriptions is available from Leiden. Recently, as Bulbul points out, a thesis was submitted at Leiden on Latino-Punic and its Linguistic Environment; I would love to read it.

The twist in this tale is that Phoenician may have survived into the 11th century AD! Al-Bakri (whom I've mentioned before) enigmatically says of the inhabitants of Sirt in Libya that:
لهم كلام يراطنون به ليس بعربي ولا عجمي ولا بربري ولا قبطي ولا يعرفه غيرهم
‍They have a speech in which they jabber which is neither Arabic nor Ajami (by which he probably means Latin but might mean Persian) nor Berber nor Coptic, which no one but them knows.
The location (in eastern Tripolitania) is about right for it to be Punic, and if it were Greek you would expect him to know, considering he cites (more or less correctly) the Greek etymology of طرابلس (Tripoli) in the next page. So was Punic still spoken in the 11th century? Your guess is as good as mine, but it looks plausible.

Galileo's sociolinguistics and free software

Just came across an interesting quote from a law professor in the free software movement on Europe's shift away from diglossia:
[W]ith the name Galileo Galilei, we associate two of the most important cultural responses to the quandary of possessed physics.

The first is an insistence upon freedom from censorship, that is "e pur si muove" -- determination to prohibit the ownership of physics by an entity rich enough and powerful enough to define its physics as the only permissible physics, the only available physics, for most ordinary people. And second, the first significant attempt in the history of the West to write scientific literature at the state of the art in a vernacular language, accessible to everyone.

Galileo Galilei's decision to publish in Italian is as important as his decision to risk confrontation with the Church, for what it says about the fundamental pillars of free science in the history of the West. Not merely, in other words, an insistence upon the freedom of ideas to work their will in skilled hands, but a determination that the ideas which motivate the world, which explain its behavior and which render it controllable, should be universally accessible to people regardless of their ability to acquire enough social surplus to have Latin.
I'm not sure whether the details of his account are accurate, but this has always been one of the strongest arguments against diglossia. The availability of universal free education goes a long way to mitigating the problem; but it's still not cost-free, since all the time devoted to learning the high language is time that could have been devoted to learning something else. (Of course, there are also practical issues regarding the quality of teaching provided - but that's another story.)

Is Omotic Afroasiatic?

Omotic, a small group of non-Cushitic, non-Semitic languages spoken in the highlands of Ethiopia, has always been the odd one out in Afroasiatic; by anyone's tree it is the first to have split off, and the noted Chadicist Paul Newman expressed scepticism about its membership in the family. I know little about Omotic, or Cushitic for that matter, but after reading a few sketch grammars in Omotic Language Studies , I found it very difficult to imagine these languages as Afro-Asiatic; with Berber or Hausa or Beja or Semitic the cognates are instantly visible, but none of the most familiar grammatical morphemes or lexical items seemed to be present. However, a paper I just came across by Rolf Theil is the first I've seen to present an argument against the hypothesis, and a pretty good one at that. There are parts I would question - for example, the suggestion that pronouns are unreliable (they are conspicuously unreliable in regions where extensive politeness systems have developed, like East and Southeast Asia, but I didn't think highland Ethiopia fell in that category) - but the overall argumentation seems good. In particular, the attempt to show that a roughly equal number of similarities can be observed between Omotic and families other than Afro-Asiatic is on the right track - if Omotic were to have more similarities with Afro-Asiatic than with any other family, then merely pointing out problems with some of those similarities would be inadequate. I'll be interested to see the reactions of people better acquainted with the family.

On another note, I passed my upgrade presentation yesterday - yay!

Ugaritic inscription

Last weekend I got a chance to indulge my longstanding passion for ancient Semitic languages at the Louvre. The Ugaritic collection was, as you might expect, especially good; I took many photographs, including this particularly clear one here, a ceremonial axe from the 13th or 12th century BC. Since the Ugaritic alphabet only contained some 30 letters, it's easy enough to read the inscription (turn it 90 degrees counterclockwise), although no word dividers are present:

xrṣn rb khnm

which the museum caption translates as "la hache du Grand Prêtre". xrṣn is presumably "axe"; I can't find it in my small dictionary, but it looks like it might be related to xurāṣ "gold" (itself cognate not only to Hebrew ḥarūṣ, but also to Greek chrysos, a Semitic loanword.) rabb- means "great one", identical to Arabic ربّ "lord" and cognate to Hebrew rav "great one; rabbi". kāhin- is "priest", identical to the Arabic كاهن "soothsayer" and cognate to Hebrew kohen "priest" - yes, the same word from which the surname "Cohen" comes from. -īm is the oblique plural, identical to Hebrew -īm (which however is no longer inflected for case) and cognate to Arabic -īn. Once you start looking, it's so easy to spot the connections between Semitic languages; no wonder people a thousand years ago noticed.

Brothers in Law

Reading a Language Log post I just noticed a rather laughable argument apparently being used in the Jose Padilla trial (according to AP):
FBI wiretaps played in court for jurors contain frequent references to "brothers," which prosecutors say means mujahedeen fighters looking for a battle. Defense lawyers contend the term is a common expression among male Muslims.
If the AP article has correctly represented the prosecution argument, they must be either absolutely desperate or thoroughly unqualified. As the defense correctly states, "brother" is fairly commonly used between male Muslims, in accordance with a hadith saying that "A Muslim is the brother of a Muslim"; it carries absolutely no implication of being a fighter. Google will turn up numerous examples; for an illustrative sample, consider this slightly frivolous MPAC forum discussion about finding motivational speakers, where other Muslims are called by the terms "brs", "bros", "akhee" (Arabic for "my brother"), and "Brother".

However, in fairness, other reports indicate that the prosecution claims that the defendants used some kind of system of codewords: they claim innocuous words like "picnics", "football", and "marriage" were used with much more sinister intended meanings. If their claim is that "brother" meant "fighter looking for battle" in this alleged code, as opposed to in normal usage, then that might not be completely absurd; I haven't found any transcripts, so I can't attempt to evaluate the plausibility of such a claim.

The usual arguments about the correct translation of "jihad" and "Allah" apparently came up as well - I don't think I'll bother adding to the thousands of web pages discussing that issue.

Popper, Sapir, and international auxiliary languages

I've been reading some of Karl Popper's work lately, and found it quite interesting (and clearly written, which one doesn't always expect of philosophers.) Both his political and his scientific writings are dominated by the same important theme: no one can get closer to the truth without being willing to put their beliefs to the test, and the more different people doing the testing, the less likely they are to overlook a flaw in the idea. Thus dictatorship and censorship - in any power structure, governmental or academic - are not just bad, but intrinsically prone to get worse results. I noticed that he took this view to have implications for language policy too:
The adoption of rationalism implies, moreover, that there is a common medium of communication, a common language of reason; it establishes something like a moral obligation towards that language, the obligation to keep up standards of clarity and to use it in such a way that it can retain its function as a vehicle of argument. That is to say, to use it plainly; to use it as an instrument of rational communication, of significant information, rather than as a means of 'self-expression', as the vicious romantic jargon of most of our educationists has it. (It is characteristic of the modern romantic hysteria that it combines Hegelian collectivism concerning 'reason' with an excessive individualism concerning 'emotions': thus the emphasis on language as a means of self-expression instead of a means of communication.) (Karl Popper, The Open Society and its Enemies vol. II: Hegel and Marx, Routledge 1945/2003, p. 264)
While this quote is mainly about how you should use a given language, rather than which language to use, it clearly suggests the desirability of some international language, and Sapir's idea of how such a language should be built happens to be rather Popperian in spirit:
It [the international auxiliary language] must, ideally, be as superior to any accepted language as the mathematical method of expressing quantities and relations between quantities is to the more lumbering means of expressing these quantities and relations in verbal form. This is, undoubtedly, an ideal which can never be reached, but ideals are not meant to be reached; they simply indicate the direction of movement. (p. 51)... National languages are all huge systems of vested interests which sullenly resist critical enquiry... (p.60) Intelligent men should not allow themselves to become international language doctrinaires. They should do all they can to keep the problem experimental, welcoming criticism at every point and trusting to the gradual emergence of an international language that is a fit medium for the modern spirit. (p. 64, Edward Sapir, "International Auxiliary Language" in Culture, Language, and Personality, Berkeley: University of California)
So it is all the more ironic to find that Popper's paragraph continues with this:
And it implies the recognition that mankind is united by the fact that our different mother tongues, in so far as they are rational, can be translated into one another. It recognizes the unity of human reason.
This seems to imply that linguistic diversity is worthless: if something is rational, it can be explained in any language, and if it can't be explained to me in my language, it must be irrational. I rather suspect that the Sapir-Whorf hypothesis was partly intended as a rebuttal to this sort of argument. If your language tends to blind you to the differences in logical form between sentences with superficially identical structures (his example in the very essay quoted above is the perfective/imperfective distinction in English) and makes it easy to spot the differences between ones with different structures, then the ideal auxiliary language should allow you to express logical form as unambiguously as possible; and to be able to make a language free of your own linguistic biases and blind spots, you will have to carefully study many languages of as many different types as possible.

Of course, that begs the question: is there such a thing as an overall better language, or is that whole approach misconceived? Algebraic notation is unquestionably superior to English for describing physical laws, but it's not a very effective way to make a grocery list. In practice, people use different languages, and different technical vocabularies embedded in the same language, for different purposes.

Talk at SOAS: The typology of number borrowing in Berber

Just a quick note for London readers: I'm going to be giving a talk on Wednesday in room B111 at SOAS, on "The typology of number borrowing in Berber" - basically the same talk I gave in Cambridge, expanded a bit (for example, I've added a section on Northern Songhay.)

Why people say silly things about historical linguistics

I recently realised that a lot of popular misconceptions about language evolution derive from uncritical use of the "family" metaphor. In families, a person has kids and then stays around, alongside the kids, for many years... they may live to see their great-grandchildren. The parent and the child may show a family resemblance, but will certainly be separate individuals. If you're told that languages come in "families", and "descend" from past languages, then it seems perfectly reasonable to imagine those ancestor languages lingering on alongside their descendants, and to imagine that the minor changes occurring daily within the language you speak are completely different from the sharp discontinuities that would have to occur for a new language to emerge.

But languages don't work that way at all: a language's "descendants" are (with rare exceptions) simply the various results of its own changes in the mouths of various communities. It's usually meaningless to talk about one living language being the "ancestor" of another one; in such cases, both are descendants of the same ancestor, even if (as infrequently happens) one has changed significantly less than the other. (Revived languages, like Sanskrit, are arguably an exception.) The same mistake is frequently made in popular understandings of biology, for the same reason; people imagine that chimpanzees (say) are humans' ancestors, when in reality the very fact that chimpanzees exist alongside humans proves that, while both species share a common ancestor, that ancestor was neither of them (or, looking at it another way, has equal right to be described as either of them.)

Friday, May 25, 2007

Songhay materials

Songhay is a close-knit family of languages in West Africa, spread by the medieval Songhay Empire, that happens to be rather relevant to my PhD. It has no close relatives; the best guess is that it's Nilo-Saharan, but if it were spoken in the Americas, it would undoubtedly be classed as having no relatives whatsoever, and the resemblance to other languages is not strong. It has some rather interesting syntactic patterns. Throughout the family, NPs are organised as follows: possessor - head - adjective - determiner - plural marker, eg Kwarandzie adra kedda gh yu (mountain small this pl) "these small mountains", Sidi L`arbi n iz yu n targa (Sidi Larbi 's child pl 's canal) "Sidi Larbi's children's canal". While at least one other West African family, Mande, has this NP order, the only case I am aware of offhand outside Africa is Ulwa in Nicaragua. Even rarer worldwide is a feature found in a number of centrally located Songhay varieties: having two distinct classes of verb, one - the vast majority - requiring SOV word order (ie preverbal objects), and the other (including such verbs as "follow", "marry", "want", "see", "fear", "bring", at least in Gao) requiring SVO order (ie postverbal objects.)

Anyway, for me this has been a great week for finding materials on Songhay. Jeffrey Heath has updated his webpage, adding work in progress on the nearly undocumented dialect of Humburi as well as several others (I will find the Tadaksahak wordlist especially useful; aside from Songhay, readers may also want to check out his Dogon materials.) On a missionary website (though I strongly disapprove of such work, it does have useful byproducts), I found a good hour of fairly comprehensible audio in Tadaksahak, an inadequately documented Northern Songhay language important for my purposes; and SOAS library just informed me that the copy of Ousseina Alidou's unpublished dissertation on Tasawaq, an even more important language for comparisons to Kwarandzie, has at long last arrived from Hamburg. Above all, I got an email from a kind contact from Tabelbala, with some more Kwarandzie audio files.

For other Songhay materials, try Relative Clauses in Tadaksahak, Some Verb Morphology Features of Tadaksahak, Northern Songhay Languages in Mali and Niger, Southern Songhay Speech Varieties in Niger, The Zarma Website, Zarma Dictionary, Notions élémentaires pour apprendre le Zarma, La dénomination en Zarma, Lexique kaado-français...

Prenominal adjective borrowed into Arabic from Persian?

A major interest of mine lately is the way in which lexical borrowings can affect syntax, dragging bits of the source language's word order with them. I came across what looks like a nice example of this in a book on Gulf Arabic. In Kuwaiti dialect, as in all dialects of Arabic, adjectives normally follow the noun. However:
The (Persian) adjective kooš precedes the noun it qualifies. It does not occur in association with defined nouns. It is not inflected for gender or number. Thus:
    kooš walad, bint    a good boy, girl
(T. M. Johnstone, Eastern Arabian Dialect Studies, London: Oxford University Press 1967, p. 147.)
Only trouble is, my Persian grammar doesn't say anything about the Persian adjective in question being pre-nominal, and virtually all adjectives in Persian are post-nominal. Does anyone know more about this?

Learn Oneida!

Came across a great new site, the Oneida Language Revitalisation Program, consisting mainly of an extensive audio phrasebook of Oneida, the Iroquoian language native to upstate New York. There's also a teaching grammar and dictionary at Oneida Language Tools, and some video at Tracy Williams' site. It's great to see this much material online for a language with less than two hundred speakers; this should make it a lot easier for would-be speakers to make a good start at learning it.

Translation and propaganda

Horrifying news from Palestine - a Hamas Mickey Mouse is telling Palestinian kids to "annihilate Jews"! Or not. In fact - after running on a wide range of media, few of whom I suspect will bother to correct their story - this story was independently quickly exposed by several sources, such as Angry Arab, Ali Alarabi, and Brian Whitaker; MEMRI (the Israeli secret services-linked outlet that provided it) made the mistake of providing a video allowing any Arabic-speaker to confirm their mistranslations. With just a bit of spin, the kids' show in question was turned from merely propagandistic to verging on Bond-villain-esque:
* nqāwim, "we will resist", is rendered as "we will fight";
* biṭuxxūnā l-yahūd "the Jews shoot us", is rendered as "we will kill the Jews" (!);
* 'astašhid "I will be a martyr" as "I will commit martyrdom" (I don't think that's even an English expression, but never mind);
* 'ustāđiyyat al-`ālam, literally "professorship of the world" (in context, they clearly mean being at the intellectual forefront of the world), is rendered as "masters of the world".

When challenged on the translation of "biṭuxxūna l-yahūd", the ex-colonel in Israeli military intelligence who runs MEMRI, Yigal Carmon, apparently resorted to insisting that because "yahūd" (Jews) comes at the end, it must somehow be the object! ("Even someone who doesn't know Arabic would listen to the tape and would hear the word 'Jews' is at the end, and also it means it is something to be done to the Jews, not by the Jews.") It is rather difficult to imagine someone running an organisation dedicated to translating Arabic being unaware that subjects in Arabic commonly follow the verb (especially when a pronominal object suffix (-nā "us") is present, as here).

The moral in all this for English-language media is clear: when some helpful organisation sends you a free translation of some foreign-language article or program, do look a gift horse in the mouth, and check the translation with an independent source first. As for readers/viewers of the media in any language - caveat lector! But you no doubt already knew that.

I'm a bit busy getting my core chapter ready to hand in, so just a quick post on an English word I spotted lately: agflation. The term confirms the ever-increasing productivity of "-flation" as a suffix; the phenomenon is rather alarming.

Who has more than 40 words for camels?

Geoffrey Pullum is annoyed to hear a reporter state that "Arabic famously has over 40 terms for different types of camel" - not so much for whether it's true or not as because "they are presented as if profound and significant and clearly supportive of exoticizing claims about far-away nomadic peoples like Arabs and Eskimos, when in fact even if they were true they would be utterly unsurprising." I suppose I should point out that it is true - unsurprisingly. I don't know much more than three (Classical) Arabic words for "camel" ('ibil camels in general, jamal male camel, nāqah female camel); but people I can only describe as camel geeks have taken the trouble to post lists of terms for camels of various ages, sexes, colours, and breeds - and there appear to be 38 more terms for female camels classified by their breeding status alone, and another 14 for different Saudi camel breeds (and I'm ignoring at least another 6 lists of specialised terms for camels just on that site.) If I were a professional camel breeder or something (perish the thought!) I would no doubt know all these terms; but otherwise, who needs them?

But "Arabic famously has over 40 terms for different types of camel" is nonetheless misleading. People have a habit of thinking of technical vocabularies as aspects of a language - English has n terms for types of dog, Japanese has x terms for types of seaweed, etc. But that doesn't really work. It's not English speakers that have more than thirty terms for places of articulation; it's linguists working in a certain tradition. If they publish in a different language but studied in the same place, they'll just calque or borrow the words; if they cut their teeth on Panini or Sibawayh - in the original or in English translation - they will use a differently organised vocabulary even if they're writing in English. Likewise, an English camel breeder (if such a thing exists) will most likely just borrow the terminology of whichever region he got his camels from wholesale, as sure as an English sushi restaurant will borrow Japanese sushi terminology. Some Fulani tribes have shifted to Songhay - primarily a language of town-dwellers, with few native words for livestock types - but kept their cattle-herding lifestyle; unsurprisingly, they've also kept Fulani's enormous set of words for different types of cow, and not suddenly forgotten how to tell one cow from another. If practically every speaker of a language knows a given technical terminology, then it might make sense to view it as a property of the language; but that certainly isn't the case here.

I'm writing my core chapter at the moment on Kwarandzie (Korandje), the Northern Songhay language of Tabelbala. (Ethnologue and basic historical common sense notwithstanding, it is specifically Northern Songhay in ancestry, sharing common innovations with the language of places like In-Gall rather than the city it used to trade extensively with, Timbuktu.) It is very heavily influenced by Berber, like other Northern Songhay languages, and I found a great example the other day: the word for "old woman" is tamghazinut. Amghar is a Berber word meaning "old man"; zinu is a Songhay word meaning "old"; and ta-...-t is a Berber circumfix forming the feminine, which, even though Kwarandzie doesn't have gender agreement of any kind, seems (judging by this remarkable case) to be marginally productive as a derivational affix. (Postvocalic r is regularly lost in Korandje.)

On the map below (which I put together for my thesis using Google Earth and GIMP), you can see something of the geographic improbability of the situation:

A query on LINGUIST List the other day asked for examples of other languages which, like English, have a verb "exist" distinct from the general-purpose existential "there is". In Algerian Arabic, such a verb has emerged in recent years through borrowing from French - and has enjoyed the rare distinction of being publicly condemned by the president:
"Ma tinsistish", "ma texistish", the President of the Republic repeated, exclaiming: "What is this language?! It's not French, nor Arabic, nor Tamazight." Looking irritated, he added "I've heard some say that this is a matter of Algerian specificities. If so, I refuse as a citizen to be a part of these specificities."
(L'Expression 9 Mar 2006. The quote can't be found on the official record, which just has a general condemnation of the "repulsive jargon we use in our daily dealings, in which it's sometimes hard to find our national language or even our original unadulterated colloquial dialect.")
Silly as it may sound, this borrowing does have advantages. kayen is the usual way of expressing "there is" in Algerian Arabic, but there are contexts in which it simply won't work - you could not reasonably render "I exist" as *kayen ana, or "Homer existed" as kan kayen Homer (any more than "there's me" or "there used to be Homer" really mean the same thing.) You have to have recourse to loanwords for that, whether you use a Classical Arabic word (mawjuud, say) or a French verb.

Anyway, I did a quick web search for examples of this, not expecting much - but it seems that the online corpus of colloquial Algerian Arabic is bigger than you might have thought, and as full of code-switching as you might expect given who is most likely to have web access. Anyway, presidential proscription or not, a number of examples come up:
* hahahaha mazal yexisti had nou3 taa les femmes? (lol does this kind of women still exist?)
* Antik yerhem waldik, can u send me the link of derja dikssiounaire blenglizia ila yexisti bien sour (Antik please can u send me the link of Darja Dictionary in English if it exists of course)
* Hiphop ma zal yexisti (Hiphop still exists)
* en deux mots : ma yexistich en un mot makachou. (In two words: it doesn't exist. In one word: there isn't any.)
* c un ideal li ma yexistich (It's an ideal that doesn't exist)

Note the -i in this verb. In Algerian Arabic, Classical final-y verbs have mostly merged to end in -a in the past 3rd person and -i everywhere else: bka "he cried", yebki "he cries", bkit "I cried", ebki "cry!"; wella "he returned", ywelli "he returns", wellit "I returned", welli "return!". The rest of the stem remains constant throughout the conjugation; only the final vowel changes in such cases. Some of the commonest forms of French verbs happen to end in [e]: j'existais, il existait, exister, existez... So by an interesting compromise, throughout North Africa most French verbs are borrowed as final-y forms: tilifuna "he called", ytilifuni "he calls", tilifunit "I called", tilifuni "call!" (< telephoner). exister is no exception.

I wonder if other dialects have adopted this word too? I found one example from Tunisia, but that scarcely counts as a different dialect...

The Piraha debate heats up

A recent Language Log post alluded below the fold to two very interesting papers continuing the Piraha debate. Piraha Exceptionality: a Reassessment (Nevins, Pesetsky, and Rodrigues) reexamines Everett 2005's claims in light of Everett 1986, pointing out substantial and inadequately explained discrepancies between the two, and concluding with the rather hard-hitting statement that:
CA asserts, for example, that the embedded clauses amply documented and described in the earlier work are not actually embedded clauses, but offers no account or even acknowledgment of the numerous facts that argue in favor of the old view over the new. Similarly, CA offers as an argument for the new view the absence of long-distance wh-movement, but offers no new account of the data that in earlier work motivated the claim that Pirahã has no overt wh-movement of any kind. Likewise, as we have seen, CA asserts that Pirahã lacks quantifiers, but offers no coherent evidence against the proposal that the words described as quantifiers in the earlier work were described wrongly. In section 5, we have suggested that the situation is little better with respect to CA's discussion of Pirahã culture. CA simply asserts that Pirahã grammar has properties that, if true, would place it outside the pale of grammar and culture as we know it and would demand a special explanation for Pirahã's seeming uniqueness.
Everett replies in "Cultural Constraints on Grammar in PIRAHÃ: A Reply to Nevins, Pesetsky, and Rodrigues (2007). He protests their efforts to provide comprehensible glosses for his 2005 sentences, objecting that considerations like what "the best free translation, the least exotic translation" is are irrelevant to the final analysis, which should rely solely on the truth conditions for the word's use, and that in any event "armchair linguists who wouldn't be able to pronounce a single Pirahã word" are in no position to give such glosses. Glosses like "cloth arm" are superior to glosses like "hammock" (which is what the compound in question means), because they help inform the reader about the complexity of Pirahã morphology. He also offers some interesting evidence on why he now analyses what he had previously termed a "nominaliser" (and had glossed as such in 2005) as a marker of old information. His core objection seems to be that such efforts as Nevins et al's are bound to fail because not all languages "translate fairly well into one another", and in particular, Piraha cannot be translated well into English; the comprehensible translations they propose don't have the same truth conditions, and the "literal" "translations" (yes, I think that was worth two pairs of scare quotes) that he sometimes gives (eg Everett 2005:624: “Smallness of cans remaining associated was in the gut of the canoe”; what would the truth conditions for something being "in the gut of the canoe" be, I wonder?) don't exoticise the language so much as attempt to render its genuine exoticism into English.

The debate looks like an argument about where the burden of proof lies: for example, does Everett need to provide more than two examples of how the truth conditions of Piraha "ba´aiso" differ from those of English "whole" (supposedly; the anaconda skin example works just fine for me in English, presumably implying that my word "whole" does not in fact mean the same as Everett's word "whole"), or do his critics need to go learn Piraha before they can question his claims about the meaning of "ba´aiso"? Are his critics justified in assuming that, in the absence of contrary published evidence, a given Piraha structure will have a familiar counterpart? Extraordinary claims require extraordinary evidence.

Incidentally, Everett's response provides another interesting example of differing truth conditions for a sentence in English. In his idiolect, apparently, the fact that, when outsiders come,
"They say hello and the Pirahãs say hello back. They ask if there are any fish and the Pirahãs say that there are fish or are not fish. Many Pirahãs can communicate at a rudimentary level in Portuguese. But they lose the gist of conversations very easily and often after someone has left they ask me to interpret... The best speakers of Portuguese among the Pirahãs speak it about as well as I do French. I can say a few things and find a bathroom, but I am not ready for any conversation of any depth at all."
is so perfectly compatible with all Piraha being "monolingual" that he can actually offer it as evidence for the claim.