Wednesday, December 07, 2005

Istanbul, bishops, Rohingya, and Tamezret

For this week, I thought I'd share two curiously parallel reanalyses I've come across recently:
  • Istanbul, apparently, derives from the Greek phrase eis ten polin, "in the city";
  • 'usquuf, "bishop" in Arabic, which apparently derives from a Coptic reinterpretation of Greek episkopos "bishop" as e-pi-skopos "to the skopos", due to which skopos was reanalyzed as meaning "bishop".

And a couple of interesting language sites I've come across: is a speaker's effort to promote the Rohingya language. The Rohingya are a Muslim minority group of the western coast of Burma. Like virtually all Burma's inhabitants, they have been seriously mistreated by the government. Apparently, their language is most closely related to (a dialect of?) Chittagongian Bengali. If anyone figures out what the acute accent is meant to indicate, do tell me... (in French) is all about the endangered Berber language of Tamezret in southern Tunisia, written by a descendant of speakers. Though he's not a trained linguist, this qualifies as quite an important documentation effort in its own right; as far as I know, the only other thing ever published on the Tamezret dialect was Märchen der Berbern von Tamzratt im Süd-Tünisien in 1900, more than a century ago.

Wednesday, November 30, 2005

Andamanese Phrasebook

Some time ago, I found a copy of perhaps the only five-language phrasebook for the Andaman Islands. The Andaman Islands are a remote island group south of Burma belonging to India. Up to the 19th century, they were inhabited by a number of tribes with Stone Age technology, no significant contact with the outside world, and languages extremely different from any spoken anywhere else. Some of their descendants remain there today, but are a tiny minority except in a couple of areas in the south, and most of their languages are extinct. This phrasebook was written by one A. J. Portman in 1887, when the islands had been turned into a British penal colony, for the use of government officials. Some of the entries paint an interesting picture of life in the colony; for economy's sake, I have given only the Aka-Bea equivalents.

"That woman is wearing his skull." Kát apáil lá ót chetta ngāūrók-ké.

"Some convicts have escaped, you must search for them." Jó chāōga lá kájré, áb átaká."

"This village is very dirty." Ká báraij lót láda-da.

"You will be bitten by sandflies and mosquitoes." Nyípá, ól bédig téil ngáb chá-pinga.

"Don't sing, or there will be a storm." Ngódá rámitóyo-ngayábada, élér-wulké.

"He is a boy, and may not eat turtle." Kát áká kádekada, óda yádi mék-nga yábada.

"Do you eat grubs alive?" Án wai ngó butu ligátí mék?"

"You must bring them in by force." Ngó ítár pórawa.

And finally:

"Is there anyone here who understands his language?" Tén kárin míjólá áká teggí gádí-áté?"

They just don't make phrasebooks like this any more...

Friday, November 25, 2005

Oldest African dictionaries

Some time ago, I came across a web page characterizing a dictionary of Kenzi Nubian dating from 1635 as "the oldest dictionary of an African language". Much as I appreciate their work in getting this very interesting material online, that claim is out by at least 500 years, if not 1000.

The oldest arguable dictionary of an African language that I am aware of so far is the Greek-Coptic Glossary of Dioscorus of Aphrodito, which apparently* dates back to the 6th century. Ibn al-'Assal's Arabic-Coptic sullam muqaffa, written in the 1200s, can quite unhesitatingly be described as a dictionary; following a then-current Arabic tradition, it was arranged alphabetically from the last letter of the word backwards (so, for instance, "apple" would be close to "people" but far from "apricot".) This arrangement was meant to aid in the composition of rhymed prose and verse. Other examples, many arranged semantically, are given by the Encyclopedia of Islam article Sullam (literally "ladder"). In Ethiopia, traditional Geez-Amharic lexicons are titled Sawasew, or "ladders"; I thus assume they are of Coptic inspiration, though I haven't been able to find any detail about when they started to be written.

After Coptic, the next oldest is an Arabic-Berber lexicon written in 1145, containing some two thousand words. Its writer, Abu Abdallah Muhammad ibn Ja'far al-Qaysi, better known as Ibn Tunart, was born in Qalaat Bani Hammad (modern Algeria) and wrote the work in Fez (Morocco) - a second little-known Algerian medieval linguist to add to my list after Ibn Quraysh! The book contains some paradigms and verbs, but consists principally of a list of Arabic nouns with Berber equivalents or glosses, arranged by semantic field; it inspired several later Moroccan lexica. Nico van den Boogert is working on republishing it.

What other African dictionaries predate Carradori's? I don't know, but I can hazard some guesses - Geez, Swahili, Kanuri, and Nubian itself would certainly be worth checking.

* According to Adel Y. Sidarus, “Coptic Lexicography in the Middle Age, The Coptic Arabic Scalae,” in The Future of Coptic Studies ed. R. McL. Wilson (Leiden: E.J. Brill, 1978), 123

Thursday, November 10, 2005

A comparative linguist of the 10th century

Yehudah ibn Quraysh was a rabbi of the late ninth/early tenth century from Tahert (modern Tiaret, in Algeria.) Shocked to hear that the Jews of Fez in Morocco were neglecting the study of the Targum (an Aramaic translation of the Bible), he wrote a letter to them intended to establish that they could not and should not get by on the Hebrew alone - because other languages, especially Aramaic and Arabic, are essential in elucidating the Hebrew. In the process, he casually noted most of the correct sound correspondences between Hebrew and Arabic, and ended up writing what amounts to an extensive comparative dictionary of the three languages, even throwing in 9 Berber comparisons and 5 Latin ones at the end. He definitely hedges his bets on the cause of this obvious similarity between the three languages, but seems to come surprisingly close to the correct explanation - common descent - at times... something to bear in mind next time you read about Sir William Jones having founded comparative linguistics in 1798.

Here is what he had to say about it, as far as I can translate it:

I then resolved to put together this book for people with understanding, so that they should know that Syriac [Aramaic] expressions are scattered throughout the whole of the Holy Tongue in the Bible, and Arabic is mixed with it, and occasionally bits of Ajami [Latin] and Berber - and principally Arabic in particular, for in it we have found many of its strangest expressions to be pure Hebrew, to the point that there is no difference between the Hebrew and the Arabic except the interchange of ṣād and ḍād, and gīmel and jīm, and ṭet and đ̣ā', and `ay(i)n and ghayn, and ḥā' and khā', and zāy and dhāl. The reason for this similarity and the cause of this intermixture was their close neighboring in the land and their genealogical closeness, since Terah the father of Abraham was Syrian, and Laban was Syrian. Ishmael and Kedar were Arabized from the Time of Division, the time of the confounding [of tongues] at Babel, and Abraham and Isaac and Jacob (peace be upon them) retained the Holy Tongue from the original Adam. The language became similar through intermixture*, just as in every land adjoining a land of a different language we see intermixture of certain expressions between them and the spread of language from one to another; and this is the cause of the similarities we have found between Hebrew and Arabic...

The original was written in classical Arabic using the Hebrew script; I retranscribe it into Arabic script here:
فرأيت عند ذلك أن أؤلِّف هذا الكتاب لأهل الفطن وذوي الألباب، فيعلمو أن جميع לשון קדש (لغة القداسة: العبرانية) الحاصل في المقرأ (الكتب المقدسة) قد انتثرت فيه ألفاظ سريانية واختلطت به لغة عربية وتشذذت فيه حروف عجمية وبربرية ولا سيما العربية خاصة فإن فيها كثير من غريب ألفاظها وجدناه عبرانيا محضا، حتى لا يكون بين العبراني والعربي في ذلك من الاختلاف إلا ما بين ابتدال الصاد والضاد، والجيمل (حرف عبراني: ڱ) والجيم، والطِت (حرف عبراني: ط) والظاء، والعين والغين، والحاء والخاء، والزاي والذال. وإنما كانت العلة في هذا التشابه والسبب في هذا الامتزاج قرب المجاورة في البلاد والمقاربة في النسب لأن תֶרח (تِرَحْ) أبو אברהם (ابراهيم) كان سريانيا وלבן (لابان: حمو يعقوب) سريانيا. وكان ישמעאל (اسماعيل) وקדָר (قيدار) مستعرب من דוֹר הפלגה (زمان الاختلاف)، زمان البلبلة في בבל (بابل)، وאברהם (ابراهيم) وיצחק (إسحاق) وיעקב (يعقوب) عليهم السلام متمسكين بـלשון קדש (لغة القداسة: العبرانية) من אדם הראשון (آدم الأول). فتشابهت اللغة من قبل الممازجة، كما نشاهد في كل بلد مجاور لبلد مخالف للغته من امتزاج بعض الألفاظ بينهم واستعارة اللسان بعضهن من بعض، فهذا سبب ما وجدناه من تشابه العبراني بالعربي...

(Source: D. Becker, Ha-Risala shel Yehudah ben Quraysh, Tel Aviv University Press, Tel Aviv 1984.)

* I previously mistranslated this, having misread qibal as qabl.

Update:

Tuesday, November 08, 2005

Curiosities of Semitic articles

As David Boxenhorn noted in his comment to the previous post, the definite articles of Hebrew and Arabic display two odd-seeming properties:
  • agreement: if a noun has a given article, so does any adjective modifying it. Thus "a short boy" in Arabic is walad-u-n qaSiir-u-n (where -n marks indefiniteness, and -u marks the nominative case), whereas "the short boy" is al-walad-u l-qaSiir-u. (The vowel of al- elides when preceded by another vowel.)
  • In direct compounds of two nouns (possessed-possessor, or more generally modifier-modified), the first noun cannot take any article. Thus in Arabic you can say yad-u l-walad-i "the boy's hand" or yad-u walad-i-n "a boy's hand" (-i marks the genitive case) but not *al-yad-u l-walad (intended to be "the hand of the boy") or *yadun al-walad (intended to be "a hand of the boy").

The second property isn't actually all that "exotic" - English does the same thing! You can say the man's hat, but never *the man's the hat or *the man's a hat; just as in Arabic or Hebrew, to make the full range of possible definiteness distinctions you have to resort to prepositions.

Definiteness agreement between nouns and adjectives is more unusual, but at least one Indo-European language has it: Norwegian. No question of substratum influence there, certainly... Anyone have another example?

However, in determining whether or not the article represents a shared innovation, the question is whether other relatives have it. I recall that Biblical/Imperial Aramaic did (later varieties lost it), but I'm not sure of the detailed behavior of its definite article. The Berber obligatory noun prefixes probably derive from an original article (see my post Beja and Beyond) but, though distinctly similar to the Beja definite article, don't seem directly comparable to the Arabic and Hebrew ones.

Monday, November 07, 2005

Demonstratives in Semitic and beyond

Rishon Rishon just posted a table comparing the words in my previous post to Hebrew. Most of them are correct; mo`ed and g'vul are not cognate, and SaH and loa` I'm not sure about. However, one is particularly interesting: ha- = 'al- "the". You often find this seeming cognate cited in works on Semitic: after all, ha- induces gemination of a subsequent non-guttural consonant - suggesting a lost consonant in the prefix assimilating to the subsequent letter - and Hebrew h- occasionally seems to soften to '- in Arabic (the causative measure hiph`iil corresponds to 'af`ala, for example.) Trouble is, the only letter in Hebrew that regularly assimilates to a following consonant is n (although l does admittedly assimilate in the verb laqaH), and the Safaitic inscriptions seem to reveal an early pre-Islamic northern dialect of Arabic which did have a definite article h- (hn- before gutturals, thus hn'lt for Al-Lat.) So are there any other possibilities?

I think so. ha- corresponds pretty well to Arabic haa "here is", which is also the obligatory prefix to the demonstrative "this" (haadhaa, haadhihi, haa'ula'). Compare also Syriac haanaa "this (m. sg.)" - which appears to reveal an added n which could explain the Hebrew doubling (and the Safaitic form) nicely. Conversely, 'al- corresponds well to Hebrew 'elleh, Arabic haa-'ulaa', Syriac haaleyn "these". The vowel doesn't correspond exactly, but then it doesn't in the certain cognate 'elleh = haa-'ulaa' either. This idea has probably already been put forward (or indeed knocked down) somewhere in the literature, for all I know, but there you go.

In either case, both definite articles would derive originally from unstressed demonstratives - a process so common it's barely worth commenting on. For example, all the Romance languages derive their definite articles from Latin demonstratives - usually illum/illa "that" (before the noun except in Romanian), but istum "that" in Sardinian. Likewise, the Coptic definite article pe- (m.)/te- (f.)/ne- (pl.) derives from the ancient Egyptian demonstrative pn (m.), tn (f.), nn (pl.) "this". Indeed, English "the" derives from the same old English word as "that". Come to think of it, I don't know of any definite articles offhand that don't derive from demonstratives; can you think of any?

Thursday, November 03, 2005

Eid Mubarak!

Eid Mubarak عيد مبارك, or, as they say in Algeria, Sahha Eidek صحّا عيدك to everybody!

Today is Eid al-Fitr, the day on which the Ramadan fast ends and the subsequent feasting begins. The Arabic term (`iid al-fiTr عيد الفطر) means "Festival of Fast-breaking"; the original meaning of the root fTr seems to be "cleave, cut open", from which it acquired the senses of "form, make" on the one hand and (in a metaphor somewhat similar to the English one) "break fast" on the other. In North Africa, it is more generally known as El Eid Es Sghir (l`id SSghiR العيد الصغير), "the small festival" (as opposed to Eid al-Adha, the "big festival").

PS: luggi turns out to come from a widely attested Berber word ileggwi, meaning a spiny plant (variously broom or needle-furze) - as Salem Chaker was the first to suggest.

Monday, October 17, 2005


A tantalizingly brief note of 1931 in the Gold Coast Review describes an ethnic group called the Mpre, found only in the village of Butie in central Ghana (8° 52' N, 1° 15' W) near the confluence of the White and Black Voltas, apart from a few emigrants in Debre. According to the author's description, the Mpre people, once more widespread, were reduced to a single village in the course of comparatively recent wars with the Asante. Noting that their language was “different to that of the surrounding tribes”, he lists 106 words of Mpre. This short vocabulary appears to be the only existing record of the language, which is believed to be extinct. The gap is all the more unfortunate because Mpre turns out to be of some taxonomic significance. It is not closely related to any of its neighbors, and Heine and Nurse (2000) treat it as unclassified. A friend of mine's paper dealing partly with this will be appearing sometime soonish, but I won't spoil the surprise...

You might think, given all this, that it was impossible to retrieve any information on its grammar. However, you would be wrong! Fellow language geeks may find it an interesting exercise to try their hand at extracting grammar information from the wordlist, which Blench gives a copy of, before reading on...

The wordlist strongly suggests a noun class prefix system still at least partially productive. The highly lopsided initial letter statistics would alone suggest this: 31 entries begin with e-, 21 with a-, and 12 with n-, together accounting for the majority of the wordlist. This speculation is confirmed by distributional analysis for the e- which appears in the numbers 1-5, but disappears in 11-13 and 20-30; it is presumably to be identified with the Ga prefix é- observable in the same numbers. (The change of ekpe “one” to mpe in “11” is noteworthy, if it is not a typo.) Likewise, comparison of kelafa “100” with lefanyo “200” reveals a prefix ke- - with precise analogues in Ch./Kr. kʌ́-, Na. gʌ́-, and Go. ká- in the same numbers. Of the 21 entries with a- (corresponding to 19, or possibly 18, distinct words), five are glossed as plural in English, while another four are glossed as collective nouns; no entries not beginning with a- are glossed as plural. I therefore conclude that a- is a marker of plurality - suggesting that ado (the formative element in “20”, “30”, ...) is the plural of edu “ten”. This jibes nicely with other languages of the area: a plural prefix a- is found in Gonja, Twi, Lejana, Akpafu, and Avatime, for example.

Identifiable compounds include zingilzi-nogha “bush cow” (cf. zingelza “bush”, nogha “cow”), sunko kawuseggi “earth owner or tindana” (cf. sunko “earth”), nkemnzui “son” (cf. nzui “child”), lefanyo “200” (cf. enyo “2”, kelafa “100”), eputo nasi “foot” (cf. eputo “leg”) ; all suggest a word order type Modifier-Modified. “Lion” (jikpajikpakoseggi) must surely be a compound, in which I would identify the final koseggi with kawuseggi “owner (?)” above? Also, ataza “finger” and atazai “toe” are clearly related, but it is unclear whether one is a compound form or whether both are simply different transcriptions of the same word.

One short sentence is given - agbem aba “it rains” (cf. agbem “God”). Assuming that this is of the form SV, this could be taken to suggest verb agreement in gender (or at least number) with the subject; however, this is by no means certain.

Monday, October 03, 2005

SOAS, epiglottal trills, Sergei Starostin

Today I attended my first lecture as an MA student here at SOAS - on phonology. Nothing much to report yet; the highlight had to be our lecturer's demonstration of an epiglottal trill (which, believe it or not, actually occurs in some Caucasian languages; I think she named Aghul.) It's a remarkable sound - impossible to confuse with any sort of pharyngeal.

In other news, I was sorry to hear that Sergei Starostin has died. I met him briefly at the Santa Fe Institute, and am one of many to have benefited from his online comparative databases. He will be missed, particularly at the EHL project.

Monday, September 12, 2005

Poetic grammars

Grammars come in many flavors nowadays - Chomskyan, functionalist, structuralist... However, grammars in verse are something you don't see too often nowadays, so I was recently pleased to come across the ''Alfiyyat Ibn Mâlik'', a 1002-line poem describing Arabic grammar; as it says in line 3:
وأستعين الله في ألفية * مقاصد النحو بها محوية
Wa-'asta`înu llâha fî 'alfiyyah * maqâsidu nnahwi bihâ mahwiyyah
And I seek God's help in a thousand-line
Poem in which grammar's basics are outlined

It was written in the 13th century by one Muhammad Ibn Mâlik, a native of Jaen in Spain who emigrated to Syria. The poem was memorized in order to aid the student in recalling the more obscure details of Arabic grammar (strictly prescriptive, of course...) Unfortunately, the poem proved somewhat obscure to prospective students, prompting the writing of commentaries on it, such as Sharh Ibn `Aqîl, in which each verse or group of verses was explained in greater detail. As a sample of the style, I present verse 229:
ويرفع الفاعلَ فعلٌ أُضمرا * كمثل "زيدٌ" في جواب "من قرا"؟
Wa-yarfa`u lfâ`ila fa`lun 'udmirâ * kamithli "zaydun" fî jawâbi "man qarâ?"
And an implicit verb makes its subject nominative
Like "Zayd-NOM" in answer to "Who read?"

(Ie, the subject of a verb implied by context but not actually present in the sentence at hand takes the nominative.) I wonder what parallels exist in other grammatical traditions.

Incidentally, I'm back from Algeria now, and plan to report on more linguistic tidbits - as well as more luggi, on- or off-topic - shortly; I'm also starting at SOAS soon.

Saturday, August 27, 2005

Hi from Algeria

In case I have any regular readers, I thought I should explain that I'm currently enjoying a holiday in Algeria and nearly incommunicado as far as the Internet is concerned. The local dialectological situation - a conservative qaf-dialect of Arabic in Dellys itself, a "Bedouin-type" gaf-dialect in the villages immediately around it, and Kabyle beyond it to the east and south - is quite historically suggestive, and I'm still looking into its origins; however, to be honest, I'm spending rather more time on the beach, which brings me to a question: does anyone have any idea what the etymology of "luggi", the local Arabic word for sea urchin, might be? Or even know of another area where this term is used? My guess would be Berber, but I haven't found a convincing answer yet. Many local seafood names seem to be found in Corriente's Dictionary of Andalusi Arabic (often with Vulgar Latin etymologies), but not this one.

Tuesday, July 19, 2005

Shakespeare was a hobbit...

or, anyway, that was my reaction to the reconstructed pronunciation of a Shakespearean accent provided by the BBC. Apparently, the Globe is planning to stage Troilus and Cressida in its original pronunciation soon - thus bringing a little life back into Shakespeare's dreadful puns! I suggest that for their next challenge they should try reconstructing Fluellen's pronunciation - Elizabethan English with a thick Elizabethan Welsh accent, presumably.

Monday, July 11, 2005

The American Language

I've been reading Mencken's The American Language (Supplement I, 1945), and find it tremendously entertaining in small doses:
Since the earliest days the two Houses have devoted immense amounts of time and wind to pursuing such wicked men and things as Bourbons, slavocrats, embargoroons, gold-bugs, plutocrats, nullifiers, war-hawks, embalmed beef, ..., economic royalists, princes of pelf, land-grabbers, land-sharks, mossbacks, the open shop, the closed shop, and labor and other racketeers. Even Washington made a contribution to the menagerie with his foreign entanglements; as for Jefferson, he produced two of the best bugaboos of all time in his war-hawks and monocrats. From 1875 onward until the late 80s waving the bloody shirt was the chief industry of Republican congressmen, and from the early 90s onward the crime of '73 engaged the Democrats.

It's also genuinely informative at times, providing, for instance, an extensive list of words of Algonquian origins, and revealing that the term African-American (whose modern popularity, of course, came long after the book was written) is not a pure neologism, but has roots in a term that was popular around 1835, Africo-American, and one from 1880, Afro-American. (In a footnote to that section, he quotes a Liberian diplomat as noting that "Liberians consider the term Americo-Liberian opprobious as reflecting upon their [ancestors'] condition of servitude in the United States. Hence they prefer to be called civilized or Monrovian Liberians to distinguish them from the natives of the hinterland..." Diplomatic speech does change!)

In his discussion of social attitudes towards the emerging American dialect, he gives an 1820 quote from a British reviewer, Sydney Smith, that, apparently, "rankled in American bosoms for many years":
In the four quarters of the globe, who reads an American book? or goes to an American play? or looks at an American picture or statue? What does the world yet owe to American physicians or surgeons? What new substances have their chemists discovered? or what old ones have they advanced? What new constellations have been discovered by the telescopes of Americans? Who drinks out of American glasses? or eats from American plates? or wears American coats or gowns? or sleeps in American blankets? Finally, under which of the old tyrannical governments of Europe is every sixth man a slave, whom his fellow-creatures may buy and sell and torture?"

What an embarrassment for the poor fellow - to have been significant enough to give such offence, yet to be remembered two hundred years later principally for the shortsighted arrogance of his sneering asides! Even his last sentence, a justified blow at the time, would soon be made obsolete by a titanic effort. Let the Kilroy-Silks of our own day take note.

Saturday, July 09, 2005

Claim of responsibility for the London murders

Language Log has recently posted twice on the bizarre name of the organization claiming to have carried out the attack. An apparently accurate screenshot of the claim can be found on Wikipedia.

The first interesting thing about this statement is the bizarre phrasing of its opening: والصلاة والسلام على الضحوك القتال سيدنا محمد صلى الله عليه وسلم. The Guardian renders this as "may peace be upon the cheerful one and undaunted fighter, Prophet Muhammad, God's peace be upon him." The doubling of "peace be upon him" (a formula added to the prophet's name as a matter of course) is unusual [because of its redundancy] and stylistically flawed, suggesting an imperfect command of Arabic literary style. The phrase الضحوك القتال (ad-Ḍaḥûk al-Qattâl), rendered by the Guardian as "the cheerful one and undaunted fighter", is composed of two words in apposition which Hans Wehr's dictionary renders as "frequently, or constantly, laughing; laugher" and "murderous, deadly, lethal". This extremely unusual epithet is so weird that at first sight I assumed it must be some kind of prank; it may potentially provide some clues to the identity of the killers.

Such an opening has been used at least once before in Europe: the assassin of Theo van Gogh left a note on the body opening after the standard invocation of God's name, with Vrede en zegeningen op de Emir van de Mujahideen, de lachende doder Mohammed Rasoeloe Allah (Sala Allaho alaihie wa Sallam), ie "Peace and Blessings from on the Amir of the Mujahidin, the laughing killer Mohammed the Prophet of God (God's peace be upon him)", which is almost identical, right down to the doubled "peace be upon him". A similar but less repetitive formula was used by Zarqawi in a purported claim of responsibility for the killing of the governor of Nineveh last year on CNN, and a Google search suggests that (again without the repetition) it occurs in other Iraqi insurgent notices. The term itself is probably copied from the 14th-century Hanbali writer Ibn Taymiyya's as-Siyasa ash-Shar'iyya, whose author, living at the height of the Mongol threat, spent much of his time urging people to fight; it does not seem to occur in any of the accepted hadith books.

The third really weird thing about the message is the phrase ابشرى با أمة الاسلام ابشرى يا امة العروبة : "Rejoice oh community of Islam, rejoice oh community of Arabdom". This collocation itself appears to be well-established, if rare - the phrase "community of Arabdom" (ummat al-`Urûbah) gets only 37 google hits, but many are collocations of one sort or another with "community of Islam", and come from speeches or interviews by well-known politicians. However, it does not seem to form any part of the standard rhetoric of so-called "jihadists".

Finally, it's worth noting that the Qur'anic quote at the end (47:7) contains a typo, if an easy one to make: it has لله lillâh "to God" for الله Allâh "God", omitting an alif. (I looked again, and the alif is there; it's just thinner than the adjacent letters, so my eye processed it as part of the subsequent lam. Oops!)

PS: Juan Cole explores, among other things, the implications of the "Arabdom" phrase.

PPS: Shibli Zaman also examines the linguistics of the issue; his summary of the "urubah" issue is more detailed than mine.

Tuesday, July 05, 2005

Negative convergence

Among the many shared characteristics that make the Maghreb proper (Algeria, Tunisia, and Morocco north of the Atlas Mountains) a linguistic area, in my view (albeit a somewhat trivial one, given that only two fully separate languages are involved) is that of double negation: like French, most languages of the area have a negative particle both before and after the verb.

In Algerian Arabic, this is ma ... sh(i), which derives transparently from Arabic ma:, "not (past)" and shay', "thing" (as in constructions like ma: ra'aytu shay'an, "I didn't see a thing".) In Kabyle, the corresponding construction is ur ... ara, which is purely Berber but exactly parallel; ur or ul meaning "not" is found throughout the family, and ara comes from a root meaning "thing" or, as in Tuareg, "child". By contrast, Tuareg and Tachelhit, south of the Maghreb proper, both use negations based on cognates of ul alone, without any postverbal element. So when I came across the Chenoua negative - u ... sh - I naturally assumed this must be a rather interesting Arabic-Berber hybrid, with the Berber preposed negative and the Arabic postposed one. The Tamezret negative is similar, ul ... sh, and seems to fit the idea nicely.

However, it turns out that, despite appearances, this may not be the best explanation. Tarifit uses war ... sha, and Middle Atlas Tamazight optionally uses ur ... (sha). At first sight these seem to work, but the vowel seems odd if they derive from Maghreb Arabic shi. However, kra happens to be a well-attested Berber word meaning "thing", found also in Kabyle, and its expected form in Zenati dialects like Tarifit, Tamezret, and Chenoua would be *shra (this may be a real form, though I haven't come across it.) And what more natural environment to simplify a consonant cluster than in an unstressed grammaticalized particle?

The issue is examined from a rather different, syntactic, perspective in a paper by Ouali online, which ironically reveals an alternative construction in Tarifit which does seem to be half-borrowed from Arabic: ur ... shi. However, its data seem somewhat at odds with those I've found in other sources; there is substantial dialectal diversity within Tarifit, which may explain this.

Tuesday, June 28, 2005

Tasmanian reborn (or not...)

Interesting story on Tasmanian today... The last speaker of a Tasmanian language died in 1905 (Wikipedia), and little material survives, so I'm not entirely confident in the historical reliablity of the newly announced reconstruction, especially since:

"There were thought to be a dozen or more Aboriginal languages in Tasmania and even more dialects. The language program has produced an amalgam of the languages."

Hmm. Do you speak European?

I'm not convinced that that "many within the Aboriginal community could speak palawa kani fluently" either. Still, it's worth a try. An interesting case of conlanging and language revitalization combining.

Thursday, June 23, 2005

Malay pronouns

And as long as I'm comparing pronouns, it seems only fair to note that not all languages have nice stable uncomplicated pronouns like Beja. Japanese is an obvious counterexample, but Prentiss Riddle and Macvaysia point out an egregious case in Malay... (noticed thanks to Language Hat.) I was especially struck by the "distinct set reserved just for addressing ethnic Chinese" - if I'm not mistaken, those are clear borrowings from Hokkien Chinese (where "I" is goá, "you" is .)

Beja and beyond

Some interesting news this week from the Beja, an ethnic group of the Red Sea coast of Sudan and Egypt. It's unclear whether this rebellion is representative of the Beja's general feelings or just a figleaf for Eritrean intervention (or both), but it's a story to watch - and an excuse to bring up a cool language.

Beja is Afro-Asiatic* - either part of Cushitic or a separate branch, depending on who you ask - and happens to be among the most obviously similar languages to Semitic and to Berber. The noun morphology is already fairly suggestive:

Beja definite articleArabic noun endingsKabyle obligatory prefix
Masculine nominative singularu:--uw-
Masculine accusative singularo--aa-
Feminine nominative singulartu:--atut-
Feminine accusative singularto--atata-

And the pronominal object suffixes add credence:

me-i, -o-ni:-iyi
you (pl.)-okn-kum-kən

(Beja, apparently, has no third person suffixes.) However, what really clinches it is the verbal system. Beja has two principal classes of verbs: one that often takes prefixes, and one that usually just takes suffixes. In Semitic, the prefixes are used for the imperfect, and the suffixes developed from a stative (still to be seen in Akkadian) into a perfect; Berber mostly retains the prefixes, whereas only minor traces of the suffixes remain. The prefixes are especially telling:

you (m.)ti- -ata-t- -ḍ
you (f.)ti- -ita- -i:t- -ḍ
you (pl.)ti- -nata- -u:nat- -m
theyi- -naya- -u:na-n

while the suffixes are best exemplified in Beja in the conditional mood:

BejaArabicDahalo general non-past (Cushitic)
you (m.)-tia-ta-to
you (f.)-tii-ti-to
you (pl.)-tina-tum-ten
they-ina-u:-en, -ammi

Just for good measure, in the prefix verbs you also have a feature found in Akkadian (among other Semitic languages) and Berber but lost in Arabic: a present tense formed by doubling the middle radical (in Berber and Akkadian) or adding n before the middle radical (in Beja). Compare:

  • Beja aktim ("I arrived") > akanti:m ("I arrive")
  • Akkadian almad ("I learned") > alammad ("I am learning")*
  • Tamasheq əlmədǎγ ("I learn", irrealis) > lammǎdǎγ ("I am learning", realis)

It's really remarkable, considering all this, that Afro-Asiatic research isn't more advanced. There are two etymological dictionaries out there, admittedly - Ehret's and Orel and Stolbova's - but, though valuable, they frequently disagree with each other, and neither has attained general acceptance.

* Some people think Afro-Asiatic is not proved. I can't think why. Omotic's membership is not entirely clear, but all the rest is just plain obvious.

* Previously misquoted forms corrected, thanks to Matthew Loran.

Tuesday, June 21, 2005

Writing Wolof (or rather وَلَفْ)

With apologies for the long hiatus in my postings, I would like to present another topic in West African writing: the surprisingly formalized tradition of writing Wolof, the main language of Senegal, in Arabic script. Wolof is also written in Latin script, I should note, which you can see copious examples of in the pedagogical materials on this Gambian Peace Corps site, but the Arabic script is much more widely known, especially in rural areas, although French is far more widely used for writing than Wolof in any script.

Myself, I only went to Dakar, so books in Wolof of any sort were relatively hard to come by. However, Arabic bookstalls, while rarer than the French ones, weren't hard to find (they had a predominantly religious focus, but a number of literary, scientific, and historical works), and, while most of their works were in Arabic, they had a couple of Wolof religious texts in a rather nice Arabic script, of which I enclose a scan. I was going to retype some, but even a cursory effort revealed serious issues. For instance, there is a Unicode letter for the common West African vowel sign that indicates short e (a dot under the letter, smaller than dots that form part of the letter) - the charts say it's 065C - but I can't find a font that will display it; and another common letter which seems to indicate ny or nj, jiim (ج) with three extra dots on top, isn't in Unicode at all. Apart from those, the main differences with standard Arabic seem to be:

  • Short e is as described previously; long e is indicated by adding an alif maqsura ى with a small alif on top (another character I can't seem to find fonts for, despite its commonness in the Qur'an; it should be 0654.)
  • p is a ba ب with three dots on top (and actually is in Unicode - 0751.)
  • A dal with three dots above (ڎ) occurs some places; I don't know how it's pronounced.
  • gaaf is a kaaf with three dots (ڭ), following longstanding Maghrebi tradition.
  • Again in the Maghrebi tradition, faa has its dot below, and qaaf has a single dot above.

PS: You can find a font that will display some of these letters at PakType; however, their selection is more adapted for Sindhi (which has the largest Arabic-based alphabet I know of) than for West Africa.

PPS: Apparently, the latest version of PakType can display all these after all; see comments...

Wednesday, June 08, 2005


Most English speakers are familiar with the phenomenon of r-dropping; it divides the English-speaking world into Ireland, Scotland, and most of America (where r is kept throughout), and England, Wales, Australia, New Zealand, and the American South and New England (where r is lost after vowels.) Despite this broad distribution, r-dropping somehow seems emblematic of British English, so I was naively surprised to observe it in other languages; yet the same sound-change happens to be observable in Tarifit, the Berber dialect of northeastern Morocco, and seemingly in Korandje, a Songhai language brought from Timbuktu to a northern Saharan oasis, Tabelbala, along the trade route to Sijilmasa in Morocco. Some examples:

* Tarifit ddaa, live! = Kabyle dder (here e=schwa=ə)
* Tarifit thamoath, earth = Kabyle thamurth (th=θ)
* Tarifit adhvea, pigeon = Kabyle ithvir (dh=ð, v=β)
* Korandje bia, big = Timbuktu beer(i)
* Korandje lekhba, news = Arabic al-akhbaar (kh=x)

I wonder what other cases are out there?

PS: Thank you, Language Hat, for your kind welcome!

Wednesday, June 01, 2005


Leafing through a book on the early days of Rastafarianism, I came across the word "blin'ty" in an interview with someone speaking Jamaican patois. Blin'ty? Is that some kind of Russian pastry? Apparently not.

It seems that Rastas have developed some rather interesting ways of reflecting their beliefs in their speech. Usages like "I an' I" are well-known, but less widespread are avoidance terms where the opposite of a word is substituted. The people of the city, from a Rasta perspective, are "Babylon"; they don't see the truth, so why should the word "city" contain the sound of "see"? (In a Jamaican pronunciation, anyway...) Rather, they substitute the more appropriate syllable "blind"...

Thursday, May 26, 2005


I've recently gotten back from travelling in Mali and Senegal. At the conference I was attending in Bamako, I met several delegates from the N'Ko movement. N'Ko is an old Manding term, meaning "I say" in each of the mutually comprehensible Manding languages (principally Bambara, Maninka, Mandinka, and Dyula) and hence traditionally used as a general term to cover Manding. In 1949, a Guinean Maninka-speaking shaykh, Solomana Kanté, stung by a Lebanese claiming in the newspaper that African languages could not be written and were thus worthless, decided to start writing his language. He experimented with Arabic and Latin scripts, but found them inadequate to Maninka's tone and vowel systems; so he devised a new alphabet, N'Ko. He went on to write nearly two hundred books in the new script, including a translation of the Qur'an, textbooks of physics and history, descriptions of traditional medicine, and books of poetry; his disciples carried on the task after his death, and the script has spread surprisingly widely, mainly in Guinea and Cote d'Ivoire.

After the conference, I wandered around Bamako a bit, and randomly ran across a market stall with N'Ko writing all over it. Naturally, I went over and asked about it; it turned out to be a traditional medicine shop. All the remedies were labelled in N'Ko, and the shop's accounts (I happened to notice) were kept in N'Ko; apparently, the stallholder used Solomana Kante's works on traditional Manding medicine... In the next stall, where they were setting up a bookstore, was an N'Ko teacher. I ended up having quite a long discussion with him; the topic was interesting enough that I didn't even notice that he had 12 fingers until an hour later, although that did make the meeting more memorable.

He showed me some books in N'Ko (textbooks of maths, physics, and geography, a grammar, a philosophical work, a newspaper, and the Qur'an translation) and spoke eloquently about what a difference it made to have access to knowledge in your own language for once. He had studied algebra and geometry through highschool, in French, without understanding them; yet when he read about them in his own language, the concepts became easy. Studying in French, he argued, you became alienated from yourself and your culture as the price for your knowledge; studying in N'Ko, your knowledge fit naturally into your own identity. Despite the funding of literacy organizations, the inadequate, atonal Latin orthography for Bambara was still virtually unused, while N'Ko (he claimed implausibly) was being studied by most of Bamako. He also explained something I hadn't realized: the N'Ko movement uses a common standardized language, a literary Manding "purified" of Arabic and French borrowings, utilising the most conservative dialects, and full of agglutinatively coined neologisms for modern technical terms, thus creating a dialect that they felt could compete with French in all usages rather than being restricted to low registers and simple topics.

I was impressed. It looks to me like N'Ko enjoys one massive advantage over Latin: not the tones, nor even the books (though they help!), but the evangelical dedication it inspires in its devotees, without which a literacy program in an unwritten language is unlikely to overcome the obstacles it faces.