Thursday, December 08, 2016

How Tunisia ruined its PISA performance

PISA 2015 is an OECD-run survey intended to evaluate education systems worldwide by giving the same test to (almost) all students of the same grade across a large number of countries and comparing the results. This years' results have gotten a lot of coverage, notably for the dismal perfomance of all the Arabic-speaking countries participating. The UAE did least badly in terms of combined scores, managing 48th place out of 70; it was trailed by Qatar (59th), Jordan (61st), Lebanon (65th), Tunisia (66th), and, most ignominiously, Algeria at 69th place, barely beating the Dominican Republic.

Laudably, PISA have made their science tests publicly available online in many languages, including four Arabic versions labelled Israel, Qatar, Tunisia, and the UAE - don't ask me what happened to Algeria, Jordan, and Lebanon. Browsing through these, one immediately notices that the Tunisian translation (unlike the Gulf ones) has a remarkable number of grammatical errors, typos, and phrasings so awkward as to be barely comprehensible. For instance:

  • Bird Migration 1: "يستعملون العدّ الذي يقوم به المتطوّعين" - wrong case: should be المتطوّعون
  • Bird Migration 1: extremely awkward phrasing: "هجرة الطيور هي حركة موسمية كبيرة، يتنقل أثناءها الطيور نحو أماكن تكاثرها أو هي تعود منها." ("Bird migration is a great seasonal movement, during which birds move to the places of their reproduction and they come back from them.") Contrast the clearer phrasing in the Qatar version: "هجرة الطيور الموسمية هي انتقال واسع النطاق للطيور من وإلى مناطق تكاثرها. وفي كل عام يتولى متطوعون إحصاء عدد الطيور المهاجرة في مواقع محددة."
  • Bird Migration 3: the bird's name is "الزقزوق الذهبي" in the text, but in the question it turns into "الزقزاق الذهبي".
  • Running in Hot Weather 1: Garden path title: anyone looking at "العدو في الطقس الحار" is going to read it as "the enemy in hot weather", at least until the context is established. Contrast the Qatari translation "الجري في الجو الحار", using a better known, graphically unambiguous term for "running".
  • Running in Hot Weather 1: Grammatical error in "يدل على ذلك {كمية العرق | ضياع الماء | درجة حرارة الجسم} العداء بعد ساعة من السباق": for the sentence to make sense (even in dialectal Arabic!), none of the alternatives should contain the definite article, since they form part of an idafa genitive. Contrast the Qatari version, which avoids the problem by putting "للعداء".
  • Running in Hot Weather 2: Garden path sentence: "شرب الماء خلال السباق يمكن أن يكون له تأثير على حصول تجفّف وضربة حرارة بالنسبة إلى العداء. أيّهما؟ " Anyone reading this will start by reading the first word as šariba "he drank", giving "he drank water during the race, it can have an effect..." and only after the fifth word will they be in a position to read it, as intended, as "Drinking water during the race can have an effect on the occurrence of dehydration and heatstroke for the runner. Which of the two?" Having gotten that far, they'll still be given pause by the need to decide the intended referents of "Which of the two?" Contrast, yet again, the much easier to read Qatari version: " ماهو تأثير شرب المياه خلال الجري على تعرض العداء للجفاف وضربة الشمس ؟ " (What is the effect of drinking water during the race on the runner's exposure to dehydration and heatstroke?")

I could keep going, and no doubt more fluent Arabic speakers can find problems I haven't even noticed, but the pattern is clear: Compared to Qatari students, to say nothing of Western ones, Tunisian students were systematically disadvantaged in the PISA 2015 science tests by bad translation.

Whose fault is this? Clearly there was a failure at the level of PISA's international verification, which should have eliminated such problems. But the translations themselves are carried out at the national level (PISA2012 Technical Report Ch. 5). In other words, this mess was produced by Tunisian translators under the direction of the Tunisian government.

How is that possible? Simple: in Tunisia, appallingly enough, science is taught in French from the start of secondary school onwards. Science teachers have little need to keep up their Standard Arabic proficiency. Which raises the question of why this test, targeted at 15-year-olds, was administered in Arabic there to begin with.

Wednesday, November 30, 2016

Siwi vocabulary for addressing animals

Probably every language has a certain number of forms used especially for addressing animals, especially domestic animals. In response to a recent query by Mark Dingemanse, I gathered together all the ones I happened to have recorded for Siwi - the list below is definitely not exhaustive, but should at least be suggestive. Note the sounds used - clicks do not usually form part of Siwi phonology!

To chicks:
didididididi: eat!

To cats:
ərrrr: come!
ǀǀǀǀǀ: come!
pss: move!

To dogs:
ʘʘʘʘʘʘʘ: follow me!

To goats:
əšš: go!
ħəww: go!
xətt: go!
kškškškškš: eat!

To donkeys:
ǁǁǁǁ: giddy-ap! (?)

The interesting question here is: to what extent are these arbitrary, reflecting an emergent cross-species convention just as most human lexemes do, versus to what extent do they reflect innate properties of animal perception and communication? How do they compare to those you've encountered, if any?

Tuesday, November 08, 2016

Some Dellys etymologies via Andalus

Looking through Corriente's etymological dictionary of Andalusi Arabic, I keep coming across explanations for obscure Dellys words whose origins had been a mystery to me. Corriente's etymologies are not always to be trusted - I've found several errors, most egregiously the attribution of kurānah كُرانة "frog" to Romance rather than to Berber - but the work remains very valuable. Here are a few etymologies that struck me.

  • l-ənjbaṛ لنجبار "maize" was originally anjibār أنجبار "snake-weed" (Persicaria bistorta), whose flowers looks vaguely similar. This in turn comes from Persian angbār انگبار, which Corriente seems to derive from rang-bār رنگبار "many-coloured".
  • skənjbir سكنجبير "ginger" derives from some sort of popular confusion between two Arabic words: zanjabīl زنجبيل "ginger" and sakanjabīn سكنجبين "oxymel" (a mixture of honey and vinegar used medicinally). I assume the connection is that both are good for colds, but a quick search didn't turn up any actual evidence that oxymel was used for that purpose. Sakanjabīn is apparently from Persian سرکه انگبین serke angabin (Corriente gives the form sik angubēn) "vinegar honey", while zanjabīl is apparently, again via Persian, from Sanskrit शृङ्गवेर ‎śṛṅgavera.
  • fərnəħ فرنح "smile, laugh (of a baby)": cp. Andalusi farnas فرنس, Moroccan fərnəs فرنس; possibly, Corriente suggests, from Greek euphrosynē εὐφροσύνη "joy".
  • bu-mnir بومنير "seal" was very hard to elicit, since they've been locally extinct for decades (they've nearly disappeared from the entire Mediterranean, in fact). However, it turns out to be correct after all: cf. Andalusi bul marīn بل مرين "sea lion", Maltese bumerin "seal". Corriente seems to take this as Romance *pollo marino "sea-chicken", but the first part of that at least is clearly implausible in light of the comparative evidence as well as of common sense; the second might be tenable, but I'm not sure.

On a not entirely unrelated note: for anyone who wants to explore the maritime terminology of Dellys in greater depth than I've ever been able to elicit, is a wonderful and unexpected resource.

Friday, November 04, 2016

Lingua Franca and Sabir in "Four Months in Algeria" (1859)

I recently finished reading Four Months in Algeria, a travel diary by the English Rev. J. W. Blakesley published in 1859. It's mostly rather superficial - he couldn't speak Arabic, and spent most of his time with French soldiers and German settlers - but enlivened by occasional insights. It contains little content of linguistic interest, but it does contain two brief passages in the pidgin still used for communication between North Africans and Europeans when neither spoke the other's language - call it Lingua Franca, or Sabir. Since it would take a brave creolist to plough through the whole thing just in the slender hopes of finding such material, I reproduce them here.

The first passage (p. 340) comes from the author's description of his journey from El Aria to a place called Embadis, both in the east of Algeria, during the month of Ramadan; it shows a curious combination of French, Arabic, and "classic" Lingua Franca:

The poor muleteers had not tasted food during the whole day ; and as soon as ever the sun dipped, they produced one or two flat cakes, and ate them with avidity, not however without first offering me a sahre. I of course declined to diminish their scanty store, and reminded them that I had breakfasted at El Aria. "Toi makasch tiene carême ; toujours mangiaria," said one of the poor fellows, in the polyglot dialect which is growing up out of the intercourse between the natives and the illiterate European settlers of the interior.*
* There are a few Arabic words which the European children habitually make use of at Guelma, even when playing with each other. Makasch, no, shuiya, gently, I found invariably took the place of the corresponding French terms. On the other hand the Arabs constantly use the words ora, hour, and buono or bueno, good, to one another. Iauh, yes, a Kabyle word, pronounced exactly like the German affirmation, is also very common among the lower orders of Europeans.

In this passage, "toi" (you), "carême" (fast), and "toujours" (still) are French, while "tiene" (have) is Spanish, and "mangiaria" (eat, or perhaps food?) is Lingua Franca (from Italian), and "makasch", being used as a simple negator, is Algerian Arabic makaš ماكاش "there is no" (I discuss the latter's history here). Despite the diversity of the lexical sources drawn on, however, the grammar - simple SVO with no subject-verb agreement - matches better with Lingua Franca than with any of the lexifiers.

The second (p. 419), from a country as yet unconquered by the French, shows no such admixture, corresponding perfectly to earlier descriptions of Lingua Franca in which it often appears as little more than Italian minus the morphology:

More than once have I found in Algeria the conventional civility of the Arab to an European change into an unmistakeable expression of goodwill, when it appeared that I was an Englishman ; and in Tunis a notification of the fact at once drew forth a "Buono Inglese ; non buono Francese," from the mouth of a native.

Tuesday, September 27, 2016

Two funny adjectives (?) in Algerian Arabic

In Algerian Arabic, as in any other Arabic variety, adjectives follow the noun. However, there is one exception to this rule: invariant quja قوجا or qŭjna قُجنا, "a huge". Thus we say ṛajəl kbir راجل كبير "a big man", but quja ṛajəl قوجا راجل "a great big man". Not only does this "adjective" precede the noun it modifies, it requires it to be made indefinite: you can say šrit quja ktab شريت قوجا كتاب "I bought a huge book", but if you want to say "I bought the huge book", there's nothing you can do but use a different adjective. *šrit quja l-ktab or *šrit əl-quja ktab or *šrit əl-quja l-ktab are all impossible. You can make quja قوجا follow the noun, but you have to use a different construction, equally unique to this "adjective": ṛajəl quja mən huwwa راجل قوجا من هو "a great big man", daṛ quja mən hiyya دار قوجا من هي "a huge house". The origin of quja قوجا is clear: it comes from Turkish koca "large; husband", which in turn is apparently an early adaptation of Persian xɑje خواجه "master, gentleman". In Turkish, all adjectives are prenominal, so one could take that to explain its position in Algerian Arabic; but a quick search suggests that Turkish koca has no problem combining with the indefinite (one finds phrases like bu koca dünya "this huge world"). However, it looks like Algerian quja has followed a trajectory very similar to Iraqi and Khaliji xôš خوش. It is not obvious to me why obligatorily indefinite prenominal adjectives should even be possible in a language that otherwise strictly requires adjectives to be postposed, much less why they should have to be indefinite in order to stay prenominal - but that's what it looks like....

The word məskin مسكين "poor (pitiable)" is not so unusual, lexically speaking; it's just about pan-Arabic. It combines just fine with definite nouns, and takes normal agreement (f. məskina مسكينة, pl. msakən مساكن.) However, it has almost the opposite idiosyncrasy: it doesn't take the definite article, which would be obligatory with any normal adjective whose head is definite (and, if it comes to that, with a noun in apposition to a definite phrase as well). Thus we say bwəʕlam məskin maqdərš yji بوعلام مسكين ماقدرش يجي "poor Boualem couldn't come", even though we would say bwəʕlam əṭ-ṭwil بوعلام الطويل for "tall Boualem" (Boualem the-tall). Why? No idea. Suggestions are welcome!

Monday, August 15, 2016

Microvariation in Dellys Arabic

There are plenty of factors that one naturally expects to condition linguistic variation: age, sex, location, class, ethnicity, religion - in short, any variable such that people are more likely to talk with those who match their value for it than with those who don't. Dellys offers clear examples of several of these:
  • Age: There's an obvious gap between the generation born before Independence and those born since then, the latter having had much greater freedom of movement and access to media as well as education. Within my extended family, my father's generation all negate verbs indifferently with ma... ši ما...شي or ma... š ما...ش, whereas their children and grandchildren uniformly use only the latter. Similarly, the older generation use mazəlt مازلْت for "I am still...", conjugating it as a verb, while the younger ones consistently use mazalni مازالني; many of the older generation use -ayən ـاين for the dual (eg يوماين yumayən "two days"), while the younger generation all use -in ـين.
  • Sex: Only women use the exclamation a məħħənti أ محّنتي "oh my goodness!"; only men, as far as I've noticed, use the quasi-expletive jədd جدّ "grandfather" (eg nəħħi jəddu نحّي جدّهُ, approximately "remove the damn thing"). In less integrated French loans, women of my generation or younger use a uvular R, whereas almost all men (and older women) substitute a trill ; this sex differentiation is acquired well before the age of ten.
  • Location: The most salient distinction at a local level is classic in Maghreb dialectology: urban (more or less pre-Hilalian) vs. rural (Hilalian). People from Dellys proper say qal قال "he said" and ṣab "he found"; people from the villages and small towns around it instead say gal and lga.
Such variation is easily understood. But a lot of variation I'm noticing seems to show no such patterning. Out of three brothers, fairly close together in age and all currently working in the same family business:
  • Two have baš باش for "so that"; the third - unlike anyone else I know - uses li baš لي باش.
  • All use lukan لوكان for "if (hypothetical)", but one also uses lakun لاكون and the other yakun ياكون.
Maybe this is somehow explained by their earlier backgrounds - the one who uses li baš لي باش and yakun ياكون had more education, perhaps he picked it up where he went to school, or where he used to work when he was younger? But there are many other variables like this. I similarly don't see any pattern to the choice between bəṛk برْك and kan كان for "only", or yəsħaq and yəsħaj يسحاج for "he needs", or yʊɣləq يُغلق and yəʕləq يعلق for "he closes", or (at least for older speakers)yəqdər يقدر and yənjəm ينجم for "he can". People of the same age and gender, living all their lives less than a kilometer from each other and sometimes even in the same household, consistently use one or the other. Presumably something must explain the difference, but it looks like it would require a pretty intensive social network analysis to find out...

This is actually fairly similar to what Nancy Dorian found for the Scots Gaelic of East Sutherland fisherfolk: "Surprises in Sutherland: Linguistic Variability amidst Social Uniformity". She observes that this kind of variation usually tends to be ignored: "Oftedal, my immediate predecessor in Gaelic dialect studies, noted that the Gaelic of his single source and that of the man’s wife differed in a number of respects, despite the fact that the two had grown up as next-door neighbors; but after noting the existence of such differences in an early footnote, he never referred to the wife’s Gaelic again." While Algerian Arabic is far from endangered, the two situations are not as different as you might think: in both cases, small towns were substantially expanded over the 19th century by rural refugees fleeing land confiscations and wider upheavals, and left to sort out the resulting mess of dialect variation among themselves without that much pressure towards standardization. Perhaps such variables would have correlated more clearly with speakers' background a century ago, and have been left today as relics too scattered by later changes to be assigned a social meaning any longer.

Do these examples of variation seem familiar to you? What kind of individual-level variation have you noticed between friends and family?

Friday, August 12, 2016

Berber feminine nouns in Dellys Arabic: an update

In Dellys, Berber nouns borrowed into Arabic are not very common, and ones that preserve the Berber nominal affixes are even rarer, so I'm always on the lookout for them. A few days ago, listening to my eldest aunt, I heard one that was completely new to me, in an old idiom:
xəlləṭ tazalt u bəḷḷuṭ
خلّط تازالْت وبلّوط
mix up tazalt and oak/acorns (ie mix good with bad)
Tazalt was described as a vine with white flowers; probably the reference is to Cistus (rockrose), whose Kabyle name is tuzzalt, "little iron". Why that would be particularly easy to confuse with an oak tree is beyond me. There are a few other plant and animal names retaining the Berber feminine circumfix t(a)-...-(t), including tirẓəẓt تيرززت (a kind of small wasp), tubrint توبرينْت (a kind of seaweed), taɣanim تاغانيم (a variety of fig, from Berber taɣanimt "small reed"), and originally plural timəlwin تيملْوين (another variety of fig). Otherwise, this circumfix seems to be almost exclusively reserved for abstract nouns referring to negatively judged character traits (see previous posts): eg taɣənnant تاغنّانْت "stubbornness", taklufit تاكلوفيت "meddling", tayhudit تايهوديت "malice", tastutit تاستوتيت "malicious trickiness". An amusing variant on this theme came up recently: taṭnuhist تاطنوهيست "open-mouthed stupidity", presumably a blend of unrecorded *taṭnuhit تاطنوهيست and French -iste. (This in turn derives from ṭnəh "mooring-post", as in "dumb as a post".)