Wednesday, July 25, 2007

Writing codas, from Sylhet to Winnipeg

In Greek-based scripts (like Latin or Cyrillic), unless a consonantal letter is followed by a vowel letter, it is assumed not to be followed by a vowel. This seems natural enough if you're used to it; but if you look at it differently, it's rather wasteful. The commonest sound to follow any given consonant is usually a vowel, not another consonant, so if you allow a single letter to represent a consonant plus a vowel you're saving space and effort.

But if you do that, then how do you represent the fact that a consonant is not followed by a vowel? Different writing systems use different solutions. In alphabets that have stuck more closely to their Canaanite prototype, like Arabic, Hebrew, Syriac, or (traditional) Tifinagh, you normally don't bother: a consonant may be followed by a vowel or may not, and you rely on the reader to figure it out. However, sometimes the reader needs additional cues: maybe the word you're writing is obscure, or two words have the same consonants, or it's very important that the text be read exactly right with no possibility of error. In that case, in Arabic, Hebrew, and Syriac, you mark what follows each consonant with a little sign above or below the letter - one sign for "a", say, another for "i", and another to indicate that nothing follows it. Such a sign is necessary if you're still mainly using the system with no vowel marking, because if you left the letter unmarked it would mean not that the letter had no vowel but that what vowel, if any, followed the consonant should be deduced from context.

Typical Indic scripts, such as Devanagari (the script used for Hindi and Nepali), adopt a rather different solution. A consonant letter on its own is to be read with a default vowel, short a ([ʌ]); a consonant followed by a consonant is written as a single "conjunct" letter, formed in any of several ways, but usually by either putting the second letter underneath the first or taking away a line on the right of the first letter and joining it to the second. On the plus side, this yields much of the compactness of a vowel-optional system without any of the ambiguity, and means that each letter is pronounceable on its own; on the minus side, this means fonts have to include a much larger number of letter forms.

Sylheti Nagri is an Indic script formerly (up to the 1950s or so) in use in the district of Sylhet, in eastern Bangladesh. Like Devanagari, it represents consonant-consonant sequences using conjuncts. However, its users were often also familiar with the Arabic script, where letters could be combined into ligatures whether or not they had vowels between them. This may have inspired them to do something rather unusual for an Indic script: develop vowel-consonant conjuncts, such as a+m, a+l, i+n... and consonant-vowel-consonant conjuncts, like pi+r, mo+t... In fact, judging by the examples in the Unicode proposal, it seems that, for at least some historic users, Sylheti did not have a conjunct system at all, just a ligature system.

One very nice solution is that adopted in Canadian Syllabics, the family of writing systems used by a number of Native American tribes in Canada. The name is potentially misleading: I prefer to reserve the term "syllabary" for writing systems like hiragana, where different syllables differ from each other unpredictably. In Canadian Syllabics, for example Cree, the shape of a symbol represents the consonant, while its orientation represents the vowel that follows it, and length or labialisation may be represented by dots. If no vowel follows the consonant, then the base shape is simply written small and superscripted, using the a-orientation, or for labialised consonants the u-orientation.

Monday, July 16, 2007

Language endangerment in Yorkshire

Several members of the British Parliament took a few minutes out from worrying about issues like Iraq, the housing shortage, and global warming to put together an Early Day Motion expressing their concern about the fate of the Yorkshire dialect:
That this House is concerned at the recently published research indicating that words are disappearing from the Yorkshire dialect because of the influence of the internet, social mobility and globalisation; and furthermore supports the work of the Yorkshire Dialect Society in continuing to promote what is, after all, the best English regional accent in the world.
The amendments proposed are also worth a look, featuring such phrases as "a slow national convergence towards the monochrome mush of effete estuarial English". For what it's worth, I am rather inclined to agree that Yorkshire may have "the best English regional accent in the world" - although two MPs proposed to amend this to "after the Lancashire accent" - and I'm glad to see a bit of appreciation for dialectologists, but I find it difficult to be all that concerned about the loss of a few well-documented local words (supposedly due to people watching national media and getting out more) in a fairly widely spoken dialect of one of the world's most flourishing languages, when whole languages are disappearing virtually undocumented every month due to factors like kidnapping children or beating them when they speak their language.

Thursday, July 12, 2007

Harun ar-Rashid and the Golden Apples of the Hesperides

I recently heard a rather good folk tale from my father about the adventures of (a completely mythologised) Hārūn ar-Rashīd during his foreordained seven years of hardship, living as a poor man dressed in goatskin nicknamed Bou-Krisha (بو كريشة). One element of the story fits nicely with the previous two posts' theme of cultural survivals from the classical era. The king gathers his sons-in-law and his would-be son-in-law Bou-Krisha, and tells them that he is terribly ill, and to cure him they must go and bring him:
ət-təffaħ ən-nifuħ التّفّاح النّيفوح
əlli yṛədd əṛ-ṛuħ اللي يردّ الرّوح
m-əs-səb`a jbal مسّبعة جبال

the fragrant apple
that restores the soul
from the Seven Mountains
For the tale's purposes, of course, all that matters about this evocative phrase is that it refers to something that it will take a long and arduous quest to get. But the historically minded listener may be excused for speculating on the phrase's origin.

Etymologically, the phrase is mildly interesting. nifuħ is unexpected, and possibly distorted to fit the rhyme - a more normal term, with obvious Classical Arabic origins, would be nəffaħ; it might have arisen by contamination from əlli yfuħ “which smells” (especially since əlli in Kabyle is ənni.) But it may be possible to look deeper.

Ceuta (Arabic səbta سبتة) is an ancient Moroccan port town at the edge of the Straits of Gibraltar which has been part of Spain since 1668. Its name derives from a longer Latin one - Septem Fratres, the Seven Brothers, said to be a reference to seven hills around the city; it was a wild area, among the last places in North Africa where elephants were found (as noted by Pliny.) And the region around the Straits of Gibraltar is where the gardens of the Hesperides were supposed to be located - where the Golden Apples grew. Is ət-təffaħ ən-nifuħ one of the Golden Apples?

Friday, July 06, 2007

Berberised Afro-Latin speakers in Gafsa

One reader of my last post asked how late Latin (or some descendant thereof) continued to be spoken in North Africa. The answer is, pretty late: the latest attestation I came across on short notice seems to be in the major medieval geographer Al-Idrisi (12th century) who, describing Gafsa in southern Tunisia, notes that:
وأهلها متبربرون وأكثرهم يتكلّم باللسان اللطيني الإفريقي.
Its inhabitants are Berberised, and most of them speak the African Latin tongue.
He even gives one word of their dialect:
ولها في وسطها العين المسماة بالطرميد.
In the middle of the town is a spring called the ṭarmīd (perhaps to be related to Latin thermae).
One interesting thing to note about this statement is that he said that the town was Berberised - in other words, that, in the very century when the Banū Hilāl were rapidly spreading through Tunisia and Libya (a subject he has fairly harsh things to say about), Berber culture was prestigious enough to be adopted by members of other cultures, in particular the remaining Roman or Romanised towns, in the area. Gafsa, of course, speaks Arabic now, but several nearby villages still spoke Berber in the 1800s, and two, Sened and Majoura, well into the 1900s.

Wednesday, July 04, 2007

Chenanith b'Libya - in the 11th century AD?

Anyone interested in North African languages who doesn't speak Dutch should immediately check out Bulbul's posting on Latino-Punic. The Phoenicians brought their language with them to North Africa when they founded Carthage and other cities. Carthage was destroyed, of course, but many other cities continued to speak Phoenician for longer; however, like Arabic in more recent times, it changed a lot under Berber influence, and this later dialect is usually called Punic. This language was spoken by St. Augustine, who quotes a number of Phoenician words, such as salus (< shalu:sh < shalo:sh < shala:sh < thala:th) "three", in his works. In eastern Libya, as it happens, Punic continued to be written even after the Phoenician alphabet was forgotten; this body of inscriptions, using the Latin alphabet to write Punic, is called (logically enough) Latino-Punic, and a comprehensive database of such inscriptions is available from Leiden. Recently, as Bulbul points out, a thesis was submitted at Leiden on Latino-Punic and its Linguistic Environment; I would love to read it.

The twist in this tale is that Phoenician may have survived into the 11th century AD! Al-Bakri (whom I've mentioned before) enigmatically says of the inhabitants of Sirt in Libya that:
لهم كلام يراطنون به ليس بعربي ولا عجمي ولا بربري ولا قبطي ولا يعرفه غيرهم
‍They have a speech in which they jabber which is neither Arabic nor Ajami (by which he probably means Latin but might mean Persian) nor Berber nor Coptic, which no one but them knows.
The location (in eastern Tripolitania) is about right for it to be Punic, and if it were Greek you would expect him to know, considering he cites (more or less correctly) the Greek etymology of طرابلس (Tripoli) in the next page. So was Punic still spoken in the 11th century? Your guess is as good as mine, but it looks plausible.

Tuesday, July 03, 2007

Galileo's sociolinguistics and free software

Just came across an interesting quote from a law professor in the free software movement on Europe's shift away from diglossia:
[W]ith the name Galileo Galilei, we associate two of the most important cultural responses to the quandary of possessed physics.

The first is an insistence upon freedom from censorship, that is "e pur si muove" -- determination to prohibit the ownership of physics by an entity rich enough and powerful enough to define its physics as the only permissible physics, the only available physics, for most ordinary people. And second, the first significant attempt in the history of the West to write scientific literature at the state of the art in a vernacular language, accessible to everyone.

Galileo Galilei's decision to publish in Italian is as important as his decision to risk confrontation with the Church, for what it says about the fundamental pillars of free science in the history of the West. Not merely, in other words, an insistence upon the freedom of ideas to work their will in skilled hands, but a determination that the ideas which motivate the world, which explain its behavior and which render it controllable, should be universally accessible to people regardless of their ability to acquire enough social surplus to have Latin.
I'm not sure whether the details of his account are accurate, but this has always been one of the strongest arguments against diglossia. The availability of universal free education goes a long way to mitigating the problem; but it's still not cost-free, since all the time devoted to learning the high language is time that could have been devoted to learning something else. (Of course, there are also practical issues regarding the quality of teaching provided - but that's another story.)