Sunday, February 19, 2017

A real-life subjacency problem sentence

There are some kinds of questions and relative clauses that you just can't form without resorting to a resumptive pronoun, even in languages - like English - that otherwise don't allow resumptive pronouns to begin with. Ever since Ross (1967) came up with a typology of "island constraints", syntacticians have hotly debated both which ones these are and how to account for them.

Unfortunately, real-life examples of people trying to say such things are very scarce on the ground. As a result discussion of this phenomenon tends to be dominated by artificial examples. Much of the literature on subjacency inadvertently demonstrates how unsatisfactory the result can be (as discussed here: 1, 2). Every once in a long while, however, you find a completely spontaneous case of someone running up against such constraints - and here's today's, courtesy of some person on Reddit:

Step zero: find a couple million complete and utter morons, who it's a miracle they can breathe in and out without f***ing it up, to support you.

Normally, a relative clause starting in "who" would have no overt subject within the clause itself apart from "who", as in:

Step zero: find a couple million complete and utter morons, who in all honesty Ø can barely breathe in and out without f***ing it up, to support you.

But that's impossible here: note the ungrammaticality of:

*Step zero: find a couple million complete and utter morons, who it's a miracle Ø can breathe in and out without f***ing it up, to support you.

Instead, you end up having to fill the subject position to which "who" refers with a resumptive pronoun "they".

Thursday, February 09, 2017

Romance languages in 17th century North Africa

In 1609, 117 years after conquering Granada, the Spanish state decreed the expulsion of all "Moriscos" - that is, everyone descended from Muslims forcibly converted to Christianity, numbering in the hundreds of thousands. In the 1720s, a century later, two separate travellers - Jean-André Peyssonel and Francisco Ximenez - found that a number of towns in Tunisia, including Testour, Bizerte, and Tebourba, were Spanish-speaking, inhabited by the descendants of these refugees (as I was surprised to learn from Vincent 2004). According to Peyssonel, for example, "the inhabitants of Tebourba practically all speak Spanish there, a language which they have conserved from father to son"; referring to the same town, Ximenez adds "immediately after their arrival from Spain, they had schools in our language. They were insultingly told they were not real Moors, and the Bey took away their books and their schools; after that, they little by little forgot Spanish and learnt Arabic." All in all, the reports seem compatible with a three-generation pattern of language shift: the people they met still spoke Spanish, but were likely mostly not to pass it on to their children, as they became more closely integrated into the wider society of their new home.

In 1627, a couple of decades after the expulsion of the Moriscos, a corsair ship from Algiers raided Iceland, capturing a couple of hundred unfortunate villagers, one of whom left a description of his experiences. While the distance travelled in this raid was unusual, the practice itself was less so: the capitals of the Barbary states were full of European slaves captured by state-sponsored pirates, waiting for ransoms that might never come. Likewise, many North Africans were captured and held as slaves in Europe (see eg Wettinger 2002 on Malta): describing Algiers in 1612, Diego de Haedo comments that "there are many Muslims who have been captives in Spain, Italy and France" and hence speak those countries' languages (Vincent 2004:107). To further complicate matters, not all immigration from Europe was involuntary: Haedo adds that "There are also an infinite number of renegades [converts to Islam] from these countries and a large number of Jews who have been there, who speak polished Spanish, French, or Italian. The same holds for all the children of renegades who, having learned their national language from their parents, speak it as well as those born in Spain or in Italy."

In brief, 17th-century North Africa contained plenty of European immigrants - some refugees, some captives, and even some voluntary - learning the language spoken around them while maintaining, for a while, the language they had arrived with. What impact did this have on Maghrebi Arabic and Berber? Unfortunately, it's not easy to date Romance loans into either, but we can safely assume that some of the precolonial loans arrived in this period. A good dialect map, in combination with historical data on where these groups ended up, might help identify such loans more precisely - but that doesn't really exist yet, except to some extent for Morocco (Heath 2002).


Vincent, Bernard. 2004. In Jocelyne Dakhlia ed., Trames de langues. Usages et métissages linguistiques dans l’histoire du Maghreb, Tunis-Paris, IRMC, Maisonneuve & Larose, 2004, 561 p.

Saturday, February 04, 2017

Why the sun really does rise

In response to someone comparing "alternative facts" to science fiction, the eminent science fiction writer Ursula LeGuin recently wrote:
The test of a fact is that it simply is so - it has no "alternative." The sun rises in the east. To pretend the sun can rise in the west is a fiction, to claim that it does so as fact (or "alternative fact") is a lie.
The comments (never read the comments!) include several people trying to be smart by pointing out that, actually, "the truth of the matter is that the sun does not rise, but rather that the Earth turns". This apparent conflict is worth unpacking from a descriptive linguistic perspective.

All fluent speakers of English use phrases like "The sun rises in the east". They also use phrases like "Hot air rises." The commenter quoted previously seems to be applying something like the following reasoning:

  • When something (eg hot air) rises, it moves upwards away from the earth.
  • When the sun "rises", it's not moving upwards away from the earth - rather, the earth is turning relative to it.
  • Therefore, the sun does not actually rise.
A lexicographer will immediately see at least one ironclad way to vitiate such an argument: identify two distinct senses for "rise". Rise1 means "to move upward away from the ground", while rise2 means "for a celestial body's apparent position to come closer to the zenith" (or something along those lines.) The sun rises2, but it doesn't rise1.

But not so fast! It's perfectly plausible that someone could believe the earth is stationary and the sun physically moves upwards when it rises. For someone holding that belief (or even just using that mental model without necessarily believing it), "rise" could easily have a single sense, not two different ones. Is there any language-internal evidence that "rise" has two senses?

As it happens, there is: look at antonyms. We say "The sun sets in the West", but "Hot air sinks" (and "Empires fall", but that's another story); you can't say "*Hot air sets". "Set" is the antonym of rise2, but not of rise1. That seems like a pretty good reason to assume that, even for flat-earther speakers of English, the two senses are lexically distinct. So it looks like Ursula LeGuin wins this one, as you might expect.

Wednesday, January 25, 2017

Tigre between ejectives and pharyngealization

There is some debate over the original pronunciation of the "emphatic" consonants (Arabic ط ض ظ ص ق) in Semitic and more generally in Afroasiatic: were they ejective as in Amharic, or pharyngealized/uvular as in Arabic? For a number of reasons, such as that in proto-Semitic they did not show a voicing contrast, the general opinion is that they were glottalized. Yet pharyngealized consonants show up not just in Arabic and neo-Aramaic but even in Berber, which would on the face of it suggest that the feature predates proto-Semitic. Either we have to suppose independent parallel development, or we must assume that Berber ejectives turned into pharyngealized consonants under the influence of Arabic. The latter seems more probable, but only if we can show that it is indeed plausible for a language to make such a change as a result of widespread bilingualism in Arabic.

It turns out that Tigre, the main language of northern Eritrea, offers a concrete example of just that. The inland plateau dialect of the Mansa`, commonly considered as standard, is described by Raz (1983) as having four ejectives k' (usually [ʔ]), t', s', and č̣ , and no pharyngealized or uvular consonants. You can hear an example of standard Tigre here, which seems consistent with his description. The coastal Hirgigo dialect spoken around Massawa, however - as heard in these Learn Tigre YouTube videos, however, show a rather different situation. ḳ is simply [q] (as in "elbow", "neck", "thigh"), ṭ is [tˤ] (as in "goat"), ṣ is [sˤ] (as in "white", "black", "back"); only for č̣ can you occasionally hear a slightly ejective realization [tʃ] ~ [tʃ'] (as in "fingers" or "fingernails"). The result is a good deal easier for an Arabic speaker to pronounce! This should not be too surprising: the port of Massawa has had extensive contact with Arabic speakers for many centuries. In fact, it's said to be the place where some of the first Muslims, seeking refuge from the persecution they were suffering in Mecca, landed on their way to the Abyssinian court. Such a diversity of emphatic consonant realizations within a single language confirms in turn that it is plausible for the habit of pharyngealizing emphatic consonants to be transferred from a language to its neighbors.

Saturday, January 21, 2017

Semitic languages in two Arabic novels

I've been reading two novels in Arabic lately. Frankenstein in Baghdad, by Ahmad Saadawi, reimagines Baghdad's descent into chaos in the mid-2000s, blending gritty realism with semi-allegorical horror. Samraweet, by Hajji Jaber, is an altogether gentler but still cutting narrative of the Eritrean diaspora, interleaving scenes from the narrator's life in Jeddah with ones from his first visit to Asmara as he gradually realizes the difficulty of being part of either place. Both turned out to share a feature I hadn't been expecting to find: dialogue in other Semitic languages.

In Frankenstein in Baghdad, one of the main characters is an elderly Assyrian woman, Elishawa "Umm Daniel". All her relatives have long since moved abroad, and keep begging her to come live with them where it's safe, so there are few occasions for her to speak anything but Arabic. However, the Assyrians of northern Iraq traditionally speak a variety of Neo-Aramaic, and when she meets her grandson from Melbourne, they have the following fairly elementary conversation (pp. 276-277), which I hope I've transcribed correctly:

"داخي إيوَت؟" (Dāx īwat?) "How are you?"
"سباي إيْوَن باسيما" (Spāy īwan basīmā) "I am fine, thanks."

The author of the book seems to be from southern Iraq, so I found it remarkable that he took the trouble to get some Neo-Aramaic dialogue - especially since the copula is appropriately put in the feminine form both times (in Assyrian Neo-Aramaic, even the 1st person singular copula agrees in gender). Probably he felt it would enhance her symbolic status as a reminder of what Iraq once was. Unfortunately, while Aramaic has been spoken in Iraq for almost three millennia, its prospects there are dim: after all these years of war and frequently persecution, most speakers live in Western cities, and unless they're exceptionally good at remaining a distinct, cohesive immigrant group, their descendants seem more likely to speak English or Swedish than Aramaic.

In Eritrea, unlike Iraq, most people have as their first language a Semitic language other than Arabic: Tigre in the north, Tigrinya in the south. So it was less surprising to find an Asmaran waiter on the first page saying "سنّي ما سيام" (?Senni mā syām), which I assume from context means something like "Good afternoon!" However, the occasional glimpses provided into Eritrean sociolinguistics were more eye-opening. The narrator and most of his friends are from a Tigre-speaking background and know how to speak it, but Tigre per se seems to play little part in their linguistic identity. They grew up not only speaking Arabic in the street, but feeling that Arabic is an Eritrean national language, and resenting the government's treatment of it as less central than Tigrinya. When an Eritrean in Jeddah speaks Tigre with him, the narrator assumes it's because he only arrived recently until he finds out, to his surprise, that this person simply "enjoys speaking it, even in Jeddah" (p. 76). It would be interesting to see how this compares to the attitudes of Tigre speakers living in Eritrea: between the prestige of Arabic and the status of Tigrinya, what are the long-term prospects for Tigre?

Saturday, January 07, 2017

Of words and pens

In Algerian Arabic, this is a stilu ستيلو - a word instantly recognizable as a borrowing from French stylo:

In Standard Arabic, on the other hand, as any Algerian learns in primary school, it's a qalam قَلَمٌ. This, as it happens, may also be a borrowing, though a much older one; compare ancient Greek kálamos κάλαμος "reed, reed-pen", which apparently has an Indo-European etymology. Clearly, either pre-modern Algerians were so sunk in illiteracy as to have forgotten the word for a pen altogether, or they replaced a pre-existing word for pen with a French borrowing - right?

Well, no. In the Middle Ages, there weren't too many fountain pens or biros around. Classical Arabic qalam referred to something more like these:

Any Algerian who went to Qur'anic school up to the 1960s or so will remember this - a simple reed pen anyone can make using nothing more complicated than a sharp knife. (The Algerian version was a bit different than those in the picture, as it happens - usually people would use a quarter-circumference of a large reed, not the whole circumference of a small one.) More than that, they will remember what it's called: qləm قلم. There are probably people in Algeria who still use these, and very likely they still call them that.

But no one calls a modern industrial pen qləm. When industrial pens were introduced, sometime in the 19th century, ordinary Algerians ended up classing them as a new object, quite distinct from the reed pen despite its similar function, and deserving of an unrelated name. The guardians of Standard Arabic, on the other hand, decided to extend the reference of qalam to cover both. It may be no coincidence that French distinguishes calame from stylo, like Algerian Arabic, whereas English, like Standard Arabic, treats both as diferent types of pen.

Historical linguists regularly use lexical reconstruction to shed light on technological history, an approach called "Wörter und Sachen". This approach has been very fruitful in many cases. But, as this case illustrates, there are some pitfalls to watch out for: whether something counts as the same object or as a new one is a rather culture-bound question, and if investigators impose their own ideas about this on the situation they are investigating, they will get the wrong answer.

Tuesday, December 27, 2016

Too strong to get out

At four, my nephew speaks English (his dominant language) very well. He still shows some interesting divergences from the standard of those around him, though. Some are influenced by German (a close second): he uses "mine" as a determiner in English (like German "mein") rather than "my", saying things like "mine house". Others seem to result from language-internal overgeneralization, as when he said:
  • If I push the Lego box then the carpet will destroy. [intended meaning: be destroyed]
Presumably, he's interpreted "destroy" as a labile verb, like "open" or "burn".

At first blush, I thought the following sentence was another example of overgeneralization:

  • I'm too strong to get out, so you can't. [intended meaning: I'm too strong for anyone to get me out]
However, reflection suggests that this ought to be perfectly grammatical in English, since "get out" is already labile. "This stump is too heavy to pull out" works fine, so why not "I'm too strong to get out"? Yet, for me at least, the clause immediately receives a pragmatically absurd interpretation with "I" as the subject of "get out", and the obviously intended interpretation is barely accessible even when I've consciously concluded that it should be grammatically acceptable.

In terms of the classic Chomskyan analysis of control, the two interpretations correspond to different unpronounced pronouns PRO:

  1. Ii'm too strong [PROi to get out]
  2. Ii'm too strong [PROarb to get PROi out]
A lot of linguists really dislike the idea of an unpronounced pronoun. Whatever its psychological merits, though, this analysis has the advantage of suggesting why the first interpretation comes more easily than the second here: it only involves one empty pronoun, whereas the desired interpretation needs two. So if anything is going wrong in this sentence, it's not so much the syntax as the pragmatics: an adult speaker might be more aware that listeners could have trouble processing a clause of this form, and avoid it in favour of something less ambiguous. That would need empirical checking though.