Sunday, May 31, 2015

Support your local linguistic olympiad

Dear linguists of the world, 

My name is Hedvig Skirgård and I’m a member of the board of the International Linguistics Olympiad (IOL) and PhD student of linguistics at ANU, Canberra. I’d like to tell you about linguistic olympiads and invite you all as linguists to participate in creating interesting linguistic puzzles for our participants and help spread linguistics to the youth of the world. You can contact us through this form

Linguistic Olympiads are arranged all over the world and started back in 1965 in Moscow. The goal is to make secondary school students interested in linguistics and languages by solving linguistic problems and getting training in linguistics. The contests look quite different in the different countries, but every year we come together for the international contest. The problems our participants solve in the contests are based on real phenomena in the world of languages and linguistics, and from all areas of our field of study. The participants are secondary school students and because of this we posit no prior explicit knowledge of linguistics, but the problems are not entirely based on logic either. We are particularly interested in puzzles based on lesser known languages. I highly encourage you to look at some examples at

The youtube-channel NativLang very sweetly made a short video introducing the contest, you can watch it here:

The contest is very popular, we're growing all the time and it attracts very passionate young people who are interested in languages and linguistics. They do not only broaden their horizons by meeting each other, but also by facing interesting phenomena in actual languages from all over the world. 

What many of these contests need are more ideas for problems, further help creating the problems and sometimes also help in linguistics training for the participants. This is where you as linguists can be helpful, without necessarily involving yourselves in the entire organisation of the contests. If you have any suggestions or would be able to assist as an advisor to a local contest, please contact the IOL via this form and we will relay correspondence.

There exists stable organisations for linguistic olympiads in several countries, but we also have some that are just starting up or that have just begun correspondence with us about entering.  These are the countries that have an IOL-accredited linguistic olympiad: Australia, Brazil, Bulgaria, Canada, China, Czech Republic, Estonia, Hungary, India, Ireland, Isle of Man, Latvia, Netherlands, Pakistan, Poland, Romania, Russia, Slovenia, South Korea, Spain, Sweden, UK, Ukraine and the USA.

These are the countries that have had contact with us, but have not yet been accredited fully: Azerbaijan, Bangladesh, Denmark, Egypt, Finland, France, Germany, Greece, Guatemala, Iran, Israel, Italy, Japan, Kazakhstan, Kenya, Lithuania, Mexico, Nigeria, Norway, Serbia, Singapore, Turkey, Turkmenistan, Uganda, United Arab Emirates, Uzbekistan and Vietnam.

Countries that share a language can collaborate, the English contests currently do but this is of course also possible for the countries of the Francophonie and other sets of countries that share language, or have closely related languages. This is entirely up to the local contests.

You are of course also most welcome to contact us regarding starting up a contest in a country that has not yet been in contact with us.

Feel free to spread this message wherever you think is useful.

All the best, 

p.s. Interesting information about the linguistic diversity of the contest itself: in the national contests participants typically only compete in one language, however in the international contest participants compete in different languages (i.e. receive the problem set in different languages). This year there will be 16 (Bulgarian Czech, Dutch, English, Estonian, French, Hungarian, Japanese, Latvian, Mandarin, Polish, Portuguese, Romanian, Russian, Slovenian, Spanish, Swedish and Ukrainian).  Participants are not required to speak English, however we need at least one of the accompanying adult team leaders to known English or the major language of the country where the IOL is currently being held. If you’d like to know more about this, please visit this blog post.

Thursday, May 28, 2015

Excellent educational video about language history/evolution

The American Museum of Natural History has created a video series about their collections, it's called Shelf Life. In this most recent episode, the 7th, the deal with language history - the trees they form and new methods investigating it that are borrowed from computational biology. They focus particularly on the Uto-Aztecan language family of North America and using very pedagogical illustrations among other things explain edit-distance in trees. This is great, if you're interested in historical/evolutionary linguistics you should tots watch it.

Here below is the actual video, you can also get more material related to the episode here.

I personally am a big fan of museums making their collection of knowledge more accessible to the public, and as a fan of the Brain Scoop and the Field Museum, I was of course particularly delighted that the American Museum of Natural History has decided to create this series, it seems great and I hope you'll like it too!

If you feel like you haven't had enough of nicely illustrated videos on language history/evolution, might I suggest this one from TEDed?

Of course, both of these videos leave important things out (how to deal with contact in a tree-model, what justifies labelling some things as stable, what kind of primary data is actually used, etc), but as introductions I think they do an unusually excellent job!

Tuesday, May 26, 2015

Best Linguist Meme Ever (if I may say so myself)

I must admit, I'm inappropriately satisfied with this meme from the previous post so I'l repost it again in its own post so it can be spread more easily.

When I have to deal with irrealis

Irrealis is a grammatical term that covers a range of functions that are in the non-realised, imagined sphere of events, states and actions, such as future, negation, subjunctive, optative etc. It's most often described as a mood or modality, but not always. (This is related to whether future is a tense or a mood.) Typically, if the marker in question only marks future it is not called "irrealis" but "future" or "non-past" if applicable, but it might be called "irrealis" because neighbouring languages use a cognate to encode other irrealisy functions and such is the convention in that descriptive tradition (which does make sense). And of course, irrealis, future and related functions tend to grammaticalise into each other over time, and create nice fuzzy stages that we linguists can tear our hair out over (finally admitting to ourselves that while we love natural languages, they also drive us crazy).

Basically, there is a lot of different apples that go into this basket and it is often hard when you're reading a description or part of it to know exactly what the author(s) intend with the term. I've put two articles below that I HIGHLY recommend reading if you ever want to work with the term "irrealis".

It's not a totally useless term, there is something funky and interesting going on in the imagined, not realised sphere, as de Haan (2012) also argues. However, under-defined terms are not contributing to our understanding of this, especially when they're applied cross-linguistically without care. Conflicting and confusing terms are actually preferable to under-defined terms. I prefer a diversity of lots of contradicting terms any day over not knowing what the authors meant, obviously.

We need to recognise the diversity that also exists in linguistic descriptive traditions and conventions - understand, accept and make it explicit. More bottom-up descriptions and more explicit definitions. Cross-linguistic standardisation of terms used directly on specific languages without thoroughly critique and investigation is not the answer - it will only make matters worse.

The issue is not with languages themselves, they're perfectly natural and awesome as they are (per definition), the issue is with our frames of understanding and categories - and when they're rigid and/or under-defined. We don't need a standardised set of terms in order to do description and comparison - we need explicitly motivated and argued points with enough supporting evidence.

Sorry for being ranty, I hope you can sympathise or have patience. 

I think it's time to update this old meme, here's a new improved version:

Like I said, I highly recommend reading these two articles:

  • Exter, Mats (2012) ‘Realis’ and ‘irrealis’ in Wogeo: A valid category? in Nicholas Evans and Marian Klamer (eds) Melanesian Languages on the Edge of Asia: Challenges for the 21st Century. Language Documentation & Conservation Special Publication No. 5, pp. 174–190 (free PDF here)
  • Haan, Ferdinand de. 2012. Irrealis: Fact or fiction? Language Sciences 34. 107–130. (free PDF here)

Over and out, for realz.

Monday, May 25, 2015

Help us make our game on linguistic diversity more linguistically diverse (it's easy!)

Dear HWRG-readers.

I'm involved in creating a game about linguistic diversity within the Language In Interaction-consortium. We'd like to make this game itself available in as many languages as possible,  if you'd like you can help us doing that by translating a few phrases into a language that you master. You can do so here.

The game is called "LingQuest". You listen to several recordings of people talking and have to match the ones who are speaking the same language. It’s quite similar to the Great Language Game by Lars Yencken, but we’re also using many lesser-known languages from the DOBES archive.

Here’s a screenshot of the App in development:

We'd like for as many people as possible to be able to play the game, irrespective of if they know English or not. A lot of information today is primarily available in English, it's the working language of most conferences, publications and blogs of linguistics - also those on linguistic diversity. This is understandable since we'd like to make the scientific debate open to as many as possible and English is the most widely spoken language in all parts of Academia (with French, German and a few other languages dominating certain areas). (If you're interested in how many languages linguists speak, see this post here.)

However, it is not necessary that our game and other experiments in linguistics are restricted in this way (nor the linguistic olympiad). This is why we're working on being able to provide our game in many different languages, and for that we'd like to ask for your help! Could you help us translate a few phrases into a language that you know so that the game can be played by people who know that language? We've tried to cut down on the number of phrases needed for the game to work, so hopefully this shouldn't take up much time. We'd be very thankful for your help. Of course we'd prefer it if you translate to languages you are very fluent in, preferably your native language(s). You can indicate your skill level in the form.

Again: thanks,

p.s. We will not at this time be able to include non-spoken languages. Perhaps in future we can figure out a way do accommodate all modalities of language, but right now we regretfully cannot do this.

Monday, May 18, 2015

When someone lumps together all linguists as belonging to the same theoretical school

You know when someone who's not in linguistics think of all linguists as belonging to the same theoretical school of thought? Isn't it frustrating? It's just bad and annoying, as are all needless and incorrect generalisations.

It's also surprisingly common - in my experience especially in fields that overlap with linguistics like psychology and neurology. A good friend, Petter, sent me a link to a fun blog about rap. (I really liked it, go read it - it's good stuff). However, it also contained an example of this phenomena:

In closed-off, inaccessible academic circles, linguist brohs openly hate on computer brohs for creating models of language that are based on “probability” and that don’t take into account the “underlying structure of language”.

This quote does not have to mean "all linguist bros", but it kinda seems like it. Not all linguist bros hate on probabilities. That is a major misunderstanding. Many of us are quite big fans of 'em even.

Hornstein might believe that 2/3 of linguists are generative, but well, I ain't got no way of checking that number but it would surprise me if it were true.

As a person who grew up in a department dominated by functional typology and fieldwork (Linguistics, Stockholm University), went to another similar department (Linguistics, Univeristy of Manitoba), later moved to the MPI in Nijmegen which also is of similar persuasions and now find myself at the Australian National Univeristy which also is not generative-dominated but again rather more focused on typology and field work..  well as such a person it becomes kind of odd when people assume we're all of the generative type.

Perhaps I grew up in a "shielded bubble", but if that's true then it's a pretty large bubble. I even had the opposite problem at times, I heard more that generative theory was bad than I heard about the actual content of the theories and what was good and useful, so I had to learn on my own when in my fourth year I finally had minimalism. (That does not mean I didn't know anything about syntax or grammatical theories before then, I knew plenty I just hadn't done a generative model throughly. There is syntax without generativism.)

Again, I got no numbers on this and I get the feeling Hornstein might be thinking of the USA community only, but well. It's still annoying. Why would all linguists belong to the same school?


Thursday, May 7, 2015

The Closing Conference of the MPIEVA Linguistics Department - Diversity Linguistics: Retrospects and Prospects

This week was the closing conference of the linguistics department of the Max Planck Institute for Evolutionary Anthropology in Leipzig - Diversity Linguistics: Retrospects and Prospects. (Written about before with much excitement by Hedvig here and here.)

The talks were wonderful, presenting results of work done over many years.  This is a partial and idiosyncratic summary of my experience there, which has to leave out talks that I couldn't get to due to schedule clashes in the program.  They are listed here roughly by chronology, or by theme. You can see the list of all talks and access their abstracts here.  A set of photos of the conference is here.

Bernard Comrie summarized the achievements of the department over the past seventeen years, probably the main one being the World Atlas of Language Structures first proposed and built by Martin Haspelmath and David Gil.  This database set the precedent for linguistic databases that followed, including the ones that Russell Gray has now announced building at the new institute in Jena.

Comrie next to a slide of the new MPI in Jena

      Work on language and genetics was summarized by Brigitte Pakendorf on male and female gene flow in Burkina Faso, Khoisan populations and northeastern Siberia.  One key result was that Y chromosome DNA was often not shared between unrelated populations, wheareas mitochondrial DNA would be.  This suggests language contact in these cases has been due to female migration between groups.  The results support a wider generalization, that language families have often been brought by men (invading armies or farmers), and hence Y chromosome DNA often is more sharply distinguished between language families (Forster and Renfrew 2011).  One unusual result was from language contact in Siberia, where male migration had happened between unrelated populations, and a verbal paradigm had been borrowed across languages.  I wondered whether this could be a signal of relatedness between those languages rather than contact (it is sometimes possible for just morphology to survive in cases of language shift, rather than any vocabulary). 

Paul Sidwell presented his family tree of Austro-Asiatic using many years of vocabulary data collection (all freely available to people who email him) and phylogenetic analysis, supported by Russell Gray and Simon Greenhill.  

Søren Wichmann used ASJP data to show language families and track likely migration routes, including comparing rates of migration through different ecological environments.

Søren later composing at the piano

      Michael Cysouw presented his database of over 1700 Bible translations, and the innovative comparative work which can be done using them, such as how apparently simple words such as ‘husband’ or ‘wood’ can vary across Germanic languages in the way that they are used.  The Bible translations are a rich resource, but much of the work needed to exploit them still needs to be done, such as developing algorithms for finding word forms in morphologically complex languages.

Amina Mettouchi presented her corpus of annotated one-hour conversations in thirteen Afro-Asiatic languages, a more sophisticated means of language comparison that is also being pushed by the MPI in Nijmegen.

 Patience Epps presented her data on vocabulary in South America, showing striking patterns of how populations have interacted.  Terms for drugs and alcohol had travelled the furthest, and then terms for bird species, which are often onomatopoeic.

Bernd Kortmann talked on the World Atlas of Varieties of English, showing grammatical differences and the influence of other languages.

Balthasar Bickel used grammatical properties from the AUTOTYP database to show large areas of the world that have historical connections.  The main innovation was the new control for language relatedness using phylogenetic methods (although a simplified version which does not take the family structure into account); and the main result was a striking map which showed the whole of Eurasia as a different colour from Africa and the Pacific, potentially showing deep-time connections between families across the entire continent.  I think some of this appearance of homogeneity might be an artifact of the way that he assigned colours to entire families (e.g. the whole of Indo-European received one colour, if I followed correctly), but the approach is interesting.

 Harald Hammarström presented his exhaustive analysis of basic word order: literally every language in the world has now been checked for its basic word order (around 5200 data points, 82 languages with no access to information, and the remaining known to have no data on word order).  He also quantified the roles of inheritance, universal tendencies and language contact, with intriguing results although perhaps underestimating the role that language contact played, as well as language relatedness beyond known language families.

      A more purely linguistic question was the problematic notion of ‘affix’, discussed first by Matthew Dryer using a new survey of a few hundred grammars, and by Susanne Maria Michaelis who showed the inconsistency of this term when describing creole languages in the APiCS database.

A couple of talks I went to had brilliant hypotheses even if the approach was flawed.  David Gil presented the craziest set of experiments that I’ve ever seen, with some photos of him carrying them out on his laptop with people from the street in Naples to hunter gatherer communities in Papua New Guinea and South Africa.  His claim was that some languages allow flexibility in the interpretation of sentences, such as allowing the (in English) nonsensical sentence

‘The clown is drinking the book’

to be interpreted as meaning

‘The clown is drinking while reading the book’.  

In English this interpretation is not possible, but he claimed that many languages do allow this flexibility of interpretation – in his terms, less strict compositional semantics.  He would show a participant two pictures (a clown drinking while reading a book, and an unrelated picture), and ask the participant to choose which picture matched the sentence, or whether neither or both of them did.  An English speaker would choose the option ‘neither’.  Speakers of other languages would often choose the picture of the clown drinking while reading a book.  Gil tested speakers of Khoe languages, Italian dialects, Japanese, Cantonese, and various languages in Indonesia.  

The results were striking and intriguing (in Paul Heggarty’s words) – languages tended to be more flexible in interpretation when they were a non-national languages, such as Khoe languages; or a little less flexible if they were a non-standard variety of the national language (such as Neapolitan Italian); and even less flexible if they were a national language, such as Italian.  Gil proposed that languages have been culturally evolving to be more strict in their rules for compositionality, and that this is how languages evolved at an even greater time-depth from the first human proto-languages.  Paul Heggarty pointed out in the questions that the results might mean that languages vary in how prescriptive they are, rather that how semantically flexible they are.

Others pointed out that the experiments were badly designed in that they did not reflect actual usage by speakers.  As he said in response, his hypothesis is in fact testable using conversational corpora, such as the collection of conversations in Nijmegen of languages from Papua New Guinea, South Africa, Mexico and South America; if he is right, the smaller the language community, the more likely there is to be flexibility in the interpretation of sentence meaning.

A similarly intriguing but controversial talk was by Johanna Nichols presenting work arguing for ancient migrations of a population in eastern Eurasia (the ‘Pacific Rim') westwards, influencing every other language family in Eurasia.  She presented striking correlations between various typological properties and longitude - the further east you go in Eurasia, the more likely that you have a particular structure.  However, these correlations may arise in some cases from non-independence of related languages: you can get a correlation between Indo-European properties and longitude, for example, if you are not controlling for language relatedness (or perhaps even if you are, given that Indo-European properties can travel by local instances of language contact such as with Basque, Dravidian and Finno-Ugric).  As with David Gil’s talk, I see her ideas as ideas to work on and test more rigorously, although intriguing in the hypothesis put forward.

Russell Gray gave the closing talk, ‘Think Big!  The Bright Future of Linguistics.’  In it he described methods of dating the spread of Indo-European, after the renewed discussion in two papers published in Nature and Language in January, and new possible explanations for patterns of language diversity in Australia.  He announced plans at his new institute in Jena to build large databases of vocabulary and language structures (the plan is to code 3000 languages for 200 structural properties), under the general name ‘GlottoBank’.  He challenged the audience to think of the ‘Hilbert problems’ of linguistics (after the mathematician David Hilbert, and the non-related linguist Martin Hilpert who presented at the SLE on this topic, covered by Hedvig here); and covered other exciting projects such as fieldwork in Vanuatu and incipient work on the question of why some places in the world are more linguistically diverse than others.  He also announced the completion of two new databases, one on Pacific religions, and one on cultural traits in 1300 populations worldwide, which will be online later this month.

Three days ago there happened to be an opinion piece in the New York Times that gives a depressing picture of what academic conferences in the humanities are often like.  This conference was the exact opposite of everything that it lists.  Rooms were routinely packed, with people standing outside the door or sitting on steps, every seat filled.  Talks were clear and well presented (talks by Brigitte Pakendorf, Östen Dahl and David Gil were entertaining in particular).  Many people said that the atmosphere was like a reunion or a family gathering, and the discussions outside of talks were stimulating as well.  This may partly be because of the unusual context of the conference, but it is also because comparative linguistics has grown into an exciting field, with much of the current trend towards collaboration and building large databases inspired by the Linguistics department in Leipzig.  Farewell and thank you to them!

Friday, May 1, 2015

Goodies from grammar reading returns - also: wallabies!

I've noticed that there are way too few posts with Goodies from Grammar reading on this blog nowadays. When we posted only on tumblr, that tag was quite common, but since we've moved to blogger, we've only got one post tagged (goodiefromgrammar-tag). Let's do something about that, after all we are humans who read grammars!
For newcomers: we're interested in systematic cross-linguistic comparison (aka linguistic typology). To get at some of the information needed for that kind of research, one needs to consult already existing descriptions of language, preferably grammars. Of the 7000+ languages in the world, more than 4000 have at least a grammar sketch or even a longer grammar. There's tons to read.

We're humans who read grammars because one could also have computers reading grammars or get similar information from comparing parallel texts. We're less interested in large languages like Swahili, Hindi, Spanish, Punjabi Arabic and Russian and more interested in a diverse sample of smaller languages.  We're not (necessarily) reading grammars to learn the language ourselves, but to understand and study it appropriately.  There are special grammars called "reference grammars" that are written in special linguist-lingo for us to communicate clearly with each other. We sometimes use those, but for many languages without reference grammars, we have to use whatever there is.

So, this time I bring you two goodies from a conversational course (not a grammar per se, but still) on Gupapuyngu [guf, gupa1247]. One about a coping mechanism of a lot of adults and the second one about a desire to improve ones life. Gupapuyngu is a Pama-Nyungan language of Australia spoken in the Northern Territory of Australia. Ethnologue counts 300 native speakers and 950 second language speakers.  The language is spoken by Yolŋu people, who are happy to talk to you here on the internet at this website.

Just like many other languages of Australia, this language is no stranger to nasals, [r]s and [l]s ^^! Underscored l and t means that they're retroflexed. I won't go through the other orthographic conventions, that's the main one for these two examples.

(1) Gupapuyngu (Lowe 1975: lesson 94, page 1)
Original: moŋalana ŋarra ŋunhiyinydja ŋunhi barpuru ŋali ga ŋarrtjunmirri
Translation: I have forgotten what (lit. that which) we were arguing (about) yesterday

(2) Gupapuyngu (Lower 1975:lesson 99, page 2)
Original: ŋarra yurru marrtji Bälmalili märr (ga) wetimirrilili
Translation: I'm going/I'll go to Bälma because there are wallabies there (the adjective wetemirri 'having wallabies' describes the place Bälma and has the same suffix -lili. The idea seems to be "I'm going to Bälma because I want to go to a wallaby-inhabited place")

Below is a picture of a wallaby in the Gurian National Park at Arnhem Land in Australia's Northern Territory. For the curious, Bälma is a place close to the Koolatong river in Arnhemland. You can see its exact position here

Lowe, Beulah M. 1975. Gupapuyngu conversational course. 102 lesson Adult Education Centre (Galiwinku, N.T.)