Creating infrastructure for linguistic theory and endangered languages

University of Surrey

University of Surrey

Unit of Assessment

English Language and Literature

Summary Impact Type


Research Subject Area(s)

Language, Communication and Culture: Language Studies, Linguistics

Summary of the impact

Global linguistic diversity is under threat; the theoretical and descriptive work of the Surrey Morphology Group (SMG) raises public awareness of linguistic diversity and produces traditional and digital resources used and valued by communities where endangered languages are spoken.

There is growing recognition of the many reasons, scientific and social, why the loss of linguistic diversity matters. Here we report on our impact on different communities, focussing in particular on Archi, an endangered language of the highlands of Daghestan (Russia). Our Dictionary of Archi, with pictures of cultural artefacts, has changed perceptions of the cultural and social value of this small language, both for the speakers of Archi and for those of surrounding larger languages. In its digital version, our dictionary has brought Archi into global awareness.

Underpinning research

The Surrey Morphology Group (SMG) based at the University of Surrey combines the investigation of grammatical categories in a broad sample of languages with the use of explicit formal and statistical frameworks for the expression of typological and theoretical generalizations.

A key area of our theoretical work is `possible words' and how morphology relates to syntax. Particularly challenging areas include `agreement' and `syncretism' (1). Our investigations of these two phenomena led to cross-linguistic databases ( and maps for the World Atlas of Language Structures (2) (also available at Other projects fed into this broader typological work. Languages were chosen primarily for their theoretical interest, but we did not miss opportunities for beneficial impact on the speaker communities.

In particular Surrey's research on the language of the Archi people and the language of the Saanich community has not only extended the SMG's research base but created real impact in the communities using these languages.

Archi is a Daghestanian language of the Lezgic group spoken by about 1200 people in Daghestan. The language is characterised by a remarkable morphological system, with extremely large paradigms, and irregularities on all levels.

In 2007 the Surrey Morphology Group completed a three-year project which resulted in an electronic Archi-Russian-English dictionary that has been released in different formats. There is a 410 page printed book version mainly for use by the Archi community, with entries in Archi, Russian and English, and a web version with meta data in English, digital pictures of cultural objects and .mp3 sound files for those interested in the languages and cultures of the Caucasus. In addition there are two types of DVD containing sound files for every word form of the lexeme (in.mp3 and .wav format), digital pictures of culturally significant objects, idioms and example sentences with interlinear glossing. Both DVD versions provide morphological information sufficient to produce the whole paradigm of the lexeme (4) (also available at

SENĆOŦEN is the language of the Saanich First Nations community, from the Saanich Peninsula of Vancouver Island, Canada, and neighbouring Gulf and San Juan Islands. Along with five closely related Northern Straits dialects, it belongs to the Salish language family (Central Salish branch). The Northern Straits language is one of 32 indigeneous languages of British Columbia. SENĆOŦEN is a highly endangered variety of Straits Salish.

From 2008 onwards Surrey researchers undertook linguistic fieldwork with Saanich elders and shared examples and select recordings from this fieldwork over the web. The examples are organised in a 3000-sentence database with orthographic and phonemic transcriptions, interlinear glossing and translations into English. Texts and associated recordings are available for use by linguists as well as members of the Saanich community (

The work of the SMG on North West Solomonic further illustrates the relationship between theoretical research and community impact. Key aspects of the grammars of a representative sample of languages of Northwest Solomonic were identified and made available online along with associated audio materials and dictionaries.

References to the research

1. Baerman, Matthew, Dunstan Brown and Greville G. Corbett. 2005. The syntax-morphology interface: a study of syncretism. Cambridge: Cambridge University Press, xix + 281pp.


2. Baerman, Matthew, Dunstan Brown, and Greville G. Corbett. 2005. Five chapters in total in: Martin Haspelmath, Matthew Dryer, David Gil and Bernard Comrie (eds) World atlas of language structures. Oxford: Oxford University Press.

3. Corbett, Greville G. 2006. Agreement. Cambridge: Cambridge University Press.


4. Chumakina, Marina, Dunstan Brown, Greville G. Corbett and Harley Quilliam. 2008. Archi: A dictionary of the language of the Archi People, Daghestan, Caucasus, with sounds and pictures

5. Corbett, Greville G. 2012. Features. Cambridge: Cambridge University Press.


6. Chumakina, Marina & Greville G. Corbett. (editors) 2013. Periphrasis: The role of syntax and morphology in paradigms. Oxford: British Academy and Oxford University Press.


Details of the impact

Surrey's research on the Archi dictionary and the lexical database of SENĆOŦEN were designed with specific community outcomes in mind. The positive effect on the language communities these projects have had will contribute to the cultural enrichment through preservation of cultural heritage.

Researchers at Surrey identified that the Archi community members would benefit from a Cyrillic-based orthography rather than the standard IPA-based orthography, and therefore produced a dictionary that would better suit their needs. This has had a major impact on the Archi community, for the first time they have Archi written in a familiar orthography: the languages of instruction at school are Avar and Russian, both of which use Cyrillic. They now have a dictionary where words for culturally salient artefacts are supplied with pictures. The dictionary registers irregular word forms which are being ousted by regular forms, and provides all the essential morphological information for each word, such as grammatical gender, a feature that young urban speakers of Archi have problems in mastering (their other main languages, Avar and Russian, have the system of three genders, whereas Archi has four).

The result is the first trilingual digital dictionary of a Daghestanian language, raising awareness of Archi both within Daghestan and throughout the world (c, d, e). At a local level, the orthography created has been used by some speakers to write stories in Archi for the first time.

Similarly the project on SENĆOŦEN makes recordings available to a wider audience. We provide transcriptions of recorded dialogues and a 3000-sentence database with orthographic and phonemic transcriptions, interlinear glossing and translations into English. The database also contains lists of verbs and verb roots linked to the sentences and is tagged for grammatical features using standard linguistic terminology. The recordings and database provide members of the SENĆOŦEN speech community with the chance to read and listen to their ancestral language from the mouths of fluent speakers, and provide linguists with the opportunity to explore this typologically interesting language.

For speaker communities we can identify two specific kinds of impact: community skills development and cultural enrichment. For the broader public our work has involved cultural enrichment by production of materials for schools, and through wider dissemination to non-specialist audiences.

Community skills development

  • The community uses the new practical orthography developed for Archi by the SMG and its collaborators. This was achieved as part of a community event in June 2007, when the school and key members of the community were presented with copies of the dictionary.
  • Influential members of the Archi community have learnt how to use both basic software and the specialized Sound Forge software, for editing audio files.
  • The lexical database for SENĆOŦEN resulted from a project which ran during 2011. It was presented to community members on Vancouver Island in November 2011 and was well received with positive feedback in relation to language revitalization (a).

Community cultural enrichment

  • Our work on Archi has raised the esteem felt by the speakers, reinforcing their enthusiasm for their language (c).
  • Work on North West Solomonic was particularly welcomed by the local community, as demonstrated by the letter from the Rorovana Joint Council of Chiefs (b).

Public cultural enrichment

Complementing this work abroad we have promoted the understanding of typology in the UK education system, by producing materials on the topic for schools (see ).

Several years of theoretical work provided the essential base for our field research on endangered languages, including Archi, North-West Solomonic and SENĆOŦEN, complemented by a sustained relationship with the communities.

Speaker communities have benefited by improvements to community skills development and cultural enrichment. Complementing this community work, the University of Surrey has promoted the understanding of typology in the UK education system, by producing materials on the topic for schools.

Sources to corroborate the impact

By its nature, the impact of this work is bottom up. Minority communities rarely appear in the global media. Nevertheless, the evidence of impact is apparent in a range of sources, from individual letters and views of a particular teacher up to newspaper articles.

At the community level:

a) letter from a SENĆOŦEN community representative, 23 July 2012 (Provided statement)

b) letter from the Rorovana Joint Council of Chiefs, 6 February 2007 (Provided statement)

c) Teacher in the Archi village school (Contact details provided)

Evidence of impact on the broader public, in the region of investigation and within the UK and internationally, is seen in the following publications:

d) Article in the Russian magazine Russkij reporter (with a print run of 90 000 copies per week) which talks about Archi:Ольга Андреева, Сколько языков знает Россия: “Русский репортер” №9 (39), 2008. [Olga Andreeva, How many languages Russia speaks],

e) Article in the Avar newspaper Charada:АхIмад Шабанов, Англиядаса ГIалимчужу Арчий, ЧIарада, 19 июль 2008 [A. Shabanov, A researcher from England comes to Archi]

f) Reviews of WALS in newspapers: e.g. the Guardian (2 Aug 2005) and the Frankfurter Allgemeine Zeitung