Log in
UCREL (the University Research Centre for Computer Corpus Research on Language) has been pioneering advances in corpus linguistics for over 40 years, providing users with corpora (collections of written or spoken material) and the software to exploit them. Drawing together 8 researchers from the Department of Linguistics and English Language and 1 from the School of Computing and Communications at Lancaster University, it has enabled the UK English Language Teaching (ELT) industry to produce innovative materials which have helped the profitability and competitiveness of that industry, and assisted other, principally commercial, users to innovate in product design and development.
Worldwide impact on language learners and others has been generated by the development at Lancaster of a ground-breaking natural language processing tool (CLAWS4), and an associated unique collection of natural language data (the British National Corpus, or BNC). Some highlights selected from the primary impacts are as follows:
The pathways to impact have been primarily via consultancy and via licencing of software IP. The impact itself is largely on the language learners—i.e. users of products such as the above. There is a secondary economic impact on a UK SME which has licenced our software.
We used the research consolidated in the British Component of the International Corpus of English (ICE-GB) to build the Internet Grammar of English (IGE), a web-based introductory English grammar; and an app for smartphones and tablets, called the interactive Grammar of English (iGE). The app is based on the IGE website, but was fully updated with new materials and exercises. Both resources have had educational and commercial impact as tools for English language teaching, reaching over 1.2 million users in 2008-2013 through the website and over 34,500 through the app.
Based in the School of English, the Research and Development Unit for English Studies (RDUES) conducts research in the field of corpus linguistics and develops innovative software tools to allow a wide range of external audiences to locate, annotate and use electronic data more effectively. This case study details work carried out by the RDUES team (Matt Gee, Andrew Kehoe, Antoinette Renouf) in building large-scale corpora of web texts, from which examples of language use have been extracted, analysed, and presented in a form suitable for teaching and research across and beyond HE, including collaboration with commercial partners.
The Electronic Text Corpus of Sumerian Literature (ETCSL) is quoted and used in both schools and colleges across the world and read by people without any direct academic connection to the subject: widening access to, interest in, and understanding of Sumerian literature. Sumerian literature is widely known as one of the oldest literatures in the world, inspiring countless studies of world literature and history of religion. The ETCSL has made the bulk of canonical Sumerian literature (c. 400 compositions) available in prose translations and the original Sumerian to both specialists and informal learners for more than a decade.
University of Huddersfield research into corpus stylistics has led to the development of Language Unlocked, a consultancy service that uses linguistic methodologies and interpretative procedures to help public, private, third-sector and non-governmental organisations. Language Unlocked has informed clients' strategic decision-making, communicated their organisational strategies and assisted them in realising long-term goals. Beneficiaries have included Britain's unions, which have reassessed their communications policies; the Green Party, which has revised its policies, manifestos and communications; and a major chemical company, which increased its visibility as a result of carefully worded advertising.
Researchers at the University of Glasgow have created the first freely accessible online database of written and spoken texts in Scottish English and Scots. Together, the Scottish Corpus of Text and Speech (SCOTS) and the Corpus of Modern Scottish Writing (CMSW), both developed at Glasgow, provide over 10 million words of text from a range of sources, complemented by audio and video recordings and digitised manuscripts and documents. They have succeeded in raising interest in and awareness of Scottish English and Scots among the general public: 40% of SCOTS's resources were contributed by the public, and the website achieved 165,000 page views per month at launch. The database is also widely used by commercial lexicographers and professionals in secondary education. It is an `essential data source' for Scottish Language Dictionaries, `in day-to-day use' by the Oxford English Dictionary, and from 2006-2013 has been deployed by school examination boards across the UK (Highers, A-Levels, Cambridge International, and Oxford, Cambridge and RSA exams).
Professor Geoffrey Khan has worked closely with the communities of Assyrian Christians of the Middle East carrying out research on their spoken language which exists in numerous dialects, many of them highly endangered. He has established initiatives to preserve knowledge of these dialects for future generations; raised awareness within the communities of the endangered state of their language, stimulating them to preserve their linguistic heritage and empowering them to become directly involved with the process of documentation of the dialects. Training native non- academic speakers to undertake linguistic fieldwork to gather large quantities of grammatical and lexical data as well as recordings of descriptions of traditional life and various types of oral literature has also been key to this initiative.
The 100-million-word British National Corpus of UK English texts and speech is used regularly and extensively as a reference resource on the contemporary English language. Its users include dictionary makers, school teachers in many countries, teachers of English as a second language, the OCR school examinations board, and many individual writers on the internet as a reference source about questions of contemporary English usage. Its use has led to improved English dictionaries that more accurately reflect actual usage: for Longman, Chambers and OUP dictionaries, use of the BNC provides a unique selling point over their competitors and enhanced educational value to readers. For students and English language teachers world-wide, the BNC provides more realistic examples of the usage of words and phrases in context, and in different registers, free of charge via various online search portals, and thus improved education in English.
Geiriadur Prifysgol Cymru (GPC) is a historical dictionary similar to the Oxford English Dictionary, and is the acknowledged authority on the spelling, derivation and meaning of Welsh words. Apart from its scholarly uses, it is used in all areas of the Welsh public sphere, providing the lexical information necessary to produce terminology for bilingual documentation in fields such as government, education, health, law and business. GPC has always had a network of voluntary readers and informants, and uses both old and new media to seek examples of contemporary usage and to promote public interest in the language. A concise version of the dictionary has been freely available online since 2003, and a full version will be launched in 2014.