Log in
UCREL (the University Research Centre for Computer Corpus Research on Language) has been pioneering advances in corpus linguistics for over 40 years, providing users with corpora (collections of written or spoken material) and the software to exploit them. Drawing together 8 researchers from the Department of Linguistics and English Language and 1 from the School of Computing and Communications at Lancaster University, it has enabled the UK English Language Teaching (ELT) industry to produce innovative materials which have helped the profitability and competitiveness of that industry, and assisted other, principally commercial, users to innovate in product design and development.
Worldwide impact on language learners and others has been generated by the development at Lancaster of a ground-breaking natural language processing tool (CLAWS4), and an associated unique collection of natural language data (the British National Corpus, or BNC). Some highlights selected from the primary impacts are as follows:
The pathways to impact have been primarily via consultancy and via licencing of software IP. The impact itself is largely on the language learners—i.e. users of products such as the above. There is a secondary economic impact on a UK SME which has licenced our software.
Researchers at the University of Glasgow have created the first freely accessible online database of written and spoken texts in Scottish English and Scots. Together, the Scottish Corpus of Text and Speech (SCOTS) and the Corpus of Modern Scottish Writing (CMSW), both developed at Glasgow, provide over 10 million words of text from a range of sources, complemented by audio and video recordings and digitised manuscripts and documents. They have succeeded in raising interest in and awareness of Scottish English and Scots among the general public: 40% of SCOTS's resources were contributed by the public, and the website achieved 165,000 page views per month at launch. The database is also widely used by commercial lexicographers and professionals in secondary education. It is an `essential data source' for Scottish Language Dictionaries, `in day-to-day use' by the Oxford English Dictionary, and from 2006-2013 has been deployed by school examination boards across the UK (Highers, A-Levels, Cambridge International, and Oxford, Cambridge and RSA exams).
University of Huddersfield research into corpus stylistics has led to the development of Language Unlocked, a consultancy service that uses linguistic methodologies and interpretative procedures to help public, private, third-sector and non-governmental organisations. Language Unlocked has informed clients' strategic decision-making, communicated their organisational strategies and assisted them in realising long-term goals. Beneficiaries have included Britain's unions, which have reassessed their communications policies; the Green Party, which has revised its policies, manifestos and communications; and a major chemical company, which increased its visibility as a result of carefully worded advertising.
The EPP Project identifies criterial features for second language acquisition. It has engaged stakeholders in the teaching and testing of language learners. This is facilitated by the EPP network and website. The project has enabled Cambridge Assessment to define the English language constructs underlying Cambridge examinations at different proficiency levels more explicitly. The work has improved the tests themselves, but also allowed Cambridge Assessment to better communicate the qualities of their tests for accreditation and recognition. Stakeholders are more actively engaged through provision of resources for teachers, testers, ministries of education etc., on the website, and in seminars. The project has led to further research with an international language school, which has led to teachers and parents of the school pupils being more aware of the needs for successful second language acquisition.
Talk of the Toon is an online resource that preserves the cultural heritage of North Eastern English dialects giving users unprecedented access to multimedia material spanning five decades. Researchers collaborated with regional museums in this initiative during the Diachronic Electronic Corpus of Tyneside English (DECTE) project (2010-2012), thereby providing them with new avenues for the public to benefit from their collections. The pedagogical resources generated have also significantly benefitted primary and secondary education. Building on regional engagement initiatives through targeted national/international workshops, the impact has also reached beyond the HEI and region to a wider range of educators and students worldwide.
This case study describes a unique collaboration between Professor Clive Upton and researchers at the University of Leeds, the BBC and the British Library (BL), examining language variation. As a result of a programme assembling and researching the largest recorded archive of dialects and speech patterns assembled in the UK, two major interlinked forms of impact were generated:
i. Informing public understanding of dialect and English language use, thereby validating diverse regional and national identities.
ii. Contributing to the professional practice and goals of the BBC and the BL through policy enhancement, training, and developing broadcast and exhibition content.
Joan Beal's research on dialect and identity has had far-reaching educational impact. Her publications are widely used in other HEIs (both in the UK and abroad) and in secondary school teaching, with economic benefits for publishers. She has also influenced curriculum reform through her consultancy for AQA, the largest provider of academic qualifications for 14-19 year olds in the UK. Beyond education, her role as a media commentator and as a consultant for the British Library Sociolinguistics & Education department has led to greater public understanding of the significance, and persistence, of dialect as a means of constructing and expressing identity.
The Electronic Text Corpus of Sumerian Literature (ETCSL) is quoted and used in both schools and colleges across the world and read by people without any direct academic connection to the subject: widening access to, interest in, and understanding of Sumerian literature. Sumerian literature is widely known as one of the oldest literatures in the world, inspiring countless studies of world literature and history of religion. The ETCSL has made the bulk of canonical Sumerian literature (c. 400 compositions) available in prose translations and the original Sumerian to both specialists and informal learners for more than a decade.
Based in the School of English, the Research and Development Unit for English Studies (RDUES) conducts research in the field of corpus linguistics and develops innovative software tools to allow a wide range of external audiences to locate, annotate and use electronic data more effectively. This case study details work carried out by the RDUES team (Matt Gee, Andrew Kehoe, Antoinette Renouf) in building large-scale corpora of web texts, from which examples of language use have been extracted, analysed, and presented in a form suitable for teaching and research across and beyond HE, including collaboration with commercial partners.