Log in
UCREL (the University Research Centre for Computer Corpus Research on Language) has been pioneering advances in corpus linguistics for over 40 years, providing users with corpora (collections of written or spoken material) and the software to exploit them. Drawing together 8 researchers from the Department of Linguistics and English Language and 1 from the School of Computing and Communications at Lancaster University, it has enabled the UK English Language Teaching (ELT) industry to produce innovative materials which have helped the profitability and competitiveness of that industry, and assisted other, principally commercial, users to innovate in product design and development.
Research carried out at Sussex into the automatic grammatical analysis of English text has enabled and enhanced a range of commercial text-processing applications and services. These include an automatic SMS question-answering service and a computer system that grades essays written by learners of English as a second language. Over the REF period there has been substantial economic impact on a spin-out company, whose viability has been established through revenue of around £500k from licensing, development and maintenance contracts for these applications.
University of Huddersfield research into corpus stylistics has led to the development of Language Unlocked, a consultancy service that uses linguistic methodologies and interpretative procedures to help public, private, third-sector and non-governmental organisations. Language Unlocked has informed clients' strategic decision-making, communicated their organisational strategies and assisted them in realising long-term goals. Beneficiaries have included Britain's unions, which have reassessed their communications policies; the Green Party, which has revised its policies, manifestos and communications; and a major chemical company, which increased its visibility as a result of carefully worded advertising.
The 100-million-word British National Corpus of UK English texts and speech is used regularly and extensively as a reference resource on the contemporary English language. Its users include dictionary makers, school teachers in many countries, teachers of English as a second language, the OCR school examinations board, and many individual writers on the internet as a reference source about questions of contemporary English usage. Its use has led to improved English dictionaries that more accurately reflect actual usage: for Longman, Chambers and OUP dictionaries, use of the BNC provides a unique selling point over their competitors and enhanced educational value to readers. For students and English language teachers world-wide, the BNC provides more realistic examples of the usage of words and phrases in context, and in different registers, free of charge via various online search portals, and thus improved education in English.
The EPP Project identifies criterial features for second language acquisition. It has engaged stakeholders in the teaching and testing of language learners. This is facilitated by the EPP network and website. The project has enabled Cambridge Assessment to define the English language constructs underlying Cambridge examinations at different proficiency levels more explicitly. The work has improved the tests themselves, but also allowed Cambridge Assessment to better communicate the qualities of their tests for accreditation and recognition. Stakeholders are more actively engaged through provision of resources for teachers, testers, ministries of education etc., on the website, and in seminars. The project has led to further research with an international language school, which has led to teachers and parents of the school pupils being more aware of the needs for successful second language acquisition.
Based in the School of English, the Research and Development Unit for English Studies (RDUES) conducts research in the field of corpus linguistics and develops innovative software tools to allow a wide range of external audiences to locate, annotate and use electronic data more effectively. This case study details work carried out by the RDUES team (Matt Gee, Andrew Kehoe, Antoinette Renouf) in building large-scale corpora of web texts, from which examples of language use have been extracted, analysed, and presented in a form suitable for teaching and research across and beyond HE, including collaboration with commercial partners.
We used the research consolidated in the British Component of the International Corpus of English (ICE-GB) to build the Internet Grammar of English (IGE), a web-based introductory English grammar; and an app for smartphones and tablets, called the interactive Grammar of English (iGE). The app is based on the IGE website, but was fully updated with new materials and exercises. Both resources have had educational and commercial impact as tools for English language teaching, reaching over 1.2 million users in 2008-2013 through the website and over 34,500 through the app.
The University of Brighton (UoB) has developed a new corpus-evidence-based approach to lexicography along with supporting tools and training resources. This approach has resulted in the development of a computational lexicography tool, the Sketch Engine, commercialised by Lexical Computing Ltd. The Sketch Engine has been adopted by four of the UK's five major dictionary publishers, national language institutes in nine European countries and over 100 universities, to support commercial dictionary production, language technology products and to enable language teaching. It has also been used to substantiate arguments in a pervasive debate about language use in the art world.
Researchers at the University of Glasgow have created the first freely accessible online database of written and spoken texts in Scottish English and Scots. Together, the Scottish Corpus of Text and Speech (SCOTS) and the Corpus of Modern Scottish Writing (CMSW), both developed at Glasgow, provide over 10 million words of text from a range of sources, complemented by audio and video recordings and digitised manuscripts and documents. They have succeeded in raising interest in and awareness of Scottish English and Scots among the general public: 40% of SCOTS's resources were contributed by the public, and the website achieved 165,000 page views per month at launch. The database is also widely used by commercial lexicographers and professionals in secondary education. It is an `essential data source' for Scottish Language Dictionaries, `in day-to-day use' by the Oxford English Dictionary, and from 2006-2013 has been deployed by school examination boards across the UK (Highers, A-Levels, Cambridge International, and Oxford, Cambridge and RSA exams).
Biak (West Papua, Indonesia) is an endangered language with no previously established orthography. Dalrymple and Mofu's ESRC-supported project created the first on-line database of digital audio and video Biak texts with linguistically analysed transcriptions and translations (one of the first ever for an endangered language), making these materials available for future generations and aiding the sustainability of the language. Biak school-children can now use educational materials, including dictionaries, based on project resources. The project also trained local researchers in best practice in language documentation, enabling others to replicate these methods and empowering local communities to save their own endangered languages.