Log in
The Natural Language Toolkit (NLTK) is a widely-adopted Python library for natural language processing. NLTK is run as an open source project. Three project leaders, Steven Bird (Melbourne University), Edward Loper (BBN, Boston) and Ewan Klein (University of Edinburgh) provide the strategic direction of the NLTK project.
NLTK has been widely used in academia, commercial / non-profit organisations and public bodies, including Stanford University and the Educational Testing Service (ETS), which administers widely-recognised tests across more than 180 countries. NLTK has played an important role in making core natural language processing techniques easy to grasp, easy to integrate with other software tools, and easy to deploy.
The Software Systems Engineering Group at UCL developed and patented xlinkit, an approach that supports the validation of XML documents in general and over-the-counter (OTC) derivative transactions expressed in the Financial Products Markup Language (FpML) in particular. The widespread adoption of FpML (95% of financial market participants now use it for OTC transactions) has brought about a substantial reduction in market and credit risk for financial institutions, by reducing the time required to confirm derivative transactions from up to 10 days to at most one day. In the year to June 2012 about $440 trillion OTC transactions were executed worldwide. [text removed for publication]. Message Automation, which markets a product including tools based on that patent, has received £3 million revenue in the same period.
In 1997 Professor David MacKay of the University of Cambridge Department of Physics developed Dasher, a software accessibility tool for entering text by zooming through letters displayed on a screen. Dasher has since transformed computing for tens of thousands of individuals unable to use a normal keyboard, and is recommended by many charities involved in assistive technologies, such as the European Platform for Rehabilitation network. Since 2008, Dasher has been downloaded over 75,000 times and has been ported to smart phones, making use of input devices such as tilt sensors and joysticks. Linking Dasher's information-efficient text generation from gestures or gaze direction to text-to-speech or real-time-text output channels has made Dasher an ideal component of augmentative and alternative communication (AAC) systems which address digital exclusion.
The spin-out company CSM Ltd. was set up in 1991 to commercially develop Durham research on program transformation. Up until 1999, this company (which in the mid-90's became Durham Software Engineering Ltd. and subsequently Software Migrations Ltd.) and researchers at Durham University developed the FermaT Workbench: an industrial-strength assembler re-engineering workbench for program comprehension, migration and re-engineering. In 1999, Software Migrations Ltd. relocated to St. Albans and now has an extensive list of national and international clients. All its products (software and services) are built on the FermaT Workbench and has generated considerable revenue with this revenue strongly expected to rise steeply in the near future.
The research on machine translation carried out at the University of Edinburgh has led to the development of Moses, the dominant open source toolkit for building machine translation (MT) systems. The toolkit has found wide adoption in academic research worldwide: the Moses paper was the most cited paper in all of the Association for Computational Linguistics conferences in 2011. Moses has also been widely used by commercial concerns such as Adobe, Symantec and Sybase, and agencies such as the European Commission and the World Trade Organisation. The research contribution of the School of Informatics in the University of Edinburgh has significantly increased the commercial viability and availability of machine translation.
The toolkit has been one of the main drivers in lowering the barrier to entry to machine translation, making MT available to small and medium-size companies and opening up new markets and opportunities.
Today, Moses is one of the most widely adopted MT systems in the translation industry, dominating the open-source space for MT. Its maturity and quality, as well as its liberal open-source license, means that it is often preferred over proprietary systems.
As a writer of popular (linguistic) science, and as the subject of a documentary film on his life and work, Professor Dan Everett's research on Amazonian languages like Pirahã has widely influenced popular understanding and debate about the relations between language, mind and culture. The spectacular, and sometimes controversial, conclusions of his fieldwork, theoretical and popular writings challenge the claim that all human beings are endowed with an innate language faculty and challenge the ways in which cultural values are constructed.
The global workplace means that the staff at your local hospital, the pilot of your aeroplane or your teacher may be operating in a foreign language. Establishing their foreign language proficiency is crucial to ensuring effective communication. Not only this, establishing what one knows and does not know enables appropriately targeted teaching. We have enabled institutions and individuals throughout Europe to better understand the nature of foreign language proficiency, and, moreover, provided the means of measuring it. Our research led to the production of an on-line language assessment system, DIALANG, made publically available from 2001 in 14 European languages.
Biak (West Papua, Indonesia) is an endangered language with no previously established orthography. Dalrymple and Mofu's ESRC-supported project created the first on-line database of digital audio and video Biak texts with linguistically analysed transcriptions and translations (one of the first ever for an endangered language), making these materials available for future generations and aiding the sustainability of the language. Biak school-children can now use educational materials, including dictionaries, based on project resources. The project also trained local researchers in best practice in language documentation, enabling others to replicate these methods and empowering local communities to save their own endangered languages.
Data-to-text utilises Natural Language Generation (NLG) technology that allows computer systems to generate narrative summaries of complex data sets. These can be used by experts, professional and managers to better, and quickly, understand the information contained within large and complex data sets. The technology has been developed since 2000 by Prof Reiter and Dr Sripada at the University of Aberdeen, supported by several EPSRC grants. The Impact from the research has two dimensions.
As economic impact, a spinout company, Data2Text (www.data2text.com), was created in late 2009 to commercialise the research. As of May 2013, Data2Text had 14 employees. Much of Data2Text's work is collaborative with another UK company, Arria NLG (www.arria.com), which as of May 2013 had about 25 employees, most of whom were involved in collaborative projects with Data2Text.
As impact on practitioners and professional services, case studies have been developed in the oil & gas sector, in weather forecasting, and in healthcare, where NLG provides tools to rapidly develop narrative reports to facilitate planning and decision making, introducing benefits in terms of improved access to information and resultant cost and/or time savings. In addition the research led to the creation of simplenlg (http://simplenlg.googlecode.com/), an open-source software package which performs some basic natural language generation tasks. The simplenlg package is used by several companies, including Agfa, Nuance and Siemens as well as Data2Text and Arria NLG.
Bilingualism Matters (BM) was set up as a proactive public engagement programme by Prof. Antonella Sorace in order to make the results of her research, showing the benefits of bilingualism, accessible and useful to the general public. BM offers advice and information particularly on early bilingualism; it combats misconceptions about bilingualism, especially regarding cognitive development in children. It has made current research accessible, practically usable and of benefit to different sections of society, including children, parents, educationalists, health professionals, businesses and policy makers. In consequence, it has changed public attitudes, and helped shape education policy both in the UK and elsewhere in Europe.