Editing Literary-Historical Manuscripts

Summary of the impact

Contributing to the preservation of literary materials through innovative use of technology, DMU's Centre for Technology and the Arts (CTA) — subsequently renamed the Centre for Textual Studies (CTS) — pioneered new digital techniques for analysing, editing and presenting literary-historical manuscripts of international significance. These techniques revolutionized the scholarly task of capturing data about manuscripts, permitting new kinds of analyses, editing and dissemination, now widely practised to facilitate public access and cultural enrichment. In particular, the CTA/CTS invented a manuscript description standard taken up by major libraries across the world, the International Standards Organization (ISO) via the Text Encoding Initiative (TEI) and by commercial publishers.

Underpinning research

DMU's CTA was established in 1998 to develop research into electronic editorial practice under the leadership of Peter Robinson (1993-2004), working with Norman Blake (1998-2004). The CTA produced editions of Chaucer's Canterbury Tales based on fresh analysis of all the early manuscripts (HRB funded £100,000 1996-99, AHRB funded £394,000 1999-2004, Leverhulme Trust funded £60,000 1995-98). The prevailing digital document encoding standard at the time — the Text Encoding Initiative (TEI) P3 standard — could not cope with the problems of encoding 400-600 year old English manuscripts. By examining hundreds of relevant manuscripts, Robinson derived a new taxonomy of contextualizing information that could be extended by repetition within elements to produce manuscript descriptions of unprecedented length and sophistication. The new standard for manuscript description was formally defined via the newly created Extensible Markup Language (XML), the advanced features of which it was amongst the first to exploit. In 1999 Robinson's edition of The Wife of Bath's Prologue on CD-ROM won the Beatrix White Prize for the best contribution to Medieval and Renaissance scholarship.

Robinson recognized that were his standard to be adopted by libraries worldwide it would enable the creation of an online union catalogue of all their manuscript holdings. In 1999, he initiated the "Manuscript Access through Standards for Electronic Records" (MASTER) project, funded by the EU (376,000 euro 2001-03) and the European Science Foundation (100,000 French francs 1999). MASTER, with DMU as lead partner of six collaborating European universities, promulgated Robinson's standard and developed and distributed the necessary software for its adoption. Details of MASTER's achievements are archived at http://master.dmu.ac.uk (accessed 13/08/13).

The CTA developed other research projects deploying new approaches to scholarly editing enabled by Robinson's work, anticipating the revolution in digital publishing and making the core documents of our literary heritage accessible to all. In 2005, a collaboration with the British Library, the University of Wales Bangor and Keio University, Japan, was established to work on manuscripts of Malory's Le Morte Darthur http://malory.dmu.ac.uk (accessed 13/08/13). In 2004, the CTA was renamed the Centre for Textual Studies by its new Director Peter Shillingsburg (2002-10) and launched further projects, including Briggs (1994-08)/Shillingsburg's `Woolf `"Time Passes"' project (Leverhulme, £52,780 2006) and partnership projects with UCL on Dante's Commedia (AHRB 2002-04), with Princeton on the Monarchia (AHRB 2003-04), and with the University of Liverpool (AHRB 2003-04).

The CTA/CTS developed a wholly new kind of study adapted from the methods of evolutionary biology, called Phylogenetic Analysis using Parsimony (PAUP). Treating random word alterations that occur during scribal transcription as analogous to random genetic alterations in biological processes, PAUP produces 'family trees' showing the manuscripts' linguistic relatedness, called cladograms. This research led to a 2008 landmark article in the world's most-cited and influential interdisciplinary science journal, Nature (circulation 53,000, readership 424,000).

The CTA/CTS developed a new publication technology — the XML-based Anastasia system (2000) — enabling literary-historical scholars to combine complex manuscripts and transcriptions in a single digital package. This technology was first delivered by CD-ROMs and later over the Internet. http://www.sd-editions.com/anastasia (accessed 13/08/13).

Details of the impact

Over the census period, the reach and significance of the CTA/CTS research programme have been global. The importance of the Canterbury Tales project to cultural heritage preservation is acknowledged by the National Library of Wales, the repository of the Hengwrt Chaucer, the earliest extant Chaucer manuscript. A digitised version of the manuscript was prepared with DMU, using the techniques developed in the CTA/CTS, and enabled the Library to better understand and preserve the fragile parchment of one of its greatest treasures, providing the widest textual archaeology of the Hengwrt manuscript and its scribe explicitly for `the benefit of the public'. Visitor numbers now total approx 100,000 annually with over two million remote visitors to its site. Building on the CTA/CTS research, the Library is now established as one of Europe's leading centres for digitisation, and its three-year strategy for agile electronic access 2011-14 is recognized as a model for facilitating access to recorded cultures.

In the census period, the CTA/CTS work on a standardized system called MASTER for manuscript description achieved catalogue integration across the holdings of the Royal Library in the Hague, the Arnamagnaean Institute in Copenhagen, L'Institut de recherche et d'histoire des textes in Paris, the National Library of the Czech Republic, Prague, the Bodleian Library, Oxford, the British Library, the Vatican Library and the Biblioteca Ambrosiana, Milan. This amounts to 61,000 manuscripts — the core of Europe's rare manuscript holdings — being combined into a single searchable catalogue using the CTA/CTS standard.

The latest manifestation of this is the EU-funded 5.57M euro ENRICH project (2007-09), which explicitly builds on the basis of the CTA/CTS achievement in this field. This created a base for a digital library of European cultural heritage, which made available more than five million digitized pages for a wide range of non-academic users, including `libraries, museums and archives, policy makers and general interest users'. Examples include the National Library of the Czech Republic, Prague, Centro per comunicazione e l'integrazione dei media, Florence, SYSTRAN S.A., Paris, and Biblioteca Nacional de España, Madrid, Spain.

The XML-based Anastasia system enabled scholars working on manuscript-based historical and literary projects to publish their editions without additional technical support, and it has been adopted by those working on ancient Biblical manuscripts of major theological significance. These include Digital Nestle-Aland, which is an electronic version of the standard scholarly edition of the Greek New Testament published in Stuttgart 2012 (ISBN 978-3-438-05140-0), and the online edition of the Codex Sinaiticus, one of the most important books in the world, produced in 2009 in an international partnership between the British Library, the National Library of Russia, St Catherine's Monastery and Leipzig University Library.

From 2007, this CTA/CTS standard for manuscript description officially became the whole world's standard. The Text Encoding Initiative (TEI) is an affiliated organization of the International Standards Organization (ISO) — itself a body with consultative status on the United Nations Economic and Social Council — and it has adopted the CTA/CTS standard. In 2002, the P4 version of the TEI guidelines (an interim release) adopted the CTA/CTS approach to produce its first systematic instructions on encoding manuscript descriptions, and the latest version, P5 published in 2007, fully incorporates the CTA/CTS work. The CTA/CTS-initiated MASTER project thus became the model for other projects attempting to build inter-library union catalogues.

Because virtually all commercial publication of scholarly works uses the TEI guidelines, the CTA/CTS standard has become ubiquitous in commercial publications based on manuscript sources. Virtually every publisher uses TEI encoding when handling complex manuscript materials. Typical examples include Cambridge University Press, which used the CTA/CTS standard to publish two volumes of the letters of Samuel Beckett in 2009-11 (ISBNs 978-0521867931, 978-0521867948). In 2009, Leo Jansen, Hans Luijten and Nienke Bakker edited the artist's letters using the CTA/CTS standard for the Van Gogh Museum in Amsterdam (ISBN 978-0500238653). 2,000 pages of Isaac Newton's manuscripts have been digitized using the CTA/CTS standard as part of Indiana University Press's `The Chymistry of Isaac Newton' project. Between 2009 and 2011, AMS Press of New York published three volumes of the work of James Fenimore Cooper using the CTA/CTS standard (ISBNs 978-0404644673, 978-0404644666 and 978-0404644796). A full list of projects using the TEI standard is given in section (5) below; all those involving manuscript work use the CTA/CTS standard developed at De Montfort.

