Manuscript transcription projects

This report was compiled in December of 2013 as a prelude to the Early Modern Manuscripts Online (EMMO) initiative. It details a preliminary collection of projects which have aspects that may somehow relate to EMMO. It is in no way a full or comprehensive account of related projects, and it is not updated on any regular basis. -Danielle Rosvally

Fully Reported Projects

Bess of Hardwick’s Letters

http://www.bessofhardwick.org

Project Scope: “This project aims to create a fully searchable, online edition of the letters of Elizabeth Talbot, Countess of Shrewsbury (also known as Bess of Hardwick).”

“The project will provide online transcripts of the all letters, presented according to modern editorial standards, in searchable, downloadable, and print-friendly versions, accompanied by scholarly notes and commentaries on manuscript features and presentation. Alongside the creation and development of the edition, the letters will be analysed for the way they textualise relationships, draw on created versions of voice and personae, and use visual and material features to communicate meaning. The findings of these analyses will be published as a major study. Together, the edition and study, for the first time, will allow us to hear Elizabeth Talbot speak for herself. The letters will be edited and analysed by the project team in the English Language Department, University of Glasgow. The edition will be hosted by the Centre for Editing Lives and Letters, Queen Mary, University of London. The texts will be added to the Corpus of Early English Correspondence, University of Helsinki, which will extend the possibilities for future analysis by another set of users – historical sociolinguists and corpus linguists. Six podcasts will provide routes into the collection for a wider audience, beyond the academy.”

The project lasted from November 1, 2008 – January 31, 2012.

Similarities to EMMO: Very similar. The interface provides robust search functionality, as well as downloadable content (each letter is offered via Diplomatic version (with spelling intact), normalized version (updated spelling/spacing), downloadable PDF (which also lists letters related to the current selection by persons and events mentioned), downloadable XML, images of the various letters (leaf by leaf), and a transcription function which provides original leaf images above a transcription box where you can submit your own transcription for review by the project team.

The site also offers details on secretary hand and resources for a user to learn to read and transcribe it (http://www.bessofhardwick.org/background.jsp?id=231).

Management Approach: Though there is a place for users to create and submit their own transcription, what is done with this transcription is not readily mentioned. There is reason to believe that this is mostly a centrally managed operation with some amount of crowd-sourcing, though that crowd sourcing is reasonably heavily edited.

Resources: The images are hosted by the Folger Digital Image Collection, and it is known that the project is funded by the Arts and Humanities Research Council.

Sponsoring Institution: University of Sheffield; University of Glasgow; Funded by: Arts and Humanities Research Council

Project Team Members:

  • Dr. Alison Wiggins (PI – English Language Department, University of Glasgow)
  • Dr. Daniel Starza Smith (Research Associate – University of Glasgow; Oct. 2011 – Dec. 2012)
  • Dr. Anke Timmermann (Research Associate – University of Glasgow; Jan. 2010 – June 2011)
  • Dr. Graham Williams (Research Associate – University of Glasgow; Oct. 2011 – April 2012)
  • Dr. Alan Bryson (Research Associate - University of Glasgow; Oct. 2008 - Sep. 2009)
  • Katherine Rogers (Digital Humanities Developer – Humanities Research Institute)

Colonial Despatches: The Colonial despatches of Vancouver Island and British Columbia 1846-1871

http://bcgenesis.uvic.ca

Project Scope: “This project aims to digitize and publish online a complete archive of the correspondence covering the period from 1846 leading to the founding of Vancouver Island in 1849, the founding of British Columbia in 1858, the annexation of Vancouver Island by British Columbia in 1866, and up to the incorporation of B.C. into the Canadian Federation in 1871. “All the material on this site originates in the work of Dr. James Hendrickson and his team of collaborators at the University of Victoria, which resulted in the publication of 28 print volumes of correspondence several years ago.”

“This digital archive contains transcriptions of virtually the complete correspondence between the British colonial authorities and the successive governors of the nascent Vancouver Island and British Columbia colonies, along with a great deal of associated writing, generated within the colonial office, and between public offices, which relates to the colonies.”

“In the long term, we plan to check and proof the whole collection, then to expand and enhance it by adding more transcriptions (of attachments, enclosures etc.), and images of all of the original documents. See Development for more details of our progress.”

Similarities to EMMO: Transcriptions are available side-by-side with an image of the scanned document (though the scanned image is not full size, just a thumbnail that you need to click into to open a separate page in order to view the full document). Mouse-over and click-in notes are available, as is XML source code.

Management Approach: Central; no crowd-sourcing whatsoever.

Resources: “Waterloo Script is long obsolete, and the days of 28-volume print publications are likely coming to an end; but now we have a much more universal and flexible publishing platform, in the form of the World Wide Web. Our team at the University of Victoria Humanities Computing and Media Centre has converted those original files from Waterloo Script into TEI P5 XML, an XML standard developed and maintained by the Text Encoding Initiative, and we have built a Web application to make them readable and searchable."

“All of the original documents have been converted to XML, and now reside in an eXist XML database. In honour of the 150th anniversary of the founding of British Columbia—a story which itself plays out in intriguing detail in these documents—we have worked hard to make the 1858 documents ready for the general reader, by adding and expanding footnotes and biographical sketches prepared by Dr. Hendrickson, along with many manuscript images. As a result, we can now provide access to the 1858 documents. However, all of the documents in the collection, including those from 1858, require detailed proofing. Please see our disclaimer page if you intend to make use of the data for serious research or legal purposes.”

Sponsoring Institutions: University of Victoria Humanities and Computing Media Centre; University of Victoria Libraries; University of Victoria Law Faculty; The Canadian Council of Archives; Canadian Heritage; Ike Barber B.C. History Digitization Project; The National Archives (UK)

Project Team Members: For a full list of project credits, see http://bcgenesis.uvic.ca/credits.htm:

  • Petria Arienzale: Research, writing and editing
  • Theo Biggs: Research assistant
  • Caitlin Croteau: Research assistant
  • Merna Forster: Project management
  • Vincent Gornall: Research and writing
  • Dr. James Hendrickson: Content expertise and research. Dr. Hendrickson is the original begetter of the project.
  • Martin Holmes (UVic HCMC): Project management and programming (I'm the primary project contact, so write to me with questions!)
  • Frank Leonard: Research and biographies
  • Dr. John Lutz (UVic History Dept): Academic director
  • Quinn MacDonald: Research, writing and editing
  • Rosemary MacKenzie: Research assistant
  • Shaun Macpherson: Research, writing and editing
  • Alison Malis: Research, writing and editing
  • Sean Manning: Research assistant
  • Marion Massey: Document transcription
  • Matthew McBride: Research, writing and editing
  • Ryan Munroe: Research, writing and editing
  • Chris Petter (UVic Library): Consulting, fundraising and research
  • Loring Rochacewich: Research assistant
  • Lindsey Schultz: Research, writing and editing
  • Kim Shortreed-Webb: Research and markup, project management, writing and editing
  • Heather Stirling: Research, writing and editing
  • Terrance Stone: Research assistant
  • Patrick Szpak: Design, research and markup
  • Josh White: Research, writing and editing
  • Leanna Wong: Research assistant

Special thanks to Susan Doyle and the UVic English Department's Professional Writing program, for their contributions through their Directed Reading students from English 492: Directed Reading: Advanced Topics In Professional Writing.

Diary of Harry Watkins Project

http://www.harrywatkinsdiary.org

Project Scope: “To produce a critical edition of Harry Watkins’ Diary in both codex and digital form. The digital form will provide access to digital facsimiles of the diary manuscript, a fully searchable digital text, and annotations.”

Similarities to EMMO: Extremely similar in that it’s a transcription effort of a period document that strives to provide free online access. Since the project is extremely nascent at this point (though a few university presses are interested, there isn’t even a publisher lined up yet), the team has yet to determine factors such as what the relationship between the digital and hard editions will be, where the project will be more permanently housed, etc.

While the original pages were scanned by Harvard (and are thus hosted in HOLLIS, the Harvard digital catalogue), the organization does have their own copies of the material. Since permissions have yet to be arranged with Harvard, it’s so far unclear as to how closely they will be able to display the facsimile and transcription. The manuscript itself is extremely tricky textually (crazy handwriting, corrections, wacky spelling) and thus OCR efforts would be very difficult, time-consuming, and require a great deal of hand-correction and XML coding.

Management approach: Centrally managed with no plans in the works for crowd sourcing (there’s no indication that it would be useful since the audience base for this project is rather limited), though it has been noted that this might be a neat additional feature if it could be supported with nominal effort.

Resources: “Currently, we have half a dozen people working on transcribing the diary – the two project directors, and our undergraduate and graduate students funded variously by CUNY-internal grant programs and federal work-study.”

“Drupal’s (drupal.org) Workbench module provides infrastructure for attaching workflow state to each page, changing that state (different project roles have different state-changing privileges), and viewing the state of the project based on workflow states. We are currently integrating the oXygen XML editor into our process for faster transcription with fewer XML errors.”

Sponsoring institution: This project is a free-floating child of the CUNY system without any solid CUNY-official backing. They receive a small bit of funding from CUNY-internal competitive grants (most of which goes to paying student transcribers) and applications for NEH grants are in the works. Most of the faculty working on the project are volunteering their time.

Project Team Members: Scott D. Dexter (Brooklyn College, CUNY), Amy E. Hughes (Brooklyn College, CUNY), Naomi J. Stubbs (Brooklyn College CUNY)

Diderot Encyclopedia collaborative Translations project in association with the ARTFL Encyclopedie

http://quod.lib.umich.edu/d/did/

Project Scope: To translate into English the entirety of the Encyclopedia of Diderot and d’Alembert and make this translation freely available online.

ARTFL hosts the original plate images while the collaborative translation project hosts the plain-text transcriptions and translations.

About ARTFL: “Founded in 1982 as a result of a collaboration between the French government and the University of Chicago, the ARTFL Project is a consortium-based service that provides its members with access to North America's largest collection of digitized French resources”

“Undertaking an electronic edition of the Encyclopédie represented a daunting task. Its structure is very complex; the typographical conventions used for textual elements - from article headwords to classifications and cross-references - varied to a significant degree from volume to volume; the relationship between articles and the plate images is in no way clear or systematic. All this notwithstanding, the computer offered a host of new possibilities both for making the work accessible to the scholarly community and for navigating within the work itself. In addition, the digital medium allowed us to think in terms of a "living edition" that could be corrected, developed and improved over time. Our initial choice was to make the work accessible as quickly as possible and progressively to correct it. In order to compensate for the errors introduced during the original data capture process, we chose to make page images of the volumes available for comparison and verification. As we undertook to correct the text, we also strove to improve the search and retrieval capacities. All too often our users limit themselves to simple word and phrase searches, yet these do not always yield the most fruitful results. Using our new search and reporting features can significantly improve the user's ability to move through what Diderot himself described as the "tortuous labyrinth" that is the Encyclopédie. Looking at frequency of occurrence by article or collocation tables, for example, can provide more useful paths into the Encyclopédie than simple word searches alone.”

Similarities to EMMO: While this is a scan and transcribe text effort, the transcription and text are not available side-by-side (you have to leave the transcription/translation database to view the ARTFL-hosted plates). Additionally, the crowd sourcing is highly administrated; rather than live wiki-style annotations, contributors send their pieces to editors who peruse and post. Search functionalities are possible (in the French more robust than in the English version), though the user interface is clunky.

Management approach: CTP is a crowd-sourced operation; participants from around the world volunteer to translate specific articles in accordance with their own interests and expertise. Becoming a translator allows access to various translation resources (including the list serve which is often queried for odd or archaic French word usage, quirks of the document, etc.)

ARTFL is largely a centralized effort though does include a crowd-sourced editing feature (users can “report error” at the top of any page).

Sponsoring institution: The translations and translation project is hosted by Michigan Publishing, a division of the University of Michigan Library.

The thumbnails and images of plates linked from the translation are hosted by ARTFL (a collaboration between the French government and the University of Chicago)

Project team members: The translation project is at least in part spearheaded by Dena Goodman (University of Michigan) and Jennifer Popiel (Saint Louis University)

ARTFL:

  • General Editor: Robert Morrissey;
  • Associate Editor: Glenn Roe;
  • Technical Development: Mark Olsen – Primary developer, Leonid Andreev, Russell Horton, Orion Montoya, Robert Voyer
  • Editorial Development: Stéphane Douard, Jack Iverson, Glenn Roe

Resources: Monetary resources are not readily known, but a good deal is known about the software behind these projects:

Translation project: “The Encyclopédie database uses a modified version of the ARTFL Project's full-text search and retrieval engine, PhiloLogic. With this new version comes several new search and reporting features such as collocation tables, frequency by headword reports, and a sortable keyword in context (KWIC) function.”
ARTFL: “In November of 2009 we began the process of converting the text of the Encyclopédie into standard Unicode (UTF-8) using a light TEI-XML encoding scheme. This move is significant in two ways: First, we can coherently represent and associate an article’s metadata (author, classifications, part of speech, etc.) with the article itself, i.e., in a TEI-XML header for each article entry, rather than storing them in external databases as we have done in the past. This will additionally allow us to manipulate the metadata in the future, adding machine classifications, similar article lists, a notes section, or any other relevant information on an article-specific basis. Secondly, the move to the Unicode standard has finally made correction of the Greek passages in the Encyclopédie possible”

DIY History/Transcribe

http://diyhistory.lib.uiowa.edu/transcribe/

Project Scope: This is a crowd-sourced transcription effort which strives to create a transcribed database of Civil War Diaries and Letters. The project was expanded to include items from outside the University of Iowa Civil War Collections in October 2012.

Similarities to EMMO: This is crowd sourcing at its purest. Each page is digitized then made freely available to the internet at large with an invitation for anyone to come transcribe it. Users are able to search whatever has been completed and view a side-by-side image of the source/transcription. The website, it should be noted, is a bit clunky and takes a great deal of click-through to understand its internal logic

Management Approach: Completely crowd sourced (part of the project’s touchstone philosophy). Here is a snipped from the “about the project” page: “DIY History lets you do it yourself to help make historic documents easier to use. Our digital library holds thousands of pages of handwritten diaries, letters, and other texts -- much more than library staff could ever transcribe alone, so we're appealing to the public to help out. Through "crowdsourcing," or engaging volunteers to contribute effort toward large-scale goals, these mass quantities of digitized artifacts become searchable, allowing researchers to quickly seek out specific information, and general users to browse and enjoy the materials more easily. Please join us in preserving our past by keeping the historic record accessible -- one page at a time.”

Resources: “Digitized artifacts are migrated from the Iowa Digital Library, which is managed by CONTENTdm software. The transcription pages use Omeka for content management, the Scripto plugin for transcribing, and Twitter Bootstrap for the frontend framework.”

Sponsoring Institution: University of Iowa Library; the digitized selections are from Iowa Libraries’ Special Collections, University Archives, and Iowa Women’s Archives.

Project Team Members: Mostly kept behind the crowd-sourcing wall; but Greg Prcikmand and Kristi Bontrager seem to be the project leads.

Hamburg Dramaturgy Translation

http://mcpress.media-commons.org/hamburg/

Project Scope: “This site hosts the peer-to-peer review of the first complete, annotated English translation of G. E. Lessing’s Hamburg Dramaturgy, translated by Wendy Arons and Sara Figal, and edited by Natalya Baldyga. The project is currently under contract with Routledge Press, which has allowed us to prepublish our work here for open review. The draft manuscript with comments will remain live here even after the translation has been published. The published book will incorporate comments and suggestions made here into the final version of the annotated translation, and it will be enhanced by the addition of critical introductions contributed by Wendy Arons, Natalya Baldyga, and Michael Chemers.”

Similarities to EMMO: Some of the functionality this project offers seems similar to the EMMO flavor. The roll-over notes and crowd-sourced annotation feel like something EMMO would provide. Currently, there are no plans for this project to host a scan of the original text, or even any version of the text in German (it is, however, freely available online via Project Guutenberg among other places).

Management: centrally managed in general translation (and comments require approval before they go live), but crowd-sourced annotations allow the functionalities of each.

Resources used: They are basically translating into Microsoft word documents then transcribing that to the internet. Wikicommons hosts the wiki functionality which offers their crowd-sourcing options. The original Hamburg text which they are using is the Deutsche Klassiker Verlag held in the Lessing library, transcribed into an online form (not via OCR but old-fashioned transcription).
The project received a $289,697 grant from the National Endowments for the Humanities (NEH) Scholarly Editions & Translations Program with a three-year grant term.

Sponsoring Institution: Media commons press hosts the digital edition, Routledge will be publishing the finished print volume.

Project Team Members: Wendy Arons (Carnegie Mellon University), Sara Figal (Independent Scholar), Natalya Baldyga (Tufts University), and Michael Chemers (University of California at Santa Barbara)

Manuscripts Online – Written Culture from 1000 to 1500

http://www.manuscriptsonline.org

Project Scope: “ Manuscripts Online enables users to search an enormous body of online primary resources relating to written and early printed culture in Britain during the period 1000 to 1500. “A single search engine enables users to undertake sophisticated full-text searching of literary manuscripts, historical documents and early printed books which are located on websites owned by libraries, archives, universities and publishers. Users are able to search the resources by keyword, but also by specific keyword types, such as person and place name, date and language (eg. Middle English, Latin and Anglo-Norman), thanks to techniques which we are using called automated entity recognition. Additionally, users are able to plot results on a map of Britain and create their own annotations to the data for public consumption, thereby building a knowledge base around this critical mass of primary source data. “Automated entity recognition is a Natural Language Processing technique within information science whereby algorithms are able to intelligently identify the occurrences of specific types of words, such as names, concepts and terminology, using three methods: dictionaries (such as a historical gazetteer of place names), lexical pattern matching and syntactic context.”

Project Duration: November 2011 – January 2013

Similarities to EMMO: On the surface this is extremely similar to the EMMO effort but in practice it’s not actually very close at all. The search functionality brings you to stubs of the items which are held in other databases who have partnered with this one. Nothing is actually hosted here, it’s just a robust search function.

One neat feature is the ability to comment on a resource (the comments are stored on the manuscripts online server) and geo-tag your comment. Since they’re connected to the search stub, though, and not the document per say this can’t really be considered a crowd-sourced annotation.

Management Approach: Mostly centrally managed with options for interaction: General users can comment and geo-tag; content providers can opt to have their resources included within the search index; and developers can use a publically available Web API to connect their website or mobile apps to the search index.

Resources: Funded by JISC; there is a long list of resources on the site’s home-page which are presumably institutions that contributed manuscripts either in hard or digital form.

Sponsoring Institution: Humanities Research Institute; University of Sheffield, Queen’s University Belfast, University of Birmingham, University of Glasgow, University of Leicester, University of York. Funding: JISC

Project Team Members:

  • Dr. Orietta Da Rold (Co-Investigator, University of Leicester)
  • Professor Wendy Scase (University of Birmingham)
  • Professor Jeremy Smith (University of Glasgow)
  • Professor Linne Mooney (University of York)
  • Professor John Thompson (Queen’s University Belfast)
  • Dr. Estelle Stubbs (Research Associate – Humanities Research Institute)
  • Dr. Sharon Howard (Project Manager – Humanities Research Institute)
  • Katherine Rogers (Digital Humanities Developer – Humanities Research Institute)
  • Matthew Groves (Digital Humanities Developer – Humanities Research Institute)
  • Michael Pidd (Principal Investigator – Humanities Research Institute)

The Papers of Abraham Lincoln

http://www.papersofabrahamlincoln.org

Project Scope: “The Papers of Abraham Lincoln is a long-term project dedicated to identifying, imaging, transcribing, annotating, and publishing all documents written by or to Abraham Lincoln during his entire lifetime (1809-1865).”

“For the past decade, the staff of the Papers of Abraham Lincoln has been collecting images of documents written by or to Abraham Lincoln from repositories and private collections around the world. The project has scanned more than 90,000 documents from more than 400 repositories and 180 private collections in 47 states and 5 foreign countries thus far. The archive will likely top 150,000 documents when complete.”

Similarities to EMMO: Functionally, this seems to be simply a collection of PDFs. There are no annotation functions readily available (though you can download the PDFs), no transcripts readily available, and nominal search capabilities (you can search the titles of the documents, but that’s about it).

Management Approach: Centrally managed; almost no crowd sourcing (except in acquisitions).

Resources: “From 2006 to 2013, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign housed the growing archive of master image files. The retirement of their Mass Storage System has forced the project to look for a new storage solution for its 35 terabytes of files. (Thirty-five terabytes is roughly equivalent to a digital music file that would play non-stop for 68 years, or to 10.8 million photographs.)”

On September 3, 2013 the project was awarded the AWS in Education Grant of $24,000 by Amazon Web Services to store more than 35 terabytes of master image files in a secure environment

Sponsoring Institution: Illinois Historic Preservation Agency and the Abraham Lincoln Presidential Library and Museum.
We are co-sponsored by the Center for State Policy and Leadership at the University of Illinois Springfield and the Abraham Lincoln Association. They have also received funding from the NEH and the National Historical Publications and Records Commission.

Project Team Members: http://www.papersofabrahamlincoln.org/about-us/staff-descriptions currently lists twelve names and position titles ranging from “Graduate Assistant” to “Director and Editor” (Daniel W. Stowell).

Interns: http://www.papersofabrahamlincoln.org/about-us/our-interns
Editorial and Advisory Board: http://www.papersofabrahamlincoln.org/about-us/editorial-and-advisory-board

TCP initiatives: EEBO-TCP (Early English Books Online); Evans Early American Imprint Collection- TCP; and ECCO-TCP (Eighteenth Century Collections Online)

http://quod.lib.umich.edu/e/eebogroup/; http://quod.lib.umich.edu/e/evans/ ; http://quod.lib.umich.edu/e/ecco/

Project Scope: Designed to bring “Early English Books”, Early American Imprints, and Eighteenth Century Manuscripts to a searchable interface for a wide audience.
“Simply put, EEBO is a commercial product published by ProQuest LLC, and available to libraries for purchase or license. EEBO-TCP is a project based at the University of Michigan and Oxford, and supported by more than 150 libraries around the world. EEBO consists of the complete digitized page images and bibliographic metadata (catalog records) for more than 125,000 early English books listed in Pollard & Redgrave’s Short-Title Catalogue (1475-1640) and Wing’s Short-Title Catalogue (1641-1700) and their revised editions, as well as the Thomason Tracts (1640-1661) collection and the Early English Books Tract Supplement. With EEBO alone, you can search for a book based on the information in the catalog record and you can flip through or download page images in TIFF or PDF format. With EEBO alone, it is not possible to search the full text of a book or to read a modern-type transcription of the text.
“EEBO-TCP captures the full text of each unique work in EEBO. This is done by manually keying the full text of each work and adding markup to indicate the structure of the text (chapter divisions, tables, lists, etc.). The result is an accurate transcription of each work, which can be fully searched, or used as the basis of a new project. To date, EEBO-TCP has produced more than 40,000 texts. The EEBO-TCP text files are delivered back to ProQuest and indexed in EEBO, so users at partner libraries can seamlessly perform full text searches and view transcriptions right within the EEBO platform, although the texts can also be accessed in other ways. EEBO-TCP is administered by the University of Michigan Library, with teams of editors at Michigan and Oxford.”

Similarities to EMMO: Reasonably similar in that it provides search functionalities to resources which are then available to view. There is no crowdsourcing, no annotations, this is just a search and find interface.

Management Approach: Completely centrally managed.

Resources: All three projects are in partnership with TCP

Sponsoring Institution: University of Michigan and Oxford; since EEBO is a subscription service it is supported by the subscription fees (each membership library pays $60,000 to become a partner).

Project Team Members: Not readily known.

Transcribe Bentham

http://blogs.ucl.ac.uk/transcribe-bentham/
Project scope: Through Crowd Sourcing, this project looks to digitize and make available digital images of Jeremy Bentham’s unpublished manuscripts.

Similarities to EMMO: Transcribe Bentham is similar to EMMO in that it provides an open-source information hub with manuscripts, crowd-sourced transcription efforts, and some search functionality. The TB search function, however, is not very robust.

Management approach: Crowd-sourced; from the project’s website FAQ: “[anyone can take part in this project]; You do not need any specialist knowledge or training, technical expertise, prior approval from us, nor do you need any historical or philosophical background. All that is required is some enthusiasm (and, perhaps, a little patience!).”

Resources: Transcribe Bentham is run using mediawiki, a free open source wiki software. In terms of participants, since the effort is crowd-sourced it’s difficult to say how many active hands are working on these manuscripts.
Sponsoring institution: The Bentham manuscripts are property of the University College London’s archive and the project was begun under their auspice. As of October 1, 2012, the project is supported by the Andrew W. Mellon Foundation

Project team members:

  • Professor Philip Schofield (Project Director)
  • 
Dr. Tim Causer
(Research Associate)
  • Professor Melissa Terras
(Reader in Electronic Communication, UCL Department of Information Studies, and Co-Director, UCL Centre for Digital Humanities)
  • Mr. Richard M. Davis
(Development Manager, ULCC Digital Archives)
  • Dr. Arnold Hunt
(Curator of Modern Historical Manuscripts, British Library)
  • Mr. José Martin
(Digital Repositories Specialist, University of London Computer Centre)
  • Mr. Martin Moyle
(Digital Curation Manager, UCL Library Services)
  • Ms. Lesley Pitman
(Librarian and Director of Information Services, UCL School of Slavonic and East European Studies Library)
  • Ms. Anna-Maria Sichani
(Transcription Assistant)
  • Mr. Tony Slade
(Head of UCL Creative Media Services)
  • Dr. Justin Tonra
(Research Associate)
  • Dr. Valerie Wallace (Research Associate)


Full bios for project team members available here: http://blogs.ucl.ac.uk/transcribe-bentham/people/

Wittgenstein Source: Wittgenstein Archives at the University of Bergen

http://129.177.5.31/documentation/en/home.html

Project scope: A searchable and filterable online archive of the primary sources used by Wittgenstein; as advertised on the project’s home page: “Browse scholarly editions of Wittgenstein's works and Nachlass. Use a set of tools to retrieve and filter content. Work with essays about Wittgenstein. Submit your own contributions for peer-reviewed publication.”
One exemplary feature is the ability to customize viewing settings according to filters toggled by the researcher. Remarks, section marks, etc. can be hidden or shown (toggled individually by section or comment mark type), certain portions of writing (dedication, motto, preface, etc.) can be highlighted or not, and the document can be viewed in diplomatic or normalized page layout. All of these options are available as single toggles so a researcher may, essentially, customize his view of the transcription.

Similarities to EMMO: This project is still in its infancy, so it’s rather unclear at the moment how similar it will be to EMMO once it’s really up and running. In that it provides an online source for manuscripts of a certain theme, it could be called akin. In that it provides a digital interface with a great many viewing options, there could also be similarities.

Management approach: Somewhat crowd-sourced; though all contributions are peer reviewed before they are published via this web site.

Resources: Very unclear at this time; the project is still in its infancy and the website even more so.

Sponsoring institution: The “Institutions and Sponsors” page lists the following sponsors:

  • eContent+ and the DISCOVERY consortium, Luxembourg
  • COST Action A32, Brussels
  • Uni Digital (earlier "Unifob Aksis"), a department of Uni Research (earlier "Unifob"), Bergen
  • University of Bergen (UiB), Bergen
  • L. Meltzers Høyskolefond, Bergen
  • Trinity College Cambridge (TCC), Wren Library, Cambridge
  • Bertrand Russell Archives (BRA), Ontario
  • Oxford University Press (OUP), Oxford
  • InteLex Corporation, Charlottesville

The “Research Groups” page further indicates that: “Wittgenstein Source is produced and maintained by the Wittgenstein Archives at the University of Bergen (WAB). WAB is part of the Uni Research (Bergen) department Uni Digital.”

Project team members: General Editor: Alois Pichler; other team members are not yet made known to the public (the “Editorial Board” page of the archive is under construction).

Chart

Other Resources

Association for Documentary Editing

http://www.documentaryediting.org/wordpress/

Recognized by the MLA as an allied organization; explanation of the project from their website: “the Association for Documentary Editing was created in 1978 to promote documentary editing through the cooperation and exchange of ideas among the community of editors.”

HRI: Humanities Research Institute (especially HRI digital)

https://www.shef.ac.uk/hri/technology and http://hridigital.shef.ac.uk

Explanation from their website: “The Humanities Research Institute is one of the UK's leading centres for digital humanities, providing research and development services for the arts, humanities and heritage domains.”

HRI provides assistance with project conception, proposal development, training staff, digital output, facilitating knowledge exchange, data development standards, online publishing services, etc. Essentially, HRI looks to facilitate the implementation of digital humanities projects.

Kiosque

http://hridigital.shef.ac.uk/kiosque

Exhibition software developed by the University of Sheffield and the Knowledge Transfer Partnership which allows museum visitors to interactively explore manuscripts via a public exhibition. Ideally used in conjunction with the Virtual Vellum viewing environment.

Media Commons Press

http://mcpress.media-commons.org

An academic press devoted to hosting online editions of publications. Media Commons provides software, host space, and support for digital projects that don’t have the time/know-how to create their own infrastructure.

TCP (Text Creation Partnership)

http://www.textcreationpartnership.org

“The primary goal of the Text Creation Partnership is to create standardized, accurate XML/SGML encoded electronic text editions of early printed books. We transcribe and encode the page images of books from ProQuest’s Early English Books Online, Gale Cengage’s Eighteenth Century Collections Online, and Readex’s Evans Early American Imprints.
“This work, and the resulting text files, are jointly funded and owned by more than 150 libraries worldwide. Ultimately, all of the TCP’s work will be placed into the public domain for anyone to use.
“The texts can be searched through web interfaces provided by the libraries at the University of Michigan and University of Oxford. In addition, partner libraries and their users are welcome to locally store, host, manipulate, analyze and otherwise work with the encoded text files, just as if they had been created locally.”

TEI: Text Encoding Initiative

http://www.tei-c.org/index.xml

Explanation of the project from their website: “a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics.”

TEI Provides tools for standardization of encoding text documents including schema to maintain tagging integrity, XSL style sheets, and OxGarage (which can transpose documents from a variety of formats).

Other Projects

Cause Papers

http://www.hrionline.ac.uk/causepapers/

A searchable database of over 14,000 cause papers relating to cases heard between 1300 and 1858 in the Church Courts of the diocese of York. Users can view images of the original papers as well as transcriptions.

Database of Mid-Victorian Illustration

http://hridigital.shef.ac.uk/dmvi/index.php

A searchable database of 868 literary illustrations published in or around 1862 with included bibliographic and iconographic details. Lightbox functionality allows a user to select specific images to view in a customized table at any point during her search.

Electronic Beowulf

http://ebeowulf.uky.edu

An electronic edition of Beowulf with included line-by-line translation. Also available are search functionalities, transcripts of various editions, and overviews of the history of these transcriptions.

Flora Tristan Project

http://hridigital.shef.ac.uk/flora-tristan

The Arts and Humanities Research Board in conjunction with the University of Sheffield sponsored this project; an effort to transcribe the corpus of letters that Tristan wrote over her life. The effort produced a CD-ROM with the transcription product (which is tagged with XML and utilizes XLS style sheets and a Java search applet).

Galdós Editions Project

http://www.hrionline.ac.uk/galdos

A project sponsored by the Arts and Humanities Rsearch Board through HRI to create a new critical edition of the Torquemada novels of Benito Pérez Galdós. This edition is available both in hard copy and online.

Hartlib Papers

http://www.hrionline.ac.uk/hartlib

A complete electronic edition of seventeenth-century man of science Samuel Hartlib’s 25,000 seventeenth-century manuscripts. This is freely available online with full-text transcription and facsimile images.

In Mozart’s Words

http://letters.mozartways.com

An searchable online edition in four languages of Mozart’s letters. This searchable database also includes access to background materials that bolster the letters’ content (i.e. newspapers, reviews, objects, paintings, documents, etc.).

Late Medieval English Scribes

http://www.medievalscribes.com

An online catalogue of all identified or unidentified scribal hands which appear in the manuscripts of Geoffrey Chaucer, John Gower, John Trevisa, William Langland, and Thomas Hoccleve. Includes a search database of the documents that will bring you to bibliographic entries rather than scanned pages.

Norman Blake Editions of the Canterbury Tales

http://hridigital.shef.ac.uk/norman-blake-editions

A transcription effort which strives to produce full diplomatic transcriptions of Chaucer’s The Canterbury Tales. The editions are to be published through HRI Online and are, as of yet, unavailable.

Old Bailey Proceedings Online

http://www.oldbaileyonline.org

A fully searchable Online edition of the proceedings of the Old Bailey, 1674-1913. Text is available both in transcription as well as in original scanned document.

Olive Schreiner Letters Online

http://www.oliveschreiner.org

Complete transcriptions of approximately 7,000 letters of nineteenth-century feminist Olive Shcreiner. The letters are available freely to search, access, read, and print with hyperlinked keywords within the transcriptions.

The Online Froissart

http://www.hrionline.ac.uk/onlinefroissart

A searchable online edition of Jean Froissart’s Chronicles of the Hundred Years’ War. Available here are various transcriptions, facsimiles, and commentaries (which may be compared side-by-side).

Origins of Early Modern Literature

http://www.hrionline.ac.uk/origins

A searchable online catalogue of literary works dated 1519-1579 intended to be the primary research spot for students and scholars whose focus is the Tudor period.

Partonopeus de Blois

http://www.hrionline.ac.uk/partonopeus/

An electronic edition of the works of anonymous 12th-century French romance Partonepeus de Blois. Includes a robust search function, though no original scans of the document are available via this edition; it exists in transcription only.

Renaissance Cultural Crossroads

http://www.hrionline.ac.uk/rcc/

A searchable, analytical and annotated list of all translations out of and into all languages printed in England, Scotland, and Ireland before 1641. It also includes all translations out of all languages into English printed abroad before 1641. Because this searches for translations of documents, the resulting pages are more information about documents rather than the documents themselves.

Richard Brome Online

http://www.hrionline.ac.uk/brome/

An online edition of the collected works of Richard Brome. Available in side-by-side comparison between modern and quarto texts, this edition is also searchable.

Stuart London Project

http://www.hrionline.ac.uk/strype

The aim of this project was to create a full-text electronic edition of seventeenth-century historian John Strype’s two-volume Survey of London. The edition is searchable and available page-by-page with separate links to included maps and illustrations. Notes to the text are included in the margins.

Smithsonian Digital Volunteers: Transcription Center

http://transcription.si.edu

The Smithsonian is crowd-sourcing transcription efforts to make its collection much more freely available via the internet. Transcriptions are available side-by-side with original document view in a searchable interface.

Whites Writing Whiteness

http://www.whiteswritingwhiteness.ed.ac.uk

Hosted by the University of Edinburgh, The WWW campaign is dedicated to explicating the theme of whiteness in South Africa. They are doing this via the transcription and analysis of letters contained in approximately fifty South African family-based archive collections. They then utilize a Virtual Research Environment (VRE) to analyze the meta-data tagged with each of these letters. The project is still in progress and the transcription database is not available online.