History of Early English Books Online

Early English Books Online (EEBO) is a Proquest/Chadwyck-Healey subscription database of over 125,000 mostly English works printed between 1473 and 1700. The works are represented in digital images (in PDF and TIFF formats) and through bibliographical descriptions drawn from the English Short-Title Catalogue, the Wing Catalogue, the Thomason Tracts, and the Early English Books Tract Supplement.

This article on the history of EEBO grew from discussions in the Folger Institute’s 2012 Workshop on Teaching Book History and the summer 2013 Early Modern Digital Agendas seminar (EMDA2013). The latter was an NEH-ODH Institute in Advanced Topics. At EMDA2013, visiting faculty member Ian Gadd led participants through a two-day investigation of EEBO. Members of EMDA argue that understanding the development, current use, and limitations of EEBO allows scholars to fully consider the influence of the digital tool on our understanding of early modern texts. This essay, primarily authored by Erica Zimmer and edited by Meaghan Brown, covers the history of EEBO. For current use and limitations, see Using Early English Books Online.

Origins: the short-title catalogues

Over the course of the EEBO project, short-title catalogues served as the means of selecting texts for imaging and as sources for the bibliographical descriptions that accompany the images. These short-title catalogues were products of nineteenth-century nationalism; defined by national and linguistic boundaries, they focused on books printed in England, Scotland, Ireland, and Wales, as well as English language books printed outside the British Isles. The short-title catalogues delineated both the scope of EEBO and the way its contents are presented and made searchable.

A short-title catalogue is a list of printed works designed to identify editions, typically including a variety of bibliographical information: the shortened version of the title, publication information (“imprint”), subject headings, genre terms, pagination and format, and references to other catalogues. Entries in a catalogue may vary in completeness. Historically, the abbreviation of titles and other information has been required by the limited space of printed catalogues. Digital catalogues may include full titles and far more detail.

EEBO’s history is closely related to that of the English Short Title Catalogue, or ESTC, a digitized short-title catalogue comprised of three earlier resources. A. W. Pollard and G. R. Redgrave’s A short-title catalogue of books printed in England, Scotland, & Ireland and of English books printed abroad, 1475–1640 (STC) and Donald Wing’s Short-title catalogue of books printed in England, Scotland, Ireland, Wales, and British America, and of English books printed in other countries, 1641–1700 (Wing) were both employed to select content for EEBO. The works described in a third short-title catalogue, the Eighteenth-Century STC (also the ESTC) was outside the time period covered by EEBO, but the ESTC played a role in developing the English Short Title Catalogue and the metadata standards it shares with EEBO.

The STC: Pollard and Redgrave’s Short-Title Catalogue

Perhaps best known among these catalogues is A. W. Pollard and G. R. Redgrave’s A Short-title Catalogue of Books Printed in England, Scotland, and Ireland, and of English Books Printed Abroad, 1475–1640 (better known as the Short-Title Catalogue, or STC). Understanding the STC’s intended purpose can help clarify the nature of the data it contains. Neither the STC nor its revisions sought to present a complete picture of early modern English print culture from 1475–1640. Nor was each entry meant to be a researcher’s final destination for the history of a given work. Initially, the STC was intended to serve as a handlist, or shortened finding aid, for titles held within the British Museum (now the British Library) in London, where Pollard was employed as Keeper of Printed Books.[1] Pollard’s 1926 “Preface” describes the STC as “a catalogue of the books of which its compilers have been able to locate copies, not a bibliography of books known or believed to have been produced.”[2] To this end, a major feature of STC entries is the list of libraries in which an STC title might be found. Since the STC’s list later served as a central point of reference for selecting copies used in the project now known as EEBO, chronological and categorical parameters of the STC’s coverage will also circumscribe that portion of EEBO records guided by STC entries.

First published in January 1927 (1926, on the title page) and revised over the years from the late 1940s until 1991, the STC includes printed works published within specific temporal and national guidelines. As Ian Gadd summarized in his explanation to EMDA2013, the work:

  • must have been printed before 1641
  • at least part of it must have been printed using printing type (rather than any other method of printing such as engraving)
  • it must have been either printed in the British Isles or its colonies or, if printed elsewhere in the world, to have been printed in English or any other British language.[3]

Discussing the implications of this final requirement, Gadd notes the exclusion from the STC of “very large numbers of foreign-printed Latin books imported into England from the fifteenth centuries onwards.” Hence, he emphasizes, one “cannot read STC’s contents as a full representation of Britain’s print culture prior to 1641.”[4]

As this sample STC entry shows, the standard fields of an STC entry include:

  • author
  • title
  • STC number (STC is not italicized when referring to the entry number for citation purposes)
  • further editions (if any)
  • Imprint: place of publication, printer, publisher,[5] or bookseller names, and dates
  • format [6]
  • entries in the Stationers' Register [7]
  • libraries in which the title is found

Some STC entries include additional information in the “Notes” field: readings such as catchwords, line endings, and spelling variations that allow users to distinguish between issues and editions; notes on fonts; an indication of the last page number found in the book (no attempt is made at verifying whether this number is correct); and collations (formulae indicating the physical makeup of the copy).[8] (For a visually supported explanation of “collation” in both bibliographic and material contexts, see Sarah Werner’s multifaceted orientation to the term here). Much of this information, including the textual variations and collations, is useful for suggesting which edition of a given title one has in hand. As Pollard observes in the “Preface,” any sense of a title as a singular and coherent text must be complicated by the multiple variant editions and issues one finds when examining individual copies. The book in hand, however, was always meant to be trusted more than the catalogue, and when making arguments about the physical make-up of early modern texts, examinations of specific exemplars should always be privileged over information taken from the STC.

Conversely, the STC was always meant to help scholars locate copies of specific books to perform such examinations. The listing of libraries at the end of each entry is meant to provide the user with a starting place for locating each listed item. Like other aspects of the STC, these lists of holding institutions are not complete: up to ten repositories are listed in Britain, followed by up to ten in North America.[9] That said, the STC was never meant as a complete record of early modern British book production or its surviving products. Warning against any sense of completeness the STC itself might create, Pollard cautions “all users of this book” to note “that from the mixed character of its sources it is a dangerous work for anyone to handle lazily, that is, without verification.”[10]

STC2: the Revised Short Title Catalogue

Revisions to the STC began with the interleaved annotations of William A. Jackson, first Librarian of the Houghton Library at Harvard, and extended through Jackson’s collaboration with F. S. Ferguson, Managing Director of Bernard Quaritch, Ltd., in England.[11] As discussions of the project progressed, Pollard’s own annotated copy of STC was also provided by Frank C. Francis, director of the British Museum and editor of The Library. Katharine F. Pantzer, Librarian of the Houghton Library at Harvard and culminating editor of the project, describes the STC revisions as a complex process of aggregation, confirmation, and reconsideration lasting over 40 years. Pantzer’s April 1985 “Acknowledgements” to volume 1 (pub. 1986) underscore the revised STC’s continued observance of the initial edition’s principles, even while she urged users’ critical awareness of the work’s limitations.[12] No further revisions are being implemented to the print STC.[13]


The STC project was extended chronologically by the work of Yale librarian Donald Wing. Like the earlier STC, Wing’s Short-Title Catalogue of Books Printed in England, Scotland, Ireland, Wales and British America and of English Books Printed in Other Countries, 1641–1700 presents a short-title list of known copies delineated by both geographic and linguistic restrictions. Commonly referred to as “Wing” due to the similarity in titles, this extension was initially published in three volumes during the years 1945–51. The work was revised from 1972–98 with attention to systemic inaccuracies. When revision of the initial three-volume set was completed, it was republished in its entirety (Wing2), with a fourth volume added in 1998. As with the STC, no further revisions are planned for Wing.

As Gadd observes, while Wing covers only 60 years, or less than half of the STC’s 165-year span, it contains an exponentially larger number of entries, given the proliferation of English and English-language printing in the period addressed.[14] For this reason, constraints on Wing’s contents are more strict: unlike the STC, Wing does not include periodicals or other ephemeral materials, and its entries’ bibliographic descriptions are not as detailed.[15] Both differences further circumscribe the picture of the print culture Wing presents.[16]

The Eighteenth-Century Short Title Catalogue

The use of the Eighteenth-Century Short Title Catalogue (ESTC) as the foundation of the English Short Title Catalogue (also ESTC—yep, acronym overlap) has informed the structure of EEBO’s metadata, even though its coverage begins where Wing's ends. Over the period 1977–87, this catalogue of eighteenth-century metadata and page images was released by the British Library both on microfiche and electronically via CD-ROM. Subsequently, its page images have been made available online as Eighteenth-Century Collections Online, or ECCO.

The Eighteenth-Century STC differs from both the STC and Wing in several important ways, including its use of computerized cataloging.[17] Since its modes of compilation and publication were electronic from the outset, it was not compelled to grapple in the same way with space constraints limiting its print predecessors. Further, the Eighteenth-Century STC was conceived as a comprehensive union catalogue—that is, a list of the known copies of works falling within its stated parameters across multiple institutions.[18] Yet the substantial increase in volume of works published during the eighteenth century itself created the need to exclude several major categories of material entirely from its listings. Specifically excluded were “serials, bookplates, trade cards, playbills and blank forms,” as well as “engraved prints, music, [and] maps.”[19] Due to these exclusions, it does not represent a complete list of works printed in the period covered.

Creating the Current Database: EEBO Remediations


In 1938, Eugene Power began photographing early printed books on microfilm, initiating the imaging project that would eventually become EEBO.[20] Early production of these microfilms increased sharply at the beginning of World War II, when global conflict threatened many of the world's libraries and the early modern resources they contained. Scholarly concern for the security of collections intersected with the early commercial interests of University Microfilms International (UMI), Power’s “fledgling” microfilm business, which at the time had sixteen institutional customers and aspired to a publishing model resembling the modern notion of “print on demand.”[21] Power sought to increase the scope of research possible in American libraries by photographing the pages of early modern books held abroad, then converting these images to microfilm—a relatively new storage technology at the time—for printing when the need to read arose. As Bonnie Mak explains, works listed in the STC were considered desirable as “demand for them would be certain: American libraries, having been established relatively recently, were generally lacking in STC titles.”[22] Using this new storage technology would allow images of the works to be kept on-site and consulted as needed.

As war spread across Europe, travel to European libraries was judged extremely dangerous, and this factor limited Americans’ movements to and through the area. Yet the scholarly community and Power’s enterprise supported action being taken to “preserve the irreplaceable volumes in England, at least.”[23] With a Rockefeller grant funding his work’s preservation aspect, Power traveled to libraries in the United Kingdom as part of an undertaking managed in the United States by Margaret Harwick.[24] The technology was also deployed during wartime to create images of German correspondence and other materials, since the microfilming techniques could be used on many forms of printed matter. After the conflict concluded, Power’s enterprise, known as “Early English Books” (EEB), was allowed to retain the high-quality cameras used in military contexts. The corresponding increase in both capacity and photographic quality further advanced UMI’s work with early modern texts.[25]

Navigating the numbering system associated with these microfilm rolls requires an awareness of their material history, as well as their connection to STC cataloguing. Initially, UMI microfilms were to contain one book per microfilm roll, or “reel,” a plan that, if followed, would have created a system in which the reels’ numbering followed the copy-specific numerical entries of the STC. Since each microfilm reel proved able to store twenty to thirty books’ worth of page images, however, a more complex numbering system—one indicating a book’s position within a reel—was created to indicate the microfilms' relation to STC records.[26] As presently maintained, the EEBO search interface bears traces of this material heritage: on the “Advanced” search screen, users may search for works by “Reel Position.” In addition, many institutions still retain the physical microfilm reels.

Digitizing for CD-ROM

In the 1990s, the decision was made to create digital facsimile images by scanning the microfilms themselves.[27] Groups of digitized images were made available beginning in 1998, and while the means of distribution have moved online, the process of microfilming, then scanning, continues to this day. Initially, the scans were captured in black and white; in 2012, the digitization moved to greyscale, thereby rendering in greater detail images converted from the microfilm.

Over the years, access to the digitized microfilms has grown. Subscribers are now able to download page image files in multiple forms, including TIFF and PDF full text. Critics have noted the remediation involved—that is, the transfer of material from one medium to another—can elide material traces that enable users to register the valences of the artifacts they study, as well as identify potential gaps within the information provided.[28] Manuscript annotation present on the original books is also rarely legible in the final EEBO digitized image. Yet others have noted these challenges as counterbalanced by the “opportunity” EEBO scans present to develop bibliographical knowledge. Images of EEB microfilms, when digitized as components of EEBO, bear marks of their inscription, as they have not, in Gadd's words, been “sanitized” in this translation of medium. As he observes, “Openings are retained rather than broken into single pages; images are not cropped; rulers, place-holders, and descriptive notes are left in place; blank leaves are not removed.”[29] (Sample instances of such “EEBO Oddities” have been gathered by Whitney Trettien and are available for viewing by EEBO subscribers.)

EEBO’s Metadata: the Contributions of the ESTC

Each EEBO record contains a range of information about the work in question. See Using Early English Books Online for a full explanation of the parts of these records and their significance to searching functions.

This metadata is drawn ultimately, but not directly, from the STC and Wing short-title catalogues.The growth of the online English Short Title Catalogue (ESTC) stemmed from a 1987 decision to “expand backwards” the database containing Eighteenth-Century STC records “by incorporating material from RSTC [STC2] and Wing2.”[30] Adding these earlier lists to the database created a data set relevant to EEBO’s growing digitization work. During the years 1989–1997, EEBO was given access to ESTC metadata, allowing EEBO to build bibliographic entries that would describe the imaged STC texts.[31] While EEBO entries adapted from the English Short Title Catalogue are “heavily edited,” both resources share reference to the STC and Wing reference lists. In this way, ESTC fields retained by EEBO cataloguers continue to inform the shape of EEBO as users encounter it.

EEBO and the Text Creation Partnership (EEBO-TCP)

EEBO’s presentation of the ESTC metadata in database format made it possible to rapidly search citations for particular words or phrases and then access images of the texts indicated. Scholars soon sought to perform similar searches on the full texts of works in this corpus. The non-profit Text Creation Partnership, or TCP, supports this latter objective. According to TCP documentation, the Text Creation Partnership will, when complete, contain “standardized, digitally-encoded electronic text editions” of “around 70,000” printed documents whose images are found in EEBO. To date, these full-text transcriptions have been created by hand.[32] When one subscribes to both EEBO and the TCP, the two resources are linked—prompting use of the joint acronym “EEBO-TCP,” although separate subscriptions to each database are also available.

Although the TCP is non-profit, it is subscription-based, and different tiers of access are available. Researchers using EEBO-TCP may find it valuable to determine the level of TCP access provided by their home institutions. Any claim to the “representativeness” of a research corpus should be accompanied with a description of the researcher’s level of access, or taken with a healthy grain of salt, as the access level can skew quantitative results. Some EEBO subscribers do not subscribe to TCP at all: in these cases, searches for discrete terms via the EEBO interface will return the lowest possible number of hits one might obtain, as the existing bibliographic metadata would be the only corpus searched. Available tiers of subscription-based TCP access are organized EEBO-TCP transcription “phases,” and a list of TCP partners and their access levels is available.

On 1 January 2015, EEBO-TCP will release Phase I of its hand-completed, full-text transcriptions to the public. This data set includes over 25,000 full-text, hand-transcribed files marked up for machine readability using Text Encoding Initiative, or TEI, guidelines. This publication event will influence the scope and the nature of insights obtainable through large-scale, computationally assisted research on early modern texts.

Critical Implications of this Genealogy

To this day, EEBO records bear traces of these predecessor catalogues. On the interface’s “Advanced” search screen, the option to search by “Bibliographic Number” allows users to search for a copy’s digital page images using its STC or Wing reference number. Likewise, the advanced search allows users to search by EEB reel number, if the microfilm reference is known (EEBO subscribers may click here to see this). The contents of its underlying database (specifically, the copies chosen and the represented image sets) are also heavily influenced by STC and Wing listings.[33]

Yet users of later resources depending on these works are not always aware of the limits of these sources. Both the STC and Wing emphasize their intent to serve as reference lists to known copies in particular locations, as well as starting-points for later, more “full-dress” explorations.[34] Since EEBO, via the ESTC, relies upon the listings of both earlier catalogues, its range of resources should be understood as similarly circumscribed. For many, however, the “illusion of comprehensiveness” persists.[35] In part, this sense may result from EEBO’s existing documentation, which at present does not explicitly address the underlying catalogues’ limits. As Mak observes, users’ desires for completeness can prompt them to equate authority with exhaustiveness. The polished façade of the digital can exacerbate this issue.

Mak has recently argued in favor of an “archaeological” awareness of EEBO’s history and development, invoking the metaphor of the palimpsest to convey the rich yet obscured material histories such multilayered resources present.[36] As both she and Gadd point out, working to recover this history results in more expert, nuanced use and understanding of resources such as EEBO, which draw directly upon earlier structures of data.[37] Developing such perspective also supports scholars in making more precise, as well as accurate, claims about the work they conduct.

