This is a list of digital humanities tools dealing with textual analysis, most of which were initially compiled by Brett Hirsch, Heather Froehlich, and other participants of the Folger Institute's Early Modern Digital Agendas (2013) institute for advanced topics in digital humanities. For more resources, please see the Bibliography of textual analysis readings and the Glossary of digital humanities terms. Additional links and resources are welcome.

For additional text analysis tools, see the Bamboo DiRT list.

AntConc (Desktop, cross-platform: Mac/Win/Unix)

  • AntConc is a concordance software for digital text analysis with built-in statistical analysis metrics. It helps identify keywords, collates, and clusters, with lots of resources available on the software page.

DocuScope (Desktop; Java, cross-platform)

  • DocuScope is a text analysis environment with a suite of interactive visualization tools for corpus-based rhetorical analysis.

Gephi (Desktop, cross-platform)

  • Gephi is an open-source network analysis software for data visualization & manipulation.

Intelligent Archive (Desktop; Java, cross-platform)

  • Intelligent Archive is an interface to an archive of texts, and incorporates a range of counting functionalities to support statistical analysis and computational stylistics.

Juxta (Desktop; Mac/Win/Unix)

  • Juxta is an open-source cross-platform tool for comparing and collating multiple witnesses to a single textual work. The software allows users to set any of the witnesses as the base text, to add or remove witness texts, to switch the base text at will, and to annotate Juxta-revealed comparisons and save the results.

MALLET (Desktop; Java, cross-platform)

  • MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
  • Graphical User Interface version of MALLET (Desktop; Java, cross-platform
    • This is a graphical user interface (GUI) for MALLET's Latent Dirichlet Allocation implementation.

MorphAdorner (Desktop; Java, cross-platform)

  • MorphAdorner is an XML lemmatizer, text segmenter and natural language processing parser for Early Modern text (especially EEBO-TCP texts).

Python (Desktop; Mac/Win/Unix)

  • Python is a free programming language that uses a clean, flexible, and legible syntax. Python packages like the Natural Language Toolkit, BeautifulSoup, and Whoosh are particularly great resources for digital text analysis.

R (Desktop; Mac/Win/Unix)

  • R is a free software environment for statistical computing and graphics.

WordHoard (Desktop; Java, cross-platform)

  • WordHoard is an application for the close reading and scholarly analysis of deeply tagged texts.

Wordsmith (Desktop, Mac/Win/Unix)

  • Wordsmith is a concordance software for digital text analysis with built-in statistical analysis metrics including keywords and collocation, with lots of resources available on the software page.

Versioning Machine (Web)

  • Versioning Machine is a framework and an interface for displaying multiple versions of text encoded according to the Text Encoding Initiative (TEI) Guidelines.

Voyant Tools (Web)

  • Voyant Tools is a web-based reading and analysis environment for digital texts.