Textual Geographies uses named entity recognition and geolocation to extract place names from multilingual (English, German, Spanish, and Chinese) printed volumes held by the HathiTrust digital library and to associate those names with detailed geographic information. The project corpus currently includes about 10 million volumes published between 1700 and the present day.

The project is under active development and has enjoyed generous funding from the National Endowment for the Humanities Office of Digital Humanities, American Council of Learned Societies, and the University of Notre Dame.

For more information concerning the corpus and research methods, please contact the project director, Matthew Wilkens.

Textual Geographies

Project Team

Advisory Board