Reading the First Books: Multilingual, Early-Modern OCR for Primeros Libros

Categories

  • Scholarship

Tags

  • Book History
  • Digital Humanities

Abstract

Reading the First Books: Multilingual, Early-Modern OCR for Primeros Libros is a two-year, multi-university effort to develop tools for the automatic transcription of early modern printed books. It is a collaboration between students, faculty, and staff at the University of Texas at Austin and Texas A&M University.

The Reading the First Books project will:

  • Develop tools for the automatic transcription of books printed in multiple languages, using variable orthographies, during the first centuries of the printing press.
  • Make those tools accessible for institutions and individuals by incorporating them into the Early Modern OCR Project (eMOP) at Texas A&M University, an open-source OCR workflow.
  • Produce automatic transcriptions of the Primeros Libros de las Américas collection of books printed before 1601 in the Americas, written in Spanish, Latin, Nahuatl, Huastec, Mixtec, Otomi, Tarascan, and Zapotec.

View the project website in the Wayback Machine Read the White Paper on the NEH website