Reading the First Books: Multilingual, Early-Modern OCR for Primeros Libros is a two-year, multi-university effort to develop tools for the automatic transcription of early modern printed books. It is a collaboration between students, faculty, and staff at the University of Texas at Austin and Texas A&M University.
The Reading the First Books project will:
- Develop tools for the automatic transcription of books printed in multiple languages, using variable orthographies, during the first centuries of the printing press.
- Make those tools accessible for institutions and individuals by incorporating them into the Early Modern OCR Project (eMOP) at Texas A&M University, an open-source OCR workflow.
- Produce automatic transcriptions of the Primeros Libros de las Américas collection of books printed before 1601 in the Americas, written in Spanish, Latin, Nahuatl, Huastec, Mixtec, Otomi, Tarascan, and Zapotec.