Using Optical Character Recognition (OCR) to Build a DH Corpus
Students will learn how to use common OCR software, including Tesseract and ABBYY Finereader, to build the text corpora they need to for common DH methods such as text mining, topic modeling, bibliographic visualizations, and text-as-data analyses. Skill Level Beginner Prerequisites None Equipment Requirements None