About

Contact Method

email

Office Hours

Wed. 12:30–2:30

Office

GUG 418-D [map]

My Primary research interests are in accessibility for Natural Language Processing, including representation for minority languages in Natural Language Processing, and encouraging fairness in Natural Language Processing through attention to diversity in language data and techniques. My dissertation research focused on writing the INterlinear Text ENrichment Toolkit, (INTENT) — a tool designed to bootstrap linguistic knowledge for basic NLP tasks using extracts from linguistics papers as a proxy for expert annotaiton. Via the Online Database for INterlinear text (ODIN), INTENT supports upwards of 1,500 languages.

As part of the RiPLes project, I have worked on a variety of tools related to data extraction of multilingual documents from PDF, including language identification, passage and document classification, and even PDF-to-text analysis for academic linguistic papers.