I work across both linguistic theory and natural language processing. My research models the emergence of grammatical structure within individuals and its diffusion across global populations.
Before joining the University of Canterbury, I held positions in computer science at the Illinois Institute of Technology and received a PhD in linguistics from Purdue University under Victor Raskin. I have also been an Intelligence Community Research Fellow under the US Office of the Director of National Intelligence and a Visiting Scientist at the US National Geospatial-Intelligence Agency.
Right now I am working on global-scale computational dialectology as the combination of grammar induction and geospatial text classification. The goal is to model regional syntactic variation so accurately that dialect models can predict an individual’s region-of-origin. This work depends on large geo-referenced corpora that reflect the demographics of underlying populations. For example, here’s a recent paper showing that the COVID-19 pandemic allows us to remove non-local populations from digital corpora.
If you’re interested in learning more about computational linguistics, check out my two courses on edX: Introduction to Text Analytics with Python and Visualizing Text Analytics with Python. Taken together, these free courses provide a basic introduction to natural language processing.