Home

I am a computational linguist in the Linguistics Department at the University of Canterbury in Christchurch, New Zealand. My research use data science to model language learning and language change in large multi-lingual corpora. My recent work has also focused on the impact of linguistic variation on language technology as well as on low-resource languages.

Before joining the University of Canterbury, I held positions in computer science at the Illinois Institute of Technology and received a PhD specializing in computational linguistics from Purdue University under Victor Raskin. I have published over 30 papers in top venues and my first book, Natural Language Processing for Corpus Linguistics, is now available from Cambridge University Press. My extensive teaching experience includes a MOOC that has taught over 11,000 students about natural language processing.

On a practical level, my work provides solutions to difficult problems: Language Identification, Dialect Identification, Construction Grammar, Language Mapping, and Corpus Similarity.

If you’re interested in learning more about computational linguistics, check out my recent book or my two courses on edX: Text Analytics 1: Introducing Natural Language Processing and Text Analytics 2: Visualizing Natural Language Processing. Taken together, these free courses provide a basic introduction to natural language processing. You can also use my introductory Python package on its own: Text Analytics.