Dunn, J. (2022). “Cognitive Linguistics Meets Computational Linguistics: Construction Grammar, Dialectology, and Linguistic Diversity.” In Tay, D. & Xie Pan, M. (eds.), Data Analytics in Cognitive Linguistics: Methods and Insights. 273-308. Berlin: De Gruyter.
Abstract. Computational linguistics and cognitive linguistics come together when we use data-driven methods to conduct linguistic experiments on corpora. This chapter uses usage-based construction grammar to model geographic variation in language. The basic challenge is to show how grammatical structure emerges given exposure to usage and then how grammatical structures change given exposure to different sub-sets of usage. We first show how computational methods can be used to experiment with language learning by training a usage-based model of construction grammar. We then show how computational methods can be used to experiment with language variation by training a construction-based model of dialectology. To make these two experiments possible, we must also consider the validity of the corpora that we use for the experiments and how well they represent specific populations. Taken together, the work described here constitutes a computational theory of usage-based grammar that covers seven languages (English, French, German, Spanish, Portuguese, Russian, Arabic) and 79 distinct national dialects of these languages. Each part of the theory is an implemented computational model that can be evaluated using its predictions on held-out testing data.