A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production

Gan, Q.; Dunn, J.; Nini, A.; & Adams, B. (2026). “A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production.” In Proceedings of the International Conference on Language Resources and Evaluation. ERLA. Abstract. This paper describes a multi-dialectal, longitudinal corpus of human-AI hybrid language production, which includes purely human-produced samples, purely LLM-generated samples, and hybrid samples … More A Multi-Dialectal, Longitudinal Corpus of Human-AI Hybrid Language Production

How register and region shape the language network

Morin, C.; Coats, S.; & Dunn, J. (2026). “How register and region shape the language network: Evidence from Computational Construction Grammar.” Constructions, 18(1). Abstract. While Construction Grammar has proven effective at modelling regional and register variation separately, it has seldom been used to explore the interaction between the two. The present paper fills this gap … More How register and region shape the language network

Diffusion Across the Grammar: Complexity in Areal Interactions Between Dialects of English

Dunn, J. (2025). “Diffusion Across the Grammar: Complexity in Areal Interactions Between Dialects of English.” In Enrique-Arias, Andrés, Carlota de Benito Moreno and Florencio del Barrio de la Rosa (eds.). The spatial diffusion of linguistic changes: new methods and theoretical perspectives. Berlin: De Gruyter. Studies in Language Change 26. Abstract. This paper experiments with the … More Diffusion Across the Grammar: Complexity in Areal Interactions Between Dialects of English

Language Contact and Population Contact as Sources of Dialect Similarity

Dunn, J. and Wong, S. (2025). “Language Contact and Population Contact as Sources of Dialect Similarity.” Languages, 10(8), 188. https://doi.org/10.3390/languages10080188 Abstract. This paper creates a global similarity network between city-level dialects of English in order to determine whether external factors like the amount of population contact or language contact influence dialect similarity. While previous computational … More Language Contact and Population Contact as Sources of Dialect Similarity

Pre-Trained Language Models Represent Some Geographic Populations Better Than Others

Dunn, J.; Adams, B.; and Tayyar Madabushi, H. (2023). “Pre-Trained Language Models Represent Some Geographic Populations Better Than Others.” In Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC/COLING 2024). 12966–12976 Abstract. This paper measures the skew in how well two families of LLMs represent diverse geographic populations. A spatial … More Pre-Trained Language Models Represent Some Geographic Populations Better Than Others

Syntactic Variation Across the Grammar: Modelling a complex adaptive system

Dunn, J. (2023). “Syntactic variation across the grammar: modelling a complex adaptive system.” Frontiers in Complex Systems. DOI: 10.3389/fcpxs.2023.1273741 Abstract. While language is a complex adaptive system, most work on syntactic variation observes a few individual constructions in isolation from the rest of the grammar. This means that the grammar, a network which connects thousands … More Syntactic Variation Across the Grammar: Modelling a complex adaptive system

Variation and Instability in Dialect-Based Embedding Spaces

Dunn, J. (2023). “Variation and Instability in Dialect-Based Embedding Spaces.” In Proceedings of the Workshop on NLP for Similar Languages, Varieties and Dialects. Association for Computational Linguistics. Abstract. This paper measures variation in embedding spaces which have been trained on different regional varieties of English while controlling for instability in the embeddings. While previous work … More Variation and Instability in Dialect-Based Embedding Spaces

Register Variation Remains Stable Across 60 Languages

Li, H.; Dunn, J.; and Nini, A. (In Press). “Register Variation Remains Stable Across 60 Languages.” Corpus Linguistics and Linguistic Theory. Abstract. This paper measures the stability of cross-linguistic register variation. A register is a variety of a language that is associated with extra-linguistic context. The relationship between a register and its context is functional: … More Register Variation Remains Stable Across 60 Languages

Stability of Syntactic Dialect Classification Over Space and Time

Dunn, J. and Wong, S. (2022). “Stability of Syntactic Dialect Classification Over Space and Time.” In Proceedings of International Conference on Computational Linguistics (COLING 2022). 26-36. Abstract. This paper analyses the degree to which dialect classifiers based on syntactic representations remain stable over space and time. While previous work has shown that the combination of … More Stability of Syntactic Dialect Classification Over Space and Time

Predicting Embedding Reliability in Low-Resource Settings

Dunn, J.; Li, H.; & Sastre, D. (2022). “Predicting Embedding Reliability in Low-Resource Settings Using Corpus Similarity Measures.” In Proceedings of the 13th International Conference on Language Resources and Evaluation. European Language Resources Association. 6461-6470. Abstract This paper simulates a low-resource setting across 17 languages in order to evaluate embedding similarity, stability, and reliability under … More Predicting Embedding Reliability in Low-Resource Settings