Representations of Language Varieties Are Reliable

Dunn, J. (2021). “Representations of Language Varieties Are Reliable Given Corpus Similarity Measures.” In Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties, and Dialects. Association for Computational Linguistics. 28-38. Abstract. This paper measures similarity both within and between 84 language varieties across nine languages. These corpora are drawn from digital sources (the … More Representations of Language Varieties Are Reliable

Modeling Global Syntactic Variation in English

Dunn, J. (2019). “Modeling Global Syntactic Variation in English Using Dialect Classification.” In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects (NAACL 19). Association for Computational Linguistics. 42-53. Abstract. This paper evaluates global-scale dialect identification for 14 national varieties of English as a means for studying syntactic variation. The paper … More Modeling Global Syntactic Variation in English

Profile-Based Authorship Analysis

Dunn, J; Argamon, S; Rasooli, A.; & Kumar, G. (2016) “Profile-Based Authorship Analysis.” Digital Scholarship in the Humanities, 31(4): 689-710. Abstract. This article presents a profile-based authorship analysis method which first categorizes texts according to social and conceptual characteristics of their author (e.g. Sex and Political Ideology) and then combines these profiles for two authorship … More Profile-Based Authorship Analysis

Finding Variants for Construction-Based Dialectometry

Dunn, J. (2018). “Finding Variants for Construction-Based Dialectometry: A Corpus-Based Approach to Regional CxGs.” Cognitive Linguistics, 29(2): 275-311. Abstract. This paper develops a construction-based dialectometry capable of identifying previously unknown constructions and measuring the degree to which a given construction is subject to regional variation. The central idea is to learn a grammar of constructions … More Finding Variants for Construction-Based Dialectometry