Skip to content

Dr. Jonathan Dunn

University of Canterbury, Linguistics

  • Home
  • CV
  • Papers
  • GitHub
  • Models
  • Corpora
  • earthLings
  • Department

Corpora

CGLU v4: The Corpus of Global Language Use (401 billion words)

http://www.earthlings.io/corpus_download.html

GeoWAC v1: Geographically-balanced Gigaword Corpora (45 billion words)

http://www.earthlings.io/corpus_download.html

CGLU v3: The Corpus of Global Language Use (16 billion words)

https://labbcat.canterbury.ac.nz/download/?jonathandunn/CGLU_v3

Categories

  • construction grammar
  • corpus measurements
  • corpus-based variation
  • metaphor
  • Uncategorized

Contact

Locke 206
University of Canterbury
Christchurch, NZL
jonathan.dunn@canterbury.ac.nz
www.jdunn.name
github.com/jonathandunn
Blog at WordPress.com.