Abstract Accepted for Presentation at "Beyond Language 2022"
Robert's abstract submitted to the Beyond Language 2022 conference has been accepted for a presentation. He will discuss his work wrangling text from Kashubian Wikipedia into a workable source for studying linguistic variation and present a small case study based on this data. The abstract:
Sourcing data from Wikipedia for the study of language contact: the csbwiki
In multilingual contexts, the languages involved are often said to exert influence on each other’s grammatical patterns. The results of language contact – contact-induced language changes – are rarely balanced; these affect the respective languages involved at different rates and in different areas of grammar. Differential rates of cross-linguistic influence are particularly evident in contexts involving historically minoritized languages, where the social context is not conducive to equitable intergroup relations. There is a need to utilize a variety of empirical data types to more thoroughly understand the social and cognitive processes at work in contexts of language contact that lead to language change. Paradoxically, empirical data on minoritized languages is relatively scarce and expensive to generate. But in the digital age we have the ability to look beyond the traditional data types used in language studies, like spoken data gathered under fieldwork conditions, literature, etc.
With prevalence of internet access and various social media platforms, online user-created content has become an enormous source of data that are readily utilizable for scientific studies. "Wikis" constitute convenient collections of such data, which are “collaboratively edited and managed directly by [their] own audience directly.”1 Wikipedia is a well-known wiki-based site that hosts wikis in 316 languages,2 many of which are understudied and / or historically minoritized. The csbwiki, Kashubian-language Wikipedia is one such collection with 5,426 articles, more than 185,000 edits and more than 15,000 users.3 In this paper, I will first present the theoretical underpinnings and methodology of the project New Speakers of Minority Languages: Proficiency, Variation and Change4 , which aims to empirically assess the relationship between second language acquisition and contact-induced language change. I will then explore the potential utility of user-created wiki data in our specific study of language contact and in studying language contact more generally.