Interdisciplinary Approaches to Data Collection, Annotation and Computational Processing of Code-Switched Languages around the World
A special issue of Languages (ISSN 2226-471X).
Deadline for manuscript submissions: closed (1 May 2022) | Viewed by 25752
Special Issue Editors
Interests: computational sociolinguistics; multilingualism; social media; spoken data; computational social sciences; language contact, language variation and change; low-resource languages, linguistic data curation and analyses; code-switching; usage-based methods to language; multi-word expressions; natural language processing; constructions
Special Issue Information
Dear Colleagues,
Code-switching (C-S) in multilingual settings has been extensively studied from a linguistic and a computational linguistic point of view across language pairs/tuples. Although there are valuable theoretical and data-based studies on C-S in linguistics (e.g., Bullock & Toribio, 2009; Fernandez et al., 2019), they usually focus on collecting and analyzing (relatively) small-scale data sets and the results of these studies are presented/published in academic venues targeting fellow linguists (e.g., publications in a journal/book, workshops & conferences in linguistics, bilingualism, multilingualism). Therefore, research on C-S from a linguistic point of view is often less visible to computational linguists who also conduct research on C-S.
Recent developments in computational areas of research make it possible to analyze large scale and multilingual data through automatized methods of analyses. Some of these methods can also be applied to the analyses of C-S as well. (e.g., Rijhwani et al., 2017; Vilares et al., 2016). So far, computational research has mostly focused on developing algorithms for processing code-switched languages. However, there are rarely any systematic analyses about the types and frequencies of errors that occur as a result of computational processing of code-switched languages. Similarly, standards for the annotation and evaluation of C-S for computational processing are lacking. Although there is a growing interest in analyzing code-switched languages in computational areas of research, there is also an imbalance between the well-studied and low-resource language pairs (e.g., C-S across African, Southeast Asian, Indigenous Languages) around the world. Finally, most computational research is published in venues attended by fellow computational linguists and/or speech technologists (e.g., Bali et al., 2020; Solorio, 2021; Sitaram, 2019). Therefore, there is less visibility for the linguistic audience.
As described above, there is a gap and lack of collaboration between linguistics and computational areas of research about C-S. In addition, the research output is not always visible across these domains. The first goal of this Special Issue is to bridge the gap and increase collaboration across disciplines. More specifically, we aim to familiarize the linguistic audience with the available large-scale data sets, methods, and techniques about C-S in computational areas of research (e.g., Natural Language Processing (NLP), Automatic Speech Processing). Secondly, we aim to make computational researchers aware of the rich linguistic research on
C-S and multilingualism in general. We also hope to increase awareness among researchers about the linguistic and social factors that lead to different types of C-S across diverse language pairs and multilingual contexts (e.g., Doğruöz et al., 2021).
We invite submissions authored by computational linguists, speech technologists, and linguists describing novel or existing research on computational processing of C-S and/or proposing interdisciplinary solutions for the existing challenges (e.g., how linguistic research in C-S could be useful for the computational processing of C-S). We also welcome papers describing, analyzing, or providing alternatives for the creation, curation, and annotation of C-S datasets across languages around the world. Submissions may include, but need not be limited to:
- Computational Techniques for processing code-switched languages (including less documented and/or low-resource languages) around the world
- Collection and annotation of code-switched spoken data for Automatic Speech Processing
- Collection and annotation of code-switched textual data for NLP
- Development and utilization of multilingual computational models for processing spoken and textual C-S data sets.
- Self-supervised models for processing spoken and textual C-S data sets.
- Evaluation benchmarks for code-switched NLP methods and speech systems
The submission guidelines to the journal are included here.
Tentative completion schedule:
- Abstract Submission Deadline: 1 Feb 2022
- Notification of Acceptance: 1 March 2022
- Full Manuscript Deadline: 1 May 2022
References:
- Bali, K., et al. (2020). “Proceedings of the First Workshop on Speech Technologies for Code-Switching in Multilingual Communities”, WSTCSMC 2020.
- Bullock, Barbara E. & Toribio, Almeida Jacqueline (eds.). 2009. The Cambridge Handbook of Linguistic Code-switching. Cambridge University Press.
- Doğruöz, A.S., Sitaram, S., Bullock, B.E., Toribio, A.J. (2021). "A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies", Proceedings of The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021). Association for Computational Linguistics (ACL).
- Fernández Fuertes, Raquel & Gómez Carrero, Tamara & Martinez, Alejandro. (2019). Where the Eye Takes You: The Processing of Gender in Codeswitching.
- Rijhwani, S., Sequiera, R., Choudhury, M., Bali, K., & Maddila, C. S. (2017, July). “Estimating code-switching on twitter with a novel generalized word-level language detection technique”, Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers) (pp. 1971-1982).
- Vilares, D., Alonso, M. A., & Gómez-Rodríguez, C. (2016, May). “En-es-cs: An English-Spanish code-switching twitter corpus for multilingual sentiment analysis”, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (pp. 4149-4153).
- Solorio, T., Chen, S., Black, A.W., Diab, M., Sitaram, S., Soto, V., Yilmaz, E. (2021), “Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching”, CALCS 2021.
- Sitaram, S., Chandu, K.R., Rallabandi, S.K., Black, A.W., (2019). "A survey of code-switched speech and language processing", arXiv preprint arXiv:1904.00784 (2019).
Dr. A.Seza Doğruöz
Dr. Sunayana Sitaram
Guest Editors
Manuscript Submission Information
Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.
Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a double-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Languages is an international peer-reviewed open access monthly journal published by MDPI.
Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.
Keywords
- code-switching
- multilingualism
- data annotation
- computational approaches
- Natural Language Processing
- Automatic Speech Processing
- linguistic approaches
- low resource languages
Benefits of Publishing in a Special Issue
- Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
- Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
- Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
- External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
- e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.
Further information on MDPI's Special Issue polices can be found here.