Kardeş-NLU: Transfer to Low-Resource Languages with Big Brother's Help -- A Benchmark and Evaluation for Turkic Languages

Lütfi Kerem Senel; Benedikt Ebing; Konul Baghirova; Hinrich Schuetze; Goran Glavaš

Kardeş-NLU: Transfer to Low-Resource Languages with Big Brother's Help -- A Benchmark and Evaluation for Turkic Languages

Lütfi Kerem Senel, Benedikt Ebing, Konul Baghirova, Hinrich Schuetze, Goran Glavaš

Add to Favorites

Main: Multilinguality and Language Diversity 1 Oral Paper

Session 7: Multilinguality and Language Diversity 1 (Oral)

Conference Room: Marie Louise 1

Conference Time: March 19, 14:00-15:30 (CET) (Europe/Malta)

TLDR:

RocketChat
Abstract

You can open the #paper-236-Oral channel in a separate window.

Abstract: Cross-lingual transfer (XLT) driven by massively multilingual language models (mmLMs) has been shown largely ineffective for low-resource (LR) target languages with little (or no) representation in mmLM's pretraining, especially if they are linguistically distant from the high-resource (HR) source language. Much of the recent focus in XLT research has been dedicated to \textit{LR language families}, i.e., families without any HR languages (e.g., families of African languages or indigenous languages of the Americas). In this work, in contrast, we investigate a configuration that is arguably of practical relevance for more of the world's languages: XLT to LR languages that do have a close HR relative. To explore the extent to which a HR language can facilitate transfer to its LR relatives, we (1) introduce Karde\c{s}-NLU, an evaluation benchmark with language understanding datasets in five LR Turkic languages: Azerbaijani, Kazakh, Kyrgyz, Uzbek, and Uyghur; and (2) investigate (a) intermediate training and (b) fine-tuning strategies that leverage Turkish in XLT to these target languages. Our experimental results show that both - integrating Turkish in intermediate training and in downstream fine-tuning - yield substantial improvements in XLT to LR Turkic languages. Finally, we benchmark cutting-edge instruction-tuned large language models on Karde\c{s}-NLU, showing that their performance is highly task- and language-dependent.