"Poro’s advanced capabilities with European languages like Finnish descend from how it addresses the core challenge for low-resource languages: training LLMs requires enormous amounts of data, but for low-resource languages like Finnish, sufficient data is simply not available. In general, Poro addresses this by cross-training low-resource languages with high-resource languages. This takes advantage of a cross-lingual signal that allows the model to achieve higher performance for the low-resource language than training a monolingual model, and has the further advantage of teaching the model basic translation capability."
I wonder how this cross-training affects the "overall quality" of the LLM for the low-resource languages? Any scientific papers to pinpoint?
I wonder how this cross-training affects the "overall quality" of the LLM for the low-resource languages? Any scientific papers to pinpoint?