Unsupervised Machine Translation - How Machines Learn to Understand Across Languages

titul vychází 12.5.2025

Hlídat cenu a dostupnost

Unsupervised Machine Translation - How Machines Learn to Understand Across Languages

Popis

Nakladatelství	Karolinum
Rok vydání	2025
Počet stran	176
Vazba	brožovaná vazba
Jazyk	eng
EAN	9788024660783
Kód	KOS9788024660783

For decades, machine translation between natural languages fundamentally relied on human-translated documents known as parallel texts, which provide direct correspondences between source and target sentences. The notion that translation systems could be trained on non-parallel texts, independently written in different languages, was long considered unrealistic. Fast forward to the era of large language models (LLMs), and we now know that given their sufficient computational resources, LLMs exploit incidental parallelism in their vast training data, i.e., they identify parallel messages across languages and learn to translate without explicit supervision. LLMs have since demonstrated the ability to perform translation tasks with impressive quality, rivaling systems specifically trained for translation. This monograph explores the fascinating journey that led to this point, focusing on the development of unsupervised machine translation. Long before the rise of LLMs, researchers were exploring the idea that translation could be achieved without parallel data. Their efforts centered on motivating models to discover cross-lingual correspondences through various techniques, such as the mapping of word embedding spaces, back-translation, or parallel sentence mining. Although much of the research described in this monograph predates the mainstream adoption of LLMs, the insights gained remain highly relevant. They offer a foundation for understanding how and why LLMs are able to translate.