The desire of humans to understand each other across language barriers has always driven the development of tools that could assist them in doing so. With the advent of computers in the 20th century, using software for automatic translation of text in one language to another was one of the earliest goals in computer science, founding the subfield of machine translation. A variety of approaches have appeared with increasing levels of automation and abstraction; for example, while rule-based machine translation requires the curation of dictionaries and explicit grammar rules, statistical machine translation is able to generate an output via statistical models automatically inferred from large bilingual text corpora.
In the 2010s, artificial neural networks were being more and more applied to solve a wide variety of computational tasks. Loosely inspired by the information processing of biological brains, which consist of neurons connected via adaptive synapses, they can in principle compute an arbitrary mathematical function of its inputs by adapting a set of internal parameters (the parameters are called weights, and the adaptation process is called training). While theoretically being formulated already in the middle of the preceding century, the increasing availability of hardware allowing for massively parallel processing (e.g., GPUs) suddenly boosted the application of neural networks to many real world applications, such as object recognition in images. Furthermore, networks could be made more powerful by making them larger and increasing their depth (the number of layers in which the computation is organized), giving rise to the buzzword of deep learning.
A special subset of neural networks, so-called recurrent neural networks, allow for the processing of sequential information, most prominently text. Neural machine translation (NMT) refers to the use of such recurrent sequence-to-sequence networks to translate text in a source language into text in another target language. NMT networks are trained on bilingual text corpora in an end-to-end fashion, without the need to define rules or statistical principles for the computation of a translation; rather, these are extracted automatically from those corpora during training. NMT networks typically consist of an encoder part, which produces an abstract representation of the source text, and a decoder part, which, using this representation, generates the output text, one token at a time. This process is reminiscent of a human interpreter, who first tries to understand the meaning of a sentence before he or she formulates the corresponding sentence in another language.
The success of NMT — systems outperforming non-neural approaches can be obtained with comparatively low effort — put it in the focus of research and commercial applications. Various flavors of NMT networks have been developed (e.g., Transformer networks), and the technology of most machine translation providers are nowadays based on NMT. The translation services of iTranslate are also powered by neural machine translation.
Currently, we are supporting server-side translation between 56 languages; the most common languages are also available offline and run completely on a mobile device without the need for an internet connection. Our goal is to remove language as a barrier to human communication, and NMT is one of the powerful tools that help us in this endeavor.