Issue #30 - Reducing loss of meaning in Neural MT

Raj Patel 28 Mar 2019
The topic of this blog post is model improvement.

Introduction

An important, and perhaps obvious feature of high-quality machine translation systems is that they preserve the meaning of the source in the translation. That is to say, if we have two different source sentences with slightly different meanings, we should have slightly different translations. However, this nuance can be a challenge, even for state-of-the-art systems, particularly in cases where source and target languages partition the “meaning space” in different ways. In this post, we will try to understand what are the language characteristics causing this meaning loss and discuss a few methods on how to reduce it in Neural MT.

Many-to-One Translation

Translation systems are typically many-to-one functions from source to target language, which in many cases results in important distinctions lost in translation. For example, take the following sentences:
  1. By mistake, I cut off my finger with a knife.
  2. By mistake, I cut my finger with a knife.
Using Google Translate for English to Hindi, they both are translated in the same way (as below) which is ambiguous in Hindi:
  • “गलती से मैंने चाकू से अपनी उंगली काट ली। (galatee se mainne chaakoo se apanee ungalee kaat lee.)
The correct translations should actually be (differences highlighted):
  1. “गलती से मैंने चाकू से अपनी उंगली काटकर अलग कर दी। (galatee se mainne chaakoo se apanee ungalee kaatakar alag kar dee.)
  2. “गलती से मैंने चाकू से अपनी उंगली काट ली।                    (galatee se mainne chaakoo se apanee ungalee kaat lee.)

Languages differ in the way they explicitly mark meaning distinctions, making the task of translation systems even more difficult as we have to try to learn these. In general, it is difficult to model all small variations which are crucial (for many languages) to transfer correct distinction in the translated text. To evaluate the nature of the issue, Cohn-Gordon and Goodman (2019) explored a corpus of 500 sentence pairs in which distinct English sentences map to a single German translation. They identified a number of common causes for many-to-one translations. In the 500 segments, the following are the most common distinctions lost in the German translation. 

Figure 1:  The most common distinctions lost in the German translation.

Meaning Preservation
While languages differ in what distinction they are required to express, all are usually capable of expressing any given distinction when desired. Cohn-Gordon and Goodman (2019) proposed a method to reduce meaning loss by applying the Rational Speech Acts (RSA) model to the translation. RSA has been successfully used for modeling language pragmatics (Goodman and Frank, 2016), and recent works show its applicability in image captioning, another sequence generation NLP task. 

In simple words, the RSA model is used to reduce many-to-one mappings, which consequently reduces meaning loss in the translation system. Cohn-Gordon and Goodman (2019) introduced a new inference algorithm to incorporate the RSA model in the translation process and showed a gain (+4%) in meaning preservation. Moreover, they obtained improvements in translation quality (+1 BLEU score), demonstrating that the goal of meaning preservation directly yields improved translations.

In summary

RSA integration in the translation process not only reduces meaning loss, but improves the translation quality. We've seen a process which was successful on two fairly similar languages. In more divergent languages, we are more likely to see more instances of this type of issue. This suggests the possibility of even bigger impact of this approach, which remains to be seen!
Raj Patel
Author

Raj Patel

Machine Translation Scientist
All from Raj Patel