Lost in translation: A new approach to AI navigates world literature

AMHERST, Mass. – English readers of digital foreign-language novels have long despaired over the poor quality of translation, especially when the original versions were published in a non-Romance language and written with a high-literary sensibility. But this may soon change, thanks to an $822,365 grant awarded to University of Massachusetts Amherst professor of computer and information science, Mohit Iyyer, from Open Philanthropy.

AMHERST, Mass. – English readers of digital foreign-language novels have long despaired over the poor quality of translation, especially when the original versions were published in a non-Romance language and written with a high-literary sensibility. But this may soon change, thanks to an $822,365 grant awarded to University of Massachusetts Amherst professor of computer and information science, Mohit Iyyer, from Open Philanthropy.

Traditionally, novels have been translated by experts who are not only fluent in the denotative meaning of words in two or more languages, but also sensitive to the fine nuances and connotations that set literature apart from more technical writing. It might take such a translator years to arrive at a faithful rendition that preserves the play of language and image of the original—if such a translator can even be found. Since linguists estimate that there are more than 7,000 languages spoken on earth today, much of what gets written in one language will only get translated poorly into another, if it gets translated at all.

While the rise of AI-based translation software has helped to ease the bottleneck, it is far from perfect. “French to English translates comparatively well,” says Iyyer, “but Japanese to English is notoriously bad, and anything with a literary sensibility is hopeless.” To illustrate the point, Iyyer points to two translations of Japanese novelist Haruki Murakami’s Norwegian Wood. The first, by a professional human translator, reads:

A chill November rain darkens the land, turning the scene into a gloomy Flemish painting. The airport workers in their rain gear, the flags atop the faceless airport buildings, the BMW billboards, everything. Just great, I’m thinking, Germany again.

Compare that to the same Japanese source text run through Google Translate:

The frosty rain of November darkened the earth, and the mechanics wearing rain feathers, the flag standing on the flat airport building, the BMW billboard and everything like that were a gloomy picture of the Flemish school. It looked like the background of. I wondered if it was Germany again.

“The status-quo AI translators are often far too literal,” says Iyyer, “because they are trained on news articles and parliamentary proceedings”

Iyyer’s solution is to bring humans back into the equation. Over the next two years, Iyyer and his team will build an online platform that hosts a wide range of previously untranslated novels, which will be available in English thanks to an AI model that his team will develop. These translations will be interactive, and readers will be able to highlight sections of text that they think are incorrect and propose alternatives that read more smoothly. Another AI model—a post-editing model—will collect these user-generated corrections and update the AI translational model with them. It’s a way for the AI translation model to “learn.”

Iyyer is quick to point out that this process can’t replace the expertise of a dedicated human translator. “But,” he says, “it’s my hope that we can give those expert translators a head start, and in the meantime we can help spread readable versions of the world’s greatest literature.”

Contacts: Mohit Iyyer, [email protected]

                 Daegan Miller, [email protected]


You May Also Like