Translation in the age of generative AI
ChatGPT came out in November last year. One year has now passed. Is it time to ask: will ChatGPT and similar generative AI tools make human translators obsolete?
Yes and no. These models already translate astonishingly well. We used to be surprised when earlier MT systems produced an entire sentence requiring no post-editing; now we’re surprised when an LLM produces a sentence that does require tweaking. In many cases, the translations these tools spit out are arguably superior to what the majority of human translators might create.
We recently were consulting for a Japanese company on its English website. The Japanese contained the phrase 創立以来/sōritsu irai. A clueless local language company they had hired to do the website translated this as “after our establishment”. On its first try, ChatGPT produced the perfectly serviceable “since our inception”. When prodded for an alternative, it proceeded to come up with the lovely “throughout our journey”. The machine also translated 追及/tsuikyū (“pursue”) variously as “dedicated to exploring” and “dedicated to uncovering” and “unwavering in our pursuit” and “unwavering in our commitment” and “an unwavering quest”. Try finding a human translator at six cents a word who could do that.
This highlights another strength of the LLM approach: you can give it instructions for the style you want, ask it for second opinions, and essentially have a dialog with the machine until you converge on the perfect output. When I asked it to provide more of a marketing spin, it came up with “from our very beginnings”; when I asked it to produce a more readable version, it suggested “from day one”.
This points us in a new direction for the role of a “translator” in the age of generative AI: to provide the guidance to the machine in terms of style and other parameters, as well as additional information relevant to the translation if any, and then iterate and tweak and fine-tune. We translators become the “machine whisperers”.
There may well come a point when even this is no longer necessary, because the models are advancing by leaps and bounds. They will advance at an ever-increasing pace. They can be customized for particular domains and sources of information, and already have the ability to implement such customizations themselves. Upcoming versions such as GPT-5 will take advantage of further increases in the massive computing power which is a major factor in the development of these new capabilities.
But what about real hairy stuff?
But certainly the models can’t be good at everything, right?. What about equity research reports, which I spent a meaningful portion of my career as a translator working on (and for which I was compensated handsomely, thank you very much). Every single thing about these translations must be spot on. For example:
会社側では06年3月期も高成長を見込み、売上高144億円(前期比36.2%増)、経常利益17.7億円(同48.5%増)を計画している。
I translated this nearly two decades ago as
For FY05, the company looks for continuing strong growth, with forecasts including sales of ¥14.4bn (up 36.2% YoY) and recurring profit of ¥1.77bn (up 48.5%).
ChatGPT, with some good prompting, came up with
For FY2005, the company is forecasting continued high growth, with projected sales of 14.4 billion yen (up 36.2% YoY) and ordinary profits of 1.77 billion yen (up 48.5% YoY).
It’s hard to argue with this.
The instructions I gave the machine to come up with this, including the definition of fiscal years, and a preference for using YoY, could easily be incorporated into a standard prompt used for all translations, or even into a custom bot augmented with additional concepts.
But don’t these LLMs hallucinate though? They do, although future iterations of the technology will do so less and less. But hallucinations are less of an issue for translation, where the machine is essentially doing a 1-1 mapping of the input. Having said that, it goes without saying that any LLM output needs human review, although it is intriguing to consider the possibility of asking the LLM to itself take another pass to compare the original and translated versions to identify any anomalies. Such workflows involving multiple LLM passes are already well-established in the industry.
Then there is “literature”
But of course we have to draw a line when it comes to Literature, right? For example, no machine could ever do a good translation of a Nobel Prize-winning novel, right? Well, let’s take this blog’s favorite novel of all time, Kawabata’s Snow Country. We spent an unreasonable amount of time analyzing the very first sentence about coming out of the long tunnel into snow country. ChatGPT did not hesitate:
Emerging from the long border tunnel, it was a land of snow.
I sort of like the “land of snow” bit. I tried additional excerpts from Yukiguni with similarly good results. Move over, Ed.
Here’s the part about Komako’s nose:
"Her slender, high nose had a slightly forlorn aspect, but below it, her small, puckered lips
The machine did a good job with “forlorn”. And here it was the human (Seidensticker) doing the hallucinating, imagining that つぼんだ/tsubonda had something to do with “buds”, while the machine got it right with “puckered”. The machine also did not take it upon itself to randomly omit parts of the original like “below it”, as Seidensticker inexplicably did.
What about the Saigyo’s poem about dying under the cherry tree?
願はくは花の下にて春死なむ
How about this?
Beneath the flowers, my desire: to pass in spring's embrace.
Then there is Zen stuff
But could the machine translate Dogen, our friendly 13th century Zen scholar-monk? Does it understand the mind-body duality? Stay tuned.