Translation in the Age of the Machine

May 26, 2024

It’s a new world. Machines can now translate almost anything ridiculously well, in most cases better than humans could. (They also make fewer mistakes.)

And LLMs provide many other advantages. You can give them details about the context, or the desired style or vocabulary, which they actually follow, unlike many human translators. They can iteratively tweak their translations.

What lies in store, then, for us human translators and our craft?

If you’re like me, you mainly translate because you love it. You love caressing ideas into compelling representations of themselves in other languages. You gain pleasure from the constant learning process about language and a broad range of topics. You are rewarded by the unique patterns of mental activity that translation evokes. You find satisfaction in participating in the process of cross-language communication and the people it helps.

Hopefully these psychological benefits are enough, because the benefit of getting paid for your work is rapidly disappearing. One study says that 76.5 % of translation tasks have “high exposure” to impact from LLMs. That’s almost certainly an underestimate.

Like go players after DeepMind, translation will shift towards being something we do as a personal challenge and pastime and way to express ourselves.

The role of the translator will need to evolve if it is to continue to be economically relevant. One role will be to review the output of the machines—do we still call it post-editing? Most likely, this process will not involve going in and actually editing the document like we do now, but rather telling the AI where we see room for improvement in its work, so it can redo it.

Then there’s the aspect of telling the machines the parameters of the desired translation. For years, the work of us human translators has involved resources like style guides, glossaries, and past document samples. But in many cases these were not kept up to date and not actually followed by many translators. Generative AI now gives us the chance to formalize these types of inputs and directions—”translation kits”—and provide them to someone (something?) that will actually listen to them. Organizing and maintaining these resources, or prompts, will remain to some extent a human task (although we should not underestimate the ability of the machines to do a good job here as well).

It’s not surprising that Big Language remains in a state of serious denial about these new capabilities of machines. They will continue to insist that only they with their decades of experience can organize the new kinds of language resources required and design the new kinds of workflows needed. Another Big Language talking point is certain to be how only they with their super-duper language technology expertise could possibly operate these new-fangled LLMs in their role as translators Wrong!

This is a technological revolution the likes of which no one in the language industry has seen in their entire careers. It dwarfs the impact of the last major innovations, translation memory and previous generations of MT. It would be extraordinarily surprising if the entire shape of the industry were not redrawn over the next five to ten years, with huge swaths of the tasks it currently performs being taken over by machines, in many cases operated not by them but by companies who had previously been their clients, squeezing them out completely. The only thing on the side of the lumbering language-industrial complex is pure inertia—clients too lazy or clueless to figure out how to redesign their business processes.

Of course, literary translators won’t be impacted by any of this, right? Actually, it’s very unlikely that they won’t. Let’s take some Japanese novel—oh, how about “Snow Country”, maybe the first couple of sentences? We have the opportunity to delineate exactly we want the machine to translate this:

"Snow Country" is a Japanese novel written in the 1930s, set in Niigata. We're translating this into English in 1956. The translator is an American who knows quite a bit of Japanese but could not really be considered a scholar, or a literary expert. He believes that we don't need to follow the original Japanese structure or phrasing that closely, since it could seem unfamiliar or even off-putting to the Western reader, even if that means adding some things or removing some things. He does, however, since the book is likely to be nominated for the Nobel Prize in literature, the committee for which of course reads the book only in English, want the book to sound "novelistic", perhaps a bit "Japonesque". Here's the first three sentences. They describe the arrival of the protagonist on a train from Tokyo. Please translate them into English. By the way, we're not at all interested in you copying existing translations which I'm sure you have in your database. Please translate "from scratch".
国境の長いトンネルを抜けると雪国であった。夜の底が白くなった。信号所に汽車が止まった。

ChatGPT gives us:

After emerging from the long tunnel at the border, the train entered the snow country. The depths of the night were turning white. The train came to a stop at the signal station.

ChatGPT is following Seidensticker in moving the first appearance of the “train” into the first sentence, something we objected to in our analysis of his translation, but that’s because we told it that was OK.

The train came out of the long tunnel into the snow country.

Hmm, I’d be happy to present the machine’s translation to the Nobel Committee. Kawabata still probably would have won.

But the beauty of LLMs is how easy is it to try different approaches by giving it different instructions, such as these:

"Snow Country" is a Japanese novel written in the 1930s, set in Niigata. We're translating this into English in 2024. The translator believes that we should follow the Japanese structure and phrasing as closely as possible, potentially even at some minor sacrifice of readability, since those aspects are part of what defines the author and his work. The main target audience for this translation will be millennials, including ones familiar with Japanese pop culture and anime. Here are the first three sentences. They describe the arrival of the protagonist on a train from Tokyo. Please translate them into English. By the way, we're not at all interested in you copying existing translations which I'm sure you have in your database. Please translate "from scratch". By the way, 国境のトンネル is referring to a tunnel which crosses a boundary somewhere along its length.

国境の長いトンネルを抜けると雪国であった。夜の底が白くなった。信号所に汽車が止まった。

ChatGPT comes up with this:

Passing through the long tunnel that crossed the boundary, it was the snow country.
The bottom of the night had turned white.
The train stopped at the signal station.

I’m going to assume that choosing “it was the snow country” rather than the “into the snow country” in the first go-around was the result of asking it to hew more closely to the original—which personally, of course, I agree with.

We could continue this process ad infinitum, but for our purposes here the point is that there is an entire skillset relating specifically to how to prompt GPTs for translation, a task which translators could be pretty good at and might even provide for a living for some.

But is it possible that generative AI will have an even more radical impact on the entire concept of the packages of information—books, reports, articles, documents—that our society has come to take for granted, and the entire enterprise of statically translating those packages for the benefit of speakers of other languages? We will take up this topic in the near future.

Richard Sprague

Personal Science

Jul 2

You might be interested in this detailed counter-argument and how the most demanding clients insist on humans in the loop. https://geekway.substack.com/p/the-robots-wont-cause-massive-unemployment?r=qpge&utm_campaign=post&utm_medium=email&triedRedirect=true

Expand full comment

May 27

Human translators are toast (to use a metaphor that LLMs today can easily handle). But what happens to second language learning in general? How would you explain to a 20-year-old why they should bother with all that pesky memorization when good (and quickly improving) apps can handle most situations that would take years to master the old fashioned way.

Japanese One Word At A Time

Discussion about this post