Editing Machine-Generated Texts

Machine gibberish? No, thanks!

Machine translation systems certainly have been improving in recent years. But being able to imitate human language is not the same as possessing human-like intelligence. Machines do neither feel nor think like us. They don’t know empathy, they don’t giggle about silly dad jokes, and they don’t wince when you yell at them. This lack of emotions and lived experiences often makes their texts sound lifeless and dull.

In some scenarios, this isn’t a big deal. A simple user manual or a city council’s latest ordinance isn’t expected to win comedy awards. However, if a machine-generated translation elicits unintended laughter and makes your company go viral for all the wrong reasons, this is a problem.

To make sure this won’t happen to you, I offer a range of so-called MTPE services. These are in line with the relevant ISO standard and follow the principles outlined in my AI policy.

Overview

What is “MTPE”?

MTPE is short for Machine Translation Post-Editing and refers to the improvement and polishing of machine-generated translations. You may also find the abbreviations PEMT, short for Post-Editing of Machine Translation, and MÜ, which stands for “maschinelle Übersetzung” and is the German equivalent for MT. And to make things even more confusing, marketing teams of translation agencies and “AI” vendors have now introduced the new term Artificial Intelligence Post-Editing (AIPE). But that’s just another instance of too much hype. At the end of the day, “AI” systems are still machines – there’s no need to muddle the terminology further. Thus, I’m sticking to the established abbreviations, MT and MTPE.

These services come in two types: full MTPE and light MTPE. The former means doing a complete revision of the machine-generated translation with the goal to create a text that is “accurate, comprehensible and stylistically adequate [and] indistinguishable from human translation output” while trying to “use as much of the MT output as possible.” This definition is found in ISO 18587: “Translation services – Post-editing of machine translation output – Requirements.”

In contrast, the much simpler light MTPE approach only aims to facilitate information gisting. A typical example: In an international company all employees should be able to read and understand internal policies regardless of their native languages. In such a scenario, the translations don’t need to be stylistically adequate, but they must be comprehensible and accurately convey the main insights of the source material.

IMPORTANT: If your text is meant for publication, light MTPE is the wrong approach!

With the usual tools: 80% translation + 20% polishing

With MTPE method: 20% translation + 80% polishing

A common misconception is that MT content will only need “a few quick fixes.” Sure, when you have a very simple text or only need a rough translation of internal files for informational purposes, light MTPE might suffice. But the word “rough” is key here: Poorly edited MT suffers from incorrect terminology, loses the source text’s subtleties, and often fails to reflect the original author’s style and tone of voice. Your readers and customers deserve better!

This is why “AI” rarely saves time

For years, language professionals have been using tools that speed up the translation process and lower costs for end clients without compromising quality. This workflow is known as Computer-Assisted Translation (CAT). Typical CAT software comes with termbases (for consistent terminology), translation memories (for the efficient re-use of existing translations), and various little helpers such as an auto-suggest feature (for faster typing). In short: CAT software can be quite a time saver!

Now, when clients pre-translate texts with an “AI” system – whether it’s ChatGPT, Amazon Translate, DeepL or a similar product – and hope to reduce translation costs to a third or even a tenth of the regular rates, it often ends with disappointment. Because the translator will then have to compare the MT with the source, move or replace words, fix inconsistently translated terminology, and get rid of silly mistakes a professional human would never have made in the first place. (For some examples, see “Human vs. machine” or scroll down to the FAQ section on this page.)

The bottom line is that a translation done by a skilled pro using CAT software, keyboard shortcuts, and other tools doesn’t take much longer than improving the flawed output of a machine. Plus, in cases where it makes sense, translators could integrate an “AI” translation engine into a CAT workflow themselves. Therefore, please, manage your expectations and don’t demand unrealistic discounts!

Suitable Content

Certain types of texts can indeed benefit from machine pre-translation and be handled faster. For example, when I started as a translator, I often worked on product datasheets with highly repetitive content. This is just the sort of monotonous typing that machines are welcome to take off my hands. Other typical examples include:

User manuals
Recipes and similar instructions
Common descriptions for hotels, flights, rental cars, restaurants, etc.

Generic T&C documents
Simple course/e-learning content
Questionnaires and user surveys with standard wording

Automatically generated translations must be reviewed properly to make sure both content and form are fit for purpose. And once again, please note: A professional translator’s toolbox already includes suitable tools for these types of texts. Current “AI” systems will rarely yield significant gains in productivity.

When MTPE causes additional costs …

Problems typically arise when machine translations aren’t used in an expedient way but serve as a shortcut to save money. Ironically, the total costs can increase due to a poorly planned MTPE strategy!

Consider the following example: An e-learning company had hired me for an MTPE project where English course content was to be machine-translated and iteratively improved by multiple language service providers. The translations would be made accessible to paying (!) users right away. Despite the German version being labeled as a beta feature, this was really a bad idea: If you make people pay for content, don’t serve them unpolished translations!

Equally problematic are vague requirements. In the first round of this MTPE project, translators were told to fix “only major errors” that cause frustration among learners. But this is subjective. Some people get annoyed by nested sentences, others are bugged by inconsistent terminology or switching between formal and colloquial style. In addition, this client was hoping for a throughput of up to 2,000 words per hour – this doesn’t leave time for diligence.

The second round then required another translator to polish the initial translator’s work. Double-checking is a good idea, but: The second translator hadn’t seen the content before, and there’d still be various mistakes the first translator didn’t fix due to being pressed for time and without clear instructions. Hence, this second person would have to start from scratch but with an even tighter allocation of hours.

Finally, a third round focused on clicking through the published German version on the e-learning platform to make sure everything was properly translated, formatted, and correct in context. However, unclear context is something that should have been clarified immediately in the first round!

Long story short: This e-learning company ended up paying repeatedly for the same service and wasting a large chunk of money.

Let’s do the math for illustration. Assume a typical course on the platform has 10,000 words, and the translators all charge $50 per hour. Considering the average throughputs desired by this client, we get the following hours:

– 1st round: 5 hours (2,000 words/hr)
– 2nd round: 3.3 hours (3,000 words/hr)
– 3rd round: 2.5 hours (4,000 words/hr)

This results in: 5*$50 + 3.3*$50 + 2.5*$50 = $540.
If the client had opted for one proper round of full MTPE instead, with a feasible throughput of about 1,200 words/hr, they would’ve paid just 8.3*$50 = $415! As you can see, cutting corners and using MTPE without a good strategy will hurt your bottom line.

Payment and Terms

For MTPE services I usually apply my standard rate of €50/hour. When MTPE is used in a meaningful way, with suitable texts, about 1,000–1,500 words per hour can be considered a realistic throughput (full MTPE). I’d be happy to look at your specific project to assess the potential effort and benefit of MTPE.

Payments can be made via SEPA credit transfer or, if you’re outside of Europe, via PayPal. My standard terms are 14 days, and my Terms & Conditions apply.

Decades of research – still not done

The history of machine translation did not begin with Google or OpenAI. The first practical considerations regarding the automatic transfer of texts from one language into another go back almost a whole century! In the 1930s, an engineer and a scholar, Georges Artsrouni and Peter P. Troyanskii, independently filed a patent application for a mechanical translation machine (book rec: Early Years in Machine Translation).

However, it became clear this was a rather lofty goal and would require at least some foundational research first. But despite a good amount of enthusiasm in the 1940s and 1950s that brought about various scientific papers, conferences, and new academic chairs, the mood would quickly change. For a simple reason: Human language is complex, multi-faceted, and often ambiguous. It cannot be described and translated using only a basic set of deterministic rules.

And so this research field evolved and went from purely rule-based approaches to a combination of statistics and language models to predict word sequences, until it eventually embraced machine learning, neural networks, and big data for training more powerful models. Of course, all those decades saw significant progress. Long gone are the days of jumbled Google Translate output riddled with unidiomatic phrases and grammar errors. And yet, after so many years, machines still need help from qualified humans to create precise and consistent translations tailored to the target audience.

Frequent Questions

Your question isn’t answered here? Please, also check my FAQ page or drop me a line via my contact form. Thank you!

Haven’t machine translations become good enough by now?

If by “good enough” you mean “it looks like normal German and that’s all that matters,” then yes, they are good enough. But your customers, readers, or fans will hate you.

Exhibit #1: Microsoft’s official German support website:

The German versions of Windows and Office include many poor translations that can make it difficult to understand what a button or feature actually does. But there’s at least some human oversight to prevent the localization from going off the rails. Unfortunately, that’s not true for Microsoft’s online documentation. Many pages are filled with the kind of machine-translated gibberish shown in the screenshot above.

To give you an idea of how bad the German is, this is what an equivalent English version of the first sentence under Option 1 would sound like:

Activate the option New Outlook toggle in your current try the Outlook app.

Does this sound “good enough” to you? Or does it make you want to switch to a better MS Office alternative?

So, if you think current machine translations will do, you either overestimate their quality by a lot or you don’t care much about your target audience. In case of the latter, we wouldn’t be a good match anyway. Thanks for stopping by and have a nice day! 😉

Are machine translations suitable for subtitling projects?

Subtitles and captions follow established conventions so that viewers can scan the text in seconds while still being able to follow the visuals on screen. For example, there are guidelines regarding the optimal length of subtitles and the proper placement of line breaks. Furthermore, subtitles and captions should be synced to the visual content and to cuts between scenes.

Machines can count characters and words, but they do not understand the content and therefore struggle with the translation and segmentation of subtitles. Maybe you’ve already seen those Instagram or YouTube reels with colorful machine-generated captions that consist of just 2–3 words per frame? People who rely on captions will constantly have to focus on the text and won’t be able to enjoy the visual content. This makes those captions pretty useless!

To answer your question: For subtitling projects, automated machine translations are just a stopgap and not a true solution. They require a lot of fixing, which means the potential savings in time and money compared to working with a professional human are rather low.

...

…