I was asked the following question regarding the Id-Extras DeepL Translate add-on:
We have run a number of major translations and there is a noticeable difference between this and the online version of DeepL. Is the latest API implemented in the plug-in? Can you tell us when there will be an update?
Bens followed-up privately with a sample file (see screenshot below, right) and the following comments (the quotes and screenshots are used here with permission):
I have collected a few extracts from other texts, which also vary between Translate and DeepL online. In a larger document of 160 pages, there are hundreds of such examples. Some have greater differences than others, but what they have in common is that almost all of them are better in the online version (copy-paste process). I’ve also attached an idml-file where you can do some testing yourself.
In this blog post, I’d like to focus on some issues which are causing some of these issues, and perhaps suggest some ways to remedy the issues, which usually involves a little preparatory work on the source file itself.
Problem #1: Undesirable paragraph breaks
This source text was provided as an example: Tick the games you’ve played!
The correct Norwegian translation, and the one produced by DeepL’s online translation service is: Kryss av for spillene du har spilt!
The poorer InDesign result is: Kryss av på for spillene du har spilt!
However, note that in InDesign, there is a paragraph dividing the original sentence:
If I remove that paragraph break, the translation is correct, also with the Translate script:
This result makes sense I think. With the paragraph break, what we have semantically is 2 separate sentences, and that gives DeepL less information to work with. Joining the two paragraphs to make one complete sentence gives it the context it needs.
It’s true that DeepL online translates this correctly even with the paragraph break. But I suspect that the online translation is simply eliminating all paragraph breaks first before sending the translation to the translation engine. This is clearly not something that Translate for InDesign should do, as it would destroy the structure of long text.
Suggested Solution: Prepare the text for translation in the InDesign file by removing semantically redundant paragraph breaks.
Problem #2: Text on a Path
The text Unlock the secrets of this epic intergalactic game! has not been translated at all.
In this case, the text is on a path in InDesign. For some reason, when translating an entire file, Translate currently ignores text on a path and only translates text in text frames.
However, it is possible to select the text-on-a-path individually and run Translate. It will then be translated properly.
Suggested Solution: This one is our fault! I hope to fix the issue in a forthcoming update. For now, the workaround is to select each text on a path and run Translate on that selection individually.
Problem #3: All Capitals
DeepL works best if the text supplied has normal capitalization. They have specifically mentioned to me to “avoid text input in upper case, if possible”.
Some fonts display all text as capitals. However, logically, behind the scenes, it makes a difference to DeepL whether the text has been typed in all-caps or regular upper- and lower-case.
For instance, the following text uses a font that only has capital letters.
If the text is typed in as “LET’S HEAR IT FOR THE HEROES” (i.e. all capitals), the result from DeepL into Norwegian is as follows. Note that the phrase “the heroes” has not been properly translated.
But if the same text is typed in as “Let’s hear it for the heroes” (i.e. upper- and lower-case), the result is correct:
It still is in all-caps, because that is how the font is designed, but because the source text is provided in upper- and lower-case (even though it looks the same in InDesign), the result is much better.
Suggested Solution: Make sure that text intended for translation is not in ALL CAPITALS. It needs to be typed regularly, in upper- and lower case. (If you have done this, it is fine to use a font that displays it as all-caps, or to use InDesign’s all-caps formatting.)
Problem #4: Hidden Inter-word Formatting Changes
This is the most problematic issue, perhaps.
The text Ways to be a boss! has not been translated at all. The text is not on a path, but in a slightly rotated and sheared text frame.
On closer examination, a lot of custom kerning went into these few words (kudos to the publisher for their attention to detail!). The problem is that each instance of custom kerning creates a separate particle of text, and so what Translate sends to DeepL in this case is this:
<Content id='0'/>W<Content id='1'/>A<Content id='2'/>Y<Content id='3'/>S <Content id='4'/>T<Content id='5'/>O be a boss!<split/>
It does this to maintain the original formatting, but despite telling DeepL that the “Content” tag should be regarded as non-splitting, DeepL seems to stumble over a word where every single letter has a different formatting tag. It is unable to understand the sentence and aborts any attempt to translate it. (This is my guess of what is going on behind the scenes.)
When this sentence is pasted into the DeepL website online, though, the translation is fine, because only the text is pasted, and all the special InDesign formatting is, of course, stripped out.
To fix the issue, at least on our end, make sure that there is a paragraph style applied to this text, and click on “Clear all overrides” to remove all paragraph and character overrides. This tends to give better results.
It must be said, however, that sometimes this is not sufficient to fix the problem. For reasons unclear, DeepL still doesn’t translate the text properly via the API. I’ve filed a bug report, so hopefully this is something that they will fix going forward!
Suggested Solution: Avoid formatting changes within a single word (e.g. kerning changes, color changes, baseline shift) for better results. But sometimes DeepL struggles in any case.
Leave a Comment