DITA-OT Translate Plug-in is a DITA-OT Plug-in to create, auto-translate and re-merge XLIFF files, generating translated documentation in a targeted foreign language. It can create and consume files using either XLIFF 1.2 or XLIFF 2.1 format.
This plug-in consists of three DITA-OT transforms
- The
xliff-create
transform creates XLIFF and skeleton files from the*.dita
files. - The
xliff-translate
transform populates the<target>
texts using an automatic translation service. - The
xliff-dita
transform recreates the DITA project using the translated texts.
Table of Contents
The DITA-OT Translate Plug-in has been tested against DITA-OT 4.x. It is recommended that you upgrade to the latest version.
The DITA-OT Translate Plug-in is a plug-in for the DITA Open Toolkit.
-
Full installation instructions for downloading DITA-OT can be found here.
- Download the
dita-ot-4.2.zip
package from the project website at dita-ot.org/download - Extract the contents of the package to the directory where you want to install DITA-OT.
- Optional: Add the absolute path for the
bin
directory to the PATH system variable.
This defines the necessary environment variable to run the
dita
command from the command line. - Download the
curl -LO https://github.com/dita-ot/dita-ot/releases/download/4.2/dita-ot-4.2.zip
unzip -q dita-ot-4.2.zip
rm dita-ot-4.2.zip
- Run the plug-in installation commands:
dita install https://github.com/doctales/org.doctales.xmltask/archive/master.zip
The dita
command line tool requires no additional configuration.
Several publically available automatic translation cloud services are available for use, they typically offer a try-before-you-buy option and generally offer sample access to the service for without cost. Upgrading to a paid version will be necessary when transforming larger documents.
The IBM Language Translator allows you to translate text programmatically from one language into another language
Introduction: Getting Started
Create an instance of the service:
- Go to the Language Translator External link icon page in the IBM Cloud Catalog.
- Sign up for a free IBM Cloud account or log in.
- Click Create.
Copy the credentials to authenticate to your service instance:
- From the IBM Cloud dashboard External link icon, click on your Language Translator service instance to go to the Language Translator service dashboard page.
- On the Manage page, click Show to view your credentials.
- Copy the
API Key
andURL
values. - Within the plug-in alter the file
cfg/configuration.properties
to hold yourAPI Key
andURL
.
By default the Frankfurt translation service URL used is: https://gateway-fra.watsonplatform.net/language-translator/api/v3/translate
,
amend this when using a regional instance.
Microsoft Translator provides multi-language support for translation, transliteration, language detection, and dictionaries.
Introduction: Overview
Create an instance of the service:
- Go to Try Cognitive Services
- Select the Translator Text APIs tab.
- Under Translator Text, select the Get API Key button.
- Agree to the terms and select your locale from the drop-down menu.
- Sign in by using your Microsoft, Facebook, LinkedIn, or GitHub account.
You can sign up for a free Microsoft account at the Microsoft account portal. To get started, click Sign in with Microsoft and then, when asked to sign in, click Create one. Follow the steps to create and verify your new Microsoft account.
After you sign in to Try Cognitive Services, your free trial begins. The displayed webpage lists all the Azure Cognitive Services services for which you currently have trial subscriptions. Two subscription keys are listed beside Speech Services. You can use either key in your applications.
Copy the credentials to authenticate to your service instance:
- Copy each of the
API Key
andEndpoint
values. - Within the plug-in alter the file
cfg/configuration.properties
to hold yourAPI Key
andURL
.
By default the global translation service URL used is: https://api.cognitive.microsofttranslator.com/translate
,
amend this when using a regional instance.
The API provides access to the Yandex online machine translation service. It supports more than 90 languages and can translate separate words or complete texts.
Introduction: Overview
To sign-up to the service:
- Review the user agreement and rules for formatting translation results.
- Get a free API key.
- Read the documentation, where you will find instructions on enabling the API and detailed descriptions of its features.
After you sign in to your account select API Keys and create a new key as necessary. The latest endpoint can be found in the documentation
https://translate.yandex.net/api/v1.5/tr/translate
Copy the credentials to authenticate to your service instance:
- Copy each of the
API Key
andEndpoint
values. - Within the plug-in alter the file
cfg/configuration.properties
to hold yourAPI Key
andURL
.
The DeepL API is accessible with a DeepL Pro subscription (DeepL API plan) only. The API is an interface that allows other computer programs to send texts to the DeepL servers and receive high-quality translations.
Introduction: Overview
To sign-up to the service:
- Open a DeepL API developers account. Note that not all accounts offer access to the DeepL API. It is essential that the account type includes REST API access.
- Fill out the application details and add a credit card. No payments are required for the first 30 days. You can cancel the card and still maintain free access for the trial period.
- Read the documentation, where you will find instructions on enabling the API and detailed descriptions of its features.
After you sign in to your account select API Keys and create a new key as necessary. The latest endpoint can be found in the documentation
https://api.deepl.com/v2/translate
Copy the credentials to authenticate to your service instance:
- Copy each of the
API Key
andEndpoint
values. - Within the plug-in alter the file
cfg/configuration.properties
to hold yourAPI Key
andURL
.
- to create an XLIFF 1.2 File and associated skeletons with run:
PATH-TO-DITA-OT/bin/dita -f xliff-create -i document.ditamap -o out --xliff.version=1
A translate.xlf
file will appear in the out
directory along with a series of skeleton files.
<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<file datatype="xml" original="/document.ditamap" source-language="en" target-language="es">
<header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
<skl>
<external-file href="./skl/document.ditamap.skl" />
</skl>
</header>
<body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
<trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
<source xml:lang="en">
Loves or pursues or desires to obtain pain of itself, because it
is pain, but occasionally circumstances occur in which toil and
pain can procure him some great pleasure. To take a trivial
example, <x ctype="x-dita-b" id="d3e14">which of us ever undertakes
laborious physical exercise,</x> except to obtain some advantage from it?
But who has any right to find fault with a man who chooses to enjoy a pleasure
that has no annoying consequences, or one who avoids a pain that produces no
resultant pleasure?
</source>
<target xml:lang="la"/>
</trans-unit>
... etc
</body>
</file>
...etc
Note: if the
translate.cachefile
parameter is used, unchanged text with previously approved translations will be copied over to the<target>
elements.
- to populate an exisiting XLIFF 1.2 File with auto-translated text
PATH-TO-DITA-OT/bin/dita -f xliff-translate \
-i translate.xlf --translate.service=[bing|deepl|watson|yandex] \
--translate.apikey=<api-key>
--xliff.version=1
The XLIFF 1.2 File is auto-translated in place, with translated text as shown:
Note: only
<trans-unit>
elements which areapproved="no"
will be auto-translated.
<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<file datatype="xml" original="/document.ditamap" source-language="en" target-language="es">
<header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
<skl>
<external-file href="./skl/document.ditamap.skl" />
</skl>
</header>
<body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
<trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
<source xml:lang="en">
Loves or pursues or desires to obtain pain of itself, because it
is pain, but occasionally circumstances occur in which toil and
pain can procure him some great pleasure. To take a trivial
example, <x ctype="x-dita-b" id="d3e14">which of us ever undertakes
laborious physical exercise,</x> except to obtain some advantage from it?
But who has any right to find fault with a man who chooses to enjoy a pleasure
that has no annoying consequences, or one who avoids a pain that produces no
resultant pleasure?
</source>
<target xml:lang="la">
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, <x ctype="x-dita-b" id="d3e14">quis nostrud exercitation
ullamco laboris,</x> nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.
</target>
</trans-unit>
...etc
</body>
</file>
...etc
- to create an XLIFF 2.1 File and associated skeletons with run:
PATH-TO-DITA-OT/bin/dita -f xliff-create -i document.ditamap -o out --xliff.version=2
A translate.xlf
file will appear in the out
directory along with a series of skeleton files.
<?xml version="1.0" encoding="UTF-8"?>
<xliff srcLang="en" trgLang="la" version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0">
<file id="2" original="/topic.dita">
<skeleton href="./skl/topic.dita.skl"></skeleton>
<unit fs:fs="p" id="9962" xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0">
<originalData>
<data id="sd4e14"><b></data>
<data id="ed4e14"></b></data>
</originalData>
<segment state="initial">
<source xml:lang="en" xml:space="preserve">Loves or pursues or desires to obtain pain of
itself, because it is pain, but occasionally circumstances occur in which toil and pain
can procure him some great pleasure. To take a trivial example, <pc dataRefEnd="ed4e14"
dataRefStart="sd4e14" fs:fs="b" id="d4e14">which of us ever undertakes laborious physical
exercise,</pc>except to obtain some advantage from it? But who has any right to find fault
with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids
a pain that produces no resultant pleasure?
</source>
<target xml:lang="la"></target>
</segment>
</unit>
...etc
</file>
...etc
- to populate an exisiting XLIFF 2.1 File with auto-translated text
PATH-TO-DITA-OT/bin/dita -f xliff-translate \
-i translate.xlf --translate.service=[bing|deepl|watson|yandex] \
--translate.apikey=<api-key>
--xliff.version=2
The XLIFF 2.1 File is auto-translated in place, with translated text as shown:
Note: any
<segement>
elements which arestate="final"
will not be re-translated.
<?xml version="1.0" encoding="UTF-8"?>
<xliff srcLang="en" trgLang="la" version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0">
<file id="2" original="/topic.dita">
<skeleton href="./skl/topic.dita.skl"></skeleton>
<unit fs:fs="p" id="9962" xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0">
<originalData>
<data id="sd4e14"><b></data>
<data id="ed4e14"></b></data>
</originalData>
<segment state="translated">
<source xml:lang="en" xml:space="preserve">Loves or pursues or desires to obtain pain of
itself, because it is pain, but occasionally circumstances occur in which toil and pain
can procure him some great pleasure. To take a trivial example, <pc dataRefEnd="ed4e14"
dataRefStart="sd4e14" fs:fs="b" id="d4e14">which of us ever undertakes laborious physical
exercise</pc>except to obtain some advantage from it? But who has any right to find fault with
a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain
that produces no resultant pleasure?
</source>
<target xml:lang="la">
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, <pc dataRefEnd="ed4e14" dataRefStart="sd4e14" fs:fs="b" id="d4e14">
quis nostrud exercitation ullamco laboris,</pc> nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum.
</target>
</segment>
</unit>
...etc
</file>
...etc
- recreate
*.dita
files using an XLIFF File and its associated skeletons with run:
PATH-TO-DITA-OT/bin/dita -f xliff-dita -i translate.xlf -o out --xliff.version=1|2
The translated *.dita
files are generated into the out
directory.
Note
Any machine translation is by definition imperfect. A typical translation workflow would send the generated
XLIFF files to the translation agency (known also as "localisation service provider"), and receive back verified
translated content from the translation agency integrated into to the XLIFF. For XLIFF 1.2, each <trans-unit>
should
be marked approved="yes"
when the <target>
element has been verified. Similarly for XLIFF 2.1 each <segement>
should be marked as state="final"
.
translate.from
- Source language to use. Defaults to the value inconfiguration.properties
translate.to
- Target language. Defaults to the value inconfiguration.properties
translate.cachefile
- Specifies the (absolute) location of a previously translated XLIFF file to be used. If theid
matches to a previously translated text snippet in the cache file, the text will be copied over and the snippet marked asapproved
.translate.service
- Decides which translation service to use:bing
- Connects to the Microsoft Azure Translation servicecustom
- Sends the translate to an arbitrary URL using POST - use this to connect to proxies for Google Cloud Translatedeepl
- Connects to the DeepL API Translation servicedummy
- Avoids accessing a translation service, copies sources to target langauge directly without amendment.watson
- Connects to the IBM Cloud Translation serviceyandex
- Connects to the Yandex Translation service
translate.authentication.url
- URL for creating an OAuth token if needed for a service. Defaults to the value in `configuration.properties.translate.apikey
- API Key for the Translation service. Defaults to the value inconfiguration.properties
translate.region
- Subscription region for a Microsoft multi-service text API subscriptiontranslate.url
- URL for a Translation service. Defaults to the value inconfiguration.properties
xliff.version
- Decides which XLIFF format to use. Defaults to the value inconfiguration.properties
:1
- XLIFF 1.2 format2
- XLIFF 2.1 format
Apache 2.0 © 2019 - 2024 Jason Fox
The Program includes the following additional software components which were obtained under license:
- json-simple-1.1.1.jar - https://github.com/fangyidong/json-simple - Apache 2.0 license
- xmltask.jar - https://github.com/antlibs/ant-xmltask - Apache 2.0 license