| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
While the GNU gettext tools deal mainly with POT and PO files, they can also manipulate a couple of other data formats.
Here is a list of other data formats which can be internationalized using GNU gettext.
gettext
gettext
pot, po
xgettext
RST is the format of resource string table files of the Free Pascal compiler versions older than 3.0.0. RSJ is the new format of resource string table files, created by the Free Pascal compiler version 3.0.0 or newer.
fpk
fp-compiler
rst, rsj
xgettext, rstconv
glade, libglade, glade2, libglade2, intltool
glade, libglade2-dev, intltool
glade, glade2, ui
xgettext, libglade-xgettext, xml-i18n-extract, intltool-extract
glib2
libglib2.0-dev
gschema.xml
xgettext, intltool-extract
This file format is specified in https://www.freedesktop.org/software/appstream/docs/.
appdata-tools, appstream, libappstream-glib, libappstream-glib-builder
appdata-tools, appstream, libappstream-glib-dev
appdata.xml, metainfo.xml
xgettext, intltool-extract, itstool
Marking translatable strings in an XML file is done through a separate
"rule" file, making use of the Internationalization Tag Set standard
(ITS, https://www.w3.org/TR/its20/).  The currently supported ITS
data categories are: ‘Translate’, ‘Localization Note’,
‘Elements Within Text’, and ‘Preserve Space’.  In addition to
them, xgettext also recognizes the following extended data
categories:
This data category associates msgctxt to the extracted text.  In
the global rule, the contextRule element contains the following:
selector attribute.  It contains an absolute selector
that selects the nodes to which this rule applies.
contextPointer attribute that contains a relative
selector pointing to a node that holds the msgctxt value.
textPointer attribute that contains a relative
selector pointing to a node that holds the msgid value.
This data category extends the standard ‘Preserve Space’ data
category with the additional values ‘trim’ and ‘paragraph’.
‘trim’ means to remove the leading and trailing whitespaces of the
content, but not to normalize whitespaces in the middle.
‘paragraph’ means to normalize the content but keep the paragraph
boundaries.  In the global
rule, the preserveSpaceRule element contains the following:
selector attribute.  It contains an absolute selector
that selects the nodes to which this rule applies.
space attribute with the value default,
preserve, trim, or paragraph.
This data category indicates whether the special XML characters
(<, >, &, ") are escaped with entity
references.  In the global rule, the escapeRule element contains
the following:
selector attribute.  It contains an absolute selector
that selects the nodes to which this rule applies.
escape attribute with the value yes or no.
unescape-if attribute with the value
xml, xhtml, html, or no.
The default values, escape="no" and unescape-if="no",
should be good for most XML file types.
A rule with escape="no",
that was necessary with GNU gettext versions before 0.23,
is now redundant.
The unescape-if attribute is useful for XML file types
which present messages with embedded XML elements to the translator.
Such file types are for example DocBook or XHTML.
If unescape-if="xml" is specified and the translation
of a message looks like valid XML, the usual escaping of <,
>, and character references is omitted.
The resulting XML document then is likely what the translator intended.
However, if the translator did not merely copy the XML markup from the
message to the translation, but added or removed markup,
the resulting XML document may be invalid.
It is therefore useful if, after invoking msgfmt, you check
the resulting XML document against the appropriate XML schema or DTD.
Similarly, if unescape-if="xhtml" is specified and the translation
looks like valid XHTML, the usual escaping is omitted.
And likewise for unescape-if="html".
All those extended data categories can only be expressed with global
rules, and the rule elements have to have the
https://www.gnu.org/s/gettext/ns/its/extensions/1.0 namespace.
Given the following XML document in a file ‘messages.xml’:
| <?xml version="1.0"?>
<messages>
  <message>
    <p>A translatable string</p>
  </message>
  <message>
    <p translatable="no">A non-translatable string</p>
  </message>
</messages>
 | 
To extract the first text content ("A translatable string"), but not the second ("A non-translatable string"), the following ITS rules can be used:
| <?xml version="1.0"?>
<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
  <its:translateRule selector="/messages" translate="no"/>
  <its:translateRule selector="//message/p" translate="yes"/>
  <!-- If 'p' has an attribute 'translatable' with the value 'no', then
       the content is not translatable.  -->
  <its:translateRule selector="//message/p[@translatable = 'no']"
    translate="no"/>
</its:rules>
 | 
ITS rules files must have the ‘.its’ file extension and obey
the XML schema version 1.0 encoded by its.xsd10 or
the XML schema version 1.1 encoded by its.xsd11
and its auxiliary schema its-extensions.xsd.
‘xgettext’ needs another file called "locating rules" to associate an ITS rule with an XML file. If the above ITS file is saved as ‘messages.its’, the locating rules file would look like:
| <?xml version="1.0"?>
<locatingRules>
  <locatingRule name="Messages" pattern="*.xml">
    <documentRule localName="messages" target="messages.its"/>
  </locatingRule>
  <locatingRule name="Messages" pattern="*.msg" target="messages.its"/>
</locatingRules>
 | 
The locatingRule element must have a pattern attribute,
which denotes either a literal file name or a wildcard pattern of the
XML file(7).  The locatingRule element can have child
documentRule element, which adds checks on the content of the XML
file.
The first rule matches any file with the ‘.xml’ file extension, but it only applies to XML files whose root element is ‘<messages>’.
The second rule indicates that the same ITS rules file are also
applicable to any file with the ‘.msg’ file extension.  The
optional name attribute of locatingRule allows to choose
rules by name, typically with xgettext's -L option.
The associated ITS rules file is indicated by the target attribute
of locatingRule or documentRule.  If it is specified in a
documentRule element, the parent locatingRule shouldn't
have the target attribute.
Locating rules files must have the ‘.loc’ file extension and obey
the XML schema version 1.0 encoded by locating-rules.xsd10 or
the XML schema version 1.1 encoded by locating-rules.xsd11.
Both ITS rules files and locating rules files must be installed in the
‘$prefix/share/gettext/its’ directory.  Once those files are
properly installed, xgettext can extract translatable strings
from the matching XML files.
After strings have been extracted from an XML file to a POT file
through xgettext
and the translator has produced a PO file with translations,
it can be used in two ways:
msgfmt program with the option --xml.
See section Invoking the msgfmt Program, for more details about how one calls
the ‘msgfmt’ program.
During this merge from a PO file into an XML file, it may happen that
more escaping of special characters for XML is needed
than what msgfmt does by default.
In this case, you can enforce more escaping
either throuch an <escapeRule> ITS rule,
or through an attribute gt:escape="yes" on the particular XML element.
Here is a list of file formats that contain localized data and that the GNU gettext tools can manipulate.
These file formats can be used with all of the msg* tools and with
the xgettext program.
If you just want to convert among these formats, you can use the
msgcat program (with the appropriate option) or the xgettext
program.
po
properties
strings
These file formats can be created through msgfmt and converted back
to PO format through msgunfmt.
mo
See section The Format of GNU MO Files for details.
class
For more information, see the section Java and the examples
hello-java, hello-java-awt, hello-java-swing.
dll
For more information, see the section C#.
resources
For more information, see the section C#.
msg
For more information, see the section Tcl - Tk's scripting language and the examples
hello-tcl, hello-tcl-tk.
qm
For more information, see the examples hello-c++-qt and
hello-c++-kde.
The programmer produces a desktop entry file template with only the
English strings.  These strings get included in the POT file, by way of
xgettext (usually by listing the template in po/POTFILES.in).
The translators produce PO files, one for each language.  Finally, an
msgfmt --desktop invocation collects all the translations in the
desktop entry file.
For more information, see the example hello-c-gnome3.
Icons are generally locale dependent, for the following reasons:
However, icons are not covered by GNU gettext localization, because
Desktop Entry files may contain an ‘Icon’ property, and this property is localizable. If a translator wishes to localize an icon, she should do so by bypassing the normal workflow with PO files:
| Icon[locale]=icon_file_name | 
to the template file.
This line remains in place when this template file is merged with the
translators' PO files, through msgfmt.
See the section Preparing Rules for XML Internationalization and
Invoking the msgfmt Program, subsection “XML mode operations”.
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
 
  This document was generated by Bruno Haible on July, 2 2025 using texi2html 1.78a.