Changes

← Older edit

Coding for translation

9,617 bytes added, 22 January

m

→‎See also: remove Addons Plugins categories. It just confuses the novice developer.

==Introduction==

Gramps has always been internationalized (see~~http~~: https://gramps-project.org/introduction-WP/2006/04/looking-back-over-5-years/ ).Therefore, all strings meantfor the user should always be flagged for translation.

In order to be considered for inclusion in the ~~offical~~ official Gramps release, any piece of code must support internationalization. What this means is that the Python module must support [[Translating Gramps|translations]] into different languages. Gramps provides support to make this as easy as possible for the developer. For enabling, a language code must be set on the ''~~configure.in'' file into ''~~[[Template:Gramps_translations#ALL_LINGUAS |ALL_LINGUAS]]'' section.

==How to allow translations==

Example 1:

print ("Hello world!")

In this example, the string will always be printed as specified.

_(George Smith and Annie Jones have %(num)d child, George Smith and Annie Jones have %(num)d children, n) % {num : n}

In other cases, it's necessary to provide a hint to translators, <abbr title="exempli gratia - Latin phrase meaning 'for example'">e.g.</abbr>,

_(Remaining names | rest)

=== Encoding ===

String handling can be a bit tricky in a localized environment~~. Gramps~~so it's ~~translation facility will always return Unicode-encoded strings. For as long as it is necessary to support both Python2 and Python3,~~ important that developers ~~will need to~~ understand Unicode string handling in both versions of the language.

~~If you~~ This is mostly a problem for Microsoft Windows™: Mac OSX and Linux use ~~non ASCII characters in~~ UTF8 for just about everything if the locale is set up correctly (and we try to do that when Gramps starts up), so one can get away with a ~~string literal~~ lot of encoding mistakes on those platforms. Windows™ on the ~~string must~~ other hand uses a slightly modified version of UTF16 for file names and retains the old DOS [http://en.wikipedia.org/wiki/Code_page code page] system for encoding output to cmd.exe. The take-away is that if you need to mess with input or output encoding, be ~~Unicode~~sure to test on both Linux and Windows before deciding that you're done. If you're not set up for multiple-platform testing arrange with someone,, who can test for you on the platform you don't have.

====Python 2====Python 2.7 has two text classes, <tt>[https://docs.python.org/2.7/library/functions.html#str str]</tt> and <tt>[https://docs.python.org/2.7/library/functions.html#unicode unicode]</tt>. <tt>Unicode</tt> objects are encoded in UTF16 internally on most platforms, and most python '''output''~~Don~~'t functions will do ~~this~~the right thing with them. One caveat here:~~'''~~ ~~print _~~passing both <tt>unicodes</tt> and <tt>strs</tt> to <tt>os.path.join(~~u"Eg~~)</tt> will return a <tt>str</tt>, ~~valid values~~ so either make sure when constructing a path that all arguments are 12<tt>unicodes</tt> or convert the result.~~0154, 50° 52′ 21.92″N")~~

The bsddb module that ships with Python2 is stupid about paths and requires that they be encoded in the file system encoding. This is handled in gramps/gen/db/write.py with _encode() and independently in a few other places. Strings from the operating system, including environment variables, are a problem on Windows™; The os module uses for input the [http://msdn.microsoft.com/en-us/library/windows/desktop/dd317766%28v=vs.85%29.aspx ANSI API] to the Windows SDK, which interprets the value of the environment variable according to the active code page and produces a <tt>str</tt>, converting any codepoints > 0xff to ? and often misinterpreting those between 0x0f and 0xff if the encoding of the input happens to be something other than the active system codepage. Once this is done it is quite difficult to get non-ASCII pathnames back into a useable form, so gramps/gen/constfunc.py provides a get_env_var() function that uses the [http://msdn.microsoft.com/en-us/library/windows/desktop/dd374081(v=vs.85).aspx Unicode API] to instead. Always use that function to read environment variables which might include non-ASCII characters and avoid using os-module functions for reading paths. By default string constants in Python 2 are <tt>str</tt>. ====Python 3====Python 3 also provides two test classes, <tt>[https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str]</tt> and <tt>[https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview bytes]</tt>. In Python 3, <tt>str</tt> is the unicode type and <tt>bytes</tt> is text encoded some other way. Everything pretty much "just works". ====Portability Functions and constants====We've provided a couple of functions in gramps/gen/constfunc.py to ease conversion of <tt>strs</tt> to <tt>unicodes</tt>; these include the necessary tests to portably do the right thing regardless of what's passed to them and according to which version of Python is in use:* <tt>cuni</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#unicode unicode] in Python 2 and for [https://docs.python.org/3/library/functions.html#func-str str] in Python 3. This has no protective checks so use it with care.* <tt>conv_to_unicode(string, encoding='utf8')</tt>: This ensures that its return value is a Unicode string which has been converted from a non-Unicode in the <tt>encoding</tt>, which defaults to UTF8 for ease of use with the GUI.* <tt>get_env_var(string, default=None)</tt>: On Windows™ in Python2, uses the <tt>ctypes</tt> module to invoke the Microsoft Unicode API to read the value of an environment variable and return a Unicode; otherwise returns the value from the <tt>os.environ</tt> array.There are also two constants:* <tt>STRTYPE</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#basestring basestring] in Python 2 and for [https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str] in Python 3. It can be used to test whether an object is a text-type.* <tt>UNITYPE</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#unicode unicode] in Python 2 and for [https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str] in Python 3. It can be used to test whether an object is already encoded in Unicode. ====For portable string handling on all platforms and for all locales====* Localized strings returned from gettext, ngettext, <abbr title="et cetera - Latin phrase meaning 'and so on'">etc.</abbr> are always unicode* Text files should always be encoded in UTF8. The easy and portable way to do this is to:*: <pre>import io</pre>*: <pre>fh = io.open(filepath, mode, encoding='utf8')</pre>*: where ''mode'' is one of r, rw, r+, or w+. ''Don't open these files in binary mode!'' Pass unicode-type strings to fh.write() and expect the same from fh.read().* Always read environment variables with <tt>constfuncs.get_env_var()</tt> if there's any chance that it will contain a non-ASCII character.* Use <tt>from __future__ import unicode_literals</tt> in any source file which might present strings to the user or to the operating system.*:When creating string literals, '''don't do this:'''*:<pre>print _(u"Eg, valid values are 12.0154, 50° 52′ 21.92″N")</pre>*:Because the <tt>u</tt> prefix was removed for Python 3.0-3.2. (It was restored in 3.3 for compatibility with 2.7, but it's not necessary.)*:Instead, put in the first line of the module *:<pre># *-* coding: utf-u 8 *-*</pre>*:then in the imports section *:<pre>from __future__ import unicode_literals</pre>*:which makes all of the literals unicode. '''Make sure that your editor is set up to save utf-8!'''

===Glade files===

External addons often need to provide their own message catalogs. To pull one in, use

this instead of the usual.

from gramps.gen.const ~~import~~ import GRAMPS_LOCALE as glocale

_ = glocale.get_addon_translator(__file__).gettext

or if you need more than one retrieval function:

translatable strings and for retrieving the translated values are the same as for internal modules.

See [[~~Addons_Development~~Addons_development#Localization|Addons development]] for more details.

==How it works==

* [[Translation_environment20|2.0.x]]

* [[Translation_environment22|2.2.x to Gramps 3.4.x]]

* [[Translation_environment4|Gramps 4.0.x to ~~master (trunk)~~5.1.x]]

There are two stages to getting a translation to work.

* [[Translation_environment22#Files_and_directory|2.2.x to Gramps 3.4.x]]

* [[Translation_environment4#Files_and_directory|Gramps 4.0.x, ~~master (trunk)~~5.1.x]]

==Tips for writing a translatable Python module==

===Use complete sentences===

Don't build up a sentence from phrases. Because a sentence is ordered in a particular way in your language does not mean that it is ordered the same way in another. Providing the entire sentence as a single unit allows the translator to make a meaningful translation. Do not concatenate phrases or terms as they will then show up as separate phrases or terms to be translated and the complete sentence may then show up incorrectly, especially in right-to-left languages (Arabic, Hebrew, <abbr title="et cetera - Latin phrase meaning 'and so on'">etc.</abbr>).

===Use named %s/%d values===

Python provides a powerful mechanism that allows the reordering of %s values in a string. A translator may need to rearrange the structure of a sentence, and it may not match the order you chose. For example:

===Provide support for plural forms.===

Plurals are handled differently in various languages. Whilst English or German have a singular and a plural form, other languages like Turkish don't distinguish between plural or singular and there are languages which use different plurals for different numbers, <abbr title="exempli gratia - Latin phrase meaning 'for example'">e.g. </abbr>, Polish.

Gramps provides a [[Translating_Gramps#Plural_forms|plural forms]] support, useful for locales with multiples plurals according to a number (''often slavic based languages'') or for Asian family languages (''singular = plural'').

===Object classes===

Gramps often displays names of primary objects (''Person, Family, Event, <abbr title="et cetera - Latin phrase meaning 'and so on'">etc .</abbr> ...''), for being consistent on displayed strings (also in english!), there is a ''trans_objclass(objclass_str)'' function on TransUtils module.

So, when we need to display the primary object name in lower case into a sentence, we can use this function.

will display:

See ''the person'' details # or See ''the family, the event, <abbr title="et cetera - Latin phrase meaning 'and so on'">etc.</abbr> ...'' details

Make ''the person'' active

Instead of "free form" text that talks about

<abbr title="exempli gratia - Latin phrase meaning 'for example'">e.g. </abbr>,

son '''of %s'''

better would be for example some tabulated format like this:

Use of commas, semicolons and spacing can be different than into english.

Remember, simple is better, maybe try to limit punctuation marks.

====definition====

<pre>

$ python3

>>> import string

>>> print(string.punctuation)

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

</pre>

====locale case====

In French, a space is required before or after some punctuation marks and symbols, like

* GtkBuilder (editors, configuration dialogs) can provide a default colon after the string without spacing,

so need some extra-testing and customization for some translators. <abbr title="exempli gratia - Latin phrase meaning 'for example'">e.g.</abbr>, in french

<pre>

#: ../gramps/gen/plug/report/stdoptions.py:257 ../gramps/gui/configure.py:1222

msgid "Date format"

msgstr "Format des dates "</pre>

<pre>

# comté (Canada)

#: ../gramps/gui/configure.py:617

#: ../gramps/gui/editors/displaytabs/addrembedlist.py:75

#: ../gramps/plugins/view/repoview.py:92

msgid "State/County"

msgstr "Province/Comté "</pre>

<pre>

# L'espace final est pour précéder le « : » codé en dur.

#: ../gramps/gui/configure.py:1332

msgid "Status bar"

msgstr "Barre d'état "

</pre>

===Deferred key on lists===

In most coding situations, strings are translated where they are coded. Occasionally however, you need to mark strings for translation, but defer actual translation until later. A classic example is:

<pre>

animals = ['mollusk',

'albatross',

'rat',

'penguin',

'python', ]

for a in animals:

print(a)

</pre>

Here, you want to mark the strings in the animals list as being translatable, but you don’t actually want to translate them until they are printed.

Here is one way you can handle this situation:

<pre>

def _(message): return message

animals = [_('mollusk'),

_('albatross'),

_('rat'),

_('penguin'),

_('python'), ]

del _

for a in animals:

print(_(a))

</pre>

This works because the dummy definition of _() simply returns the string unchanged. And this dummy definition will temporarily override any definition of _() in the built-in namespace (until the del command). Take care, though if you have a previous definition of _() in the local namespace.

Note that the second use of _() will not identify “a” as being translatable to the gettext program, because the parameter is not a string literal.

Another way to handle this is with the following example:

<pre>

def _T_(message): return message

animals = [_T_('mollusk'),

_T_('albatross'),

_T_('rat'),

_T_('penguin'),

_T_('python'), ]

for a in animals:

print(_(a))

</pre>

In this case, you are marking translatable strings with the function _T_(), which won’t conflict with any definition of _().

See [https://docs.python.org/dev/library/gettext.html#deferred-translations deferred translations]

Current custom key on gramps code is ''~~todo~~'_T_'''. Set as xgettext flag on [https://github.com/gramps-project/gramps/blob/master/po/genpot.sh#L6 shell script] and [https://github.com/gramps-project/gramps/blob/master/po/update_po.py#L716 python interface], generating the translation strings template.

==Changing translated text message in the source code==

To make it easier to port your changes across multiple branches, it's a good idea to separate the changes in the source tree from the po/ ones. This way, you'll be able to quickly re-apply the source changes using normal cross-branch porting workflow (such as `git cherry-pick'), and then adjust and re-run the search-and-replace in the po/ on the new branch, because, most likely, it won't reapply due to the differences in the .po layout.

~~'''NOTE: '''to~~ {{man note| Note |To stress it again, only do it for text change that didn't change how it is going to be translated. If you'd like your change to be somehow reflected in the translations, let the translators do the work instead.}}

==Textual reports==

~~Since~~ Starting with Gramps-3.2 we are able to select the language for textual reports, see ~~[http://www.gramps-project.org/bugs/view.php?id=~~feature {{bug|2371 ~~this feature]~~}}.

~~Currently~~ For Gramps it was only available on Ancestor report (3.2.x) and detailed reports (3.3.x). The capability for translated-output was added to some more (gramps core) reports, in the gramps40 branch, before gramps 4.0.0 was released. So more than the "three original reports" now have had this [https://gramps-project.org/bugs/view.php?id=2371#c33601 feature request implemented].

For providing this option:

self.__get_date(event.get_date_object())

self.__get_type(event.get_type())

==See also==

* [[Coding_for_translation|Coding for Translation]]

* [[Coding_for_translation_using_weblate|Coding for Translation using Weblate]]

* [[Translating_Gramps_using_Weblate|Translating Gramps using Weblate]]

* [[Translating_the_Gramps_User_manual|Translating the Gramps User manual]]

* [[Portal:Translators|Portal:Translators]]

* [[Committing_policies|Committing policies]]

* [[What_to_do_for_a_release|What to do for a release]]

[[Category:Translators/Categories]]

[[Category:Developers/Tutorials]]

[[Category:Developers/General]]

Bamaustin

4,608

edits

Changes

Coding for translation

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Contributor help pages

wiki

Tools