Changes

Coding for translation

10,744 bytes added, 00:07, 19 July 2015

→‎Textual reports

Coding guidelines to enable easy and correct translation of strings on the User Interface.

~~[[Category:Translators/Categories]][[Category:Developers/Tutorials]][[Category:Developers/General]]~~

==Introduction==

~~GRAMPS~~ Gramps has always been internationalized (see

http://gramps-project.org/2006/04/looking-back-over-5-years).

Therefore, all strings meant

for the user should always be flagged for translation.

In order to be considered for inclusion in the offical ~~GRAMPS~~ Gramps release, any piece of code must support internationalization. What this means is that the Python module must support [[Translating ~~GRAMPS~~Gramps|translations]] into different languages. ~~GRAMPS~~ Gramps provides support to make this as easy as possible for the developer. For enabling, a language code must be set on the ''~~configure.in'' file into ''~~[[Template:Gramps_translations#ALL_LINGUAS |ALL_LINGUAS]]'' section.

==How to allow translations==

~~GRAMPS~~ Gramps is a fully-internationalized application with translations in many languages. All code which presents text to users must provide for that text to be translated. Fortunately, Gramps provides ~~a simple interface (based on the~~ an extension of [http://docs.python.org/3/library/gettext.html gettext ~~interface) to mark strings as being translatable~~] which makes this fairly painless. First, ~~import~~ alias the gettext function from the ~~intl library.~~single localization instance: from gramps.gen.~~ggettext~~ const import ~~gettext~~ GRAMPS_LOCALE as glocale _= glocale.translation.gettextThis statement imports the <code>~~sgettext~~gettext</code> function ~~under the name of~~ and aliases it as <code>_</code>. ~~This is the function that both marks the~~ The translation tools treat strings wrapped in _() as translatable and assemble them into catalogs for ~~translation and performs~~ the ~~actual~~ translators to work with; by aliasing it to gettext(), we also enable python to retrieve the translation ~~at runtime. Strings that should be translated should be enclosed as an argument to~~ appropriate for the ~~function~~user's locale.

Example 1:

print _("Hello world!")

In this example, ~~GRAMPS~~ Gramps will attempt to translate the string. If a translation exists, the call to the function will return the translation. If a translation does not exist, the original string is returned. === More complicated translations === In addition to <tt>gettext</tt>, GrampsTranslation offers two more specialized retrieval functions, <tt>ngettext</tt> and <tt>sgettext</tt>. In some strings, it's necessary to specify different translations depending upon the number of an argument. For example, George Smith and Annie Jones have 1 child George Smith and Annie Jones have 3 children We'd code that in python as follows: _ = glocale.translation.ngettext _(George Smith and Annie Jones have %(num)d child, George Smith and Annie Jones have %(num)d children, n) % {num : n} In other cases, it's necessary to provide a hint to translators, e.g. _(Remaining names | rest) We're making sure that the translators know that this message id means "what's left" rather than "take a nap". When the file is translated, this is no problem, because the translation doesn't include the hint -- but if the user is working in English, we don't want him to see the hint, so we need to alias _ to sgettext: _ = glocale.translation.sgettext Often you need to combine them. While <tt>ngettext</tt> and <tt>sgettext</tt> can each handle plain strings, neither can handle the other's strings. Fortunately the <tt>intltool</tt> message extractor is pretty stupid, so any function name that ends in either <tt>_</tt> or <tt>gettext</tt> will work. This will work pretty well: _ = glocale.translation.gettext N_ = glocale.translation.ngettext S_ = glocale.translation.sgettext Obviously you would pass the translatable string to the right function. === Encoding ===String handling can be a bit tricky in a localized environment so it's important that developers understand Unicode string handling in both versions of the language. This is mostly a problem for Microsoft Windows™: Mac OSX and Linux use UTF8 for just about everything if the locale is set up correctly (and we try to do that when Gramps starts up), so one can get away with a lot of encoding mistakes on those platforms. Windows™ on the other hand uses a slightly modified version of UTF16 for file names and retains the old DOS [http://en.wikipedia.org/wiki/Code_page code page] system for encoding output to cmd.exe. The take-away is that if you need to mess with input or output encoding, be sure to test on both Linux and Windows before deciding that you're done. If you're not set up for multiple-platform testing arrange with someone,, who can test for you on the platform you don't have. ====Python 2====Python 2.7 has two text classes, <tt>[https://docs.python.org/2.7/library/functions.html#str str]</tt> and <tt>[https://docs.python.org/2.7/library/functions.html#unicode unicode]</tt>. <tt>Unicode</tt> objects are encoded in UTF16 internally on most platforms, and most python '''output''' functions will do the right thing with them. One caveat here: passing both <tt>unicodes</tt> and <tt>strs</tt> to <tt>os.path.join()</tt> will return a <tt>str</tt>, so either make sure when constructing a path that all arguments are <tt>unicodes</tt> or convert the result.

~~All strings meantfor~~ The bsddb module that ships with Python2 is stupid about paths and requires that they be encoded in the ~~user should be always be preceeded~~ file system encoding. This is handled in gramps/gen/db/write.py with ~~the _ function~~_encode() and independently in a few other places.

~~If you use~~ Strings from the operating system, including environment variables, are a problem on Windows™; The os module uses for input the [http://msdn.microsoft.com/en-us/library/windows/desktop/dd317766%28v=vs.85%29.aspx ANSI API] to the Windows SDK, which interprets the value of the environment variable according to the active code page and produces a <tt>str</tt>, converting any codepoints > 0xff to ? and often misinterpreting those between 0x0f and 0xff if the encoding of the input happens to be something other than the active system codepage. Once this is done it is quite difficult to get non -ASCII ~~characters in~~ pathnames back into a ~~string~~useable form, so gramps/gen/constfunc.py provides a get_env_var() function that ~~shall be translated,~~ uses the ~~string must be unicode~~[http://msdn.microsoft.~~Example:~~ ~~print _~~com/en-us/library/windows/desktop/dd374081(~~u"Eg, valid values are 12~~v=vs.85).aspx Unicode API] to instead.~~0154, 50° 52′ 21~~Always use that function to read environment variables which might include non-ASCII characters and avoid using os-module functions for reading paths.~~92″N")~~

By default string constants in Python 2 are <tt>str</tt>.

===~~Into glade~~ =Python 3====Python 3 also provides two test classes, <tt>[https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str]</tt> and <tt>[https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview bytes]</tt>. In Python 3, <tt>str</tt> is the unicode type and <tt>bytes</tt> is text encoded some other way. Everything pretty much "just works". ====Portability Functions and constants====We've provided a couple of functions in gramps/gen/constfunc.py to ease conversion of <tt>strs</tt> to <tt>unicodes</tt>; these include the necessary tests to portably do the right thing regardless of what's passed to them and according to which version of Python is in use:* <tt>cuni</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#unicode unicode] in Python 2 and for [https://docs.python.org/3/library/functions.html#func-str str] in Python 3. This has no protective checks so use it with care.* <tt>conv_to_unicode(string, encoding='utf8')</tt>: This ensures that its return value is a Unicode string which has been converted from a non-Unicode in the <tt>encoding</tt>, which defaults to UTF8 for ease of use with the GUI.* <tt>get_env_var(string, default=None)</tt>: On Windows™ in Python2, uses the <tt>ctypes</tt> module to invoke the Microsoft Unicode API to read the value of an environment variable and return a Unicode; otherwise returns the value from the <tt>os.environ</tt> array.There are also two constants:* <tt>STRTYPE</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#basestring basestring] in Python 2 and for [https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str] in Python 3. It can be used to test whether an object is a text-type.* <tt>UNITYPE</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#unicode unicode] in Python 2 and for [https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str] in Python 3. It can be used to test whether an object is already encoded in Unicode. ====For portable string handling on all platforms and for all locales====* Localized strings returned from gettext, ngettext, etc. are always unicode* Text files should always be encoded in UTF8. The easy and portable way to do this is to:*: <pre>import io</pre>*: <pre>fh = io.open(filepath, mode, encoding='utf8')</pre>*: where ''mode'' is one of r, rw, r+, or w+. ''Don't open these files in binary mode!'' Pass unicode-type strings to fh.write() and expect the same from fh.read().* Always read environment variables with <tt>constfuncs.get_env_var()</tt> if there's any chance that it will contain a non-ASCII character.* Use <tt>from __future__ import unicode_literals</tt> in any source filewhich might present strings to the user or to the operating system.*:When creating string literals, '''don't do this:'''*:<pre>print _(u"Eg, valid values are 12.0154, 50° 52′ 21.92″N")</pre>*:Because the <tt>u</tt> prefix was removed for Python 3.0-3.2. (It was restored in 3.3 for compatibility with 2.7, but it's not necessary.)*:Instead, put in the first line of the module*:<pre># *-* coding: utf-8 *-*</pre>*:then in the imports section*:<pre>from __future__ import unicode_literals</pre>*:which makes all of the literals unicode. '''Make sure that your editor is set up to save utf-8!''' ===Glade files===

Just enable the translatable attribute on an XML element.

~~Note:~~====Non ASCII characters====

If you plan to use non ASCII characters in a string, that shall be translated,

do not use escape sequences:

Eg, valid values are 12.0154, 50<code>&</code>#xB0; 52' 21.92"N

use ~~in stead~~instead:

Eg, valid values are 12.0154, 50° 52′ 21.92″N

In this case note the ~~special~~ unicode characters for deg, min, sec.'''Ensure that your editor is set up to encode the characters in UTF-8!'''

====Accessibility====

* Custom widgetsIn addition to [http://developer.gnome.org/gtk/2.24/GtkWidget.html#id1294298 accelerators], ''[http://developer.gnome.org/gtk/2.24/GtkWidget.html GtkWidget]'' also support a custom <accessible> element, which supports actions and relations. Properties on the accessible implementation of an object can be set by accessing the internal child "accessible" of a ''[http://developer.gnome.org/gtk/2.24/GtkWidget.html GtkWidget]''. See [http://developer.gnome.org/gtk/2.24/GtkWidget.html#GtkWidget-BUILDER-UI GtkBuilder UI].

~~Default widgets (~~* Gtk label~~, Gtk entry) should automaticaly generate accessibility keys.Remember that Gramps also uses custom widgets (ValidatableMaskedEntry, UndoableEntry, StyledTextEditor, UndoableBuffer), which do not provide accessibility support.~~

* Toggle buttons''A [http://developer.gnome.org/gtk/2.24/GtkLabel.html GtkLabel]'' '''with mnemonic support''' will automaticaly generate accessibility keys on linked ''[http://developer.gnome.org/gtk/2.24/GtkEntry.html GtkEntry]'' and ''UndoableEntry'' fields. Remember that Gramps also uses custom widgets like ''StyledTextEditor'' and ''ValidatableMaskedEntry'', which do not always have relation with a ''GtkLabel''.

~~Gramps often uses toogle~~ * Toggle buttons and ~~alone image (no label), this excludes blind people and generates a poor interface for accessibility.~~Icons on toolbar

~~Think on accessibility support when you use custom widgets or~~ Gramps often uses ''[http://developer.gnome.org/gtk/2.24/GtkToggleButton.html GtkToggleButtons]'' and alone ''[http://developer.gnome.org/gtk/2.24/GtkImage.html GtkImage]'' (image on without label), this excludes blind people and generates a ~~toggle button, by adding:~~ ~~<property name="AtkObject::accessible-name">Name access</property>~~poor interface for accessibility.

~~===Into addons plugins===~~See [[Accessibility]].

~~from TransUtils import get_addon_translator~~ _ = ~~get_addon_translator().gettext~~==Addons===

~~See [[Addons_Development#Localization|Addon developpement]]~~External addons often need to provide their own message catalogs.To pull one in, usethis instead of the usual. from gramps.gen.const import import GRAMPS_LOCALE as glocale _ = glocale.get_addon_translator(__file__).gettextor if you need more than one retrieval function: _translation = glocale.get_addon_translator(__file__) _ = _translation.gettext S_ = _translation.sgettext

~~==How it works==~~The addon translator is another instance of GrampsTranslation, so the rules for creatingtranslatable strings and for retrieving the translated values are the same as for internal modules.

See [http://www.gnu.org/software/gettext/manual/gettext.html GNU gettext] and [http://live.gnome.org/TranslationProject/DevGuidelines/Localize%20using%20gettext%20and%20intltool Gnome] provide utilities and a [http://www.gnome.org/~malcolm/i18n/build-changes.html translation framework] (''previously [http://gramps.svn.sourceforge.net/viewvc/gramps/branches/maintenance/gramps20/gramps2/src/build_po build_po] and [~~http://gramps.svn.sourceforge.net/viewvc/gramps/branches/maintenance/gramps20/gramps2/src/get_strings get_strings~~Addons_Development#Localization|Addons development]~~''):~~* [http://www.gnu.org/software/autoconf/manual/gettext/msginit-Invocation.html msginit] ~~will generate a standard gettext header.~~* intltool-update will manage template and translations.* intltool-extract will extract translation strings on ''.glade'' and ''.xml'' files, by generating files with ''.h'' extensionfor more details.

~~# Generates a new template (gramps.pot), into ''/po'' directory :~~ ~~intltool-update -p~~==How it works==

* intltoolWe need at least [http://www.gnu.org/software/gettext/manual/gettext.html GNU gettext], then [http://www.gnu.org/software/autoconf/manual/gettext/msginit-~~merge~~ Invocation.html msginit] will ~~merge cached translations into~~ generate a standard gettext header.~~in files~~

~~# Merges translated~~ Gramps has used different environments according to versions for retrieving strings ~~into desktop file, ''root'' directory~~ to translate: ~~intltool-merge -d po/ data/gramps.desktop.in data/gramps.desktop~~

~~# Merges translated strings into xml file, ''root'' directory :~~* [[Translation_environment20|2.0.x]] ~~intltool-merge -~~* [[Translation_environment22|2.2.x ~~po/ data/gramps~~to Gramps 3.~~xml~~4.~~in data/gramps~~x]]* [[Translation_environment4|Gramps 4.0.~~xml~~x to master (trunk)]]

~~# Merges translated strings into key file, ''root'' directory :~~ ~~intltool-merge -k po/ data/gramps~~There are two stages to getting a translation to work.~~keys.in data/gramps.keys~~

===Files and directory===

~~There are two stages to getting a translation to work.~~ Translations are stored in a <code>.po</code> file that contains the mappings between the original strings and the translated strings, see [[Translating ~~GRAMPS~~Gramps]].

Translators use a generic file <code>gramps.pot</code> to generate their <code>.po</code> file.

~~GRAMPS~~ Gramps uses a utility that extracts the strings from the source code to build the <code>.po</code> file. This utility ~~(a perl script called by the command <code>make</code>)~~ examines the source files for strings that have been marked as translatable. In the python source, these are the strings enclosed in the <code>_()</code> function calls. ~~If you want this script to take your translatable strings into account, you must add your source file path in the file : <code>po/POTFILES.in</code>. For this report example, you should add:~~ ~~...~~ ~~# plugins directory~~ ~~src/plugins/AncestorChart2.py~~ ~~src/plugins/AncestorReport.py~~ ~~...~~ ~~src/plugins/FindDupes.py~~ ~~src/plugins/Leak.py~~ ~~src/plugins/MediaManager.py~~ ~~src/plugins/Myreport.py # <------~~ ~~src/plugins/NarrativeWeb.py~~ ~~src/plugins/PatchNames.py~~ ~~...~~ ~~In this file, the sources are sorted within each directory or category~~.

Note that because strings are extracted by a script from the source file, string constants and not variables must be enclosed in the <code>_()</code> call. In the following example, the extraction script will not extract the string.

At run time, the <code>_()</code> calls will translate the string by looking it up in the translation database (created from the <code>.po</code> files) and returning the translated string.

~~You can check missing references (not on <code>POTFILES~~===Add the reference to the file=== We need to also add a reference to this file for generating the translation template. * [[Translation_environment22#Files_and_directory|2.2.x to Gramps 3.~~in</code> and <code>POTFILES~~4.~~skip</code>) with the command~~x]] ~~/intltool-update -m~~ ~~into <code>/po</code> directory~~* [[Translation_environment4#Files_and_directory|Gramps 4.0.x, master (trunk)]]

==Tips for writing a translatable Python module==

===Use complete sentences===

Don't build up a sentence from phrases. Because a sentence is ordered in a particular way in your language does not mean that it is ordered the same way in another. Providing the entire sentence as a single unit allows the translator to make a meaningful translation. Do not concatenate phrases or terms as they will then show up as separate phrases or terms to be translated and the complete sentence may then show up incorrectly, especially in right-to-left languages (Arabic, Hebrew, etc.).

===Use named %s /%d values===

Python provides a powerful mechanism that allows the reordering of %s values in a string. A translator may need to rearrange the structure of a sentence, and it may not match the order you chose. For example:

print "%s was born in %s" % ('Joe','Toronto')

'city' : 'Toronto', 'male_name' : 'Joe'}

In this case, the order of the %s formatters is not important, since the values will be looked up in the dictionary at run time to resolve the value. The translator can reorder the %s formatters, or even remove them without causing any problems.

Note that Python also allows a variation which some people find easier to read:

print "%(male_name)s was born in %(city)s" % dict(

city = 'Toronto', male_name = 'Joe')

Some languages are using right-to-left text direction. It is important to use named arguments when there is more than one %s/%d value into a translation string.

===Provide separate strings for masculine and feminine.===

Plurals are handled differently in various languages. Whilst English or German have a singular and a plural form, other languages like Turkish don't distinguish between plural or singular and there are languages which use different plurals for different numbers, e.g. Polish.

Gramps provides a [[~~Translating_GRAMPS~~Translating_Gramps#Plural_forms|plural forms]] support, useful for locales with multiples plurals according to a number (''often slavic based languages'') or for Asian family languages (''singular = plural'').

Note, some locales need singular form with [http://en.wikipedia.org/wiki/Plural#Zero zero] and plural form might be also used in this case.

We need to call module :

from ~~TransUtils~~ gen.ggettext import sgettext as _

or

from ~~TransUtils~~ gen.ggettext import sngettext as _(if you use ngettext)# not implemented

Translation string will use context, but this will be hidden on user interface.

See ''the person'' details # or See ''the family, the event, etc...'' details

Make ''the person'' active

===Genitive form===

Genitive (and some other) forms need to modify the name itself into some locales, like Finnish or Swedish.

Instead of "free form" text that talks about

e.g.

son '''of %s'''

better would be for example some tabulated format like this:

son: %s

daughter: %s

which doesn't require genitive.

===Punctuation===

Use of commas, semicolons and spacing can be different than into english.

''todo''

==Changing translated text message in the source code==

One of the severities in our bug tracker is "text", which ranks up as easier than "tweak" and "minor", but more difficult than "trivial". If a bug is concerned with readability or correctness of a text that Gramps outputs, whether in GUI, in a console error message, or in a produced report, then "text" is the severity to use. So why is it more than "trivial"?

As described above, any translated text in the source code gets reflected into tens of *.po files, maintained by the translators. So every time you just change it in the source, ALL the translators need to do the translation again. Normally, the translation environment will give a prudent suggestion, but there is still a manual approval step. If you check in the change, the string will not be translated until the translators pick it up.

This is why, if what you change is just a couple of spelling mistakes, a missing comma in the middle, or maybe an extra space somewhere in the message, it's a good idea to save the translators' work, by doing a global search and replace of your source message text in the *.po files, and committing these along with your change.

For short enough messages, that don't span multiple lines in the *po files, you can do it by executing

perl -pi -e 's/YOUR MESSAGE BEFORE CORRECTION/your message after correction/g;' *.po *.pot

in the po/ directory. Make sure you do a "git diff" and observe the results make sense. (You'll probably have to escape some characters in the regular expression, such as | or .).

To make it easier to port your changes across multiple branches, it's a good idea to separate the changes in the source tree from the po/ ones. This way, you'll be able to quickly re-apply the source changes using normal cross-branch porting workflow (such as `git cherry-pick'), and then adjust and re-run the search-and-replace in the po/ on the new branch, because, most likely, it won't reapply due to the differences in the .po layout.

{{man note| Note |To stress it again, only do it for text change that didn't change how it is going to be translated. If you'd like your change to be somehow reflected in the translations, let the translators do the work instead.}}

==Textual reports==

Since Gramps-3.2 we are able to select the language for textual reports, see ~~[http://www.gramps-project.org/bugs/view.php?id=~~feature {{bug|2371 ~~this feature]~~}}.

~~Currently~~ For Gramps it was only available on Ancestor report (3.2.x) and detailed reports (3.3.x). The capability for translated-output was added to some more (gramps core) reports, in the gramps40 branch, before gramps 4.0.0 was released. So more than the "three original reports" now have had this [https://gramps-project.org/bugs/view.php?id=2371#c33601 feature request implemented].

For providing this option:

self.__get_date(event.get_date_object())

self.__get_type(event.get_type())

[[Category:Translators/Categories]]

[[Category:Developers/Tutorials]]

[[Category:Developers/General]]

[[Category:Addons]]

Sam888

Bureaucrats, Administrators

15,534

edits

Changes

Coding for translation

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Contributor help pages

wiki

Tools