Internationalization

From Gramps
Jump to: navigation, search

Internationalization Architecture

Gnome-important.png
'''This article applies to Gramps 4.x. Gramps 3.x and earlier do not have a central localization facility'''

Internationalization and Localization

Applications like Gramps are used by people in many countries who read, write, and speak a multitude of languages. They also have different conventions for writing numbers and dates, and use different currencies. Languages are expressed in different alphabets or in some cases in ideographs. They have different ways of expressing plurals and possession. The art of making an application usable by more than one culture is called internationalization (sometimes shortened to i18n) and the necessary settings and data, including translations, needed for a particular culture is called localization (or l10n). This article explains the central internationalization facility in Gramps to help developers to make their parts of the program and its addons usable around the world.

GrampsLocale: A Central Internationalization Facility

All internationalization is done through the GrampsLocale class in gramps.gen.utils.grampslocale.py. The first instance of GrampsLocale is constructed by gramps.gen.const.py early in program startup and it configures itself from the environment. GrampsLocale uses an adapted Singleton Pattern: Any call to the GrampsLocale constructor with no parameters or parameters which match the environment will return the first instance. Calls with parameters which don't match the environment will create a new GrampsLocale instance. For convenience, gramps.gen.const.py also retains a pointer to that first instance named GRAMPS_LOCALE, and it's easier to import it than it is to import the GrampsLocale class and retrieve the first instance with the constructor. To illustrate:

 from gramps.gen.const import GRAMPS_LOCALE as glocale

vs.

 from gramps.gen.utils.grampslocale import GrampsLocale
 glocale = GrampsLocale()

Translators

The most common use for the GrampsLocale object is to retrieve the translator so that one can obtain localized messages. GrampsLocale creates an extension of Python's GnuTranslation Class with an additional function, sgettext(). The translation functions are described in [Coding For Translations]. When writing a new module one will usually alias one or more of them, with the most commonly used one named _:

 _ = glocale.translation.gettext

and you identify a translatable string literal with:

 print(_("Father's Name: %s") % p_name)

Further details may be found in Coding for translation.

Additional Message Catalogs

All of the strings in Gramps's core are contained in its own message catalog, but addons often add their own strings which need translation. It's up to the addon author to get these strings translated and to package the supplementary message catalogs with the addon. To get a translation object that will look for a supplementary message catalog, one uses get_addon_translator:

 _ = glocale.get_addon_translator(__file__).gettext

will look for a message catalog called 'addon.mo' in the locale directory for glocale's language packaged with the addon. (__file__ is python magic for the path to this python code file; get_addon_translator() looks for a locale subdirectory in the same directory as __file__.)

Again, more details are in Coding For Translation

Formatters

There's more to localization than just getting the right translation, of course. Americans would abbreviate the constant pi as 3.14, but most continental Europeans would write it 3,14. Dates are more complicated: Not only are there a variety of ways to write them but when writing an all-numbers date Americans perversely put the month first, so 5/1/2012 is 1 May in the USA and 5 January most everywhere else. Gramps provides formatters and parsers for both dates and numbers. The date parser and displayer (formatter) are properties which can be retrieved with

 ddisp = glocale.date_displayer
 dparse = glocale.date_parser

There's also a convenience function, glocale.get_date(date) to get a localized string for a date.

Numbers can be formatted using glocale.format() or glocale.format_string() which simply wrap the respective locale functions.

Collation (Sort) Order

Everyone learned how to sort things in "alphabetical order" in school, right? Well, Chinese kids didn't, because they don't use an alphabet, but they do have a way of sorting strings. But most languages have their own ways of sorting "alphabetically" even if they use the same Roman alphabet that English speakers do. Python provides a sorted function, but unless you give it special localized keys, it will sort according to the English rules. Python also provides locale.strxfrm to generate those keys, but it doesn't work very well, especially on Microsoft Windows. IBM provides a library, ICU which does a much better job. You don't have to worry about that, though, because GrampsLocale hides the details from you. All you need to do is call

 sorted(string_list, key=glocale.sort_key)

string_list needs to be translated first, of course. If you need a strcmp like test,

 glocale.strcoll(_("string1"), _("string2"))

Alternate Locales

Reports can offer the option of an alternate locale to translate the Gramps-provided strings and to present dates and numbers appropriately. This is accomplished by obtaining a new GrampsLocale object with a different value of lang:

new_locale = GrampsLocale(lang='pt_BR')

and then setting _ and any needed formatters to the new_locale functions instead of the GRAMPS_LOCALE ones. One can populate a menu with a localized list of available locales using GrampsLocale.get_language_dict()

Convenience functions stdoptions.add_localization_option() in gramps.gen.plug.reports and Report.set_locale() make it quite simple and eliminate the need for a dependency on GrampsLocale in your report code. gramps/plugins/textreports/ancestorreport.py provides a usage example. Note that in the case of these text reports the Narrator class does the actual work.

Implementation

That Singleton Thing

A singleton isn't used much in Python. Normally if you want a class with only one instance you don't even define a class, you just write a module that does what you want. In less flexible languages like C++ or Java, class is the overarching paradigm and a singleton pattern is the way to make sure that there's only one. It works like this:

 class Singleton(object):
      _instance = None
      def __new__(cls):
          if not cls._instance:
              cls._instance = super(Singleton, cls).__new__(cls)
          return cls._instance

The first time Singleton() is called, _instance == None, so it sets _instance to a new object by calling super()__new__() and returns the result. After that, _instance != None so it just returns the same object.

GrampsLocale allows the option to create more than one object, but it wants there to be only one default, so it uses the singleton pattern, but checks the parameters supplied to the constructor. If they're not the same as the ones belonging to _instance (actually called __first_instance since there can be more than one instance) then it goes ahead and creates a new object.

There's another twist and another variable to control it: Python uses two-step construction. __new__() makes the object in memory and __init__() sets up the objects instance variables. Setting up a GrampsLocale from the environment is expensive and has side effects, so we want to do it only once. So __new__() sets a special instance variable, initialized which __init__() checks for and returns without doing anything if it exists and has already been set. Otherwise, only if self is the __first_instance does __init__() run the appropriate (for the platform) first instance initializer.

First Instance

There are two constraints driving the first instance model:

  • The Gtk toolkit, which underpins the GUI, sets its localization at startup
  • Obtaining the localization parameters from the operating system involves expensive operations like file access, directory searches, and probes of structures like the Windows Registry and MacOSX System Preferences.

Consequently, the first time — and only the first time — a GrampsLocale is initialized without arguments (i.e., that it's called as foo = GrampsLocale()) it runs the previously mentions __init_first_instance() which queries the environment and sets up the default GrampsLocale. That is normally done in gramps/gen/const.py, imported very early in grampsapp.py. Subsequent calls to GrampsLocale() without arguments return the same instance, as do calls to GrampsLocale() with arguments that match those set in the environment. Calls to GrampsLocale with different arguments will return a different object, which is useful when one wants to localize something like a report differently from the UI.

Translation Objects

grampslocale.py also provides two subclasses of the Python gettext Translations, simply because they don't provide sgettext. GrampsTranslations extends gettext.GnuTranslations and actually retrieves translated strings from the message catalogs while GrampsNullTranslations subclasses gettext.NullTranslations and just returns the applicable part of the original string (the msgid).

NB: Both always return UnicodeStrings in both Python2 and Python3. That's transparent in Python3 where all strings are Unicode, but often requires encoding for output in Python2.