Changes

Jump to: navigation, search

Coding for translation

5,845 bytes added, 23:13, 6 April 2014
m
For portable string handling on all platforms and for all locales
Coding guidelines to enable easy and correct translation of strings on the User Interface.
[[Category:Translators/Categories]][[Category:Developers/Tutorials]][[Category:Developers/General]]
==Introduction==
Gramps has always been internationalized (see
=== Encoding ===
String handling can be a bit tricky in a localized environmentso it's important that developers understand Unicode string handling in both versions of the language.  This is mostly a problem for Microsoft Windows™: Mac OSX and Linux use UTF8 for just about everything if the locale is set up correctly (and we try to do that when Grampsstarts up), so one can get away with a lot of encoding mistakes on those platforms. Windows™ on the other hand uses a slightly modified version of UTF16 for file names and retains the old DOS [http://en.wikipedia.org/wiki/Code_page code page] system for encoding output to cmd.exe. The take-away is that if you need to mess with input or output encoding, be sure to test on both Linux and Windows before deciding that you're done. If you're not set up for multiple-platform testing arrange with someone,, who can test for you on the platform you don't have. ====Python 2====Python 2.7 has two text classes, <tt>[https://docs.python.org/2.7/library/functions.html#str str]</tt> and <tt>[https://docs.python.org/2.7/library/functions.html#unicode unicode]</tt>. <tt>Unicode</tt> objects are encoded in UTF16 internally on most platforms, and most python 's translation facility ''output''' functions will do the right thing with them. One caveat here: passing both <tt>unicodes</tt> and <tt>strs</tt> to <tt>os.path.join()</tt> will always return Unicode-a <tt>str</tt>, so either make sure when constructing a path that all arguments are <tt>unicodes</tt> or convert the result. The bsddb module that ships with Python2 is stupid about paths and requires that they be encoded stringsin the file system encoding. For as long as it This is necessary handled in gramps/gen/db/write.py with _encode() and independently in a few other places. Strings from the operating system, including environment variables, are a problem on Windows™; The os module uses for input the [http://msdn.microsoft.com/en-us/library/windows/desktop/dd317766%28v=vs.85%29.aspx ANSI API] to support both Python2 the Windows SDK, which interprets the value of the environment variable according to the active code page and Python3produces a <tt>str</tt>, developers will need converting any codepoints > 0xff to understand Unicode string handling in both versions ? and often misinterpreting those between 0x0f and 0xff if the encoding of the languageinput happens to be something other than the active system codepage. Once this is done it is quite difficult to get non-ASCII pathnames back into a useable form, so gramps/gen/constfunc.py provides a get_env_var() function that uses the [http://msdn.microsoft.com/en-us/library/windows/desktop/dd374081(v=vs.85).aspx Unicode API] to instead. Always use that function to read environment variables which might include non-ASCII characters and avoid using os-module functions for reading paths.
If you use non ASCII characters By default string constants in a string literal the string must be UnicodePython 2 are <tt>str</tt>.
'''Don't do this:'''====Python 3==== print _(u"EgPython 3 also provides two test classes, valid values are 12<tt>[https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str]</tt> and <tt>[https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview bytes]</tt>.0154In Python 3, 50° 52′ 21<tt>str</tt> is the unicode type and <tt>bytes</tt> is text encoded some other way.92″NEverything pretty much ")just works".
====Portability Functions and constants====We've provided a couple of functions in gramps/gen/constfunc.py to ease conversion of <tt>strs</tt> to <tt>unicodes</tt>; these include the necessary tests to portably do the right thing regardless of what's passed to them and according to which version of Python is in use:* <tt>cuni</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#unicode unicode] in Python 2 and for [https://docs.python.org/3/library/functions.html#func-str str] in Python 3. This has no protective checks so use it with care.* <tt>conv_to_unicode(string, encoding='utf8')</tt>: This ensures that its return value is a Unicode string which has been converted from a non-Unicode in the <tt>encoding</tt>, which defaults to UTF8 for ease of use with the GUI.* <tt>get_env_var(string, default=None)</tt>: On Windows™ in Python2, uses the <tt>ctypes</tt> module to invoke the Microsoft Unicode API to read the value of an environment variable and return a Unicode; otherwise returns the value from the <tt>os.environ</tt> array.There are also two constants:* <tt>STRTYPE</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#basestring basestring] in Python 2 and for [https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str] in Python 3. It can be used to test whether an object is a text-type.* <tt>UNITYPE</tt> is an alias for [https://docs.python.org/2.7/library/functions.html#unicode unicode] in Python 2 and for [https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str str] in Python 3. It can be used to test whether an object is already encoded in Unicode. ====For portable string handling on all platforms and for all locales====* Localized strings returned from gettext, ngettext, etc. are always unicode* Text files should always be encoded in UTF8. The easy and portable way to do this is to:*: <pre>import io</pre>*: <pre>fh = io.open(filepath, mode, encoding='utf8')</pre>*: where ''mode'' is one of r, rw, r+, or w+. ''Don't open these files in binary mode!'' Pass unicode-type strings to fh.write() and expect the same from fh.read().* Always read environment variables with <tt>constfuncs.get_env_var()</tt> if there's any chance that it will contain a non-ASCII character.* Use <tt>from __future__ import unicode_literals</tt> in any source file which might present strings to the user or to the operating system.*:When creating string literals, '''don't do this:'''*:<pre>print _(u"Eg, valid values are 12.0154, 50° 52′ 21.92″N")</pre>*:Because the <tt>u</tt> prefix was removed for Python 3.0-3.2. (It was restored in 3.3 for compatibility with 2.7, but it's not necessary.)*:Instead, put in the first line of the module *:<pre># *-* coding: utf-u 8 *-*</pre>*:then in the imports section *:<pre>from __future__ import unicode_literals</pre>*:which makes all of the literals unicode. '''Make sure that your editor is set up to save utf-8!'''
===Glade files===
* [[Translation_environment20|2.0.x]]
* [[Translation_environment22|2.2.x to Gramps 3.4.x]]
* [[Translation_environment4|TrunkGramps 4.0.x to master (trunk)]]
There are two stages to getting a translation to work.
* [[Translation_environment22#Files_and_directory|2.2.x to Gramps 3.4.x]]
* [[Translation_environment4#Files_and_directory|Gramps 4.0.x, Trunkmaster (trunk)]]
==Tips for writing a translatable Python module==
For short enough messages, that don't span multiple lines in the *po files, you can do it by executing
perl -pi -e 's/YOUR MESSAGE BEFORE CORRECTION/your message after correction/g;' *.po *.pot
in the po/ directory. Make sure you do a "git diff" and observe the results make sense. (You'll probably have to escape some characters in the regular expression, such as | or .). To make it easier to port your changes across multiple branches, it's a good idea to separate the changes in the source tree from the po/ ones. This way, you'll be able to quickly re-apply the source changes using normal cross-branch porting workflow (such as `git cherry-pick'), and then adjust and re-run the search-and-replace in the po/ on the new branch, because, most likely, it won't reapply due to the differences in the .po layout.
'''NOTE: '''to stress it again, only do it for text change that didn't change how it is going to be translated. If you'd like your change to be somehow reflected in the translations, let the translators do the work instead.
self.__get_date(event.get_date_object())
self.__get_type(event.get_type())
 
[[Category:Translators/Categories]]
[[Category:Developers/Tutorials]]
[[Category:Developers/General]]

Navigation menu