Localization

So, over time, a number of people have come to me and asked
about the possibility of localizing RT into a number of languages
For the most part, I’ve asked them to hold off for 2.0. So, now
we’re in the middle of the development cycled for RT2.0 and I’ve
actually started looking at localization. Here’s the “80% solution”
I’m going to propose. I’d like to hear from you if you don’t think this
will meet your needs. Note that I am not promising to be all things
to all people. I just want to make sure that I’m not nothing to all people
:wink:

So, here's what I propose:

* Allowing the mail templates to be localized to whatever
  language you want. exactly as they are now.
* Using Locale::PGetText to localize the core, cli AND
  web uis. Note that this would mean that _each user_ could
  choose to use RT in whatever language they are most comfortable
  working in.

* Not dealing with localized sorting and comparisons.  It's just 
more work than I really want to undertake right now.. and I think
It would lead to a lot of complexity that I don't want to deal with 
yet

The most controversial part of that…and the part that I really just am not sure
about is the webui. I’d like to do everything in my power to not have to
build multiple parallel localized versions of the webui. that would
just get way too cumbersome really quickly. So. The ever-important question:

Do we have any l10n experts in our midst? Anyone care to impart some advice?
Jesse

jesse reed vincent – jrvincent@wesleyan.edu – jesse@fsck.com
pgp keyprint: 50 41 9C 03 D0 BC BC C8 2C B9 77 26 6F E1 EB 91
And I’m told we do share some common rituals. Our “flame war” is apparently
held in person in their land and called “project meeting”.
-Alan Cox [on “Suits”]

So, over time, a number of people have come to me and asked
about the possibility of localizing RT into a number of languages
For the most part, I’ve asked them to hold off for 2.0. So, now
we’re in the middle of the development cycled for RT2.0 and I’ve
actually started looking at localization. Here’s the “80% solution”
I’m going to propose. I’d like to hear from you if you don’t think this
will meet your needs. Note that I am not promising to be all things
to all people. I just want to make sure that I’m not nothing to all people
:wink:

So, here’s what I propose:

  • Allowing the mail templates to be localized to whatever
    language you want. exactly as they are now.

  • Using Locale::PGetText to localize the core, cli AND
    web uis. Note that this would mean that each user could
    choose to use RT in whatever language they are most comfortable
    working in.

  • Not dealing with localized sorting and comparisons. It’s just
    more work than I really want to undertake right now… and I think
    It would lead to a lot of complexity that I don’t want to deal with
    yet

That sounds great. I think sorting isn’t so important, but comparisons
will be (in database or knowledge base searches). Free form texts/requests
from users comes in different MIME encodings and their browsers could use
in another ones => support of correct keyword searching is needed to
work well.

The most controversial part of that…and the part that I really just am not sure
about is the webui. I’d like to do everything in my power to not have to
build multiple parallel localized versions of the webui. that would
just get way too cumbersome really quickly. So. The ever-important question:

I’m not an i10n expert but I hope there will be some ‘resource’ files with
localized texts for a given version of UI here. I think that some
versioning will be needed to make a place for new patches from RT
developers around the world (These couldn’t be localized just in time be
them ;-).

Do we have any l10n experts in our midst? Anyone care to impart some advice?
Jesse

jesse reed vincent – jrvincent@wesleyan.edu – jesse@fsck.com
pgp keyprint: 50 41 9C 03 D0 BC BC C8 2C B9 77 26 6F E1 EB 91

And I’m told we do share some common rituals. Our “flame war” is apparently
held in person in their land and called “project meeting”.
-Alan Cox [on “Suits”]


Rt-devel mailing list
Rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

Jan Okrouhly
-----------------------------------------+---–okrouhly@civ.zcu.cz—
Laboratory for Computer Science | phone: (420 19) 7491588
University of West Bohemia | location: Univerzitni 22
Americka 42, 306 14 Pilsen, Czech Republic | room: UI404
------------------------------------------73!-de-OK1INC@OK0PPL.#BOH.CZE.EU-

Below my comments, for what its worth.

As quoted from Jan Okrouhly:

The most controversial part of that…and the part that I really just am not sure
about is the webui. I’d like to do everything in my power to not have to
build multiple parallel localized versions of the webui. that would
just get way too cumbersome really quickly. So. The ever-important question:

I’m not an i10n expert but I hope there will be some ‘resource’ files with
localized texts for a given version of UI here. I think that some
versioning will be needed to make a place for new patches from RT
developers around the world (These couldn’t be localized just in time be
them ;-).

Resource files for things like error messages will be very useful.

Do we have any l10n experts in our midst? Anyone care to impart some advice?

I don’t know whether you are planning to do so already, but consider
using templates for the Web UI as well. That way, people can localize
what they want, and even change the Web UI to their taste, without
a need to change the logic.

You will need a template processor to fill in the templates with
dynamic information (variables, query results). At least the template
processor need to replace variables. In the examples below I use a
variable respresentation of ‘$xxx.yyy$’, but any will do.

There are at least two ways to approach this:

  • Create a template per HTML page.
    This will require some special processing instructions within the
    template to indicate database queries, and most likely, conditional
    statements. E.g. (syntax is arbitrary, choose anything that can
    be easily parsed in perl)

    some HTML text …
    #if $error.submit$
    You’ve submitted something illegal.
    #endif

    .... #query name=overview ... #endquery
    nrname
    $overview.nr$$overview.name$

    Within the logic you define the exact implementation (e.g. a SELECT
    statement on the database) for the query with name “overview”.

    There are many template processors around that do this kind of
    thing. Some require database queries to be specified in the
    template, but I consider this bad practice. (Business) logic and design
    should be separated. The design should only need design-logic
    and only if really necessary.

  • Create a template per HTML snippet.
    This will prevent the need for a #query type processing instruction,
    but you may get more template files to maintain, and maintaining
    the overview may be more difficult.
    An advantage may be that HTML snippets may be reused, e.g.
    a common header and/or footer.

Kind regards,
– Marco Nijdam, marco@west.nl
– West Consulting bv, Bagijnhof 80, 2611 AR Delft, The Netherlands
– P.O. Box 3318, 2601 DH Delft
– Tel: +31 15 219 1600, Fax: +31 15 214 7889

  • Allowing the mail templates to be localized to whatever
    language you want. exactly as they are now.

We should also have a “Language” field in the database.

This is what I find most important when it comes to l10n, because it is
the “external requestors”, and not our support workers that needs
translations. At least in our case, the email traffic is all the external
requestors will see … at least as for now.

  • Using Locale::PGetText to localize the core, cli AND
    web uis. Note that this would mean that each user could
    choose to use RT in whatever language they are most comfortable
    working in.

Well, I haven’t tried this, but according to the freeciv-devel
mailinglist, there are certain problems with gettext. You never get all
those kasuses and endings right. I.e., in a context like “4 tickets
shown” and “1 ticket shown”, it’s common to separate singularis and
pluralis, but that’s not enough - after what I’ve heard languages like
Polish also have dualis, etc … meaning the ending would be different for
1, 2, 4 and 46.

In other examples, some languages needs completely different adjustments
the original author didn’t think of. In examples like “This request is
owned by you” vs “This request is owned by Jesse”, there might be changes
in the rest of the sentence in different languages.

Tobias Brox (alias TobiX) - +4722925871 - urgent emails to
sms@tobiasb.funcom.com. Check our upcoming MMORPG at
http://www.anarchy-online.com/ (Qt) and play multiplayer Spades,
Backgammon, Poker etc for free at http://www.funcom.com/ (Java)

I don’t know whether you are planning to do so already, but consider
using templates for the Web UI as well.

We are using HTML::Mason in the devel version.

Tobias Brox (alias TobiX) - +4722925871 - urgent emails to
sms@tobiasb.funcom.com. Check our upcoming MMORPG at
http://www.anarchy-online.com/ (Qt) and play multiplayer Spades,
Backgammon, Poker etc for free at http://www.funcom.com/ (Java)

So we’re rapidly approaching the time when I’d like to get RT translated into
a whole bunch of languages. We’ve currently got Traditional Chinese and German
as test languages.

I’d very much like to add Russian or another slavic
language to the set, to make sure we can deal with russian grammar properly with
the current framework. Ideally, I’d like something that uses a cyrillic
characterset for this first stab.

I don’t want to have a lot of translations come in right now, as things are
still somewhat in flux, but if you’re interested in translating RT into your
native language, please drop me personal mail at jesse@bestpractical.com,
with a subject of “Translation: ”. That way, I can coordinate
and make sure that we don’t get six translations into klingon :wink:

Thanks,

Jesse

http://www.bestpractical.com/products/rt – Trouble Ticketing. Free.

So we’re rapidly approaching the time when I’d like to get RT translated into
a whole bunch of languages. We’ve currently got Traditional Chinese and German
as test languages.

Will RT 2.2 use content negotiation to determine the language? In an
international environment like I’m working in, one fixed language per
installation would not be very useful.

Fabian

Is the database already configured to use UTF-16/Unicode in its tables?

Thanks,
Christian

Christian Gilmore
Technology Leader
GeT WW Global Applications Development
IBM Software Group

The next release of RT (which may get a number a bit higher than 2.2, due
to the major changes behind the scenes) uses browser-based content
negotiation to set the language. I can currently run a local instance
with mozilla speaking german, IE speaking chinese and Omniweb speaking
english.
-jOn Thu, Jun 06, 2002 at 03:44:00PM +0300, Fabian Ritzmann wrote:

On Thu, 2002-06-06 at 08:03, Jesse Vincent wrote:

So we’re rapidly approaching the time when I’d like to get RT translated into
a whole bunch of languages. We’ve currently got Traditional Chinese and German
as test languages.

Will RT 2.2 use content negotiation to determine the language? In an
international environment like I’m working in, one fixed language per
installation would not be very useful.

Fabian


rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

http://www.bestpractical.com/products/rt – Trouble Ticketing. Free.

Postgres can be coaxed into it at install time. Sadly, we’re still waiting
on mysql, though I believe that the current state of things works properly
for storage and retrieval and just falls down on sorting.
It looks like support for two-byte charsets mayy end up requiring
perl 5.8 for proper unicode support. We’ve ended up going with
UTF-8 for things for now, as it dovetails with non-utf-supporting pieces
somewhat more nicely.On Thu, Jun 06, 2002 at 09:15:48AM -0500, Christian Gilmore wrote:

Is the database already configured to use UTF-16/Unicode in its tables?

Thanks,
Christian

http://www.bestpractical.com/products/rt – Trouble Ticketing. Free.

What in perl-5.6.1 isn’t able to support unicode? We’ve not yet found
issues, but I’d very much like to hear if there are.

Thanks,
Christian

Christian Gilmore
Technology Leader
GeT WW Global Applications Development
IBM Software Group

What in perl-5.6.1 isn’t able to support unicode? We’ve not yet found
issues, but I’d very much like to hear if there are.

There’s no problem per se in 5.6.1 if all the MTAs, lexicon files, and
databases all support utf8 natively. However, Encode.pm is probably
needed for transparently enabling sending/receiving data in legacy
charsets and maintain unicode internally.

For western languages that need relatively few esoteric unicode conversion
and processing, 5.6.1 + Text::Iconv could handle it pretty well (although
5.6.0 is seriously broken beyond belief). I think that we should avoid
to burden western-language users with a 5.8 prereq, but in the long term,
core-supported perlio and Encode are probably the way to go.

Thanks,
/Autrijus/

I’m sorry to come into this conversation midway, but what is Encode.pm and
where can I get a copy of it? Is it available on CPAN?

Yes. But currently, you’ll need perl v5.7.3 or later (5.8-RC is preferred)
to use it. I am aware of no existing efforts to backport it to 5.6.x.

Encode.pm is behind most of the PerlIO-layer features, which is a main
advantage of 5.8 over previous perls. For example:

open $fh, '<:encoding(euc-jp)', $jpn_file;
print <$fh>; # utf

I’ve got an interesting problem that might be solved for now by a library that
can take a byte string encoded in an odd encoding, say EUC_JP, and stuff it
into the proper Perl UTF-8 string.

Text::Iconv does that. But Encode.pm takes care of some more esoteric encodings,
as well as providing tighter perl-level integration. For example, length() and
regexen are all capable of processing utfstrings.

HTH,
/Autrijus/

What in perl-5.6.1 isn’t able to support unicode? We’ve not yet found
issues, but I’d very much like to hear if there are.

Actually, with my Encode::compat (and Text::Iconv) Perl 5.6.1 seems
to be happy with RT’s unicode support, as far as I know. I’d also
like to hear if there’s anything still broken or missing.

Oh, by the way, 5.8-using people may encounter a random ‘invalid
header: Content-C0c0’ error on web-submitted tickets or comments
that contain unicode characters. This is the result of a utf8
bug detailed below:

http://rt.perl.org/rt2/Ticket/Display.html?id=18107

Anyway, the following patch should cure it, and is submitted
to MailTool’s author for inclusion in the next version.

Thanks,
/Autrijus/

— /usr/home/samba/Common/perl/site/lib/Mail/Header.pm Thu Oct 3 15:50:58 2002
+++ /usr/local/lib/perl5/site_perl/5.8.0/Mail/Header.pm Fri Nov 29 13:35:47 2002
@@ -156,8 +156,8 @@

Change the case of the tag

eq Message-Id

  • $tag =~ s/\b([a-z]+)/\L\u$1/gio;
  • $tag =~ s/\b([b-df-hj-np-tv-z]+|MIME)\b/\U$1/gio
  • $tag =~ s/\b([a-z+])/\L\u\u$1/gio;

  • $tag =~ s/\b([b-df-hj-np-tv-z]+|MIME)\b/\U\u$1/gio
    if $tag =~ /-/;

    $tag;