RT email and the ISO-8859-1 characters

When RT mails us, it does not add headers such as:

Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

and therefore, composed characters (Latin-1 charset) are garbled.

What is the proper way to add such headers:

  1. read the documentation, it is already explained,

  2. copy the Scrip Action/Notify to Action/Notify-international and
    hack it.

a message of 19 lines which said:

When RT mails us, it does not add headers such as:

Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

and therefore, composed characters (Latin-1 charset) are garbled.

What is the proper way to add such headers:

I forgot to say that adding the MIME headers to the template does not
work since RT messes with MIME headers (to handle multipart mail, I
presume).

“Stephane” == Stephane Bortzmeyer bortzmeyer@netaktiv.com writes:

Stephane> When RT mails us, it does not add headers such as:
Stephane> Content-Type: text/plain; charset=iso-8859-1
Stephane> Content-Transfer-Encoding: 8bit

Stephane> and therefore, composed characters (Latin-1 charset) are
Stephane> garbled.

You might want to look at my patch for Japanese support in the contrib
directory; you can use something similar to how I add Content-Type and
Content-Transfer-Encoding headers.

Also, you will want to consider properly encoding the body and headers
with Quoted-Printable and QP encoding. The MIME:: modules should do
this easily.

Ben

Brought to you by the letters L and B and the number 15.
“He’s like… some sort of… non-giving up… school guy!”
Debian GNU/Linux maintainer of Gimp and Nethack – http://www.debian.org/

a message of 23 lines which said:

You might want to look at my patch for Japanese support in the contrib
directory; you can use something similar to how I add Content-Type and
Content-Transfer-Encoding headers.

Thanks for the tip. Do you think we may see an internationalized (I
don’t ask for a French localization, I could do it myself) version of
RT soon? Or should I tinker in my backyard only?

Also, you will want to consider properly encoding the body and headers
with Quoted-Printable and QP encoding.

Never. I never use QP. Nobody does it in Latin-1-speaking countries.

“Stephane” == Stephane Bortzmeyer bortzmeyer@netaktiv.com writes:

Stephane> Thanks for the tip. Do you think we may see an
Stephane> internationalized (I don't ask for a French
Stephane> localization, I could do it myself) version of RT soon? 
Stephane> Or should I tinker in my backyard only?

The problem is that Mason is not yet internationalized, so we
would have to invent our own framework.

Ben> Also, you will want to consider properly encoding the body
Ben> and headers with Quoted-Printable and QP encoding.

Stephane> Never. I never use QP. Nobody does it in
Stephane> Latin-1-speaking countries.

If you do not send your 8-bit email encoded either in Quoted-Printable
or Base64, then you’re violating RFC 2822 and will run into trouble
with email gateways:

http://www.faqs.org/rfcs/rfc2822.html

2.1. General Description

At the most basic level, a message is a series of characters. A
message that is conformant with this standard is comprised of
characters with values in the range 1 through 127 and interpreted as
US-ASCII characters [ASCII]. For brevity, this document sometimes
refers to this range of characters as simply “US-ASCII characters”.

[snip]

I highly doubt that “nobody does it”; I receive email transparently
encoded in quoted-printable every day with the proper headers, and my
MUA converts it to the correct 8-bit Latin-1 for me automatically.

What makes you say that “nobody does it”?

Ben

Brought to you by the letters B and C and the number 15.
“I’m with insurance.”
Debian GNU/Linux maintainer of Gimp and Nethack – http://www.debian.org/

a message of 44 lines which said:

If you do not send your 8-bit email encoded either in Quoted-Printable
or Base64, then you’re violating RFC 2822 and will run into trouble
with email gateways:

Almost all the SMTP servers are now 8-bits clean. Nobody still runs
sendmail 5 :slight_smile:

What makes you say that “nobody does it”?

On a typical french-speaking mailing list:

aragon:~/Mail/debian % grep -i ‘content-transfer-encoding: *quoted-printable’ french |wc -l
436
aragon:~/Mail/debian % grep -i ‘content-transfer-encoding: *8bit’ french |wc -l 1276

“Stephane” == Stephane Bortzmeyer bortzmeyer@netaktiv.com writes:

Stephane> Almost all the SMTP servers are now 8-bits clean. Nobody
Stephane> still runs sendmail 5 :-)

This is not true. SMTP gateways to non-IP networks often depend
on 7-bit email, and treat 8-bit email in extremely special ways.

>> What makes you say that "nobody does it"?

Stephane> On a typical french-speaking mailing list:

Stephane> aragon:~/Mail/debian % grep -i
Stephane> 'content-transfer-encoding: *quoted-printable' french
Stephane> |wc -l 436 aragon:~/Mail/debian % grep -i
Stephane> 'content-transfer-encoding: *8bit' french |wc -l 1276

That’s hardly nobody, and don’t forget that many people will have the
quoted-printable automatically converted to 8-bit by their ISP.

It’s still not at all difficult to use quoted-printable encoding
to safely follow standards.

Ben

Brought to you by the letters L and C and the number 6.
“A beldam is an old lady.”
Debian GNU/Linux maintainer of Gimp and Nethack – http://www.debian.org/

Stephane Bortzmeyer wrote:

a message of 44 lines which said:

If you do not send your 8-bit email encoded either in Quoted-Printable
or Base64, then you’re violating RFC 2822 and will run into trouble
with email gateways:

Almost all the SMTP servers are now 8-bits clean. Nobody still runs
sendmail 5 :slight_smile:

Quite possibly, but the purpose of standards is for people to follow
them! If individual’s make their own decisions, that defeats the reason
for having standards. It’s simply wrong to send out mail with
top-bit-set characters.

What makes you say that “nobody does it”?

On a typical french-speaking mailing list:

aragon:~/Mail/debian % grep -i ‘content-transfer-encoding: *quoted-printable’ french |wc -l
436
aragon:~/Mail/debian % grep -i ‘content-transfer-encoding: *8bit’ french |wc -l
1276

You can’t prove correctness by counting votes! Anyway, by your numbers
a third as many people are using QP as 8-bit. For what it’s worth,
here are my numbers:

$ find ~/mail -type f -print0 | xargs -0 grep -ih ‘^Content-Transfer-Encoding:’ | sort -f | uniq -ic | sort -nr
1228 Content-transfer-encoding: 7bit
485 Content-transfer-encoding: base64
449 Content-transfer-encoding: quoted-printable
280 Content-transfer-encoding: 8bit
5 Content-Transfer-Encoding: 7bit
2 Content-Transfer-Encoding: x-uuencode
2 Content-Transfer-Encoding: binary
2 Content-Transfer-Encoding:7bit

I’m in the UK, so many mails are fine in Ascii, but need Latin1 for
things like pound signs or foreign-derived words. It looks like I’ve
got over three times as many mails base 64 or QP encoded as 8-bit. So I
don’t think you can generalize about Latin1 countries anyway.

Smylers
GBdirect

a message of 19 lines which said:

When RT mails us, it does not add headers such as:

Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

and therefore, composed characters (Latin-1 charset) are garbled.

What is the proper way to add such headers:

I finally tried the following quick, not perfect, but acceptable,
patch, which I suggest to you.

If it has no nasty side-effects (with multipart messages?), I vote :slight_smile:
for its inclusion in RT.

— lib/RT/Template.pm.orig Thu Apr 4 15:10:17 2002
+++ lib/RT/Template.pm Thu Apr 4 15:11:47 2002
@@ -286,7 +286,12 @@
$body = $content;
}

  • $self->{‘MIMEObj’}->attach(Data => $body);
  • my ($type) = “text/plain”;

  • if ($RT::MailCharset) {

  •  $type .= "; charset=$RT::MailCharset";
    
  • }

  • $self->{‘MIMEObj’}->attach(Type => $type,

  •                        Data => $body);
    

    if ($headers) {
    foreach $header (split(/\n/,$headers)) {

— etc/config.pm.orig Wed Mar 6 13:50:14 2002
+++ etc/config.pm Thu Apr 4 15:09:03 2002
@@ -255,6 +255,9 @@

$UseFriendlyToLine = 1;

+# Add a charset which you want to be included in all outgoing mail messages.
+# Keep it undefined if you want the default.
+#$MailCharset = “iso-8859-15”;

}}}

At 15:23 Uhr +0200 4.4.2002, Stephane Bortzmeyer wrote:

I finally tried the following quick, not perfect, but acceptable,
patch, which I suggest to you.

For which version of RT is it?
I tried to apply it to 2.0.13, but it didn’t work.

Sebastian Flothow
sebastian@flothow.de
#include <stddisclaimer.h>

a message of 16 lines which said:

For which version of RT is it?
I tried to apply it to 2.0.13, but it didn’t work.

It was against 2.0.11. The code in lib/RT/Template.pm changed a lot in
2.0.13, I’ll update the patch.

At 12:14 Uhr +0200 11.4.2002, Stephane Bortzmeyer wrote:

It was against 2.0.11. The code in lib/RT/Template.pm changed a lot in
2.0.13, I’ll update the patch.

Can you give an estimate of when it’s going to be done?

Or, could someone give me a hint where in the code would be the right
place to add “charset=iso-8859-1” to the headers of outgoing mail?
I had a look in some files, but since I don’t know all these CPAN
modules used by RT, I couldn’t find a place where I was confident
enough that hacking something in would do the right thing.

Unfortunately, RT is almost unusable for us without this, because the
server handling all outgoing mail insists on appending
“charset=unknown-8bit” when the charset isn’t specified, which in
turn causes Outlook to replace the message body with a warning that
it contains ‘unsupported’ characters. Outlook provides the original
message as an attachment, but that’s far from comfortable.

Thanks,
Sebastian Flothow
sebastian@flothow.de
#include <stddisclaimer.h>

2.0.12 moved to using a better, more accurate parser for templates. You
can now drop a Content-Type: header straight into your templates.On Tue, Apr 16, 2002 at 03:01:19PM +0200, Sebastian Flothow wrote:

At 12:14 Uhr +0200 11.4.2002, Stephane Bortzmeyer wrote:

It was against 2.0.11. The code in lib/RT/Template.pm changed a lot in
2.0.13, I’ll update the patch.

Can you give an estimate of when it’s going to be done?

Or, could someone give me a hint where in the code would be the right
place to add “charset=iso-8859-1” to the headers of outgoing mail?
I had a look in some files, but since I don’t know all these CPAN
modules used by RT, I couldn’t find a place where I was confident
enough that hacking something in would do the right thing.

Unfortunately, RT is almost unusable for us without this, because the
server handling all outgoing mail insists on appending
“charset=unknown-8bit” when the charset isn’t specified, which in
turn causes Outlook to replace the message body with a warning that
it contains ‘unsupported’ characters. Outlook provides the original
message as an attachment, but that’s far from comfortable.

Thanks,

Sebastian Flothow
sebastian@flothow.de
#include <stddisclaimer.h>


rt-users mailing list
rt-users@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-users

http://www.bestpractical.com/products/rt – Trouble Ticketing. Free.