Problems with text/html Mails and Unicode

Hi,

I have a severe Problem with text/html mails sent to a RT queue via the Maligate, both on Web and on E-Mail Replies, the UTF-8 entities get double-encoded.

Here is what I sent to RT (copy/pasted from Outlook):

—snip—
öäü
ßßß
—snap—

The Ticket in the web shows this as

—snip—
öäü
ßßß
—snap—

(Page is reported as UTF-8 by Opera)

If I download the entire Attachment, I get:

—snip—

öäü
ßßß
[...] ---snap---

(Page is reported as UTF-8 by Opera)

The automatically generated reply from RT quotes it like this:

—snip—
öäüßßß
—snap—

The funny things about all this:

a) If I send the very same mail as plain/text instead of plain/html, everything works fine.

b) If I manually change the encoding line in the Database (update attachments set contenttype=‘text/plain’ where id=1;) display is fine as well.

I am using RT 3.6.1-2 installed from Debian Packages out of Testing, the full System Configuration dump can be found in the attached text file.

I would appreciate any hints where to go from here, because I have been trying for hours now to locate this problem. I even tried switching from MySQL to PgSQL, with no effect.

Yours sincerely,
Torben Nehmer

Torben Nehmer
Diplom Informatiker (FH)
Business System Developer

CANCOM Deutschland GmbH
Messerschmittstr. 20
89343 Scheppach
Germany

Phone: +49 (0)8225 - 996-1118
Fax: +49 (0)8225 - 996-41118
torben.nehmer@cancom.de

rt-config.txt (9.88 KB)

Hello everybody,

— I wrote:

I have a severe Problem with text/html mails sent to a RT queue via the
Maligate, both on Web and on E-Mail Replies, the UTF-8 entities get
double-encoded.

Can anybody give me some hints here? Do you need additional information like DB Dumps or Mail Dumps? I’d be happy even with a read-the-manual link in case I have missed something important.

It would be very important for me to get this running, as unfortunalety the kind of E-Mails I’ve used to test the mailgate (text/html from MS-Outlook) is the most widespread case here in my userbase.

Yours sincerely,
Torben Nehmer

Torben Nehmer
Diplom Informatiker (FH)
Business System Developer

CANCOM Deutschland GmbH
Messerschmittstr. 20
89343 Scheppach
Germany

Phone: +49 (0)8225 - 996-1118
Fax: +49 (0)8225 - 996-41118
torben.nehmer@cancom.de

Hello everybody,

— I wrote:

I have a severe Problem with text/html mails sent to a RT queue via the
Maligate, both on Web and on E-Mail Replies, the UTF-8 entities get
double-encoded.

I have a few more bits of information.

The broken entries all are using Quotes-Printable encoding Method in the database:

—snip—
rtdb=# select * from attachments where id=21;
id | transactionid | parent | messageid | subject | filename | contenttype | contentencoding | content | headers | creator | created
21 | 61 | 20 | | | | text/html | quoted-printable | =0D

=0D =0D =0D =0D =0D =0D
asdf
=0D
=C3=B6=C3=A4=C3=BC
=0D ---snap---

Maybe the QP doesn’t get corretly treated?

What I don’t understand is that the original mails sent by Outlook are using Base64, not QP encoding.

Any more insights here?

Yours sincerely,
Torben Nehmer

Torben Nehmer
Diplom Informatiker (FH)
Business System Developer

CANCOM Deutschland GmbH
Messerschmittstr. 20
89343 Scheppach
Germany

Phone: +49 (0)8225 - 996-1118
Fax: +49 (0)8225 - 996-41118
torben.nehmer@cancom.de

Hello everybody,

— I wrote:

I have a severe Problem with text/html mails sent to a RT queue via the
Maligate, both on Web and on E-Mail Replies, the UTF-8 entities get
double-encoded.

I have a few more bits of information.

The broken entries all are using Quotes-Printable encoding Method in the database:

While further digging into this I found the Method _DecodeLOB in Record.pm. When changing it to this, everything suddenly worked fine:

—snip—
sub _DecodeLOB {
my $self = shift;
my $ContentType = shift;
my $ContentEncoding = shift;
my $Content = shift;

if ( $ContentEncoding eq 'base64' ) {
    $Content = MIME::Base64::decode_base64($Content);
}
elsif ( $ContentEncoding eq 'quoted-printable' ) {
    $Content = MIME::QuotedPrint::decode($Content);
}
elsif ( $ContentEncoding && $ContentEncoding ne 'none' ) {
    return ( $self->loc( "Unknown ContentEncoding [_1]", $ContentEncoding ) );
}
if ( $ContentType eq 'text/plain' || $ContentType eq 'text/html' ) {
   $Content = Encode::decode_utf8($Content) unless Encode::is_utf8($Content);
}

return ($Content);

}
—snap—

The only change was activating UTF8 decoding also text/html.

Yours sincerely,
Torben Nehmer

Torben Nehmer
Diplom Informatiker (FH)
Business System Developer

CANCOM Deutschland GmbH
Messerschmittstr. 20
89343 Scheppach
Germany

Phone: +49 (0)8225 - 996-1118
Fax: +49 (0)8225 - 996-41118
torben.nehmer@cancom.de