Utf8 in email headers is not decoded properly

Using version 3.8.8 of RT with perl 5.8.8 and Oracle, I tracked down this
problem:

  • Reply to an existing ticket using an email agent that will encode emails
    in UTF8
  • Put a non ascii character in the subject
  • The email gets recorded with crap in the subject rather than the properly
    encoded subject

In theory, you shouldn’t have pure UTF8 in the subject, but Outlook does it,
so does gmail. Don’t know about other emails.

This only happens with existing tickets. Here’s how.

When receiving the mail, MIME::Parser doesn’t decode the utf8, no matter
what. So its all binary utf8 in the header data structure.

RT adds the “RT-Ticket-ID” field. This string is properly encoded and
flagged as utf8.

When saving into the database, RT calls $Attachment->head->as_string, which
basically does a join of all the headers.

When joining strings in perl5, if any one of them is a flagged UTF8 string,
then the resulting string is flagged utf8, even if it contains binary data.
Don’t know why, that’s just what I found out.

So when RT does this:
utf8::decode( $head ) unless utf8::is_utf8( $head );

$head checks as being utf8, even though it’s not really, so it doesn’t get
decoded.

And that goes into the database, which is then used everywhere, and I see
this in my tickets:
test d’à cçênts

Now, where should I fix this?

  • In Attachment_Overlay, force the decoding
  • In RT::I18N, SetMIMEHeadToEncoding.

Thanks

Mathieu Longtin
1-514-803-8977