Patch for RT 3.0.3 attachment conversion problem

Quoting Dirk Pape pape-rt@inf.fu-berlin.de:

Hello,

we also have this problem here. I already send a (personal) bug report
to
Jesse with some attachments demonstrating this. I thought it only
affected
this special attachments.

The following patch seems to correct the problem and works for my setup.
Can you please give it a try ?
(I’m sorry to copy it in the message body but this damn webmail does not handle
attachments very well)

SNIP ---- 8< ---- 8< ---- SNIP
— lib/RT/I18N.pm.orig Wed Jun 25 15:40:44 2003
+++ lib/RT/I18N.pm Wed Jun 25 15:35:22 2003
@@ -161,6 +161,16 @@
SetMIMEEntityToEncoding( $_, $enc ) foreach $entity->parts;
}

  • if ($entity->head->mime_type !~ /^text/plain$/) {

  •   # convert at least MIME word encoded attachment filename
    
  •    foreach my $attr (qw(content-type.name content-disposition.filename)) {
    
  •        if (my $name = $entity->head->mime_attr($attr)) {
    
  •            $entity->head->mime_attr($attr => DecodeMIMEWordsToUTF8($name));
    
  •        }
    
  •    }
    
  •    return;
    
  • }
    my $charset = _FindOrGuessCharset($entity) or return;

    one and only normalization

    $charset = ‘utf-8’ if $charset eq ‘utf8’;
    @@ -169,16 +179,6 @@
    SetMIMEHeadToEncoding($entity->head, $charset => $enc);

    my $head = $entity->head;

  • convert at least MIME word encoded attachment filename

  • foreach my $attr (qw(content-type.name content-disposition.filename)) {

  • if ( my $name = $head->mime_attr($attr) ) {

  •   $head->mime_attr( $attr => DecodeMIMEWordsToUTF8($name) );
    
  • }

  • }

  • return unless ( $head->mime_type =~ /^text/plain$/i );
    my $body = $entity->bodyhandle;

    if ( $enc ne $charset ) {
    SNIP ---- 8< ---- 8< ---- SNIP

Remy Chibois

rt-3.0.3-I18N-Attachments.patch (0 Bytes)

The following patch seems to correct the problem and works for my setup.
Can you please give it a try ?

This patch seems flawed in the sense that it breaks I18N.pm’s
capability of correctly handling high bits in non-text/plain
attachment’s file names.

I have instrumented my I18N.pm with debug statements (attached
as I18N.diff) and feed your test case into it:

the log is shown in attached rt.log, and I can view the ticket
and bild.pdf without problem there.

I also cannot produce the UTF32LE problem here… It is probably
a wrong thing to do if people somehow set it into @EmailInputEncodings.

Anyway, I’d like you (and others who notice this problem) to
try I18N.diff and see what it comes out, so we can determine
whether it’s db-specific, mta-specific, or what.

Thanks,
/Autrijus/

rt.log (2.46 KB)

I18N.diff (1.57 KB)

Quoting Autrijus Tang autrijus@autrijus.org:

The following patch seems to correct the problem and works for my
setup.
Can you please give it a try ?

This patch seems flawed in the sense that it breaks I18N.pm’s
capability of correctly handling high bits in non-text/plain
attachment’s file names.

Ok for that point but I never saw attachments filenames that
were not encoded using either “Quoted-Printable” or “Base 64”.
Here, text parts are “text/plain” either Q or B encoded and
attachments (binaries) are for most part “application/" or
"image/
” with filenames containing high bits encoded using
“Qutoed-Printable”.
This is why the patch is flawed.

[…]

I also cannot produce the UTF32LE problem here… It is probably
a wrong thing to do if people somehow set it into
@EmailInputEncodings.

my @EmailInputEncodings only contains “iso-8859-1” as all
incoming mails are using this format.

Anyway, I’d like you (and others who notice this problem) to
try I18N.diff and see what it comes out, so we can determine
whether it’s db-specific, mta-specific, or what.

Ok. I’ll post results today.

Thank you.

Remy Chibois

Quoting Autrijus Tang autrijus@autrijus.org:

[…]

Anyway, I’d like you (and others who notice this problem) to
try I18N.diff and see what it comes out, so we can determine
whether it’s db-specific, mta-specific, or what.

Here is what I get (debug log first then the message):

========== I18N.pm debug ==========
RT: XXX: We are handed a multipart (MIME::Entity=HASH(0x92f1084)), recursing
into it… (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:161)
RT: XXX: Trying to guess encoding of MIME::Entity=HASH(0x9294758)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:166)
RT: XXX: MIME::Entity=HASH(0x9294758) is encoded in iso-8859-1. Now set Head to
it… (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:172)
RT: XXX: So. Let’s see if text/plain is the type of MIME::Entity=HASH(0x9294758)
(text/plain). (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:186)
RT: XXX: It is. Continuing decoding MIME::Entity=HASH(0x9294758)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:188)
RT: XXX: Trying to guess encoding of MIME::Entity=HASH(0x929a404)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:166)
RT: XXX: MIME::Entity=HASH(0x929a404) is encoded in UTF-32LE. Now set Head to
it… (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:172)
RT: XXX: So. Let’s see if text/plain is the type of MIME::Entity=HASH(0x929a404)
(text/plain). (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:186)
RT: XXX: It is. Continuing decoding MIME::Entity=HASH(0x929a404)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:188)
RT: XXX: …and we’re back from the multipart (MIME::Entity=HASH(0x92f1084))…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:163)
RT: XXX: Trying to guess encoding of MIME::Entity=HASH(0x92f1084)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:166)
RT: XXX: Trying to guess encoding of MIME::Entity=HASH(0x9588154)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:166)
RT: XXX: MIME::Entity=HASH(0x9588154) is encoded in utf-8. Now set Head to it…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:172)
RT: XXX: So. Let’s see if text/plain is the type of MIME::Entity=HASH(0x9588154)
(text/plain). (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:186)
RT: XXX: It is. Continuing decoding MIME::Entity=HASH(0x9588154)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:188)
RT: XXX: Trying to guess encoding of MIME::Entity=HASH(0x958d0e4)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:166)
RT: XXX: MIME::Entity=HASH(0x958d0e4) is encoded in utf-8. Now set Head to it…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:172)
RT: XXX: So. Let’s see if text/plain is the type of MIME::Entity=HASH(0x958d0e4)
(text/plain). (/produits/rt-3.0.3-dev/lib/RT/I18N.pm:186)
RT: XXX: It is. Continuing decoding MIME::Entity=HASH(0x958d0e4)…
(/produits/rt-3.0.3-dev/lib/RT/I18N.pm:188)
RT: Converting ‘utf-8’ to ‘iso-8859-1’
RT: Malformed UTF-8 character (UTF-16 surrogate 0xdfda) in subroutine entry at
/usr/local/perl/lib/5.8.0/i686-linux-thread-multi/Encode.pm line 186.
(/produits/rt-3.0.3-dev/lib/RT.pm:235)
RT: Malformed UTF-8 character (UTF-16 surrogate 0xdfa8) in subroutine entry at
/usr/local/perl/lib/5.8.0/i686-linux-thread-multi/Encode.pm line 186.
(/produits/rt-3.0.3-dev/lib/RT.pm:235)
RT: Malformed UTF-8 character (UTF-16 surrogate 0xdb00) in subroutine entry at
/usr/local/perl/lib/5.8.0/i686-linux-thread-multi/Encode.pm line 186.
(/produits/rt-3.0.3-dev/lib/RT.pm:235)
RT: Malformed UTF-8 character (character 0xffff) in subroutine entry at
/usr/local/perl/lib/5.8.0/i686-linux-thread-multi/Encode.pm line 186.
(/produits/rt-3.0.3-dev/lib/RT.pm:235)
RT: <rt-3.0.3-323-2213.16.1574948203848> No recipients found. Not sending.
(/produits/rt-3.0.3-dev/lib/RT/Action/SendEmail.pm:251)
RT: Ticket 323 created in queue ‘General’ by Remy
(/produits/rt-3.0.3-dev/lib/RT/Ticket_Overlay.pm:608)
========== /I18N.pm debug ==========

Now the original message (e-mail adresses changed, attachment truncated)

========== Message ==========From: =?iso-8859-1?Q?CHIBOIS_R=E9my?= rchibois@free.fr
To: ‘rt3dev’ rt3dev@phoenix
Subject: Attachment filename test
Date: Thu, 26 Jun 2003 08:51:23 +0200
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: multipart/mixed;
boundary=“----_=_NextPart_000_01C33BAF.5B86EAB0”
Content-Length: 656313
Lines: 8537

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_000_01C33BAF.5B86EAB0
Content-Type: text/plain;
charset=“iso-8859-1”

Attachment filename test

------_=_NextPart_000_01C33BAF.5B86EAB0
Content-Type: application/vnd.ms-powerpoint;
name=“=?iso-8859-1?Q?Pr=E9sentation_Outil_Comm_DSIV_020517=2E?=
=?iso-8859-1?Q?ppt?=”
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename=“=?iso-8859-1?Q?Pr=E9sentation_Outil_Comm_DSIV_020?=
=?iso-8859-1?Q?517=2Eppt?=”

0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAAIAAAAoQMAAAAAAAAA
[Content truncated]
AAAAAAAAAAAAAAAAAAAAAAAA////////////////AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAA

------_=_NextPart_000_01C33BAF.5B86EAB0–

========== /Message ==========

Hope this will help.

Remy Chibois

Anyway, I’d like you (and others who notice this problem) to
try I18N.diff and see what it comes out, so we can determine
whether it’s db-specific, mta-specific, or what.

Here is what I get (debug log first then the message):

You Encode::Guess version, please. Try upgrading to the latest
version on CPAN and see if this persist.

/me curses false-alarm BOM marks...

Maybe what we should do is prepend a single ’ ’ in the strings
passed to Encode::Guess, to defeat BOM sniffing. Maybe.

Thanks,
/Autrijus/