Rfc2231 Content-Disposition: attachment; filename problem

Hi,
thanks for your work on Request Tracker!

I have a problem with RT-3.8.8 and Czech diacritics in the filename of
attachment. After a few days of investigations found, that with the
latest MIME Tools 5.501 Request Trackers post-processing code on MIME
Entity correctly converts all the MIME Words in the Headers including
content-disposition.filename into the UTF-8.

Unfortunately when RT/Attachment_Overlay.pm code calls
$Attachment->head->recommended_filename; The MIME Tools code in
recommended_filename method don’t know the filename is already converted
and garble filename (escapes chars). The MIME Tools code may be OK, because it
returns correct filename in the case of no post processing (UTF-8
recoding by RT). See included show-filename.

Problematic header:

Content-Type: application/msword;
name="V_abc =?ISO-8859-2?Q?=EC=B9=E8=F8=BE=FD=E1=ED=E9_12345=2Edoc?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename0=ISO-8859-2’’%56%5F%61%62%63%20%EC%B9%E8%F8%BE%FD%E1%ED%E9%20;
filename1=%31%32%33%34%35%2E%64%6F%63

The correctly displayed filename is:

V_abc ěščřžýáíé 12345.doc

An attached test script show-filename displays this using
recommended_filename() method. MIME Tools 5.501 are needed, 5.500 does
not work.

Request Tracker 3.8.8 with MIME Tools 5.501 without the modification
above stores filename:
V_abc \xC4\x9B\xC5\xA1\xC4\x8D\xC5\x99\xC5\xBE\xC3\xBD\xC3\xA1\xC3\xAD\xC3\xA9 12345.doc

With older MIME Tools 5.428 the filename is garbled even more and ticket
creation failed because of invalid UTF-8 sequences can’t be inserted
into PostgreSQL used.

With the RT/Attachment_Overlay.pm modification the filename is ok:

— lib/RT/Attachment_Overlay.pm 2010-05-10 15:36:51.000000000 +0200
+++ local/lib/RT/Attachment_Overlay.pm 2011-02-28 15:49:27.000000000 +0100
@@ -131,7 +131,12 @@
$MessageId =~ s/^<(.*?)>$/$1/o;

 #Get the filename
  • my $Filename = $Attachment->head->recommended_filename;
  • my $Filename;

  • foreach my $attr_name ( qw( content-disposition.filename content-type.name ) ) {

  • $Filename = $Attachment->head->mime_attr( $attr_name );

  • last if defined $Filename && $Filename ne ‘’ && $Filename =~ /\S/;

  • }

  • $Filename ||= $Attachment->head->recommended_filename;

    MIME::Head doesn’t support perl strings well and can return

    octets which later will be double encoded in low-level code

The modification above is simply code from recommended_filename() method
without a decoder. Maybe this is not the right way to handle this…

The complete message with attachment is also attached.

Best Regards
Zito

show-filename (372 Bytes)

msg.behounek.dikritika.mbox (36.2 KB)

recommended_filename.diff (723 Bytes)

Hi,

Forwarded rt-bugs@bestpractical.com. This need tests to make sure
things stay fixed. Tests should cover latin-1 and some not latin
encoding. Fix would be probably only in 4.x branch, such changes tend
to have unexpected side effects.---------- Forwarded message ----------
From: Václav Ovsík vaclav.ovsik@i.cz
Date: 2011/2/28
Subject: [Rt-devel] rfc2231 Content-Disposition: attachment; filename problem
To: rt-devel@lists.bestpractical.com

Hi,
thanks for your work on Request Tracker!

I have a problem with RT-3.8.8 and Czech diacritics in the filename of
attachment. After a few days of investigations found, that with the
latest MIME Tools 5.501 Request Trackers post-processing code on MIME
Entity correctly converts all the MIME Words in the Headers including
content-disposition.filename into the UTF-8.

Unfortunately when RT/Attachment_Overlay.pm code calls
$Attachment->head->recommended_filename; The MIME Tools code in
recommended_filename method don’t know the filename is already converted
and garble filename (escapes chars). The MIME Tools code may be OK, because it
returns correct filename in the case of no post processing (UTF-8
recoding by RT). See included show-filename.

Problematic header:

Content-Type: application/msword;
name="V_abc =?ISO-8859-2?Q?=EC=B9=E8=F8=BE=FD=E1=ED=E9_12345=2Edoc?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename0=ISO-8859-2’’%56%5F%61%62%63%20%EC%B9%E8%F8%BE%FD%E1%ED%E9%20;
filename1=%31%32%33%34%35%2E%64%6F%63

The correctly displayed filename is:

V_abc ěščřžýáíé 12345.doc

An attached test script show-filename displays this using
recommended_filename() method. MIME Tools 5.501 are needed, 5.500 does
not work.

Request Tracker 3.8.8 with MIME Tools 5.501 without the modification
above stores filename:
V_abc \xC4\x9B\xC5\xA1\xC4\x8D\xC5\x99\xC5\xBE\xC3\xBD\xC3\xA1\xC3\xAD\xC3\xA9
12345.doc

With older MIME Tools 5.428 the filename is garbled even more and ticket
creation failed because of invalid UTF-8 sequences can’t be inserted
into PostgreSQL used.

With the RT/Attachment_Overlay.pm modification the filename is ok:

— lib/RT/Attachment_Overlay.pm 2010-05-10 15:36:51.000000000 +0200
+++ local/lib/RT/Attachment_Overlay.pm 2011-02-28 15:49:27.000000000 +0100
@@ -131,7 +131,12 @@
$MessageId =~ s/^<(.*?)>$/$1/o;

#Get the filename
  • my $Filename = $Attachment->head->recommended_filename;
  • my $Filename;
  • foreach my $attr_name ( qw( content-disposition.filename
    content-type.name ) ) {
  •   $Filename = $Attachment->head->mime_attr( $attr_name );
    
  •   last if defined $Filename && $Filename ne '' && $Filename =~ /\S/;
    
  • }
  • $Filename ||= $Attachment->head->recommended_filename;
# MIME::Head doesn't support perl strings well and can return
# octets which later will be double encoded in low-level code

The modification above is simply code from recommended_filename() method
without a decoder. Maybe this is not the right way to handle this…

The complete message with attachment is also attached.

Best Regards
Zito

List info: http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-devel

Best regards, Ruslan.

show-filename (372 Bytes)

msg.behounek.dikritika.mbox (36.1 KB)

recommended_filename.diff (723 Bytes)