Hello.
sub Attachment::OriginalContent {
…
if (!$enc || $enc eq ‘’ || $enc eq ‘utf8’ || $enc eq ‘utf-8’) {
# If we somehow fail to do the decode, at least push out the raw bits
eval {return( Encode::decode_utf8($content))} || return ($content);
}
…
}
decode_utf8 returns string instead of octets.
perldoc Encode:
$string = decode_utf8($octets [, CHECK]);
Best regards. Ruslan.
Thing that RT shouldn’t use at all is Encode::from_to because of absence
of CHECK argument. Encode uses default 0.
perldoc Encode:
If CHECK is 0, (en|de)code will put a substitution character in place of
a malformed character.
code example:
perl -e 'my $str = “\x{442}\x{435}\x{441}\x{442}”; require Encode;
Encode::_utf8_off($str); Encode::from_to($str, “utf8”=> “us-ascii”);
print $str;'
output would be ‘???’ instead of croak. RT don’t like die, but current
behaviour hide bugs. Now we have corrupted data in DB.
Codepath in RT that trigger this bug:
sub RT::I18N::SetMIMEEntityToEncoding {
…
eval {
$RT::Logger->debug("Converting ‘$charset’ to ‘$enc’ for ".
$head->mime_type . " - ". $head->get(‘subject’));
# NOTE:: see the comments at the end of the sub.
Encode::_utf8_off( $lines[$_] ) foreach ( 0 .. $#lines );
Encode::from_to( $lines[$_], $charset => $enc ) for ( 0 ..
$#lines );
};
…
}
Ruslan U. Zakirov wrote:
Hello.
More on UTF-8 crap.
RT convert message/rfc822 mails:
sub RT::I18N::SetMIMEEntityToEncoding {
…
return unless ($head->mime_type =~ qr{^(text/plain|message/rfc822)$}i);
…
}
but
sub RT::Attachment::OriginalContent {
my $self = shift;
return $self->Content unless $self->ContentType eq ‘text/plain’;
…
}
Best regards. Ruslan.