OriginalContent can return string with utf8 flag on

Hello.
sub Attachment::OriginalContent {

if (!$enc || $enc eq ‘’ || $enc eq ‘utf8’ || $enc eq ‘utf-8’) {
# If we somehow fail to do the decode, at least push out the raw bits
eval {return( Encode::decode_utf8($content))} || return ($content);
}

}

decode_utf8 returns string instead of octets.

perldoc Encode:
$string = decode_utf8($octets [, CHECK]);

				Best regards. Ruslan.

Thing that RT shouldn’t use at all is Encode::from_to because of absence
of CHECK argument. Encode uses default 0.

perldoc Encode:
If CHECK is 0, (en|de)code will put a substitution character in place of
a malformed character.

code example:
perl -e 'my $str = “\x{442}\x{435}\x{441}\x{442}”; require Encode;
Encode::_utf8_off($str); Encode::from_to($str, “utf8”=> “us-ascii”);
print $str;'
output would be ‘???’ instead of croak. RT don’t like die, but current
behaviour hide bugs. Now we have corrupted data in DB.

Codepath in RT that trigger this bug:
sub RT::I18N::SetMIMEEntityToEncoding {

eval {
$RT::Logger->debug("Converting ‘$charset’ to ‘$enc’ for ".
$head->mime_type . " - ". $head->get(‘subject’));

         # NOTE:: see the comments at the end of the sub.
         Encode::_utf8_off( $lines[$_] ) foreach ( 0 .. $#lines );
         Encode::from_to( $lines[$_], $charset => $enc ) for ( 0 .. 

$#lines );
};

}

Ruslan U. Zakirov wrote:

Hello.

More on UTF-8 crap.

RT convert message/rfc822 mails:
sub RT::I18N::SetMIMEEntityToEncoding {

return unless ($head->mime_type =~ qr{^(text/plain|message/rfc822)$}i);

}

but

sub RT::Attachment::OriginalContent {
my $self = shift;

return $self->Content unless $self->ContentType eq ‘text/plain’;

}

			Best regards. Ruslan.

Ruslan U. Zakirov wrote:

Thing that RT shouldn’t use at all is Encode::from_to because of absence
of CHECK argument. Encode uses default 0.

perldoc Encode:
If CHECK is 0, (en|de)code will put a substitution character in place of
a malformed character.

[snip]

Codepath in RT that trigger this bug:
sub RT::I18N::SetMIMEEntityToEncoding {

eval {

Encode::from_to( $lines[$_], $charset => $enc ) for ( 0 …
$#lines );
};

Ahh, and for your info Encode::from_to never fails. Eval context is useless.