Creating tickets with non-ascii subject (and/or postgresql)

When I create a ticket in the Web Interface and the Subject contains a
non-ascii character: “a é b” (that is: a + LATIN SMALL LETTER E WITH
ACUTE + b) then RT produces the error:

‘Ticket could not be created due to an internal error’

Looking at the log file:

[Mon Nov 30 17:35:41 2009] [warning]: DBD::Pg::st execute failed:
ERROR: invalid byte sequence for encoding “UTF8”: 0xe92062
HINT: This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding". at /usr/share/perl5/DBIx/SearchBuilder/Handle.pm
line 532, line 301.
(/usr/share/perl5/DBIx/SearchBuilder/Handle.pm:532)
[Mon Nov 30 17:35:41 2009] [warning]: RT::Handle=HASH(0xb06ee58)
couldn’t execute the query ‘INSERT INTO Attachments (Subject,
ContentType, Headers, Creator, MessageId, Parent, Created, Transaction
Id) VALUES (?, ?, ?, ?, ?, ?, ?, ?)’ at
/usr/share/perl5/DBIx/SearchBuilder/Handle.pm line 545
[…]

What happened:
The database in postgresl is set to UTF8 and an insert was attempted
with the byte \xe9 (ascii) instead of \xc3\xa9 (utf-8).
\xe9 is not valid UTF8 so it produces an error.

What does work: creating the ticket with the subject: “a b” and then
changing it to “a é b”

Digging in the code:

RT::Attachments_Overlay::Create contains:
# MIME::Head doesn’t support perl strings well and can return
# octets which later will be double encoded in low-level code
my $head = $Attachment->head->as_string;
utf8::decode( $head );

Dumping the contents of the variable $head (using Devel::Peek) before
and after the decode:

SV = PV(0xbf57fb0) at 0xa9fd9d8
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0xbf6c938 “MIME-Version: 1.0\nX-Mailer: MIME-tools 5.427
(Entity 5.427)\nSubject: a \303\251 b\nContent-Type: text/plain\n”\0
[UTF8 “MIME-Version: 1.0nX-Mailer: MIME-tools 5.427 (Entity
5.427)nSubject: a \x{e9} bnContent-Type: text/plainn”]
CUR = 101
LEN = 104

SV = PV(0xbf57fb0) at 0xa9fd9d8
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0xbf6c938 “MIME-Version: 1.0\nX-Mailer: MIME-tools 5.427
(Entity 5.427)\nSubject: a \351 b\nContent-Type: text/plain\n”\0
CUR = 100
LEN = 104

Looking in git to find some history of this:

commit 0e92634d782383f5c64bece63962f1eb361f96fb (2008-08-01)
added the code:
my $head = $Attachment->head->as_string;
utf8::decode( $head );

commit 0d14f4e5a6a36597300559b8efe1716989cae61d (2009-05-01)
modified the code to:
my $head = Encode::decode_utf8($Attachment->head->as_string);

commit 6f1f370a28146902391a5aa0e6aca3e6027d9b9a (2009-08-26)
reverted the change and changed it back to:
my $head = $Attachment->head->as_string;
utf8::decode( $head );

The reason for doing this apperently was because of:
http://rt3.fsck.com/Ticket/Display.html?id=13278

The result (seems to be) that tickets that contain a non-ascii
character in the subject can not be created when Postgresql is used
and the database is set to UTF8.

Does anyone remember why all this is nessesary?

Best regards,

Bram

When I create a ticket in the Web Interface and the Subject contains a
non-ascii character: “a � b” (that is: a + LATIN SMALL LETTER E WITH
ACUTE + b) then RT produces the error:

‘Ticket could not be created due to an internal error’

I cannot reproduce this with 3.8.6 and postgresql 8.3 here, which
version are you running, which browser ?
Are you sure you set HTTP charset to utf-8 in apache?

Quoting Emmanuel Lacour elacour@easter-eggs.com:> On Mon, Nov 30, 2009 at 03:04:11PM +0100, Bram wrote:

When I create a ticket in the Web Interface and the Subject contains a
non-ascii character: “a é b” (that is: a + LATIN SMALL LETTER E WITH
ACUTE + b) then RT produces the error:

‘Ticket could not be created due to an internal error’

I cannot reproduce this with 3.8.6 and postgresql 8.3 here, which
version are you running, which browser ?
Are you sure you set HTTP charset to utf-8 in apache?

  • using RT 3.8.2
  • postgresql 8.3.8
  • \l in postgres shows:
    rt3 | postgres | UTF8

I’m certain that the http charset is set to utf-8 (verfied with
tcpdump), browser is Opera and I’m certain that the browser sends it
as utf-8 (verified with the output of Devel::Peek)

Best regards,

Bram

Hi,

I’ve encountered the exact same problem with u umlaut (ü) in subject when
creating ticket in Web UI.
RT 3.8.2, PostgreSQL 8.3.0 encoding UTF8.
I used Bram’s pointer to RT::Attachment::Create code and found that the
Encode::decode_utf8 variation (that existed in RT 3.8.4 but not in 3.8.2 or
3.8.6) fixes the problem. Thanks Bram!

I opened a bug for this: http://rt3.fsck.com/Ticket/Display.html?id=14214.

EynatFrom: Emmanuel Lacour [mailto:elacour@easter-eggs.com]
Sent: Monday, 30 November 2009 4:20 PM
To: rt-devel@lists.bestpractical.com
Subject: Re: [Rt-devel] Creating tickets with non-ascii subject (and/or
postgresql)

I updated the ticket http://rt3.fsck.com/Ticket/Display.html?id=14214
with the steps on how to reproduce this. (Also updated the ticket with
the reason why this is happening.)

Best regards,

Bram

Quoting Eynat Nir Mishor eynatnirmishor@gmail.com: