Accent (diacritic) in attachment name -> diacritic in transaction subject broken

Hi,

I have problem with diacritic (accent) encoding (non English characters)
for Czech language.

Subject with diacritic is broken when I add file attachment with
diacritic. In test case I did:

  • attached PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY.bin with any content

  • filled subject PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY

  • sent it as Correspond or Comment

  • subject in RT frontend is set to:
    PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY

  • subject in Correspond e-mail notification is wrong:
    PŘÍLIŠŽLUŤOUČKÝ KŮŇ ÚPĚL ĎÁBELSKÉ KÓDY

or encoded (equal sign is placed for space dividing only):
Subject:
=?UTF-8?B?W1RERyAjNDE1M10gUMOFwpjDg8KNTEnDhSAgw4XCvUxVw4XCpE9Vw4TCjEs=?=
=?UTF-8?B?w4PCnSBLw4XCrsOFwocgw4PCmlDDhMKaTCDDhMKOw4PCgUJFTFNLw4PCiSBL?=
=?UTF-8?B?w4PCk0RZ?=

  • subject in Comment e-mail notification is OK

  • Data on transaction in Transaction table is OK

  • in one transaction i saw opposite behavior so Subject was OK and
    attachment name was screwed. I was not able to reproduce this yet.

RT is installed on CentOS 5.8 with this enviroment:

  • RT 4.0.5 upgraded from 3.8.7 (+ RT::Extension::CommandByMail)
    • logging is set to notice and is without any anomalies
  • Apache + mod_perl2 (AddDefaultCharset UTF-8)
  • MySQL 5.0.95 (DB and tables collation set to utf8_general_ci)
  • Perl 5.8.8 (distribution package + additional required modules
    installed through CPAN)

I tried to update every Perl module RT depends on (according to
informations in make testdeps) and clear Mason object cache without success.

I tried to reproduce this issue in test queue of
issues.bestpractical.com without success (ticket 19692) so I suppose
this is bug or my mistake in my CentOS environment.

Legwork suggested in RT wiki and mailinglist did not help me to find
solution.

I appreciate every suggestion.

Thanks in advance.

Regards,
Pavel Sidlo
e-mail: pavel.sidlo@topdigital.cz

Hi Stephen,

thank you for your answer.

Subject diacritic is OK for every message(=transaction) except ones with
diacritic filenames in attachment. So I see for example all messages
with diacritic in ticket but only ones with diacritic characters in
attachment name that have diacritic characters in subject have broken
subject.

Nearly all communication in RT tickets is in Czech language so it
usually contains diacritic characters. Therefore I am almost sure it is
not problem between webserver and web browser.

RT frontend is displayed in Czech language with diacritic and it
displays all characters correctly. I also checked browsers encoding and
it selects UTF8 correctly with charset autodetection or forced to UTF8.

Regards,

Pavel Sidlo
e-mail: pavel.sidlo@topdigital.cz

Dne 1.4.2012 22:12, Stephen J Alexander napsal(a):

I have problem with diacritic (accent) encoding (non English characters)
for Czech language.

Subject with diacritic is broken when I add file attachment with
diacritic. In test case I did:

  • attached PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY.bin with any content

  • filled subject PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY

  • sent it as Correspond or Comment

  • subject in RT frontend is set to:
    PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY

  • subject in Correspond e-mail notification is wrong:
    PŘÍLIŠŽLUŤOUČKÝ KŮŇ ÚPĚL ĎÁBELSKÉ KÓDY

I believe this bug was recently fixed on the branch
4.0/unicode-transaction-subjects
https://github.com/bestpractical/rt/compare/4.0-trunk...4.0/unicode-transaction-subjects.

The branch hasn’t been merged yet, pending some discussion of a larger
fix, however you can try applying the following patch and see if it
fixes your problem:

curl https://github.com/bestpractical/rt/commit/57ea0c02.patch \
   | patch -p1 -d /opt/rt4

You’ll need to restart RT after the patch succeeds.

Thomas

Hi Thomas,

applying patch solved the problem.

Thank you a lot for your help.

Best regards,

Pavel Sidlo
e-mail: pavel.sidlo@topdigital.cz

Dne 3.4.2012 16:51, Thomas Sibley napsal(a):> On 04/01/2012 02:46 PM, Pavel Šidlo wrote:

I have problem with diacritic (accent) encoding (non English characters)
for Czech language.

Subject with diacritic is broken when I add file attachment with
diacritic. In test case I did:

  • attached PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY.bin with any content

  • filled subject PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY

  • sent it as Correspond or Comment

  • subject in RT frontend is set to:
    PŘÍLIŠŽLUŤOUČKÝKŮŇÚPĚLĎÁBELSKÉKÓDY

  • subject in Correspond e-mail notification is wrong:
    PŘÍLIŠŽLUŤOUČKÝ KŮŇ ÚPĚL ĎÁBELSKÉ KÓDY

I believe this bug was recently fixed on the branch
4.0/unicode-transaction-subjects
https://github.com/bestpractical/rt/compare/4.0-trunk...4.0/unicode-transaction-subjects.

The branch hasn’t been merged yet, pending some discussion of a larger
fix, however you can try applying the following patch and see if it
fixes your problem:

 curl https://github.com/bestpractical/rt/commit/57ea0c02.patch \
    | patch -p1 -d /opt/rt4

You’ll need to restart RT after the patch succeeds.

Thomas