RT 4.4 LDAPImport character encoding mismatch


I just wanted to share a fix for an issue that’s been reported at a few places, but for which no obvious fix was available.

We use LDAPImport in an AD environment to import/update user accounts that sometimes have non-ASCII accented characters. Since we started using the LDAPImport extension back in 4.0, we noticed that the string comparison would always produce a mismatch between the database and LDAP data. The LDAP data would appear double-encoded UTF8, and that would trigger a superfluous transaction to update the user field in the database. The result over the years has been millions upon millions of transactions rows with identical NewValue and OldValue.

Well, after experimenting with different LDAP options in the RT configuration, I found the following fix:

Set($LDAPOptions, [ raw => qr/(\;binary)/ ]);

The explanation is found here in the Net::LDAP documentation:

When this option is given, Net::LDAP converts all values of attributes not matching this REGEX into Perl UTF-8 strings so that the regular Perl operators (pattern matching, …) can operate as one expects even on strings with international characters.

Perhaps the RT developpers would consider using this as a default RT setting in order to make the product a little more friendly for non-English environments.