Full text indexing failure (invalid byte sequence for encoding "UTF8")

We’re currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
Postgresql 8.4.12-0squeeze1.

Recently I tried to enable full text search following the instructions
here:

http://blog.bestpractical.com/2011/06/full-text-searching.html

…but ran into this error an hour into the initial “rt-fulltext-indexer
–all”:

[crit]: error: ERROR:  invalid byte sequence for encoding "UTF8": 0xfc
HINT:  This error can also happen if the byte sequence does not
  match the encoding expected by the server, which is controlled by
  "client_encoding". at /usr/sbin/rt-fulltext-indexer-4 line 375.
  (/usr/share/request-tracker4/lib/RT.pm:351)

Subsequent runs of the same command end with the same error.

The encoding for the rt4 db has been set to utf8 for as long as I can
recall. I assume this relates to some data inserted into the db ages
ago when client_encoding was something other than utf8, or in a previous
version of postgresql which might have been less stringent about input.

There is a FAQ about ‘invalid byte sequence for encoding’ but I’m not
sure that this is the same issue.

Anyone else been through this sort of issue? Would it be better to take
the question to a postgresql list?

Benpub 4096R/318B6A97 2009-05-11 Ben Poliakoff benp@reed.edu
Primary key fingerprint: 3F23 EBC8 B73E 92B7 0A67 705A 8219 DCF0 318B 6A97

signature.asc (828 Bytes)

We’re currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
Postgresql 8.4.12-0squeeze1.

This is fixed in RT 4.0.9 and above, wich resolve this issue by skipping
the attachment with bad data. RT 4.0.7 and above are better about not
trusting emails which claim to be “utf-8”, which prevents the bad data
from getting in in the first place, which is the likely cause here, and
which older Pg allowed.

  • Alex

We’re currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
Postgresql 8.4.12-0squeeze1.

This is fixed in RT 4.0.9 and above, wich resolve this issue by skipping
the attachment with bad data. RT 4.0.7 and above are better about not
trusting emails which claim to be “utf-8”, which prevents the bad data
from getting in in the first place, which is the likely cause here, and
which older Pg allowed.

The good news is that Debian backports now has 4.0.7 (I’ve just
uploaded 4.0.7-4~bpo60+1 which has a few extra fixes compared to
4.0.7-2~bpo60+1). The bad news is that since Debian is in freeze,
4.0.9 or above won’t be hitting Debian any time soon (except possibly
experimental, if someone asks nicely :slight_smile:

However, I do encourage people who are using the Debian packages to
report bugs that affect them to the BTS even if they are fixed in newer
upstream releases; if they seem serious enough, it’s still possible
to fix important bugs in Debian before the release.

Dominic Hargreaves, Systems Development and Support Section
IT Services, University of Oxford

signature.asc (198 Bytes)

On Fri, 2013-02-01 at 17:03 -0800, Ben Poliakoff wrote:

We’re currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
Postgresql 8.4.12-0squeeze1.

This is fixed in RT 4.0.9 and above, wich resolve this issue by skipping
the attachment with bad data. RT 4.0.7 and above are better about not
trusting emails which claim to be “utf-8”, which prevents the bad data
from getting in in the first place, which is the likely cause here, and
which older Pg allowed.

The good news is that Debian backports now has 4.0.7 (I’ve just
uploaded 4.0.7-4~bpo60+1 which has a few extra fixes compared to
4.0.7-2~bpo60+1). The bad news is that since Debian is in freeze,
4.0.9 or above won’t be hitting Debian any time soon (except possibly
experimental, if someone asks nicely :slight_smile:

However, I do encourage people who are using the Debian packages to
report bugs that affect them to the BTS even if they are fixed in newer
upstream releases; if they seem serious enough, it’s still possible
to fix important bugs in Debian before the release.

Thanks for the replies Alex and Dominic. I’ll plan on updating to
4.07-4 soon, looking forward to 4.0.9!

Ben

signature.asc (828 Bytes)

Thanks for the replies Alex and Dominic. I’ll plan on updating to
4.07-4 soon, looking forward to 4.0.9!

And 4.0.10 is out now. :slight_smile:

Hello Dominic,

Thanks a lot for your work!

I am running 4.0.7-4 fine on Debian 7.0 wheezy.

But what I am missing is the possibility to have multiple instances (prod/test)
and multiple versions, e.g. I am now very much interested in 4.0.10 but of course if first want to test it.

Is there perhaps an unofficial 4.0.10 dep repository somewhere ?
Is it complicated to build a new deb if there is a new upstream version ?

About the multiple version and instances, the Open Monitoring Distribution
http://omdistro.org/ solved this very nice for nagios and friends,
i wish something like that would exist for rt as well.

jo

I am planning to upload 4.0.10 to experimental in the next week or
so. As for multiple concurrent versions, this isn’t something the
Debian packages are suited to. If you have a need for multiple concurrent
versions, you will need to look at other options. On other hand,
as long as you aren’t using mod_perl in the same apache instance, you
can quite happily run multiple instances of the same version;
just make sure the configurations point to different databases
and cache locations.On Thu, Mar 21, 2013 at 01:32:56PM +0100, linux@muellers.ms wrote:

Hello Dominic,

Thanks a lot for your work!

I am running 4.0.7-4 fine on Debian 7.0 wheezy.

But what I am missing is the possibility to have multiple instances (prod/test)
and multiple versions, e.g. I am now very much interested in 4.0.10 but of course if first want to test it.

Is there perhaps an unofficial 4.0.10 dep repository somewhere ?
Is it complicated to build a new deb if there is a new upstream version ?

About the multiple version and instances, the Open Monitoring Distribution
http://omdistro.org/ solved this very nice for nagios and friends,
i wish something like that would exist for rt as well.

jo

The good news is that Debian backports now has 4.0.7 (I’ve just
uploaded 4.0.7-4~bpo60+1 which has a few extra fixes compared to
4.0.7-2~bpo60+1). The bad news is that since Debian is in freeze,
4.0.9 or above won’t be hitting Debian any time soon (except possibly
experimental, if someone asks nicely :slight_smile:

However, I do encourage people who are using the Debian packages to
report bugs that affect them to the BTS even if they are fixed in newer
upstream releases; if they seem serious enough, it’s still possible
to fix important bugs in Debian before the release.

Dominic Hargreaves, Systems Development and Support Section
IT Services, University of Oxford

signature.asc (198 Bytes)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On 03/23/2013 09:16 PM, Dominic Hargreaves wrote:

As for multiple concurrent versions, this isn’t something the
Debian packages are suited to.
Agreed; it can work, but it’s a pain. You can see this in the PostgreSQL
packages for example - they handle concurrent versions reasonably well
via the alternatives system and versioned config and data directories,
but it does cause some confusion for users and admins. In the case of
PostgreSQL it’s absolutely necessary to enable low-pain upgrades so it’s
worth the price in complexity, but this probably isn’t true for RT.


Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRTbpwAAoJELBXNkqjr+S2zkQH/2kzoCUOSl3Do8E0IZKVY0Mf
25/uow7s0H0CCKHhsKnGz7wApf6zSjcLA1kxp7XWvLUaZAIfj6fO2DRsdK+b1TGj
g1+aPtOgJDwe98uHu1q1R6s0ZzO2lZ7aj+2M+lj77lshdJOlpTywSMHhXz+3y30f
ZZNnWFhBLyMcLNoSx7D5LxRuCT2Tr1mfTqbekLKCXmIXqVyOGdojWfc+IsRcHSTK
kpiYGtC2x5SVqJQM57dSsY0Xwl+b+FYoGG4sQEPXd1C0fHahvH8zaVdvyJ/3O6Qs
bq4Z3/5A6FaJZzSzVlp/FUPHQXQF5deMcaoKR7zzridC25V/0m03o8PPmDbXftg=
=whnR
-----END PGP SIGNATURE-----

As for multiple concurrent versions, this isn’t something the
Debian packages are suited to.
Agreed; it can work, but it’s a pain. You can see this in the PostgreSQL
packages for example - they handle concurrent versions reasonably well
via the alternatives system and versioned config and data directories,
but it does cause some confusion for users and admins. In the case of
PostgreSQL it’s absolutely necessary to enable low-pain upgrades so it’s
worth the price in complexity, but this probably isn’t true for RT.

Yes, between major releases, but that doesn’t let run any arbitrary
combinations of versions. We’re doing much the same in Debian,
(request-tracker3.6, request-tracker3.8, request-tracker4) which I
should have mentioned, although I think we’re hoping that the 4.x
series will be maintained all in one package namespace to make things
easier for users.

Cheers,
Dominic.

Dominic Hargreaves, Systems Development and Support Section
IT Services, University of Oxford

signature.asc (198 Bytes)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On 03/25/2013 12:41 AM, Dominic Hargreaves wrote:

Yes, between major releases, but that doesn’t let run any arbitrary
combinations of versions.
Ah, good point. I misunderstood. Yes, the only good way to allow
arbitrary combinations of versions is with a second install in a VM.
Since that handily prevents all sorts of other nasty problems with
combining test and live systems that’s the approach I favour anyway.


Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRT4GHAAoJELBXNkqjr+S2FeMH/0xv6RQ63ScpbpfqYtwpZf36
QgBKbgb2KMx5XK8VN2Bj5OiXZwqzMQ6tEfAYyOLFlhOHJ9XjiPj1oICfUEZsYYF0
tTue5ZNSSnHnywppsEE9gsWPj5rCKoaVVI6OZwL6bJJj7Qsfs4izuP96XvkMhgeT
TaaMS52awpx+u7zDLP4mnbdqZEEUmZTlfhCeKBq+/bI5JMXuGPOPW8Z9tJ+l3eys
4anibvBf9MT9RNJ6ERzOjgTNbqbT6Nuq7QOkXBeUf3ZB24/YWJ+Jx1BF77WG019m
rev3jSGPvklW/jkUzgwQeGY5q4Dr43CvMYIvb3nwJpm4RFW6gA3Zp5pp+agEhnQ=
=T7bu
-----END PGP SIGNATURE-----

I am planning to upload 4.0.10 to experimental in the next week or
so. As for multiple concurrent versions, this isn’t something the
Debian packages are suited to. If you have a need for multiple concurrent
versions, you will need to look at other options. On other hand,
as long as you aren’t using mod_perl in the same apache instance, you
can quite happily run multiple instances of the same version;
just make sure the configurations point to different databases
and cache locations.

4.0.10 packages have been uploaded to Debian, but because they include
a new binary package (for the HTML documentation - BPS, thanks for the
handy devel script for generating this!) is sitting in a queue for
manual processing.

If you’re impatient and happy with building your own package, have a look
at the debian/4.0.10-1 tag in the repository at

http://anonscm.debian.org/gitweb/?p=pkg-request-tracker/request-tracker4.git

Cheers,
Dominic.

Dominic Hargreaves, Systems Development and Support Section
IT Services, University of Oxford

signature.asc (198 Bytes)

4.0.10 packages have been uploaded to Debian, but because they include
a new binary package (for the HTML documentation - BPS, thanks for the
handy devel script for generating this!) is sitting in a queue for
manual processing.

A side note: the HTML doc is customized for being displayed as part of a
larger HTML context (http://docs.bestpractical.com), so it’s just HTML
fragments and no styling. It’s somewhat readable on it’s own though.

++ on 4.0.10 in Debian. :slight_smile: