Full text indexer issue with img html tag

Hello all,
we are experiencing issue with full text indexer which seems to be
related to HTML email with tag “”
(28398 characters long) which seems to cause rt-fulltext-indexer fail
to update full text index.

We do see following debug messages:
Found attachment #3085
[834] [Wed Mar 30 12:33:32 2016] [warning]: NOTICE: word is too long
to be indexed
DETAIL: Words longer than 2047 characters are ignored.
(/opt/rt4/sbin/rt-fulltext-indexer:322)
[834] [Wed Mar 30 12:33:32 2016] [warning]: NOTICE: word is too long
to be indexed
DETAIL: Words longer than 2047 characters are ignored.
(/opt/rt4/sbin/rt-fulltext-indexer:322)
Processed attachment #3085

Anyone with another/better solution than removal of the affected ticket?

Peter Viskup

we are experiencing issue with full text indexer

What version of RT? That’s a symptom of RT 4.0.9 or earlier; gradi4.0.10
contains a fix that makes RT simply skip that attachment.

  • Alex

What version of RT? That’s a symptom of RT 4.0.9 or earlier; 4.0.10
contains a fix that makes RT simply skip that attachment.

Actually, my mistake – it was fixed in 4.0.3:

If you’re running pre-4.0.3, you certainly want to upgrade.

  • Alex

Fortunately our version is 4.2.10.
Just discovered we didn’t ran rt-fulltext-indexer enough many times.
It proceeded with only 200 attachments at once. Is that some
limitation of PgSQL or rt-fulltext-indexer? Didn’t read that in the
documentation.

I ran it in a loop, till it didn’t report any other indexed
attachments. This solved our issue with full text search.
Thank you.

Peter ViskupOn Thu, Mar 31, 2016 at 6:49 AM, Alex Vandiver alex@chmrr.net wrote:

On Wed, 30 Mar 2016 21:32:37 -0700 Alex Vandiver alex@chmrr.net wrote:

What version of RT? That’s a symptom of RT 4.0.9 or earlier; 4.0.10
contains a fix that makes RT simply skip that attachment.

Actually, my mistake – it was fixed in 4.0.3:
In Postgres, simply skip attachments whose tsvectors are too large · bestpractical/rt@692b5bc · GitHub

If you’re running pre-4.0.3, you certainly want to upgrade.

  • Alex

RT 4.4 and RTIR Training Sessions https://bestpractical.com/training

  • Washington DC - May 23 & 24, 2016

Just discovered we didn’t ran rt-fulltext-indexer enough many times.
It proceeded with only 200 attachments at once. Is that some
limitation of PgSQL or rt-fulltext-indexer? Didn’t read that in the
documentation.

See
https://docs.bestpractical.com/rt/4.4.0/full_text_indexing.html#Updating-the-index

You can pass --all to index all attachments, which looks to not be
documented.

  • Alex