I am working on an upgrade from 3.8.x to 4.4.x and I noticed the
following comment in the UPGRADING-4.2 file:
The full-text indexing defaults for PostgreSQL have changed; GiST is now
the suggested index, as well as storing data in a separate
AttachmentsIndex table. Both changes improve lookup speed. For
improved search performance, you may wish to drop existing C
and C indexes on C, and re-generate the index using
I am curious about the benchmarks you used to make this determination?
In particular, the GIN fastscan option can dramatically improve search
performance in versions 9.4 and above. Here is a nice discussion:
In addition, GIN indexes are much smaller in PostgreSQL 9.4 and above.
So I am curious about the data that motivated the change to GiST.
Here is a correspondence between myself and a BP engineer you may find helpful:
I’ve enabled full-text searching on our 4.2 instance.
How often are folks running the sbin/rt-fulltext-indexer to keep the
Since the pg updates are incremental, they are safe to run rather
frequently. Depending on ticket volume, they’ve been run anywhere from
every minute to every hour. Unlike the Sphinx indexes for MySQL,
there’s no need to hold back and run them daily. Indeed, we’ve
considered updating the Pg index in realtime as attachments are created
– it’s sufficiently fast.
We are using Pg 9.1 on the backend with GIN indexing. We get ~100
tickets created per day and I am not sure how many transactions are
created per day.
Any best practices or anecdotal data would be helpful. Also, how
"strong" of a default recommendation is the GiST indexing? Am I being
foolish for choosing GIN?
The GiST recommendation is very much a soft one. We’re not aware of any
particular cases where it has been explicitly required, nor have we done
comprehensive comparisons between the two on RT’s data at scale.
—end of email—
There is also some good documentation on the Pg site regarding the