Fulltext Indexing in PostgreSQL 9.6/10


#1

Hi,

I have been looking at RT fulltext indexing. We have had reports that too many results were returned for a specific search. The problem was due to the fact that the function that is used to generate the database query for PostgreSQL is plainto_tsquery(‘string’). This function converts the string into individual tokens and then produces a query requesting any item that contains all of the listed tokens at any location. This means that searching for ‘bad word’ would also return results with ‘word bad’. In PostgreSQL 9.6 and 10, we can do better. They introduce the function phraseto_tsquery(‘string’). Like plainto_tsquery() it generates a list of tokens from the string however is also constrains the locations of the tokens such that token1 is before token2 which is before token3, … Would it be possible to use it instead if the PostgreSQL version is 9.6 or higher?

It looks like the only location that needs to be changed is:

— src/rt-4.4.2/lib/RT/Tickets.pm 2017-07-12 09:23:39.000000000 -0500
+++ lib/RT/Tickets.pm 2017-09-28 10:05:25.163603802 -0500
@@ -977,7 +977,7 @@
ALIAS => $alias,
FIELD => $index,
OPERATOR => ‘@@’,

  •            VALUE       => 'plainto_tsquery('. $dbh->quote($value) .')',
    
  •            VALUE       => 'phraseto_tsquery('. $dbh->quote($value) .')',
               QUOTEVALUE  => 0,
           );
       }
    

Regards,
Ken


#2

Hi again,

Since PostgreSQL 11 has been released, it would be nice to perhaps move to using the function websearch_to_tsquery() instead since it provides these nice options for refining your searches:

  • unquoted text: text not inside quote marks will be converted to terms separated by & operators, as if processed by plainto_tsquery.

  • “quoted text”: text inside quote marks will be converted to terms separated by <-> operators, as if processed by phraseto_tsquery.

  • OR: logical or will be converted to the | operator.

  • -: the logical not operator, converted to the the ! operator.

Regards,
Ken


#3

I agree, thumbs up. Maybe BP support notices this thread :slight_smile: