SearchBuild and case insensitive matching on postgres

Hi,

I’ve modified SearchBuilder/Handle/Pg.pm according to the enclosed diff,
which makes use of lower() instead of ilike. This will make a difference:
postgresql will use indices created with “on (lower(field))” when lower is
used explicitally in the query. ILIKE is not the same thing as lower() =
xx, though, since many strange things occur with locales. Hence ILIKE never
uses indices, it just cannot logically do this. (A postgresql developer had
this to say: “It’s a big leap to assume that ILIKE is equivalent to LIKE on
lower(). Think about Turkish i/I, German esstet (did I spell that right?),
ch in various languages, etc etc.” )

Just for reference:–On tisdag, mars 30, 2004 14.46.11 -0500 Vivek Khera vivek@khera.org wrote:

On Mar 30, 2004, at 2:38 PM, Jesse Vincent wrote:

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

make your index functional on lower(column) and then compare
lower(column) = lower(‘string’). don’t just index on the column directly.

Pg.diff (786 Bytes)

Hi,

I’ve modified SearchBuilder/Handle/Pg.pm according to the enclosed diff,
which makes use of lower() instead of ilike. This will make a difference:
postgresql will use indices created with “on (lower(field))” when lower is
used explicitally in the query. ILIKE is not the same thing as lower() =
xx, though, since many strange things occur with locales. Hence ILIKE never
uses indices, it just cannot logically do this. (A postgresql developer had
this to say: “It’s a big leap to assume that ILIKE is equivalent to LIKE on
lower(). Think about Turkish i/I, German esstet (did I spell that right?),
ch in various languages, etc etc.” )

You want to look at the latest development version of
DBIx::SearchBuilder::Handle::Pg, as it has a slightly more elegant
version of what you’re trying to do here.

Hi,

I’ve modified SearchBuilder/Handle/Pg.pm according to the enclosed diff,
which makes use of lower() instead of ilike. This will make a difference:
postgresql will use indices created with “on (lower(field))” when lower is
used explicitally in the query. ILIKE is not the same thing as lower() =
xx, though, since many strange things occur with locales. Hence ILIKE never
uses indices, it just cannot logically do this. (A postgresql developer had
this to say: “It’s a big leap to assume that ILIKE is equivalent to LIKE on
lower(). Think about Turkish i/I, German esstet (did I spell that right?),
ch in various languages, etc etc.” )

You want to look at the latest development version of
DBIx::SearchBuilder::Handle::Pg, as it has a slightly more elegant
version of what you’re trying to do here.

So, to be more exact, RT 3.0 doesn’t have the infrastructure needed to
let us do LOWER(‘query’) = LOWER(‘colname’) so we fake it with lc. When
talking to newer RT branches, we do the right thing. I’ve just released
this version of SearchBuilder as 1.00_01. I’d be thrilled if you could
test it out and tell us how you do. If there are other new indexes that
you find to help, I’ll add them to the core.

Best,
Jesse

Hi,

I’ve modified SearchBuilder/Handle/Pg.pm according to the enclosed
diff, which makes use of lower() instead of ilike. This will make a
difference: postgresql will use indices created with “on
(lower(field))” when lower is used explicitally in the query. ILIKE
is not the same thing as lower() = xx, though, since many strange
things occur with locales. Hence ILIKE never uses indices, it just
cannot logically do this. (A postgresql developer had this to say:
“It’s a big leap to assume that ILIKE is equivalent to LIKE on
lower(). Think about Turkish i/I, German esstet (did I spell that
right?), ch in various languages, etc etc.” )

You want to look at the latest development version of
DBIx::SearchBuilder::Handle::Pg, as it has a slightly more elegant
version of what you’re trying to do here.

So, to be more exact, RT 3.0 doesn’t have the infrastructure needed to
let us do LOWER(‘query’) = LOWER(‘colname’) so we fake it with lc. When
talking to newer RT branches, we do the right thing. I’ve just released
this version of SearchBuilder as 1.00_01. I’d be thrilled if you could
test it out and tell us how you do. If there are other new indexes that
you find to help, I’ll add them to the core.

My main concern is not search criterias using LIKE, but joins on primary
keys (integers) using LIKE. There should be a simple equality, no LIKEs. I
can’t find where and how RT decides that they are search criterias and not
joins.

I will try DBIx-SearchBuilder-1.00_1, I’ll get back about this.

/Palle

Hi,

I’ve modified SearchBuilder/Handle/Pg.pm according to the enclosed

You want to look at the latest development version of
DBIx::SearchBuilder::Handle::Pg, as it has a slightly more elegant
version of what you’re trying to do here.

So, to be more exact, RT 3.0 doesn’t have the infrastructure needed to
let us do LOWER(‘query’) = LOWER(‘colname’) so we fake it with lc. When
talking to newer RT branches, we do the right thing. I’ve just released
this version of SearchBuilder as 1.00_01. I’d be thrilled if you could
test it out and tell us how you do. If there are other new indexes that
you find to help, I’ll add them to the core.

My main concern is not search criterias using LIKE, but joins on primary
keys (integers) using LIKE. There should be a simple equality, no LIKEs.
I can’t find where and how RT decides that they are search criterias and
not joins.

I will try DBIx-SearchBuilder-1.00_1, I’ll get back about this.

This version seems to do a better job using lower where it should, at least
it seems that way.

But, it still also uses lower on integers and id:s sometimes - this is bad.
I’ve poked around in the code, and it seems to me there is logic in RT
(lib/RT/Record.pm:207) that tries to prevent the usage of lower where the
name of the column is ‘id’, but this code does not always help.
SearchBuilder.pm:801 will run MakeClauseCaseInsensitive even if I try to
match on ‘^(.*.)?id$’ in Record.pm. I see no easy way to get around this,
without breaking SearchBuilder for other projects (it is supposed to be
somewhat independant of RT, I assume?). This said, the following patch
makes RT quite usable for me, in fact it’s suddenly pretty fast. It should
do the right thing for RT, but is an ugly hack and most probably breaks
SearchBuilder for use with most other projects:

— SearchBuilder/Handle/Pg.pm.orig Fri May 14 23:32:14 2004
+++ SearchBuilder/Handle/Pg.pm Fri May 14 23:34:47 2004
@@ -279,8 +279,12 @@
# on the value. they only expect field, operator, value.
else {

  •           return ( "$field", $operator, lc($value));
    
  •       if ($field =~ /id$/i) {
    
  •           return  ( "$field", $operator, lc($value));
    
  •       }
    
  •       else {
    
  •           return ( "LOWER($field)", $operator, lc($value));
    
  •       }
      }
    
    }
    else {

The correct way to do this would be not to run MakeCaseInsensitive for
id:s, but since It is done in SearchBuilder, I can’t come up with an easy
way to fix this.

Hopefully, someone can fix this in a proper way?

Best regards,
Palle

But, it still also uses lower on integers and id:s sometimes - this is bad.
I’ve poked around in the code, and it seems to me there is logic in RT
(lib/RT/Record.pm:207) that tries to prevent the usage of lower where the
name of the column is ‘id’, but this code does not always help.
SearchBuilder.pm:801 will run MakeClauseCaseInsensitive even if I try to
match on ‘^(.*.)?id$’ in Record.pm. I see no easy way to get around this,
without breaking SearchBuilder for other projects (it is supposed to be
somewhat independant of RT, I assume?). This said, the following patch
makes RT quite usable for me, in fact it’s suddenly pretty fast. It should
do the right thing for RT, but is an ugly hack and most probably breaks
SearchBuilder for use with most other projects:

So, DBIx::SearchBuilder::_ClassAccessible has type information
available. You should be able to just ask it if a given row is an int
type.