Extremely slow query in rt 3.0.10 due to ILIKE instead of =

Palle_Girgensohn · March 30, 2004, 7:29pm

Hi,

With RT & the latest (0.99) DBIx-SearchBuilder, clicking “Home” takes
approximately 20 second or more. Most of the time is spent running this
query:

rt3=# explain analyze
rt3-# SELECT DISTINCT main.* FROM Tickets main , Groups Groups_1,
CachedGroupMembers CachedGroupMembers_2, Users
rt3-# Users_3 WHERE ((CachedGroupMembers_2.MemberId ILIKE Users_3.id))
AND ((Groups_1.id ILIKE CachedGroupMembers_2.GroupId)) AND
rt3-# ((main.EffectiveId = main.id)) AND ((main.Status != ‘deleted’)) AND
((main.Type = ‘ticket’)) AND ((main.id ILIKE
rt3(# Groups_1.Instance)) AND ( ( ( (Users_3.EmailAddress ILIKE
‘girgen@pingpong.net’)AND(Groups_1.Domain =
rt3(# ‘RT::Ticket-Role’)AND(Groups_1.Type = ‘Requestor’) ) ) AND (
(main.Status = ‘new’)OR(main.Status = ‘open’) ) ) ORDER BY
rt3-# main.Priority DESC LIMIT 10;

QUERY PLAN

Limit (cost=729.00…729.19 rows=1 width=193) (actual
time=20850.989…20851.960 rows=10 loops=1)
→ Unique (cost=729.00…729.19 rows=1 width=193) (actual
time=20850.975…20851.874 rows=10 loops=1)
→ Sort (cost=729.00…729.01 rows=3 width=193) (actual
time=20850.961…20851.033 rows=19 loops=1)
Sort Key: main.priority, main.id, main.effectiveid,
main.queue, main.“type”, main.issuestatement, main.resolution,
main.“owner”, main.subject, main.initialpriority, main.finalpriority,
main.timeestimated, main.timeworked, main.status, main.timeleft, main.told,
main.starts, main.started, main.due, main.resolved, main.lastupdatedby,
main.lastupdated, main.creator, main.created, main.disabled
→ Nested Loop (cost=256.11…728.97 rows=3 width=193)
(actual time=19122.777…20847.490 rows=58 loops=1)
Join Filter: ((“inner”.id)::text ~~*
(“outer”.groupid)::text)
→ Nested Loop (cost=0.00…459.97 rows=67 width=4)
(actual time=0.308…291.242 rows=533 loops=1)
Join Filter: ((“inner”.memberid)::text ~~*
(“outer”.id)::text)
→ Seq Scan on users users_3 (cost=0.00…2.69
rows=1 width=4) (actual time=0.104…0.347 rows=1 loops=1)
Filter: ((emailaddress)::text ~~*
‘girgen@pingpong.net’::text)
→ Seq Scan on cachedgroupmembers
cachedgroupmembers_2 (cost=0.00…224.83 rows=13283 width=8) (actual
time=0.015…104.128 rows=13283 loops=1)
→ Materialize (cost=256.11…256.18 rows=7
width=197) (actual time=0.023…36.582 rows=161 loops=533)
→ Nested Loop (cost=0.00…256.11 rows=7
width=197) (actual time=9.800…19068.908 rows=161 loops=1)
Join Filter: ((“outer”.id)::text ~~*
(“inner”.instance)::text)
→ Seq Scan on tickets main
(cost=0.00…77.16 rows=1 width=193) (actual time=0.083…29.923 rows=161
loops=1)
Filter: ((effectiveid = id) AND
((status)::text <> ‘deleted’::text) AND ((“type”)::text = ‘ticket’::text)
AND (((status)::text = ‘new’::text) OR ((status)::text = ‘open’::text)))
→ Seq Scan on groups groups_1
(cost=0.00…155.50 rows=1340 width=8) (actual time=0.845…91.976 rows=1603
loops=161)
Filter: (((“domain”)::text =
‘RT::Ticket-Role’::text) AND ((“type”)::text = ‘Requestor’::text))
Total runtime: 20853.432 ms
(19 rows)

~21 seconds… this is because of the ILIKE:s. I have postgresql intialized
with LC_ALL=sv_SE.ISO8859-1. Indices are not used in postgresql for “LIKE”
when the database collation order is not of locale ‘C’. I usually want
this, since I need sorting to be localized.

Looking at the query above, I can’t see that ILIKE cannot be replaced with
`=’ in all cases. I’d say the way it is done now is a bug. I cannot find my
way around the RT code, so I’d appreciate some help doing this.

Here’s the same query without the ILIKE:s:

rt3=# explain analyze
rt3-# SELECT DISTINCT main.* FROM Tickets main , Groups Groups_1,
CachedGroupMembers CachedGroupMembers_2, Users
rt3-# Users_3 WHERE ((CachedGroupMembers_2.MemberId = Users_3.id)) AND
((Groups_1.id = CachedGroupMembers_2.GroupId)) AND
rt3-# ((main.EffectiveId = main.id)) AND ((main.Status != ‘deleted’)) AND
((main.Type = ‘ticket’)) AND ((main.id =
rt3(# Groups_1.Instance)) AND ( ( ( (Users_3.EmailAddress =
‘girgen@pingpong.net’)AND(Groups_1.Domain =
rt3(# ‘RT::Ticket-Role’)AND(Groups_1.Type = ‘Requestor’) ) ) AND (
(main.Status = ‘new’)OR(main.Status = ‘open’) ) ) ORDER BY
rt3-# main.Priority DESC LIMIT 10;

QUERY PLAN

Limit (cost=40.25…40.32 rows=1 width=193) (actual time=100.423…101.392
rows=10 loops=1)
→ Unique (cost=40.25…40.32 rows=1 width=193) (actual
time=100.410…101.308 rows=10 loops=1)
→ Sort (cost=40.25…40.26 rows=1 width=193) (actual
time=100.395…100.468 rows=19 loops=1)
Sort Key: main.priority, main.id, main.effectiveid,
main.queue, main.“type”, main.issuestatement, main.resolution,
main.“owner”, main.subject, main.initialpriority, main.finalpriority,
main.timeestimated, main.timeworked, main.status, main.timeleft, main.told,
main.starts, main.started, main.due, main.resolved, main.lastupdatedby,
main.lastupdated, main.creator, main.created, main.disabled
→ Nested Loop (cost=0.00…40.24 rows=1 width=193) (actual
time=1.164…95.580 rows=58 loops=1)
→ Nested Loop (cost=0.00…32.28 rows=2 width=4)
(actual time=0.840…64.753 rows=316 loops=1)
→ Nested Loop (cost=0.00…13.89 rows=6
width=4) (actual time=0.351…17.903 rows=533 loops=1)
→ Seq Scan on users users_3
(cost=0.00…2.69 rows=1 width=4) (actual time=0.117…0.356 rows=1 loops=1)
Filter: ((emailaddress)::text =
‘girgen@pingpong.net’::text)
→ Index Scan using cachedgroupmembers2
on cachedgroupmembers cachedgroupmembers_2 (cost=0.00…11.13 rows=5
width=8) (actual time=0.198…10.070 rows=533 loops=1)
Index Cond:
(cachedgroupmembers_2.memberid = “outer”.id)
→ Index Scan using groups_pkey on groups
groups_1 (cost=0.00…3.05 rows=1 width=8) (actual time=0.060…0.065 rows=1
loops=533)
Index Cond: (groups_1.id = “outer”.groupid)
Filter: (((“domain”)::text =
‘RT::Ticket-Role’::text) AND ((“type”)::text = ‘Requestor’::text))
→ Index Scan using tickets_pkey on tickets main
(cost=0.00…3.97 rows=1 width=193) (actual time=0.069…0.071 rows=0
loops=316)
Index Cond: (main.id = “outer”.instance)
Filter: ((effectiveid = id) AND ((status)::text
<> ‘deleted’::text) AND ((“type”)::text = ‘ticket’::text) AND
(((status)::text = ‘new’::text) OR ((status)::text = ‘open’::text)))
Total runtime: 103.182 ms

Postgresql-7.4.2
FreeBSD 4.9 stable
DBIx-SearchBuilder 0.99
Perl 5.8.3

0,1 s, pretty darn much better…

So, in short, I’d like to exterminate unnecessesary ILIKEs. In this
example, all are.

Any ideas how to pursue this?

/Palle

Jesse_Vincent · March 30, 2004, 7:38pm

Hi,

With RT & the latest (0.99) DBIx-SearchBuilder, clicking “Home” takes
approximately 20 second or more. Most of the time is spent running
this query:

~21 seconds… this is because of the ILIKE:s. I have postgresql
intialized with LC_ALL=sv_SE.ISO8859-1. Indices are not used in
postgresql for “LIKE” when the database collation order is not of
locale ‘C’. I usually want this, since I need sorting to be localized.

Looking at the query above, I can’t see that ILIKE cannot be replaced
with `=’ in all cases. I’d say the way it is done now is a bug. I
cannot find my way around the RT code, so I’d appreciate some help
doing this.

Here’s the same query without the ILIKE:s:

Postgresql-7.4.2
FreeBSD 4.9 stable
DBIx-SearchBuilder 0.99
Perl 5.8.3

0,1 s, pretty darn much better…

So, in short, I’d like to exterminate unnecessesary ILIKEs. In this
example, all are.

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGH!

So, we need to do case-insensitive comparisons. Every time a postgres
hacker looks at this issue, they tell me something different.

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

Vivek_Khera1 · March 30, 2004, 7:46pm

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

make your index functional on lower(column) and then compare
lower(column) = lower(‘string’). don’t just index on the column
directly.

Palle_Girgensohn · March 30, 2004, 8:34pm

Hi,

With RT & the latest (0.99) DBIx-SearchBuilder, clicking “Home” takes
approximately 20 second or more. Most of the time is spent running
this query:

~21 seconds… this is because of the ILIKE:s. I have postgresql
intialized with LC_ALL=sv_SE.ISO8859-1. Indices are not used in
postgresql for “LIKE” when the database collation order is not of
locale ‘C’. I usually want this, since I need sorting to be localized.

Looking at the query above, I can’t see that ILIKE cannot be replaced
with `=’ in all cases. I’d say the way it is done now is a bug. I
cannot find my way around the RT code, so I’d appreciate some help
doing this.

Here’s the same query without the ILIKE:s:

Postgresql-7.4.2
FreeBSD 4.9 stable
DBIx-SearchBuilder 0.99
Perl 5.8.3

0,1 s, pretty darn much better…

So, in short, I’d like to exterminate unnecessesary ILIKEs. In this
example, all are.

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGH!

Ooops…

So, we need to do case-insensitive comparisons. Every time a postgres
hacker looks at this issue, they tell me something different.

What is the right way to do fast case-insensitive comparisons in postgres
that take database indexes into account?

First, I just set up a postgresql cluster initdb:ed with locale=C. The
query is down to 14 secs, so it is not that big a difference. I does not
use indices differently, so I guess I was wrong, or they changed something.

In this specific example, it seems like all that is compared are ID:s
(except the email). Are you really sure they need to be compared
case-insensitive?

I remade the query with = except for Users_3.EmailAddress ILIKE
‘girgen@p…’, and it is really fast, 0,1 s, and it returns all the same
rows. I also remade the query using lower() = lower(), and it is OK, 0,5 s.
This is really the postgresql planner making non-optimal assumptions, I
guess. ILIKE between two db fields is really not a LIKE, it is a lower() =
lower()… Postgresql should ideally understand this.

I’ll ask the postgresql developers about this. We’ll see what they say. I
also found the lines in SearchBuilder that handle this. Changed between
0.98_1 and 0.99… So, I can downgrade for now, I guess… Still, do you
really need to match ID:s case-insensitive?

/Palle

SELECT DISTINCT main.* FROM Tickets main , Groups Groups_1,
CachedGroupMembers CachedGroupMembers_2, Users Users_3 WHERE
((CachedGroupMembers_2.MemberId = Users_3.id)) AND ((Groups_1.id
= CachedGroupMembers_2.GroupId)) AND ((main.EffectiveId =
main.id)) AND ((main.Status != ‘deleted’)) AND ((main.Type =
‘ticket’)) AND ((main.id = Groups_1.Instance)) AND ( ( (
(Users_3.EmailAddress ILIKE ‘girgen@pingpong.net’)AND(Groups_1.Domain
= ‘RT::Ticket-Role’)AND(Groups_1.Type = ‘Requestor’) ) ) AND (
(main.Status = ‘new’)OR(main.Status = ‘open’) ) ) ORDER BY
main.Priority DESC LIMIT 10

Jesse_Vincent · March 30, 2004, 8:47pm

I’ll ask the postgresql developers about this. We’ll see what they
say. I also found the lines in SearchBuilder that handle this. Changed
between 0.98_1 and 0.99… So, I can downgrade for now, I guess…
Still, do you really need to match ID:s case-insensitive?

It’s how searchbuilder-based apps work on every other database
platform. Changing that behavior or making it db-specific worries me,
especially since a lot of the data is stuff that humans search on.

PGP.sig (186 Bytes)

Palle_Girgensohn · March 30, 2004, 9:33pm

I’ll ask the postgresql developers about this. We’ll see what they
say. I also found the lines in SearchBuilder that handle this. Changed
between 0.98_1 and 0.99… So, I can downgrade for now, I guess…
Still, do you really need to match ID:s case-insensitive?

It’s how searchbuilder-based apps work on every other database platform.
Changing that behavior or making it db-specific worries me, especially
since a lot of the data is stuff that humans search on.

hmm, ok, but these are not search criterias, they are join conditions. They
are ID:s, internal integers that are not shown to anyone or used in the
GUI, right? Integers should not need to be case-insensitive? And they
compare two db fields, not field with value.

These are the fields that are ILIKEd in this query. They’re all integers:

Users.id
CachedGroupMembers.MemberId
CachedGroupMembers.GroupId
Tickets.id
Groups.Instance

SearchBuilder seem to separate join conditions from search restrictions. I
tried returning “lower($field)” , … , “lower($value)” in
Handle/Pg.pm:_MakeClauseCaseInsensitive(), but the single quotes are added
later down the code, so it became ‘lower(value)’ instead of lower(‘value’).
Where are the single quotes added to the query string?

/Palle

Palle_Girgensohn · March 31, 2004, 12:30am

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

make your index functional on lower(column) and then compare
lower(column) = lower(‘string’). don’t just index on the column directly.

True. Note you should preferably compare with ‘=’, unless you really need
LIKE. ILIKE will still not use indices. lower(field)=‘value’ will.
lower(field) like ‘value’ will use index only if the database uses C
locale for sorting (i.e. locale was set to `C’ or unset when initdb was
run). For RT, I guess this would be recommended, although for me it is a
pain, since my other databases on this server need proper localized
sorting. I guess I need to set up a separate database service for RT?

It is very strange that ILIKE does not use an index when lower() LIKE
does…

girgen=# create index person_foo on person (lower(last_name));
girgen=# vacuum analyze person;
girgen=# explain select * from person where lower(last_name) =
‘girgensohn’;
QUERY PLAN

Index Scan using person_foo on person (cost=0.00…137.58 rows=78 width=96)
Index Cond: (lower(last_name) = ‘girgensohn’::text)
(2 rows)

girgen=# explain select * from person where last_name = ‘Girgensohn’;
QUERY PLAN
Seq Scan on person (cost=0.00…441.35 rows=4 width=96)
Filter: (last_name = ‘Girgensohn’::text)
(2 rows)

girgen=# explain select * from person where lower(last_name) like
‘girgen%’;
QUERY PLAN

Index Scan using person_foo on person (cost=0.00…137.58 rows=78 width=96)
Index Cond: ((lower(last_name) >= ‘girgen’::text) AND (lower(last_name)
< ‘girgeo’::text))
Filter: (lower(last_name) ~~ ‘girgen%’::text)
(3 rows)

girgen=# explain select * from person where last_name ilike ‘girgen%’;
QUERY PLAN
Seq Scan on person (cost=0.00…441.35 rows=5 width=96)
Filter: (last_name ~~* ‘girgen%’::text)
(2 rows)

postgresql 7.4.2, freebsd 4.9 stable.

/Palle

Palle_Girgensohn · May 14, 2004, 1:07am

Hi,

I despererately need this fixed, or I will be forced to throw away rt,
something I really don’t want to. Any ideas where in the code I can find
the joins of IDs using LIKE?

Thanks,
Palle–On tisdag, mars 30, 2004 23.33.58 +0200 Palle Girgensohn girgen@pingpong.net wrote:

–On tisdag, mars 30, 2004 15.47.56 -0500 Jesse Vincent jesse@bestpractical.com wrote:

I’ll ask the postgresql developers about this. We’ll see what they
say. I also found the lines in SearchBuilder that handle this. Changed
between 0.98_1 and 0.99… So, I can downgrade for now, I guess…
Still, do you really need to match ID:s case-insensitive?

It’s how searchbuilder-based apps work on every other database platform.
Changing that behavior or making it db-specific worries me, especially
since a lot of the data is stuff that humans search on.

hmm, ok, but these are not search criterias, they are join conditions.
They are ID:s, internal integers that are not shown to anyone or used in
the GUI, right? Integers should not need to be case-insensitive? And they
compare two db fields, not field with value.

These are the fields that are ILIKEd in this query. They’re all integers:

Users.id
CachedGroupMembers.MemberId
CachedGroupMembers.GroupId
Tickets.id
Groups.Instance

SearchBuilder seem to separate join conditions from search restrictions.
I tried returning “lower($field)” , … , “lower($value)” in
Handle/Pg.pm:_MakeClauseCaseInsensitive(), but the single quotes are
added later down the code, so it became ‘lower(value)’ instead of
lower(‘value’). Where are the single quotes added to the query string?

/Palle

rt-devel mailing list
rt-devel@lists.bestpractical.com
The rt-devel Archives

Jesse_Vincent · May 14, 2004, 1:36am

Hi,

I despererately need this fixed, or I will be forced to throw away rt,
something I really don’t want to. Any ideas where in the code I can find
the joins of IDs using LIKE?

You can grab the latest DBIx-SearchBuilder from svn. It should handle
this better.