Extremely slow query in rt 3.0.10 due to ILIKE instead of =

Hi,

With RT & the latest (0.99) DBIx-SearchBuilder, clicking “Home” takes
approximately 20 second or more. Most of the time is spent running this
query:

rt3=# explain analyze
rt3-# SELECT DISTINCT main.* FROM Tickets main , Groups Groups_1,
CachedGroupMembers CachedGroupMembers_2, Users
rt3-# Users_3 WHERE ((CachedGroupMembers_2.MemberId ILIKE Users_3.id))
AND ((Groups_1.id ILIKE CachedGroupMembers_2.GroupId)) AND
rt3-# ((main.EffectiveId = main.id)) AND ((main.Status != ‘deleted’)) AND
((main.Type = ‘ticket’)) AND ((main.id ILIKE
rt3(# Groups_1.Instance)) AND ( ( ( (Users_3.EmailAddress ILIKE
‘girgen@pingpong.net’)AND(Groups_1.Domain =
rt3(# ‘RT::Ticket-Role’)AND(Groups_1.Type = ‘Requestor’) ) ) AND (
(main.Status = ‘new’)OR(main.Status = ‘open’) ) ) ORDER BY
rt3-# main.Priority DESC LIMIT 10;

QUERY PLAN

Limit (cost=729.00…729.19 rows=1 width=193) (actual
time=20850.989…20851.960 rows=10 loops=1)
→ Unique (cost=729.00…729.19 rows=1 width=193) (actual
time=20850.975…20851.874 rows=10 loops=1)
→ Sort (cost=729.00…729.01 rows=3 width=193) (actual
time=20850.961…20851.033 rows=19 loops=1)
Sort Key: main.priority, main.id, main.effectiveid,
main.queue, main.“type”, main.issuestatement, main.resolution,
main.“owner”, main.subject, main.initialpriority, main.finalpriority,
main.timeestimated, main.timeworked, main.status, main.timeleft, main.told,
main.starts, main.started, main.due, main.resolved, main.lastupdatedby,
main.lastupdated, main.creator, main.created, main.disabled
→ Nested Loop (cost=256.11…728.97 rows=3 width=193)
(actual time=19122.777…20847.490 rows=58 loops=1)
Join Filter: ((“inner”.id)::text ~~*
(“outer”.groupid)::text)
→ Nested Loop (cost=0.00…459.97 rows=67 width=4)
(actual time=0.308…291.242 rows=533 loops=1)
Join Filter: ((“inner”.memberid)::text ~~*
(“outer”.id)::text)
→ Seq Scan on users users_3 (cost=0.00…2.69
rows=1 width=4) (actual time=0.104…0.347 rows=1 loops=1)
Filter: ((emailaddress)::text ~~*
‘girgen@pingpong.net’::text)
→ Seq Scan on cachedgroupmembers
cachedgroupmembers_2 (cost=0.00…224.83 rows=13283 width=8) (actual
time=0.015…104.128 rows=13283 loops=1)
→ Materialize (cost=256.11…256.18 rows=7
width=197) (actual time=0.023…36.582 rows=161 loops=533)
→ Nested Loop (cost=0.00…256.11 rows=7
width=197) (actual time=9.800…19068.908 rows=161 loops=1)
Join Filter: ((“outer”.id)::text ~~*
(“inner”.instance)::text)
→ Seq Scan on tickets main
(cost=0.00…77.16 rows=1 width=193) (actual time=0.083…29.923 rows=161
loops=1)
Filter: ((effectiveid = id) AND
((status)::text <> ‘deleted’::text) AND ((“type”)::text = ‘ticket’::text)
AND (((status)::text = ‘new’::text) OR ((status)::text = ‘open’::text)))
→ Seq Scan on groups groups_1
(cost=0.00…155.50 rows=1340 width=8) (actual time=0.845…91.976 rows=1603
loops=161)
Filter: (((“domain”)::text =
‘RT::Ticket-Role’::text) AND ((“type”)::text = ‘Requestor’::text))
Total runtime: 20853.432 ms
(19 rows)

~21 seconds… this is because of the ILIKE:s. I have postgresql intialized
with LC_ALL=sv_SE.ISO8859-1. Indices are not used in postgresql for “LIKE”
when the database collation order is not of locale ‘C’. I usually want
this, since I need sorting to be localized.

Looking at the query above, I can’t see that ILIKE cannot be replaced with
`=’ in all cases. I’d say the way it is done now is a bug. I cannot find my
way around the RT code, so I’d appreciate some help doing this.

Here’s the same query without the ILIKE:s:

rt3=# explain analyze
rt3-# SELECT DISTINCT main.* FROM Tickets main , Groups Groups_1,
CachedGroupMembers CachedGroupMembers_2, Users
rt3-# Users_3 WHERE ((CachedGroupMembers_2.MemberId = Users_3.id)) AND
((Groups_1.id = CachedGroupMembers_2.GroupId)) AND
rt3-# ((main.EffectiveId = main.id)) AND ((main.Status != ‘deleted’)) AND
((main.Type = ‘ticket’)) AND ((main.id =
rt3(# Groups_1.Instance)) AND ( ( ( (Users_3.EmailAddress =
‘girgen@pingpong.net’)AND(Groups_1.Domain =
rt3(# ‘RT::Ticket-Role’)AND(Groups_1.Type = ‘Requestor’) ) ) AND (
(main.Status = ‘new’)OR(main.Status = ‘open’) ) ) ORDER BY
rt3-# main.Priority DESC LIMIT 10;

QUERY PLAN

Limit (cost=40.25…40.32 rows=1 width=193) (actual time=100.423…101.392
rows=10 loops=1)
→ Unique (cost=40.25…40.32 rows=1 width=193) (actual
time=100.410…101.308 rows=10 loops=1)
→ Sort (cost=40.25…40.26 rows=1 width=193) (actual
time=100.395…100.468 rows=19 loops=1)
Sort Key: main.priority, main.id, main.effectiveid,
main.queue, main.“type”, main.issuestatement, main.resolution,
main.“owner”, main.subject, main.initialpriority, main.finalpriority,
main.timeestimated, main.timeworked, main.status, main.timeleft, main.told,
main.starts, main.started, main.due, main.resolved, main.lastupdatedby,
main.lastupdated, main.creator, main.created, main.disabled
→ Nested Loop (cost=0.00…40.24 rows=1 width=193) (actual
time=1.164…95.580 rows=58 loops=1)
→ Nested Loop (cost=0.00…32.28 rows=2 width=4)
(actual time=0.840…64.753 rows=316 loops=1)
→ Nested Loop (cost=0.00…13.89 rows=6
width=4) (actual time=0.351…17.903 rows=533 loops=1)
→ Seq Scan on users users_3
(cost=0.00…2.69 rows=1 width=4) (actual time=0.117…0.356 rows=1 loops=1)
Filter: ((emailaddress)::text =
‘girgen@pingpong.net’::text)
→ Index Scan using cachedgroupmembers2
on cachedgroupmembers cachedgroupmembers_2 (cost=0.00…11.13 rows=5
width=8) (actual time=0.198…10.070 rows=533 loops=1)
Index Cond:
(cachedgroupmembers_2.memberid = “outer”.id)
→ Index Scan using groups_pkey on groups
groups_1 (cost=0.00…3.05 rows=1 width=8) (actual time=0.060…0.065 rows=1
loops=533)
Index Cond: (groups_1.id = “outer”.groupid)
Filter: (((“domain”)::text =
‘RT::Ticket-Role’::text) AND ((“type”)::text = ‘Requestor’::text))
→ Index Scan using tickets_pkey on tickets main
(cost=0.00…3.97 rows=1 width=193) (actual time=0.069…0.071 rows=0
loops=316)
Index Cond: (main.id = “outer”.instance)
Filter: ((effectiveid = id) AND ((status)::text
<> ‘deleted’::text) AND ((“type”)::text = ‘ticket’::text) AND
(((status)::text = ‘new’::text) OR ((status)::text = ‘open’::text)))
Total runtime: 103.182 ms

Postgresql-7.4.2
FreeBSD 4.9 stable
DBIx-SearchBuilder 0.99
Perl 5.8.3

0,1 s, pretty darn much better… :wink:

So, in short, I’d like to exterminate unnecessesary ILIKEs. In this
example, all are.

Any ideas how to pursue this?

/Palle

Hi,

With RT & the latest (0.99) DBIx-SearchBuilder, clicking “Home” takes
approximately 20 second or more. Most of the time is spent running
this query:

~21 seconds… this is because of the ILIKE:s. I have postgresql
intialized with LC_ALL=sv_SE.ISO8859-1. Indices are not used in
postgresql for “LIKE” when the database collation order is not of
locale ‘C’. I usually want this, since I need sorting to be localized.

Looking at the query above, I can’t see that ILIKE cannot be replaced
with `=’ in all cases. I’d say the way it is done now is a bug. I
cannot find my way around the RT code, so I’d appreciate some help
doing this.

Here’s the same query without the ILIKE:s:

Postgresql-7.4.2
FreeBSD 4.9 stable
DBIx-SearchBuilder 0.99
Perl 5.8.3

0,1 s, pretty darn much better… :wink:

So, in short, I’d like to exterminate unnecessesary ILIKEs. In this
example, all are.

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGH!

So, we need to do case-insensitive comparisons. Every time a postgres
hacker looks at this issue, they tell me something different.

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

make your index functional on lower(column) and then compare
lower(column) = lower(‘string’). don’t just index on the column
directly.

Hi,

With RT & the latest (0.99) DBIx-SearchBuilder, clicking “Home” takes
approximately 20 second or more. Most of the time is spent running
this query:

~21 seconds… this is because of the ILIKE:s. I have postgresql
intialized with LC_ALL=sv_SE.ISO8859-1. Indices are not used in
postgresql for “LIKE” when the database collation order is not of
locale ‘C’. I usually want this, since I need sorting to be localized.

Looking at the query above, I can’t see that ILIKE cannot be replaced
with `=’ in all cases. I’d say the way it is done now is a bug. I
cannot find my way around the RT code, so I’d appreciate some help
doing this.

Here’s the same query without the ILIKE:s:

Postgresql-7.4.2
FreeBSD 4.9 stable
DBIx-SearchBuilder 0.99
Perl 5.8.3

0,1 s, pretty darn much better… :wink:

So, in short, I’d like to exterminate unnecessesary ILIKEs. In this
example, all are.

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGH!

Ooops…

So, we need to do case-insensitive comparisons. Every time a postgres
hacker looks at this issue, they tell me something different.

What is the right way to do fast case-insensitive comparisons in postgres
that take database indexes into account?

First, I just set up a postgresql cluster initdb:ed with locale=C. The
query is down to 14 secs, so it is not that big a difference. I does not
use indices differently, so I guess I was wrong, or they changed something.

In this specific example, it seems like all that is compared are ID:s
(except the email). Are you really sure they need to be compared
case-insensitive?

I remade the query with = except for Users_3.EmailAddress ILIKE
‘girgen@p…’, and it is really fast, 0,1 s, and it returns all the same
rows. I also remade the query using lower() = lower(), and it is OK, 0,5 s.
This is really the postgresql planner making non-optimal assumptions, I
guess. ILIKE between two db fields is really not a LIKE, it is a lower() =
lower()… Postgresql should ideally understand this.

I’ll ask the postgresql developers about this. We’ll see what they say. I
also found the lines in SearchBuilder that handle this. Changed between
0.98_1 and 0.99… So, I can downgrade for now, I guess… Still, do you
really need to match ID:s case-insensitive?

/Palle

SELECT DISTINCT main.* FROM Tickets main , Groups Groups_1,
CachedGroupMembers CachedGroupMembers_2, Users Users_3 WHERE
((CachedGroupMembers_2.MemberId = Users_3.id)) AND ((Groups_1.id
= CachedGroupMembers_2.GroupId)) AND ((main.EffectiveId =
main.id)) AND ((main.Status != ‘deleted’)) AND ((main.Type =
‘ticket’)) AND ((main.id = Groups_1.Instance)) AND ( ( (
(Users_3.EmailAddress ILIKE ‘girgen@pingpong.net’)AND(Groups_1.Domain
= ‘RT::Ticket-Role’)AND(Groups_1.Type = ‘Requestor’) ) ) AND (
(main.Status = ‘new’)OR(main.Status = ‘open’) ) ) ORDER BY
main.Priority DESC LIMIT 10

I’ll ask the postgresql developers about this. We’ll see what they
say. I also found the lines in SearchBuilder that handle this. Changed
between 0.98_1 and 0.99… So, I can downgrade for now, I guess…
Still, do you really need to match ID:s case-insensitive?

It’s how searchbuilder-based apps work on every other database
platform. Changing that behavior or making it db-specific worries me,
especially since a lot of the data is stuff that humans search on.

PGP.sig (186 Bytes)

I’ll ask the postgresql developers about this. We’ll see what they
say. I also found the lines in SearchBuilder that handle this. Changed
between 0.98_1 and 0.99… So, I can downgrade for now, I guess…
Still, do you really need to match ID:s case-insensitive?

It’s how searchbuilder-based apps work on every other database platform.
Changing that behavior or making it db-specific worries me, especially
since a lot of the data is stuff that humans search on.

hmm, ok, but these are not search criterias, they are join conditions. They
are ID:s, internal integers that are not shown to anyone or used in the
GUI, right? Integers should not need to be case-insensitive? And they
compare two db fields, not field with value.

These are the fields that are ILIKEd in this query. They’re all integers:

Users.id
CachedGroupMembers.MemberId
CachedGroupMembers.GroupId
Tickets.id
Groups.Instance

SearchBuilder seem to separate join conditions from search restrictions. I
tried returning “lower($field)” , … , “lower($value)” in
Handle/Pg.pm:_MakeClauseCaseInsensitive(), but the single quotes are added
later down the code, so it became ‘lower(value)’ instead of lower(‘value’).
Where are the single quotes added to the query string?

/Palle

What is the right way to do fast case-insensitive comparisons in
postgres that take database indexes into account?

make your index functional on lower(column) and then compare
lower(column) = lower(‘string’). don’t just index on the column directly.

True. Note you should preferably compare with ‘=’, unless you really need
LIKE. ILIKE will still not use indices. lower(field)=‘value’ will.
lower(field) like ‘value’ will use index only if the database uses C
locale for sorting (i.e. locale was set to `C’ or unset when initdb was
run). For RT, I guess this would be recommended, although for me it is a
pain, since my other databases on this server need proper localized
sorting. I guess I need to set up a separate database service for RT?

It is very strange that ILIKE does not use an index when lower() LIKE
does…

girgen=# create index person_foo on person (lower(last_name));
girgen=# vacuum analyze person;
girgen=# explain select * from person where lower(last_name) =
‘girgensohn’;
QUERY PLAN

Index Scan using person_foo on person (cost=0.00…137.58 rows=78 width=96)
Index Cond: (lower(last_name) = ‘girgensohn’::text)
(2 rows)

girgen=# explain select * from person where last_name = ‘Girgensohn’;
QUERY PLAN
Seq Scan on person (cost=0.00…441.35 rows=4 width=96)
Filter: (last_name = ‘Girgensohn’::text)
(2 rows)

girgen=# explain select * from person where lower(last_name) like
‘girgen%’;
QUERY PLAN

Index Scan using person_foo on person (cost=0.00…137.58 rows=78 width=96)
Index Cond: ((lower(last_name) >= ‘girgen’::text) AND (lower(last_name)
< ‘girgeo’::text))
Filter: (lower(last_name) ~~ ‘girgen%’::text)
(3 rows)

girgen=# explain select * from person where last_name ilike ‘girgen%’;
QUERY PLAN
Seq Scan on person (cost=0.00…441.35 rows=5 width=96)
Filter: (last_name ~~* ‘girgen%’::text)
(2 rows)

postgresql 7.4.2, freebsd 4.9 stable.

/Palle

Hi,

I despererately need this fixed, or I will be forced to throw away rt,
something I really don’t want to. Any ideas where in the code I can find
the joins of IDs using LIKE?

Thanks,
Palle–On tisdag, mars 30, 2004 23.33.58 +0200 Palle Girgensohn girgen@pingpong.net wrote:

–On tisdag, mars 30, 2004 15.47.56 -0500 Jesse Vincent jesse@bestpractical.com wrote:

I’ll ask the postgresql developers about this. We’ll see what they
say. I also found the lines in SearchBuilder that handle this. Changed
between 0.98_1 and 0.99… So, I can downgrade for now, I guess…
Still, do you really need to match ID:s case-insensitive?

It’s how searchbuilder-based apps work on every other database platform.
Changing that behavior or making it db-specific worries me, especially
since a lot of the data is stuff that humans search on.

hmm, ok, but these are not search criterias, they are join conditions.
They are ID:s, internal integers that are not shown to anyone or used in
the GUI, right? Integers should not need to be case-insensitive? And they
compare two db fields, not field with value.

These are the fields that are ILIKEd in this query. They’re all integers:

Users.id
CachedGroupMembers.MemberId
CachedGroupMembers.GroupId
Tickets.id
Groups.Instance

SearchBuilder seem to separate join conditions from search restrictions.
I tried returning “lower($field)” , … , “lower($value)” in
Handle/Pg.pm:_MakeClauseCaseInsensitive(), but the single quotes are
added later down the code, so it became ‘lower(value)’ instead of
lower(‘value’). Where are the single quotes added to the query string?

/Palle


rt-devel mailing list
rt-devel@lists.bestpractical.com
The rt-devel Archives

Hi,

I despererately need this fixed, or I will be forced to throw away rt,
something I really don’t want to. Any ideas where in the code I can find
the joins of IDs using LIKE?

You can grab the latest DBIx-SearchBuilder from svn. It should handle
this better.