Postgres performance

Aidan_Mountford · April 2, 2003, 10:21am

Hello all,

I am using RT with a pg backend.

I am having some fairly serious issues relating to the performance of some
of the queries built by the query builder.

The worst issues seem to appear from an ACL lookup that occurs when
responding/commenting.

I have captured this query and note that it is not using joins.

To prove the point, i manually optimised this query, and reduced the
execution time from almost a minute down to about 4 seconds.

My question is, before I begin really trying to hack away at the way the
query builder works.

a) Is anyone else doing this - or intend to??
b) Are there any know gotchas that will make this process difficult.

On another note:

I am looking at altering the ticket display/search to allow tickets that
have been updated since you last saw them to be displayed in a different
color.

Once again,
Is anyone else doing this, or does anyone else want it.

Cheers

Aidan Mountford
Technical Director
MindVision Interactive P/L

Jesse_Vincent · April 2, 2003, 3:14pm

Hello all,

I am using RT with a pg backend.

I am having some fairly serious issues relating to the performance of some
of the queries built by the query builder.

The worst issues seem to appear from an ACL lookup that occurs when
responding/commenting.

I have captured this query and note that it is not using joins.

To prove the point, i manually optimised this query, and reduced the
execution time from almost a minute down to about 4 seconds.

Can you show us the unoptimized and optimized queries?

Request Tracker... So much more than a help desk — Best Practical Solutions – Trouble Ticketing. Free.

Vivek_Khera · April 2, 2003, 4:52pm

“AM” == Aidan Mountford aidan@oz.to writes:

AM> I am having some fairly serious issues relating to the performance of some
AM> of the queries built by the query builder.

Do you have all the recommended indexes built?

Have you run “vacuum analyze” on them to generate the index stats
postgres needs to use them effectively?

Have you ever run vacuum on the tables to keep them of manageable
size? If not, run a “vacuum full analyze” first, then daily/weekly
run regular “vacuum analyze”.

Aidan_Mountford · April 3, 2003, 2:11am

Yep - Can do…

The original query looks like this…

SELECT DISTINCT main.*
FROM Users main,
Principals Principals_1,
Groups Groups_2,
Principals Principals_3,
Principals Principals_4,
ACL ACL_5,
CachedGroupMembers CachedGroupMembers_6,
CachedGroupMembers CachedGroupMembers_7
WHERE ((ACL_5.RightName = ‘SuperUser’)
OR (ACL_5.RightName = ‘OwnTicket’))
AND CachedGroupMembers_7.GroupId = ‘4’
AND Principals_1.Disabled = ‘0’
AND Principals_1.PrincipalType = ‘User’
AND Principals_3.PrincipalType = ‘User’
AND (ACL_5.ObjectType = ‘RT::System’
OR (ACL_5.ObjectType = ‘RT::Queue’
AND ACL_5.ObjectId = ‘14’))
AND ((ACL_5.PrincipalId = Principals_4.Id
AND Principals_4.id = Groups_2.Id
AND ACL_5.PrincipalType = ‘Group’
AND (Groups_2.Domain = ‘SystemInternal’
OR Groups_2.Domain = ‘UserDefined’
OR Groups_2.Domain = ‘ACLEquivalence’))
OR ( ( (Groups_2.Domain = ‘RT::Queue-Role’
AND Groups_2.Instance = ‘14’)
OR (Groups_2.Domain = ‘RT::Ticket-Role’
AND Groups_2.Instance = ‘135’))
AND Groups_2.Type = ACL_5.PrincipalType
AND Groups_2.Id = Principals_4.id
AND Principals_4.PrincipalType = ‘Group’))
AND main.id = Principals_1.id
AND Principals_1.id = CachedGroupMembers_7.MemberId
AND main.id = Principals_3.id
AND CachedGroupMembers_6.MemberId = Principals_3.Id
AND CachedGroupMembers_6.GroupId = Principals_4.Id
ORDER BY main.Name ASC

The above took 59 seconds on our production DB (and returned 10 rows)

Without much effort I did this up with…

SELECT DISTINCT main.*
FROM (((((Users main join Principals Principals_1 on main.id =
Principals_1.id)
join CachedGroupMembers CachedGroupMembers_7 on Principals_1.id
= CachedGroupMembers_7.MemberId)
join Principals Principals_3 on Principals_1.id =
Principals_3.id)
join CachedGroupMembers CachedGroupMembers_6 on
CachedGroupMembers_6.MemberId = Principals_3.Id)
join Principals Principals_4 on CachedGroupMembers_6.GroupId =
Principals_4.Id),
Groups Groups_2,
ACL ACL_5
WHERE ( (ACL_5.RightName = ‘SuperUser’)
OR (ACL_5.RightName = ‘OwnTicket’))
AND CachedGroupMembers_7.GroupId = ‘4’
AND Principals_1.Disabled = ‘0’
AND Principals_1.PrincipalType = ‘User’
AND Principals_3.PrincipalType = ‘User’
AND (ACL_5.ObjectType = ‘RT::System’
OR (ACL_5.ObjectType = ‘RT::Queue’ AND ACL_5.ObjectId =
‘14’))
AND ( (ACL_5.PrincipalId = Principals_4.Id
AND Principals_4.id = Groups_2.Id
AND ACL_5.PrincipalType = ‘Group’
AND (Groups_2.Domain = ‘SystemInternal’
OR Groups_2.Domain = ‘UserDefined’
OR Groups_2.Domain = ‘ACLEquivalence’))
OR ( ( (Groups_2.Domain = ‘RT::Queue-Role’
AND Groups_2.Instance = ‘14’)
OR (Groups_2.Domain = ‘RT::Ticket-Role’
AND Groups_2.Instance = ‘135’))
AND Groups_2.Type = ACL_5.PrincipalType
AND Groups_2.Id = Principals_4.id
AND Principals_4.PrincipalType = ‘Group’))
ORDER BY main.Name ASC

This ran in about 6 seconds. (and returned the same 10 rows)

And then I tried a little harder…

SELECT DISTINCT main.*
FROM ((((((Users main join Principals Principals_1 on main.id =
Principals_1.id)
join CachedGroupMembers CachedGroupMembers_7 on
Principals_1.id = CachedGroupMembers_7.MemberId)
join Principals Principals_3 on Principals_1.id =
Principals_3.id)
join CachedGroupMembers CachedGroupMembers_6 on
CachedGroupMembers_6.MemberId = Principals_3.Id)
join Principals Principals_4 on CachedGroupMembers_6.GroupId =
Principals_4.Id)
join Groups Groups_2 on Principals_4.id = Groups_2.Id),
ACL ACL_5
WHERE ( (ACL_5.RightName = ‘SuperUser’)
OR (ACL_5.RightName = ‘OwnTicket’))
AND CachedGroupMembers_7.GroupId = ‘4’
AND Principals_1.Disabled = ‘0’
AND Principals_1.PrincipalType = ‘User’
AND Principals_3.PrincipalType = ‘User’
AND (ACL_5.ObjectType = ‘RT::System’
OR (ACL_5.ObjectType = ‘RT::Queue’ AND ACL_5.ObjectId =
‘14’))

      AND (   (ACL_5.PrincipalId = Principals_4.Id
               AND ACL_5.PrincipalType = 'Group'
               AND (Groups_2.Domain = 'SystemInternal'
                       OR Groups_2.Domain = 'UserDefined'
                       OR Groups_2.Domain = 'ACLEquivalence'))

           OR  (   (   (Groups_2.Domain = 'RT::Queue-Role'
                        AND Groups_2.Instance = '14')
                    OR (Groups_2.Domain = 'RT::Ticket-Role'
                        AND Groups_2.Instance = '135'))
                    AND Groups_2.Type = ACL_5.PrincipalType
                    AND Principals_4.PrincipalType = 'Group'))

ORDER BY main.Name ASC

And this gave the same result in less than one second…

There is more than can be done with that ACL table - but I have had little
time to look at it

I hope this helps you guys…

AFrom: rt-devel-admin@lists.fsck.com
[mailto:rt-devel-admin@lists.fsck.com]On Behalf Of Jesse Vincent
Sent: Thursday, 3 April 2003 12:44 AM
To: Aidan Mountford
Cc: rt-devel@lists.fsck.com
Subject: Re: [rt-devel] Postgres performance.

Hello all,

I am using RT with a pg backend.

I am having some fairly serious issues relating to the performance of some
of the queries built by the query builder.

The worst issues seem to appear from an ACL lookup that occurs when
responding/commenting.

I have captured this query and note that it is not using joins.

To prove the point, i manually optimised this query, and reduced the
execution time from almost a minute down to about 4 seconds.

Can you show us the unoptimized and optimized queries?

Request Tracker... So much more than a help desk — Best Practical Solutions – Trouble Ticketing. Free.
rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

gsstark · April 4, 2003, 10:09pm

“Aidan Mountford” aidan@oz.to writes:

Yep - Can do…

The original query looks like this…

SELECT DISTINCT main.*
FROM Users main,
Principals Principals_1,
Groups Groups_2,
Principals Principals_3,
Principals Principals_4,
ACL ACL_5,
CachedGroupMembers CachedGroupMembers_6,
CachedGroupMembers CachedGroupMembers_7

Can you post (or send me) the results of “explain analyze” on this and the
optimized query?

greg

David_Wheeler · April 7, 2003, 8:37pm

Can you post (or send me) the results of “explain analyze” on this
and the
optimized query?

I can. Attached.

David

David Wheeler AIM: dwTheory
david@kineticode.com ICQ: 15726394
Yahoo!: dew7e
Jabber: Theory@jabber.org
Kineticode. Setting knowledge in motion.[sm]

rt_queries.txt (29.6 KB)

gsstark · April 7, 2003, 9:40pm

David Wheeler david@kineticode.com writes:

Can you post (or send me) the results of “explain analyze” on this and the
optimized query?

I can. Attached.

Ok.

So first of all, do you run vacuum and analyze on your database frequently? If
not you should run “vacuum full” and then start running both frequently. How
frequently depends on the amount of updates your database gets, but as often
as once an hour is not unusual.

The big gun to be sure everything is up-to-date for optimizing like this is:
vacuum full analyze
which cleans up everything and gets accurate statistics.

The only bad guess postgres is making appears to be on a table called
CachedGroupMembers. Is it possible this table has received lots of updates
since the last time the database was analyzed?

That said, I don’t think that’s the main difference between the original query
and your optimized one. I think a big problem is that postgres isn’t using an
index on groups. It seems to be having trouble discerning the the groups_2.Id
= Principals_4.id is on both sides of the OR and can thus be pulled out to do
an index lookup.

I’m confused by why this isn’t happening because some testing shows postgres
is indeed capable of doing this type of rewriting.

Out of curiosity, what happens if you run

explain analyze SELECT DISTINCT main.*
FROM Users main,
Principals Principals_1,
Groups Groups_2,
Principals Principals_3,
Principals Principals_4,
ACL ACL_5,
CachedGroupMembers CachedGroupMembers_6,
CachedGroupMembers CachedGroupMembers_7
WHERE ((ACL_5.RightName = ‘SuperUser’)
OR (ACL_5.RightName = ‘OwnTicket’))
AND CachedGroupMembers_7.GroupId = ‘4’
AND Principals_1.Disabled = ‘0’
AND Principals_1.PrincipalType = ‘User’
AND Principals_3.PrincipalType = ‘User’
AND (ACL_5.ObjectType = ‘RT::System’
OR (ACL_5.ObjectType = ‘RT::Queue’
AND ACL_5.ObjectId = ‘14’))
AND (
(
Principals_4.id = Groups_2.Id
AND ( ACL_5.PrincipalId = Principals_4.Id
AND ACL_5.PrincipalType = ‘Group’
AND (Groups_2.Domain = ‘SystemInternal’ OR Groups_2.Domain = ‘UserDefined’ OR Groups_2.Domain = ‘ACLEquivalence’)
)
)
OR
(
Groups_2.Id = Principals_4.id
AND (
Groups_2.Type = ACL_5.PrincipalType
AND Principals_4.PrincipalType = ‘Group’
AND ( (Groups_2.Domain = ‘RT::Queue-Role’ AND Groups_2.Instance = ‘14’) OR (Groups_2.Domain = ‘RT::Ticket-Role’ AND Groups_2.Instance = ‘135’))
)
)
)
AND main.id = Principals_1.id
AND Principals_1.id = CachedGroupMembers_7.MemberId
AND main.id = Principals_3.id
AND CachedGroupMembers_6.MemberId = Principals_3.Id
AND CachedGroupMembers_6.GroupId = Principals_4.Id
ORDER BY main.Name ASC;

greg

David_Wheeler · April 8, 2003, 6:04pm

So first of all, do you run vacuum and analyze on your database
frequently? If
not you should run “vacuum full” and then start running both
frequently. How
frequently depends on the amount of updates your database gets, but as
often
as once an hour is not unusual.

I don’t have that many tickets in my database (< 50), and what’s in
there was mainly loaded up from a database dump of an RC2 database. I
run vacuum and analyze nightly, but haven’t run vacuum full.

The big gun to be sure everything is up-to-date for optimizing like
this is:
vacuum full analyze
which cleans up everything and gets accurate statistics.

See attached, where I run this and a few other things before re-running
the explains.

The only bad guess postgres is making appears to be on a table called
CachedGroupMembers. Is it possible this table has received lots of
updates
since the last time the database was analyzed?

Nope. As I said, this database doesn’t have much in the way of records
in it. The only difference I can think of is that, because there aren’t
that many records, the statistics won’t yet be accurate.

That said, I don’t think that’s the main difference between the
original query
and your optimized one. I think a big problem is that postgres isn’t
using an
index on groups. It seems to be having trouble discerning the the
groups_2.Id
= Principals_4.id is on both sides of the OR and can thus be pulled
out to do
an index lookup.

Well, the optimizer may decide that it’s not worth it to use an index
when there are so few records and do a table scan, instead. But then I
don’t know why it would be different for Aidan’s version of the query.

I’m confused by why this isn’t happening because some testing shows
postgres
is indeed capable of doing this type of rewriting.

Out of curiosity, what happens if you run

I’ve re-run the queries, including this one. Summary:

Original query: 20704.24 msec
Aidan’s Optimized query: 203.89 msec
Your new query: 18799.20 msec

Something’s rotten in Denmark.

Regards,

David

David Wheeler AIM: dwTheory
david@kineticode.com ICQ: 15726394
Yahoo!: dew7e
Jabber: Theory@jabber.org
Kineticode. Setting knowledge in motion.[sm]

rt-pg-output.txt.gz (3.02 KB)

Aidan_Mountford · April 9, 2003, 12:08pm

If it helps, I also run vacuum etc nightly…

Before doing my testing analyse and vacuum were run before EACH test.

I have only about 500 tickets at this stage…

That said, I am having some success with rebuilding things to perform
structured joins…

The performance of these joins is clearly way ahead of the existing
method.

Unless anyone can see a reason why I should stop this, I will carry on
down this path…

AFrom: rt-devel-admin@lists.fsck.com
[mailto:rt-devel-admin@lists.fsck.com]On Behalf Of David Wheeler
Sent: Wednesday, 9 April 2003 3:34 AM
To: Greg Stark
Cc: rt-devel@lists.fsck.com; Aidan Mountford
Subject: Re: [rt-devel] Postgres performance.

So first of all, do you run vacuum and analyze on your database
frequently? If
not you should run “vacuum full” and then start running both
frequently. How
frequently depends on the amount of updates your database gets, but as
often
as once an hour is not unusual.

I don’t have that many tickets in my database (< 50), and what’s in
there was mainly loaded up from a database dump of an RC2 database. I
run vacuum and analyze nightly, but haven’t run vacuum full.

The big gun to be sure everything is up-to-date for optimizing like
this is:
vacuum full analyze
which cleans up everything and gets accurate statistics.

See attached, where I run this and a few other things before re-running
the explains.

The only bad guess postgres is making appears to be on a table called
CachedGroupMembers. Is it possible this table has received lots of
updates
since the last time the database was analyzed?

Nope. As I said, this database doesn’t have much in the way of records
in it. The only difference I can think of is that, because there aren’t
that many records, the statistics won’t yet be accurate.

That said, I don’t think that’s the main difference between the
original query
and your optimized one. I think a big problem is that postgres isn’t
using an
index on groups. It seems to be having trouble discerning the the
groups_2.Id
= Principals_4.id is on both sides of the OR and can thus be pulled
out to do
an index lookup.

Well, the optimizer may decide that it’s not worth it to use an index
when there are so few records and do a table scan, instead. But then I
don’t know why it would be different for Aidan’s version of the query.

I’m confused by why this isn’t happening because some testing shows
postgres
is indeed capable of doing this type of rewriting.

Out of curiosity, what happens if you run

I’ve re-run the queries, including this one. Summary:

Original query: 20704.24 msec
Aidan’s Optimized query: 203.89 msec
Your new query: 18799.20 msec

Something’s rotten in Denmark.

Regards,

David

David Wheeler AIM: dwTheory
david@kineticode.com ICQ: 15726394
Yahoo!: dew7e
Jabber: Theory@jabber.org
Kineticode. Setting knowledge in motion.[sm]

David_Wheeler · April 10, 2003, 12:47am

Unless anyone can see a reason why I should stop this, I will carry on
down this path…

I, for one, would welcome the increased performance. It’s pretty bad
right now.

David

David Wheeler AIM: dwTheory
david@kineticode.com ICQ: 15726394
Yahoo!: dew7e
Jabber: Theory@jabber.org
Kineticode. Setting knowledge in motion.[sm]

gsstark · April 10, 2003, 2:02am

David Wheeler david@kineticode.com writes:> On Wednesday, April 9, 2003, at 05:08 AM, Aidan Mountford wrote:

Unless anyone can see a reason why I should stop this, I will carry on
down this path…

I, for one, would welcome the increased performance. It’s pretty bad right
now.

Aren’t these queries dynamically generated? How can you rewrite them like this
in a general way?

There is one disadvantage of explicit joins like you’re using: They restrict
postgres’s degrees of freedom which in theory reduces postgres’s ability to
optimize.

That may sound silly given the speed difference you’re seeing, but numbers can
be deceiving. Most of that difference is coming directly from a single tricky
join clause. If you can really play with the query you could try simply moving
the groups_2.id = whatever.id clause outside the OR.

Also, you may want to try the queries with the CVS version of postgres. It
will treat the explicit paths very differently from the way 7.3 does and might
not do exactly what you expect.

That said, I usually prefer explicit joins myself because I find them
infinitely more readable than old style queries. But I would suggest ordering
the tables the way you want them so you don’t need the 10 nested
parentheses…

I’m still curious why postgres is having trouble with that OR clause. I can’t
seem to come up with any simple example queries that trigger that behaviour.
By all means consider sending it on to postgres-general, sending there
generally elicits helpful contributions. I would mention the groups table and
point out the relevant clause of the query specifically.

greg

Jesse_Vincent · April 10, 2003, 2:05am

What aidan is working on is recasting all the joins as explicit joins.
Because of the toolkit we use to do the dynamic generation, it’s a
single code path to change.On Wed, Apr 09, 2003 at 10:02:57PM -0400, Greg Stark wrote:

David Wheeler david@kineticode.com writes:

On Wednesday, April 9, 2003, at 05:08 AM, Aidan Mountford wrote:

Unless anyone can see a reason why I should stop this, I will carry on
down this path…

I, for one, would welcome the increased performance. It’s pretty bad right
now.

Aren’t these queries dynamically generated? How can you rewrite them like this
in a general way?

There is one disadvantage of explicit joins like you’re using: They restrict
postgres’s degrees of freedom which in theory reduces postgres’s ability to
optimize.

That may sound silly given the speed difference you’re seeing, but numbers can
be deceiving. Most of that difference is coming directly from a single tricky
join clause. If you can really play with the query you could try simply moving
the groups_2.id = whatever.id clause outside the OR.

Also, you may want to try the queries with the CVS version of postgres. It
will treat the explicit paths very differently from the way 7.3 does and might
not do exactly what you expect.

That said, I usually prefer explicit joins myself because I find them
infinitely more readable than old style queries. But I would suggest ordering
the tables the way you want them so you don’t need the 10 nested
parentheses…

I’m still curious why postgres is having trouble with that OR clause. I can’t
seem to come up with any simple example queries that trigger that behaviour.
By all means consider sending it on to postgres-general, sending there
generally elicits helpful contributions. I would mention the groups table and
point out the relevant clause of the query specifically.

–
greg

rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

Request Tracker... So much more than a help desk — Best Practical Solutions – Trouble Ticketing. Free.

Jeroen_Ruigrok_asmod · April 14, 2003, 3:56pm

Unless anyone can see a reason why I should stop this, I will carry on
down this path…

I, for one, would welcome the increased performance. It’s pretty bad
right now.

It is.
Thanks for your work guys, been on my list of to do items, but as some
might have noticed, I disappeared in the void of corporate work.

I think, call it an unfounded bias, that the Query Builder is very much
geared towards MySQL and PostgreSQL performance has been ignored in the
past.

Jeroen Ruigrok van der Werven <asmodai(at)wxs.nl> / asmodai / a capoeirista
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7 9D88 97E6 839B 2EAC 625B
http://www.tendra.org/ | http://www.in-nomine.org/~asmodai/diary/
Only the wisest and the stupidest of men never change…

Jeroen_Ruigrok_asmod · April 14, 2003, 3:58pm

What aidan is working on is recasting all the joins as explicit joins.
Because of the toolkit we use to do the dynamic generation, it’s a
single code path to change.

Will we see any of these performance changes alongside the 3.0.1 release?

Jeroen Ruigrok van der Werven <asmodai(at)wxs.nl> / asmodai / a capoeirista
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7 9D88 97E6 839B 2EAC 625B
http://www.tendra.org/ | http://www.in-nomine.org/~asmodai/diary/
Only the wisest and the stupidest of men never change…

Jeroen_Ruigrok_asmod · April 16, 2003, 1:57pm

[snip original query]

The above took 59 seconds on our production DB (and returned 10 rows)

This took me between 320-324 msec on my almost empty database.

[snip optimisation #1]

This ran in about 6 seconds. (and returned the same 10 rows)

This took about 93 msec.

And then I tried a little harder…

And this gave the same result in less than one second…

This took me about 49 msec.

I will fill the database some more, do some testing on the
DBIx::SearchBuilder 0.80.0 and then the 0.80.4 queries.

Jeroen Ruigrok van der Werven <asmodai(at)wxs.nl> / asmodai / a capoeirista
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7 9D88 97E6 839B 2EAC 625B
http://www.tendra.org/ | http://www.in-nomine.org/~asmodai/diary/
Only the wisest and the stupidest of men never change…

Jeroen_Ruigrok_asmod · April 16, 2003, 3:54pm

It seems that every user you add adds about 120-160 msec (on my box)
to the query execution time for pgsql.

With 5 users I had:

950 msec
6 - 1070 msec
7 - 1230 msec

Going with Aidan’s first optimisation reduces the latter result down to
about 250 msec, a whopping fifth of the original query.
While his second query reduces it to 106 msec, about a twelfth.

0.80.4 doesn’t help with the ACL query when commenting on tickets.
It takes as long as the original, or perhaps a very small fraction less,
but still way too long.

Jeroen Ruigrok van der Werven <asmodai(at)wxs.nl> / asmodai / a capoeirista
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7 9D88 97E6 839B 2EAC 625B
http://www.tendra.org/ | http://www.in-nomine.org/~asmodai/diary/
Only the wisest and the stupidest of men never change…