Slow queries building list of privileged users (Postgres)

I’m migrating from an RT 2 install to an RT 3.8 install with around
170 privileged users (and around 90,000 total users).

I’ve done some initial testing with RT 3.8.6 and have observed
that building the list of privileged users (in the “create new
ticket” (ticket owner), “display ticket” (reminder owner) and
“ticket search” (ticket owner) pages) takes around 20-30 seconds to run.

It performs two queries which are logged by my slow query logger:

LOG: duration: 30570.089 ms statement: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_4 JOIN Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.MemberId = Principals_1.id ) JOIN Groups Groups_3 ON ( Groups_3.id = CachedGroupMembers_2.GroupId ) WHERE (Principals_1.Disabled = ‘0’) AND (ACL_4.PrincipalType = Groups_3.Type) AND (Principals_1.id != ‘1’) AND (Principals_1.PrincipalType = ‘User’) AND (ACL_4.RightName = ‘OwnTicket’) AND (Groups_3.Domain = ‘RT::System-Role’) AND ((ACL_4.ObjectType = ‘RT::Queue’ AND ACL_4.ObjectId = 3) OR (ACL_4.ObjectType = ‘RT::System’)) ORDER BY main.Name ASC

LOG: duration: 824.692 ms statement: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_2 JOIN Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_3 ON ( CachedGroupMembers_3.MemberId = Principals_1.id ) WHERE (Principals_1.Disabled = ‘0’) AND (ACL_2.PrincipalId = CachedGroupMembers_3.GroupId) AND (Principals_1.id != ‘1’) AND (ACL_2.PrincipalType = ‘Group’) AND (Principals_1.PrincipalType = ‘User’) AND (ACL_2.RightName = ‘OwnTicket’) AND ((ACL_2.ObjectType = ‘RT::Queue’ AND ACL_2.ObjectId = 3) OR (ACL_2.ObjectType = ‘RT::System’)) ORDER BY main.Name ASC

The first, on our database, returns no entries at all, whilst the
second returns the expected number of entries (the list of privileged
users).

Concentrating on this very slow one and asking the query planner
with EXPLAIN ANALYZE:

Unique (cost=147226.19…147269.59 rows=496 width=3382) (actual time=30888.166…30888.166 rows=0 loops=1)
→ Sort (cost=147226.19…147227.43 rows=496 width=3382) (actual time=30888.164…30888.164 rows=0 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress, main.freeformcontactinfo, main.organization, main.realname, main.nickname, main.lang, main.emailencoding, main.webencoding, main.externalcontactinfoid, main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, main.homephone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state, main.zip, main.country, main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, main.lastupdated
Sort Method: quicksort Memory: 25kB
→ Hash Join (cost=76985.91…147203.98 rows=496 width=3382) (actual time=30888.088…30888.088 rows=0 loops=1)
Hash Cond: ((groups_3.type)::text = (acl_4.principaltype)::text)
→ Hash Join (cost=76933.22…147046.33 rows=3077 width=3388) (actual time=30887.657…30887.657 rows=0 loops=1)
Hash Cond: (cachedgroupmembers_2.groupid = groups_3.id)
→ Hash Join (cost=57544.03…126361.67 rows=126470 width=3386) (actual time=1509.645…29963.426 rows=1278161 loops=1)
Hash Cond: (cachedgroupmembers_2.memberid = main.id)
→ Seq Scan on cachedgroupmembers cachedgroupmembers_2 (cost=0.00…54962.02 rows=3357602 width=8) (actual time=0.026…2268.758 rows=3357455 loops=1)
→ Hash (cost=57500.57…57500.57 rows=3477 width=3386) (actual time=1509.465…1509.465 rows=92315 loops=1)
→ Hash Join (cost=52857.90…57500.57 rows=3477 width=3386) (actual time=993.708…1360.215 rows=92315 loops=1)
Hash Cond: (main.id = principals_1.id)
→ Seq Scan on users main (cost=0.00…3223.16 rows=92316 width=3382) (actual time=0.022…75.080 rows=92316 loops=1)
→ Hash (cost=51835.42…51835.42 rows=81798 width=4) (actual time=993.584…993.584 rows=92316 loops=1)
→ Seq Scan on principals principals_1 (cost=0.00…51835.42 rows=81798 width=4) (actual time=0.036…914.611 rows=92316 loops=1)
Filter: ((id <> 1) AND (disabled = 0) AND ((principaltype)::text = ‘User’::text))
→ Hash (cost=18759.68…18759.68 rows=50361 width=10) (actual time=0.092…0.092 rows=4 loops=1)
→ Bitmap Heap Scan on groups groups_3 (cost=1907.17…18759.68 rows=50361 width=10) (actual time=0.078…0.081 rows=4 loops=1)
Recheck Cond: ((domain)::text = ‘RT::System-Role’::text)
→ Bitmap Index Scan on groups1 (cost=0.00…1894.58 rows=50361 width=0) (actual time=0.059…0.059 rows=4 loops=1)
Index Cond: ((domain)::text = ‘RT::System-Role’::text)
→ Hash (cost=52.39…52.39 rows=24 width=6) (actual time=0.401…0.401 rows=97 loops=1)
→ Bitmap Heap Scan on acl acl_4 (cost=8.77…52.39 rows=24 width=6) (actual time=0.228…0.303 rows=97 loops=1)
Recheck Cond: ((((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::Queue’::text) AND (objectid = 3)) OR (((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::System’::text)))
→ BitmapOr (cost=8.77…8.77 rows=24 width=0) (actual time=0.205…0.205 rows=0 loops=1)
→ Bitmap Index Scan on acl1 (cost=0.00…4.30 rows=4 width=0) (actual time=0.091…0.091 rows=2 loops=1)
Index Cond: (((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::Queue’::text) AND (objectid = 3))
→ Bitmap Index Scan on acl1 (cost=0.00…4.45 rows=20 width=0) (actual time=0.113…0.113 rows=95 loops=1)
Index Cond: (((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::System’::text))
Total runtime: 30892.368 ms

Here most of the time is spent in the hash join for:

Hash Cond: (cachedgroupmembers_2.memberid = main.id)

and this join eliminates all the remaining rows.

Clearly for us, this first join is useless, but can anyone comment on
whether this is usual, and why it’s so slow for us?

We’re running an essentially identical database configuration with
another instance of 3.8.6, which has around 10 privileged users and so
unsurprisingly doesn’t have this problem, and the query planner is
different here:

Unique (cost=638.22…638.31 rows=1 width=1086) (actual time=0.355…0.355 rows=0 loops=1)
→ Sort (cost=638.22…638.22 rows=1 width=1086) (actual time=0.353…0.353 rows=0 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress, main.freeformcontactinfo, main.organization, main.realname, main.nickname, main.lang, main.emailencoding, main.webencoding, main.externalcontactinfoid, main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, main.homephone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state, main.zip, main.country, main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, main.lastupdated
Sort Method: quicksort Memory: 25kB
→ Nested Loop (cost=0.00…638.21 rows=1 width=1086) (actual time=0.300…0.300 rows=0 loops=1)
→ Nested Loop (cost=0.00…637.87 rows=1 width=8) (actual time=0.299…0.299 rows=0 loops=1)
→ Nested Loop (cost=0.00…635.76 rows=6 width=4) (actual time=0.298…0.298 rows=0 loops=1)
→ Nested Loop (cost=0.00…608.38 rows=3 width=4) (actual time=0.296…0.296 rows=0 loops=1)
→ Seq Scan on acl acl_4 (cost=0.00…16.40 rows=2 width=6) (actual time=0.031…0.219 rows=3 loops=1)
Filter: (((rightname)::text = ‘OwnTicket’::text) AND ((((objecttype)::text = ‘RT::Queue’::text) AND (objectid = 3)) OR ((objecttype)::text = ‘RT::System’::text)))
→ Index Scan using groups1 on groups groups_3 (cost=0.00…294.15 rows=147 width=10) (actual time=0.024…0.024 rows=0 loops=3)
Index Cond: (((groups_3.domain)::text = ‘RT::System-Role’::text) AND ((groups_3.type)::text = (acl_4.principaltype)::text))
→ Index Scan using cachedgroupmembers3 on cachedgroupmembers cachedgroupmembers_2 (cost=0.00…9.06 rows=5 width=8) (never executed)
Index Cond: (cachedgroupmembers_2.groupid = groups_3.id)
→ Index Scan using principals_pkey on principals principals_1 (cost=0.00…0.34 rows=1 width=4) (never executed)
Index Cond: (principals_1.id = cachedgroupmembers_2.memberid)
Filter: ((principals_1.id <> 1) AND (principals_1.disabled = 0::smallint) AND ((principals_1.principaltype)::text = ‘User’::text))
→ Index Scan using users_pkey on users main (cost=0.00…0.33 rows=1 width=1086) (never executed)
Index Cond: (main.id = principals_1.id)
Total runtime: 0.602 ms

cachedgroupmembers and users have the same indexes on both systems
(I’m not sure whether that’s relevant).

Both systems are Debian lenny, RT 3.8.6, Postgres 8.3.

If anyone has any other advice about running RT 3.8 on postgres
I’d be interested too - I’ve added a couple of extra indexes:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));
CREATE INDEX users5 ON users (LOWER(emailaddress));

which are missing from the default installation which have
cut down some other common slow queries (the former
is already in an RT ticket:

http://rt3.fsck.com/Ticket/Display.html?id=13056

)

Thanks,
Dominic.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

I’m migrating from an RT 2 install to an RT 3.8 install with around
170 privileged users (and around 90,000 total users).

I’ve done some initial testing with RT 3.8.6 and have observed
that building the list of privileged users (in the “create new
ticket” (ticket owner), “display ticket” (reminder owner) and
“ticket search” (ticket owner) pages) takes around 20-30 seconds to run.

It performs two queries which are logged by my slow query logger:

Sorry, I wrote this a few days ago, and forgot to note that I have
worked around this:

http://github.com/jmdh/rt/commit/83b2c9fe39be7ec9ce32940f82b0d47c1025b482

but I haven’t fully analysed the impact of this, only that it doesn’t
make any difference with our data - it’s clearly not
suitable for general use.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

signature.asc (197 Bytes)

I’m migrating from an RT 2 install to an RT 3.8 install with around
170 privileged users (and around 90,000 total users).

I’ve done some initial testing with RT 3.8.6 and have observed
that building the list of privileged users (in the “create new
ticket” (ticket owner), “display ticket” (reminder owner) and
“ticket search” (ticket owner) pages) takes around 20-30 seconds to run.

It performs two queries which are logged by my slow query logger:

LOG: duration: 30570.089 ms statement: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_4 JOIN Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.MemberId = Principals_1.id ) JOIN Groups Groups_3 ON ( Groups_3.id = CachedGroupMembers_2.GroupId ) WHERE (Principals_1.Disabled = ‘0’) AND (ACL_4.PrincipalType = Groups_3.Type) AND (Principals_1.id != ‘1’) AND (Principals_1.PrincipalType = ‘User’) AND (ACL_4.RightName = ‘OwnTicket’) AND (Groups_3.Domain = ‘RT::System-Role’) AND ((ACL_4.ObjectType = ‘RT::Queue’ AND ACL_4.ObjectId = 3) OR (ACL_4.ObjectType = ‘RT::System’)) ORDER BY main.Name ASC

LOG: duration: 824.692 ms statement: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_2 JOIN Principals Principals_1 ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_3 ON ( CachedGroupMembers_3.MemberId = Principals_1.id ) WHERE (Principals_1.Disabled = ‘0’) AND (ACL_2.PrincipalId = CachedGroupMembers_3.GroupId) AND (Principals_1.id != ‘1’) AND (ACL_2.PrincipalType = ‘Group’) AND (Principals_1.PrincipalType = ‘User’) AND (ACL_2.RightName = ‘OwnTicket’) AND ((ACL_2.ObjectType = ‘RT::Queue’ AND ACL_2.ObjectId = 3) OR (ACL_2.ObjectType = ‘RT::System’)) ORDER BY main.Name ASC

The first, on our database, returns no entries at all, whilst the
second returns the expected number of entries (the list of privileged
users).

Concentrating on this very slow one and asking the query planner
with EXPLAIN ANALYZE:

Unique (cost=147226.19…147269.59 rows=496 width=3382) (actual time=30888.166…30888.166 rows=0 loops=1)
→ Sort (cost=147226.19…147227.43 rows=496 width=3382) (actual time=30888.164…30888.164 rows=0 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress, main.freeformcontactinfo, main.organization, main.realname, main.nickname, main.lang, main.emailencoding, main.webencoding, main.externalcontactinfoid, main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, main.homephone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state, main.zip, main.country, main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, main.lastupdated
Sort Method: quicksort Memory: 25kB
→ Hash Join (cost=76985.91…147203.98 rows=496 width=3382) (actual time=30888.088…30888.088 rows=0 loops=1)
Hash Cond: ((groups_3.type)::text = (acl_4.principaltype)::text)
→ Hash Join (cost=76933.22…147046.33 rows=3077 width=3388) (actual time=30887.657…30887.657 rows=0 loops=1)
Hash Cond: (cachedgroupmembers_2.groupid = groups_3.id)
→ Hash Join (cost=57544.03…126361.67 rows=126470 width=3386) (actual time=1509.645…29963.426 rows=1278161 loops=1)
Hash Cond: (cachedgroupmembers_2.memberid = main.id)
→ Seq Scan on cachedgroupmembers cachedgroupmembers_2 (cost=0.00…54962.02 rows=3357602 width=8) (actual time=0.026…2268.758 rows=3357455 loops=1)
→ Hash (cost=57500.57…57500.57 rows=3477 width=3386) (actual time=1509.465…1509.465 rows=92315 loops=1)
→ Hash Join (cost=52857.90…57500.57 rows=3477 width=3386) (actual time=993.708…1360.215 rows=92315 loops=1)
Hash Cond: (main.id = principals_1.id)
→ Seq Scan on users main (cost=0.00…3223.16 rows=92316 width=3382) (actual time=0.022…75.080 rows=92316 loops=1)
→ Hash (cost=51835.42…51835.42 rows=81798 width=4) (actual time=993.584…993.584 rows=92316 loops=1)
→ Seq Scan on principals principals_1 (cost=0.00…51835.42 rows=81798 width=4) (actual time=0.036…914.611 rows=92316 loops=1)
Filter: ((id <> 1) AND (disabled = 0) AND ((principaltype)::text = ‘User’::text))
→ Hash (cost=18759.68…18759.68 rows=50361 width=10) (actual time=0.092…0.092 rows=4 loops=1)
→ Bitmap Heap Scan on groups groups_3 (cost=1907.17…18759.68 rows=50361 width=10) (actual time=0.078…0.081 rows=4 loops=1)
Recheck Cond: ((domain)::text = ‘RT::System-Role’::text)
→ Bitmap Index Scan on groups1 (cost=0.00…1894.58 rows=50361 width=0) (actual time=0.059…0.059 rows=4 loops=1)
Index Cond: ((domain)::text = ‘RT::System-Role’::text)
→ Hash (cost=52.39…52.39 rows=24 width=6) (actual time=0.401…0.401 rows=97 loops=1)
→ Bitmap Heap Scan on acl acl_4 (cost=8.77…52.39 rows=24 width=6) (actual time=0.228…0.303 rows=97 loops=1)
Recheck Cond: ((((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::Queue’::text) AND (objectid = 3)) OR (((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::System’::text)))
→ BitmapOr (cost=8.77…8.77 rows=24 width=0) (actual time=0.205…0.205 rows=0 loops=1)
→ Bitmap Index Scan on acl1 (cost=0.00…4.30 rows=4 width=0) (actual time=0.091…0.091 rows=2 loops=1)
Index Cond: (((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::Queue’::text) AND (objectid = 3))
→ Bitmap Index Scan on acl1 (cost=0.00…4.45 rows=20 width=0) (actual time=0.113…0.113 rows=95 loops=1)
Index Cond: (((rightname)::text = ‘OwnTicket’::text) AND ((objecttype)::text = ‘RT::System’::text))
Total runtime: 30892.368 ms

Here most of the time is spent in the hash join for:

Hash Cond: (cachedgroupmembers_2.memberid = main.id)

and this join eliminates all the remaining rows.

Clearly for us, this first join is useless, but can anyone comment on
whether this is usual, and why it’s so slow for us?

We’re running an essentially identical database configuration with
another instance of 3.8.6, which has around 10 privileged users and so
unsurprisingly doesn’t have this problem, and the query planner is
different here:

Unique (cost=638.22…638.31 rows=1 width=1086) (actual time=0.355…0.355 rows=0 loops=1)
→ Sort (cost=638.22…638.22 rows=1 width=1086) (actual time=0.353…0.353 rows=0 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress, main.freeformcontactinfo, main.organization, main.realname, main.nickname, main.lang, main.emailencoding, main.webencoding, main.externalcontactinfoid, main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, main.homephone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state, main.zip, main.country, main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, main.lastupdated
Sort Method: quicksort Memory: 25kB
→ Nested Loop (cost=0.00…638.21 rows=1 width=1086) (actual time=0.300…0.300 rows=0 loops=1)
→ Nested Loop (cost=0.00…637.87 rows=1 width=8) (actual time=0.299…0.299 rows=0 loops=1)
→ Nested Loop (cost=0.00…635.76 rows=6 width=4) (actual time=0.298…0.298 rows=0 loops=1)
→ Nested Loop (cost=0.00…608.38 rows=3 width=4) (actual time=0.296…0.296 rows=0 loops=1)
→ Seq Scan on acl acl_4 (cost=0.00…16.40 rows=2 width=6) (actual time=0.031…0.219 rows=3 loops=1)
Filter: (((rightname)::text = ‘OwnTicket’::text) AND ((((objecttype)::text = ‘RT::Queue’::text) AND (objectid = 3)) OR ((objecttype)::text = ‘RT::System’::text)))
→ Index Scan using groups1 on groups groups_3 (cost=0.00…294.15 rows=147 width=10) (actual time=0.024…0.024 rows=0 loops=3)
Index Cond: (((groups_3.domain)::text = ‘RT::System-Role’::text) AND ((groups_3.type)::text = (acl_4.principaltype)::text))
→ Index Scan using cachedgroupmembers3 on cachedgroupmembers cachedgroupmembers_2 (cost=0.00…9.06 rows=5 width=8) (never executed)
Index Cond: (cachedgroupmembers_2.groupid = groups_3.id)
→ Index Scan using principals_pkey on principals principals_1 (cost=0.00…0.34 rows=1 width=4) (never executed)
Index Cond: (principals_1.id = cachedgroupmembers_2.memberid)
Filter: ((principals_1.id <> 1) AND (principals_1.disabled = 0::smallint) AND ((principals_1.principaltype)::text = ‘User’::text))
→ Index Scan using users_pkey on users main (cost=0.00…0.33 rows=1 width=1086) (never executed)
Index Cond: (main.id = principals_1.id)
Total runtime: 0.602 ms

cachedgroupmembers and users have the same indexes on both systems
(I’m not sure whether that’s relevant).

Both systems are Debian lenny, RT 3.8.6, Postgres 8.3.

If anyone has any other advice about running RT 3.8 on postgres
I’d be interested too - I’ve added a couple of extra indexes:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));
CREATE INDEX users5 ON users (LOWER(emailaddress));

which are missing from the default installation which have
cut down some other common slow queries (the former
is already in an RT ticket:

http://rt3.fsck.com/Ticket/Display.html?id=13056

)

Thanks,
Dominic.

Hi Dominic,

First, do you have $UseSQLForACLChecks set? I know that that
is a new option and there may still be performance tuning that
needs to be done to have it work well. We run RT 3.8.5 on a
PostgreSQL 8.4.1 database with 25K users and about 400 privileged
users and we do not see a performance problem. Would you mind
posting your postgres.conf changes from the default values as
well as the indexes you have defined for the tables involved.
Also, what is your statistics target for your tables?

Regards,
Ken

I’m migrating from an RT 2 install to an RT 3.8 install with around
170 privileged users (and around 90,000 total users).

I’ve done some initial testing with RT 3.8.6 and have observed
that building the list of privileged users (in the “create new
ticket” (ticket owner), “display ticket” (reminder owner) and
“ticket search” (ticket owner) pages) takes around 20-30 seconds to run.

It performs two queries which are logged by my slow query logger:

[snip details]

Both systems are Debian lenny, RT 3.8.6, Postgres 8.3.

If anyone has any other advice about running RT 3.8 on postgres
I’d be interested too - I’ve added a couple of extra indexes:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));
CREATE INDEX users5 ON users (LOWER(emailaddress));

which are missing from the default installation which have
cut down some other common slow queries (the former
is already in an RT ticket:

http://rt3.fsck.com/Ticket/Display.html?id=13056

)

First, do you have $UseSQLForACLChecks set? I know that that
is a new option and there may still be performance tuning that
needs to be done to have it work well.

No, it’s not set. That in itself we’d certainly like to use, but it
introduced yet another unacceptable slowdown - something to analyse
separately, probably.

We run RT 3.8.5 on a
PostgreSQL 8.4.1 database with 25K users and about 400 privileged
users and we do not see a performance problem. Would you mind
posting your postgres.conf changes from the default values as
well as the indexes you have defined for the tables involved.

I’ve attached our postgresql.conf.

The indexes we have defined are the standard ones from the 3.8.6
schemas, plus one of the two I already posted:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));

I’ve just noticed that this one wasn’t created on the particular test
instance I’m talking about, but the query in question doesn’t use
emailaddress, so that’s probably not relevant:

CREATE INDEX users5 ON users (LOWER(emailaddress));

For completeness, the indexes defined on the relevant tables are:

users:
“users_pkey” PRIMARY KEY, btree (id)
“users1” UNIQUE, btree (name)
“users3” btree (id, emailaddress)
“users4” btree (emailaddress)

acl:
“acl_pkey” PRIMARY KEY, btree (id)
“acl1” btree (rightname, objecttype, objectid, principaltype, principalid)

principals:
“principals_pkey” PRIMARY KEY, btree (id)
“principals2” btree (objectid)

cachedgroupmembers:
“cachedgroupmembers_pkey” PRIMARY KEY, btree (id)
“cachedgroupmembers2” btree (memberid)
“cachedgroupmembers3” btree (groupid)
“disgroumem” btree (groupid, memberid, disabled)

groups:
“groups_pkey” PRIMARY KEY, btree (id)
“groups1” UNIQUE, btree (domain, instance, type, id, name)
“groups2” btree (type, instance, domain)
“groups3” btree (lower(domain::text), lower(type::text))

Also, what is your statistics target for your tables?

default_statistics_target = 10

and no per-table changes. I’m not familiar with tuning this; would
you suggest a different value?

Thanks,
Dominic.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

postgresql.conf (3.64 KB)

signature.asc (197 Bytes)

Here most of the time is spent in the hash join for:

Hash Cond: (cachedgroupmembers_2.memberid = main.id)

What indexes do you have on cachedgroupmembers? (Also, I presume your
Postgres is reasonably well tuned, autovacuumed, etc)

What indexes do you have on cachedgroupmembers?

"cachedgroupmembers_pkey" PRIMARY KEY, btree (id)
"cachedgroupmembers2" btree (memberid)
"cachedgroupmembers3" btree (groupid)
"disgroumem" btree (groupid, memberid, disabled)
"memim" btree (memberid, immediateparentid)

(Also, I presume your
Postgres is reasonably well tuned,

I didn’t prepare the configuration for this partcular Postgres
instance and I’m not a postgres expert, but yes, I believe it’s
reasonably well tuned for the type of server. The postgresql.conf
is attached to another message in this thread.

autovacuumed, etc)

Yup, I believe so. I also did a manual VACUUM ANALYZE before running
the tests.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

signature.asc (197 Bytes)

I’m migrating from an RT 2 install to an RT 3.8 install with around
170 privileged users (and around 90,000 total users).

I’ve done some initial testing with RT 3.8.6 and have observed
that building the list of privileged users (in the “create new
ticket” (ticket owner), “display ticket” (reminder owner) and
“ticket search” (ticket owner) pages) takes around 20-30 seconds to run.

It performs two queries which are logged by my slow query logger:

[snip details]

Both systems are Debian lenny, RT 3.8.6, Postgres 8.3.

If anyone has any other advice about running RT 3.8 on postgres
I’d be interested too - I’ve added a couple of extra indexes:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));
CREATE INDEX users5 ON users (LOWER(emailaddress));

which are missing from the default installation which have
cut down some other common slow queries (the former
is already in an RT ticket:

http://rt3.fsck.com/Ticket/Display.html?id=13056

)

First, do you have $UseSQLForACLChecks set? I know that that
is a new option and there may still be performance tuning that
needs to be done to have it work well.

No, it’s not set. That in itself we’d certainly like to use, but it
introduced yet another unacceptable slowdown - something to analyse
separately, probably.

We run RT 3.8.5 on a
PostgreSQL 8.4.1 database with 25K users and about 400 privileged
users and we do not see a performance problem. Would you mind
posting your postgres.conf changes from the default values as
well as the indexes you have defined for the tables involved.

I’ve attached our postgresql.conf.

The indexes we have defined are the standard ones from the 3.8.6
schemas, plus one of the two I already posted:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));

I’ve just noticed that this one wasn’t created on the particular test
instance I’m talking about, but the query in question doesn’t use
emailaddress, so that’s probably not relevant:

CREATE INDEX users5 ON users (LOWER(emailaddress));

For completeness, the indexes defined on the relevant tables are:

users:
“users_pkey” PRIMARY KEY, btree (id)
“users1” UNIQUE, btree (name)
“users3” btree (id, emailaddress)
“users4” btree (emailaddress)

acl:
“acl_pkey” PRIMARY KEY, btree (id)
“acl1” btree (rightname, objecttype, objectid, principaltype, principalid)

principals:
“principals_pkey” PRIMARY KEY, btree (id)
“principals2” btree (objectid)

cachedgroupmembers:
“cachedgroupmembers_pkey” PRIMARY KEY, btree (id)
“cachedgroupmembers2” btree (memberid)
“cachedgroupmembers3” btree (groupid)
“disgroumem” btree (groupid, memberid, disabled)

groups:
“groups_pkey” PRIMARY KEY, btree (id)
“groups1” UNIQUE, btree (domain, instance, type, id, name)
“groups2” btree (type, instance, domain)
“groups3” btree (lower(domain::text), lower(type::text))

Also, what is your statistics target for your tables?

default_statistics_target = 10

and no per-table changes. I’m not familiar with tuning this; would
you suggest a different value?

Thanks,
Dominic.


Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

Hi Dominic,

Here are the indexes that we have that differ from your
setup:

“users1” UNIQUE, btree (lower(name::text))

instead of:

“groups3” btree (lower(domain::text), lower(type::text))

we have:

“groups2” btree (lower(type::text), lower(domain::text), instance)

You also should definitely raise the statistics target to
at least 100, which is the new default in 8.4. We also have
the random_page_cost set to 2.0 since we are mainly memory
resident. I know that the index order needs to match the
query to be used, so maybe these index changes would help.

Regards,
Ken

Content-Description: Config file for problematic RT instance database server

The indexes we have defined are the standard ones from the 3.8.6
schemas, plus one of the two I already posted:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));

I’ve just noticed that this one wasn’t created on the particular test
instance I’m talking about, but the query in question doesn’t use
emailaddress, so that’s probably not relevant:

CREATE INDEX users5 ON users (LOWER(emailaddress));

For completeness, the indexes defined on the relevant tables are:

users:
“users_pkey” PRIMARY KEY, btree (id)
“users1” UNIQUE, btree (name)
“users3” btree (id, emailaddress)
“users4” btree (emailaddress)

acl:
“acl_pkey” PRIMARY KEY, btree (id)
“acl1” btree (rightname, objecttype, objectid, principaltype, principalid)

principals:
“principals_pkey” PRIMARY KEY, btree (id)
“principals2” btree (objectid)

cachedgroupmembers:
“cachedgroupmembers_pkey” PRIMARY KEY, btree (id)
“cachedgroupmembers2” btree (memberid)
“cachedgroupmembers3” btree (groupid)
“disgroumem” btree (groupid, memberid, disabled)

groups:
“groups_pkey” PRIMARY KEY, btree (id)
“groups1” UNIQUE, btree (domain, instance, type, id, name)
“groups2” btree (type, instance, domain)
“groups3” btree (lower(domain::text), lower(type::text))

Also, what is your statistics target for your tables?

default_statistics_target = 10

and no per-table changes. I’m not familiar with tuning this; would
you suggest a different value?

Here are the indexes that we have that differ from your
setup:

“users1” UNIQUE, btree (lower(name::text))

instead of:

“groups3” btree (lower(domain::text), lower(type::text))

we have:

“groups2” btree (lower(type::text), lower(domain::text), instance)

You also should definitely raise the statistics target to
at least 100, which is the new default in 8.4. We also have
the random_page_cost set to 2.0 since we are mainly memory
resident. I know that the index order needs to match the
query to be used, so maybe these index changes would help.

Thanks for the pointers. I’ll give them a spin on Monday and report
back.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

signature.asc (197 Bytes)

I’ve attached our postgresql.conf.

The indexes we have defined are the standard ones from the 3.8.6
schemas, plus one of the two I already posted:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));

I’ve just noticed that this one wasn’t created on the particular test
instance I’m talking about, but the query in question doesn’t use
emailaddress, so that’s probably not relevant:

CREATE INDEX users5 ON users (LOWER(emailaddress));

For completeness, the indexes defined on the relevant tables are:

users:
“users_pkey” PRIMARY KEY, btree (id)
“users1” UNIQUE, btree (name)
“users3” btree (id, emailaddress)
“users4” btree (emailaddress)

acl:
“acl_pkey” PRIMARY KEY, btree (id)
“acl1” btree (rightname, objecttype, objectid, principaltype, principalid)

principals:
“principals_pkey” PRIMARY KEY, btree (id)
“principals2” btree (objectid)

cachedgroupmembers:
“cachedgroupmembers_pkey” PRIMARY KEY, btree (id)
“cachedgroupmembers2” btree (memberid)
“cachedgroupmembers3” btree (groupid)
“disgroumem” btree (groupid, memberid, disabled)

groups:
“groups_pkey” PRIMARY KEY, btree (id)
“groups1” UNIQUE, btree (domain, instance, type, id, name)
“groups2” btree (type, instance, domain)
“groups3” btree (lower(domain::text), lower(type::text))

Also, what is your statistics target for your tables?

default_statistics_target = 10

and no per-table changes. I’m not familiar with tuning this; would
you suggest a different value?

Here are the indexes that we have that differ from your
setup:

“users1” UNIQUE, btree (lower(name::text))

instead of:

“groups3” btree (lower(domain::text), lower(type::text))

we have:

“groups2” btree (lower(type::text), lower(domain::text), instance)

You also should definitely raise the statistics target to
at least 100, which is the new default in 8.4. We also have
the random_page_cost set to 2.0 since we are mainly memory
resident. I know that the index order needs to match the
query to be used, so maybe these index changes would help.

Thanks. Bizzarely, I can’t reproduce the problematic query now; I
wonder if it skewed severely by the hammering of another database
on the same server (a test run of rt2tort3, as it happens).

I think your suggested new index in group is correct; that’s
eliminated some more slow queries. I think I’ll try and put up a
set of annotated additional postgres indexes on the wiki, in lieu
of future updates to the indexes created by the schemas shipped with
RT.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

signature.asc (197 Bytes)

I think your suggested new index in group is correct; that’s
eliminated some more slow queries. I think I’ll try and put up a
set of annotated additional postgres indexes on the wiki, in lieu
of future updates to the indexes created by the schemas shipped with
RT.

(And possibly influencing future index updates as shipped by the vendor :wink:

I think your suggested new index in group is correct; that’s
eliminated some more slow queries. I think I’ll try and put up a
set of annotated additional postgres indexes on the wiki, in lieu
of future updates to the indexes created by the schemas shipped with
RT.

(And possibly influencing future index updates as shipped by the vendor :wink:

I’ve created http://wiki.bestpractical.com/view/DatabaseIndexes

I fear that I’ve lost some of the full rationale for the indexes
I’ve put in in the frenzy of preparation for our upgrade, but hopefully
this is still useful.

Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford

signature.asc (197 Bytes)