3.4.0rc1 slow

David_Kerry · January 7, 2005, 6:08pm

Hello,

I’ve been spending some time with 3.4.0rc1 lately with some of
our existing data and have found some issues related to slow
or excessive database access.

Setup:

Postgres 7.4.6
RT 3.4.0rc1
SearchBuilder 1.16
Apache 2+mod_perl2 (Debian sarge)

Hardware: dual 3Ghz Xeon server

Both rt and database are on the same machine.

I have 2 issues:

The query builder page (when you hit the ‘tickets’ link) is
exceptionally slow to initially come up. About 10 seconds just
to bring up the empty form. Is this normal?

I’ve looked at the postgres logs, and it seems to do many many iterations
of this type of query:

LOG: duration: 1983.993 ms statement:
SELECT DISTINCT main.* FROM (Groups main LEFT JOIN Principals Principals_2 ON ( main.id = Principals_2.id)),
ACL ACL_1 WHERE ((ACL_1.RightName = ‘OwnTicket’)) AND ((Principals_2.Disabled = ‘0’)) AND ( ( ACL_1.PrincipalId = main.id AND
ACL_1.PrincipalType = ‘Group’ AND ( main.Domain = ‘SystemInternal’ OR main.Domain = ‘UserDefined’ OR main.Domain =
‘ACLEquivalence’)) OR ( ( (main.Domain = ‘RT::Queue-Role’ AND main.Instance = 102) ) AND main.Type = ACL_1.PrincipalType) ) AND
(ACL_1.ObjectType = ‘RT::System’ OR (ACL_1.ObjectType = ‘RT::Queue’ AND ACL_1.ObjectId = 102) ) ORDER BY main.Name ASC

At the time, there were about 50k+ users and 20k+ groups in the
database.
The method used in the Postgres version of SearchBuilder/Handle/Pg.pm
for insert() is suboptimal. I’ve turned on postgres logging to dump
any queries taking > .5 seconds and the query in this method comes up
continually:

ie:

postgres[18131]: [6610-1] LOG: duration: 881.154 ms statement: SELECT id FROM Attachments WHERE oid = ‘3609306’

From doing a bit of reading in the postgres docs, the oid should not
be used for this type of thing - they can/will roll, which can lead
to duplicates too.

The same query, selecting on ‘id’ from the Attachments table is
virtually instantaneous.

I would highly recommend taking the approach used in the Oracle version
of this class and select the next id from a sequence and insert using
that instead of depending on the database to auto-generate an id and
then having to figure out what id it generated.

Alternately - use guids everywhere instead of sequences/ids - easy to
generate, avoids duplicate key issues on imports and has many other
benefits.

This becomes a serious issue when you throw database replication into
the mix too.

David Kerry

Jesse_Vincent · January 7, 2005, 10:59pm

I’ve looked at the postgres logs, and it seems to do many many iterations
of this type of query:

LOG: duration: 1983.993 ms statement:
SELECT DISTINCT main.* FROM (Groups main LEFT JOIN Principals Principals_2 ON ( main.id = Principals_2.id)),
ACL ACL_1 WHERE ((ACL_1.RightName = ‘OwnTicket’)) AND ((Principals_2.Disabled = ‘0’)) AND ( ( ACL_1.PrincipalId = main.id AND
ACL_1.PrincipalType = ‘Group’ AND ( main.Domain = ‘SystemInternal’ OR main.Domain = ‘UserDefined’ OR main.Domain =
‘ACLEquivalence’)) OR ( ( (main.Domain = ‘RT::Queue-Role’ AND main.Instance = 102) ) AND main.Type = ACL_1.PrincipalType) ) AND
(ACL_1.ObjectType = ‘RT::System’ OR (ACL_1.ObjectType = ‘RT::Queue’ AND ACL_1.ObjectId = 102) ) ORDER BY main.Name ASC

At the time, there were about 50k+ users and 20k+ groups in the
database.

I’d love it if you could dig a bit more into what’s killing that query.
Pull off the distinct and see if we’re doing something stupid and
generating a huge cartesian porduct?

The method used in the Postgres version of SearchBuilder/Handle/Pg.pm
for insert() is suboptimal. I’ve turned on postgres logging to dump
any queries taking > .5 seconds and the query in this method comes up
continually:

Interesting. At the time we did that (years ago now), that was the
recommended way to do unique ids on postgres. I’d love to see a patch
for improved behaviour.

ie:

postgres[18131]: [6610-1] LOG: duration: 881.154 ms statement: SELECT id FROM Attachments WHERE oid = ‘3609306’

From doing a bit of reading in the postgres docs, the oid should not
be used for this type of thing - they can/will roll, which can lead
to duplicates too.

But not instantly. And the only time we’re doing that is immediately
after the insert.

Andrew_Sullivan · January 7, 2005, 11:13pm

From doing a bit of reading in the postgres docs, the oid should not
be used for this type of thing - they can/will roll, which can lead
to duplicates too.

But not instantly. And the only time we’re doing that is immediately
after the insert.

More importantly, Postgres won’t have oids on most tables by default
in the near future. So it won’t even work unless your DBA knows to
flip the WITH OIDS option.

A

Andrew Sullivan | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
–Alexander Hamilton

Jesse_Vincent · January 8, 2005, 4:10am

More importantly, Postgres won’t have oids on most tables by default
in the near future. So it won’t even work unless your DBA knows to
flip the WITH OIDS option.

It was only a year or two that I was seeing people widely recommending
the strategy SB is using to get insert ids. There are going to be lots
of sad webapp developers out there But at least I can fix it easily.

David_Kerry · January 8, 2005, 5:28am

I’ve looked at the postgres logs, and it seems to do many many iterations
of this type of query:

LOG: duration: 1983.993 ms statement:
SELECT DISTINCT main.* FROM (Groups main LEFT JOIN Principals Principals_2 ON ( main.id = Principals_2.id)),
ACL ACL_1 WHERE ((ACL_1.RightName = ‘OwnTicket’)) AND ((Principals_2.Disabled = ‘0’)) AND ( ( ACL_1.PrincipalId = main.id AND
ACL_1.PrincipalType = ‘Group’ AND ( main.Domain = ‘SystemInternal’ OR main.Domain = ‘UserDefined’ OR main.Domain =
‘ACLEquivalence’)) OR ( ( (main.Domain = ‘RT::Queue-Role’ AND main.Instance = 102) ) AND main.Type = ACL_1.PrincipalType) ) AND
(ACL_1.ObjectType = ‘RT::System’ OR (ACL_1.ObjectType = ‘RT::Queue’ AND ACL_1.ObjectId = 102) ) ORDER BY main.Name ASC

At the time, there were about 50k+ users and 20k+ groups in the
database.

I’d love it if you could dig a bit more into what’s killing that query.
Pull off the distinct and see if we’re doing something stupid and
generating a huge cartesian porduct?

Ok - my mistake - I hadn’t done an analyze on the tables/indexes when I ran
that (was taking 30+ seconds without the analyze to bring up the query page).

After the analyze, that same query drops down to ~500ms. Still fairly long
given the hardware I’m running on, however.

I’ve played with the query a bit, and removing the distinct doesn’t really
change things much. The result is 8 rows returned instead of 7. Removing
the distinct and order by results in the same cost by the optimizer
(according to ‘explain’, anyways).

I also did some more investigation and there are other things amiss I
think.

Simply bringing up the tickets->query builder page generates over 129 sql
queries ?! 19 of those queries took between 400 and 600ms apiece. That’s
an awful lot of data trolling to bring up one screen. Is that right, or
is my setup misbehaving/borked somehow? I have saved the entire sql trace
in case anyone is interested in looking at it in detail.

Everything else so far seems to be reasonable snappy except for this one
screen, which is unfortunate, because my users will be running into that
particular one continuously.

The method used in the Postgres version of SearchBuilder/Handle/Pg.pm
for insert() is suboptimal. I’ve turned on postgres logging to dump
any queries taking > .5 seconds and the query in this method comes up
continually:

Interesting. At the time we did that (years ago now), that was the
recommended way to do unique ids on postgres. I’d love to see a patch
for improved behaviour.

I’ve got a patch worked up for this - basically I just copied the oracle
way (which should be many many times faster, but I still have to test it).
This should give improved insert performance (which has been an issue
for us in the past as well under particularly high incoming mail volume).

David Kerry

Jesse_Vincent · January 8, 2005, 6:21am

Simply bringing up the tickets->query builder page generates over 129 sql
queries ?! 19 of those queries took between 400 and 600ms apiece. That’s
an awful lot of data trolling to bring up one screen. Is that right, or
is my setup misbehaving/borked somehow? I have saved the entire sql trace
in case anyone is interested in looking at it in detail.

My first guess is that it’s mostly in the dropdowns for picking ticket
owners by name or group membership. Using MasonX::Profiler, you might
be able to see which components are causing the slowdowns, which will
help to find out how to fix em. Was 3.2 similarly slow?

Todd_Chapman · January 8, 2005, 6:04am

I have found that screen to be particularly slow in 3.2

-ToddOn Sat, Jan 08, 2005 at 01:21:57AM -0500, Jesse Vincent wrote:

Simply bringing up the tickets->query builder page generates over 129 sql
queries ?! 19 of those queries took between 400 and 600ms apiece. That’s
an awful lot of data trolling to bring up one screen. Is that right, or
is my setup misbehaving/borked somehow? I have saved the entire sql trace
in case anyone is interested in looking at it in detail.

My first guess is that it’s mostly in the dropdowns for picking ticket
owners by name or group membership. Using MasonX::Profiler, you might
be able to see which components are causing the slowdowns, which will
help to find out how to fix em. Was 3.2 similarly slow?

Everything else so far seems to be reasonable snappy except for this one
screen, which is unfortunate, because my users will be running into that
particular one continuously.

Rt-devel mailing list
Rt-devel@lists.bestpractical.com
The rt-devel Archives

Andrew_Sullivan · January 8, 2005, 3:54pm

More importantly, Postgres won’t have oids on most tables by default
in the near future. So it won’t even work unless your DBA knows to
flip the WITH OIDS option.

It was only a year or two that I was seeing people widely recommending
the strategy SB is using to get insert ids. There are going to be lots
of sad webapp developers out there But at least I can fix it easily.

Yike. That’s unfortunate, because it was always a bad idea, although
not as potentially bad as it is now (ok, not quite always.
Originally the oids were regarded as a feature, if you harken back to
the days when it really was Postgres rather than PostgreSQL). But I
certainly remember being cautioned by some against using them in
pre-7.0 versions, and that’s several years ago. That doesn’t mean
plenty of people weren’t advocates of them as well.

A

Andrew Sullivan | ajs@crankycanuck.ca
Information security isn’t a technological problem. It’s an economics
problem.
–Bruce Schneier

Andrew_Sullivan · January 8, 2005, 3:57pm

Ok - my mistake - I hadn’t done an analyze on the tables/indexes when I ran
that (was taking 30+ seconds without the analyze to bring up the query page).

I did wonder about this area, because it’s something I know was
recently improved. But this one. . .

Simply bringing up the tickets->query builder page generates over 129 sql
queries ?! 19 of those queries took between 400 and 600ms apiece. That’s

. . . is, according to one of the DBAs in my group at work, primarily
due to the way the ACLs are processed. So, more objects for
permissions entails more queries, and the delay expands linearly
accorfdingly. We are also finding that screen to make the system
extremely hard to use.

A

Andrew Sullivan | ajs@crankycanuck.ca
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.
–Dennis Ritchie

Jesse_Vincent · January 8, 2005, 9:01pm

Ok - my mistake - I hadn’t done an analyze on the tables/indexes when I ran
that (was taking 30+ seconds without the analyze to bring up the query page).

I did wonder about this area, because it’s something I know was
recently improved. But this one. . .

Simply bringing up the tickets->query builder page generates over 129 sql
queries ?! 19 of those queries took between 400 and 600ms apiece. That’s

. . . is, according to one of the DBAs in my group at work, primarily
due to the way the ACLs are processed. So, more objects for
permissions entails more queries, and the delay expands linearly
accorfdingly. We are also finding that screen to make the system
extremely hard to use.

I’d love to see suggestions about how we can improve the search UI
without completely breaking it for users of downlevel browsers. (Which
means that pretty javascript hacks and popup windows are pretty much
right out.) Would people be satisfied if the pick lists for users and
groups were transformed into text entry widgets?

Andrew_Sullivan · January 8, 2005, 9:52pm

I’d love to see suggestions about how we can improve the search UI
without completely breaking it for users of downlevel browsers. (Which

Well, the problem here seems more to be the number of round trips.
One of my DB guys is looking at how the queries could be altered to
get more in one go, but he’s convinced that it needs a great deal of
work to solve.

right out.) Would people be satisfied if the pick lists for users and
groups were transformed into text entry widgets?

I think that’d just create a new problem, that you’d end up with
groups and users that are typos.

A

Andrew Sullivan | ajs@crankycanuck.ca
The plural of anecdote is not data.
–Roger Brinner

Ruslan_U_Zakirov1 · January 9, 2005, 1:35pm

Jesse Vincent wrote:

[snip]

Would people be satisfied if the pick lists for users and
groups were transformed into text entry widgets?

No.

			Regards, Ruslan.

David_Kerry · January 9, 2005, 8:43pm

Simply bringing up the tickets->query builder page generates over 129 sql
queries ?! 19 of those queries took between 400 and 600ms apiece. That’s
an awful lot of data trolling to bring up one screen. Is that right, or
is my setup misbehaving/borked somehow? I have saved the entire sql trace
in case anyone is interested in looking at it in detail.

My first guess is that it’s mostly in the dropdowns for picking ticket
owners by name or group membership. Using MasonX::Profiler, you might
be able to see which components are causing the slowdowns, which will
help to find out how to fix em. Was 3.2 similarly slow?

I haven’t used 3.2 yet. I’m considering making the big leap from 2.0.13
direct to 3.4 (due to growing complaints from my users over the slowness
of 2.0).

I’ll play with MasonX::Profile and see what I can find out as well.

David Kerry

Vivek_Khera1 · January 10, 2005, 2:26am

postgres[18131]: [6610-1] LOG: duration: 881.154 ms statement:
SELECT id FROM Attachments WHERE oid = ‘3609306’

From doing a bit of reading in the postgres docs, the oid should
not
be used for this type of thing - they can/will roll, which can lead
to duplicates too.

But not instantly. And the only time we’re doing that is immediately
after the insert.

also, there’s no index on the oid’s so it could result in a long table
scan for a large attachments table.

smime.p7s (2.42 KB)

Tony_Bowden · January 10, 2005, 7:10pm

I’d love to see suggestions about how we can improve the search UI
without completely breaking it for users of downlevel browsers. (Which
means that pretty javascript hacks and popup windows are pretty much
right out.)

Why does it rule out Javascript. What’s wrong with having something
fancier for people who have JS with the fall back to the clumsier
approach for people who don’t.

Surely JS should only be avoided if it’s the only way to achieve
something?

Tony