Speeding up RT3

Hi,

I’m currently in the process of performance testing RT3 and trying to
speed things up where needed. Currently I’m finding that the front page
is taking about 10 seconds to load for a non SuperUser (about 2-3
seconds for SU’s).

Tracking this down, it’s the QuickSearch box that is the major slowdown,
taking about 6 seconds to load, seeming most of that time is taken up
processing the ACL in perl.

I haven’t dug too far into the ACL system, but is there any way to
arrange the privileges that will result in faster ACL calls (eg, only
assigning privileges to groups, rather than to individuals,etc). Or any
other suggestions short of throwing hardware at the problem?

My setup is.

RT 3.2.2

Oracle 9.2.0.5

About a million tickets

Apache 1.3.27/FastCGI

Regards

Matthew Watson

Netspace Online Systems

I’m currently in the process of performance testing RT3 and trying to
speed things up where needed. Currently I’m finding that the front page
is taking about 10 seconds to load for a non SuperUser (about 2-3
seconds for SU’s).

What DBIx::SearchBuilder are you running? How many queues do
non-superusers have on the front page?

Tracking this down, it’s the QuickSearch box that is the major slowdown,
taking about 6 seconds to load, seeming most of that time is taken up
processing the ACL in perl.

Is that a guess or do you have profiling results?

If you don’t have profiling results, I’d strongly recommend you use
standalone_httpd and Devel::DProf to profile and post those results.

Also, what sorts of CPUs are sitting behind that oracle database? Have
your DBAs spent time profiling what RT’s doing on the database side and
looking for index improvements?

-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Friday, October 01, 2004 10:57 PM
To: Matthew Watson
Cc: rt-users@lists.fsck.com
Subject: Re: [rt-users] Speeding up RT3

I’m currently in the process of performance testing RT3 and trying
to
speed things up where needed. Currently I’m finding that the front
page
is taking about 10 seconds to load for a non SuperUser (about 2-3
seconds for SU’s).

What DBIx::SearchBuilder are you running?

1.01

How many queues do
non-superusers have on the front page?

18 queues, which have the right “SeeQueue”,“CreatTicket” and
“ReplyToTicket” assigned to “Everyone”.

Tracking this down, it’s the QuickSearch box that is the major
slowdown,
taking about 6 seconds to load, seeming most of that time is taken
up
processing the ACL in perl.

Is that a guess or do you have profiling results?

Profiling using MasonX::Profiler.

If you don’t have profiling results, I’d strongly recommend you use
standalone_httpd and Devel::DProf to profile and post those results.

I’ll have a look into doing that.

Also, what sorts of CPUs are sitting behind that oracle database? Have
your DBAs spent time profiling what RT’s doing on the database side
and
looking for index improvements?

I’m not what the CPUs are off hand, it’s dual processor sun hardware
with 4gb of ram, but I can’t remember exactly what, however we have done
quite a bit of work making sure the database isn’t the cause of the
problems, all the queries for the front page take less than 1 second to
run (combined).

The web front end is a 3ghz p4 with a gig of ram. Also tried running the
Web interface on an E450, but that was even slower.

Regards,
Matt.

My setup is.

RT 3.2.2

Oracle 9.2.0.5

About a million tickets

Apache 1.3.27/FastCGI

Regards

Matthew Watson

Netspace Online Systems


The rt-users Archives

Be sure to check out the RT wiki at http://wiki.bestpractical.com

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

What DBIx::SearchBuilder are you running?

1.01

Go up to the latest. We saw noticable performance improvements using
the latest DBIx::SearchBuilder

How many queues do
non-superusers have on the front page?

18 queues, which have the right “SeeQueue”,“CreatTicket” and
“ReplyToTicket” assigned to “Everyone”.

Tracking this down, it’s the QuickSearch box that is the major
slowdown,
taking about 6 seconds to load, seeming most of that time is taken
up
processing the ACL in perl.

Is that a guess or do you have profiling results?

Profiling using MasonX::Profiler.

nod That gets us down to mason component, but won’t tell us anything
about the underlying libraries :wink:

If you don’t have profiling results, I’d strongly recommend you use
standalone_httpd and Devel::DProf to profile and post those results.

I’ll have a look into doing that.

Wonderful.

Also, what sorts of CPUs are sitting behind that oracle database? Have
your DBAs spent time profiling what RT’s doing on the database side
and
looking for index improvements?

I’m not what the CPUs are off hand, it’s dual processor sun hardware
with 4gb of ram, but I can’t remember exactly what, however we have
done
quite a bit of work making sure the database isn’t the cause of the
problems, all the queries for the front page take less than 1 second to
run (combined).

Now that’s’s interesting. Have you ended up reindexing the Oracle
database? If so, I know some folks who would love to see the work
you’ve done. But that points to library issues. And the SearchBuilder
issues
were the ones that were most visible in our profiling runs here at Best
Practical. (These days, much of RT’s time is spent in mason rendering
the page itself. Amazon.com has been doing a bunch of mason performance
work that will significantly improve that in the next big release)

Jesse

-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Sunday, October 03, 2004 12:25 AM
To: Matthew Watson
Cc: rt-users@lists.fsck.com
Subject: Re: [rt-users] Speeding up RT3

What DBIx::SearchBuilder are you running?

1.01

Go up to the latest. We saw noticable performance improvements using
the latest DBIx::SearchBuilder

Ok, I’ve bumped it up to version 1.11. Performance seems to be a lot
better. I’m waiting to hear back from some of our staff who are testing.
If they are still having issues I’ll look further into profiling it with
Devel::DProf

Now that’s’s interesting. Have you ended up reindexing the Oracle
database? If so, I know some folks who would love to see the work
you’ve done.

I’ll get a rundown of that from our DBA.

Thanks,

Regards,
Matt.

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

Now that’s’s interesting. Have you ended up reindexing the Oracle
database? If so, I know some folks who would love to see the work
you’ve done.

I’ll get a rundown of that from our DBA.

I’d be curious to know if my Postgres index changes were applicable to
Oracle as well. They currently seem to be very MySQL specific in terms
of the way queries are built.

I posted them to the devel list (or was it this list) about a month or
so ago.

Now that’s’s interesting. Have you ended up reindexing the Oracle
database? If so, I know some folks who would love to see the work
you’ve done.

I’ll get a rundown of that from our DBA.

I’d be curious to know if my Postgres index changes were applicable to
Oracle as well. They currently seem to be very MySQL specific in
terms of the way queries are built.

I posted them to the devel list (or was it this list) about a month or
so ago.

On a large Pg site that I’ve got ties to, we didn’t see a whole lot of
change with em, actually.

j

All this talk about profiling and indexeing has me playing again tonight.
A forewarning, I am not a coder, and I don’t do perl.

I’m trying desparately to use DProf to figure out where all my time is
being burnt up when loading larger tickets – database? Webserver?
Something else? However, I keep getting strange dprofpp errors
and numbers that don’t make sense (ttoal time in the negatives).

This is a relatively clean perl 5.8.5 install (upgraded just yesterday).

Am I missing a module? Am I doing something wrong?

This appears to work on anything but standalone_httpd & webmux.pl. If I
run dprofpp -p rt, I get realistic figures (although, of course, it
doesn’t actually product any valuable output) which makes me think there
is something specific and/or tricky about standalone_http.

I’m currently running RT 3.2.2.

Thanks!

[evelocity03:jdiegmueller] [/usr/local/rt/bin] # dprofpp -p
standalone_httpd
You can connect to your RT server at http://localhost:8080/

[evelocity03:jdiegmueller] [/usr/local/rt/bin] # dprofpp
Exporter::Heavy::heavy_export
Garbled profile, unexpected exit time stamp at /usr/local/bin/dprofpp line
666, line 755.

Faking 12 exit timestamp(s).
Total Elapsed Time = -8.35145 Seconds
User+System Time = 0 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 1.030 1.810 74161 0.0000 0.0000 RT::Record::__Value
0.00 0.918 4.497 14592 0.0001 0.0003 RT::Principal::HasRight
0.00 0.899 2.198 48964 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::new
0.00 0.859 4.387 49090 0.0000 0.0001 RT::Record::LoadByCols
0.00 0.790 0.719 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_gen_alternate_RecordCache_ke
y
0.00 0.750 0.516 155929 0.0000 0.0000
DBIx::SearchBuilder::Record::id
0.00 0.720 3.709 49090 0.0000 0.0001
DBIx::SearchBuilder::Record::Cacha
ble::LoadByCols
0.00 0.710 0.857 99034 0.0000 0.0000
Cache::Simple::TimedExpiry::fetch
0.00 0.700 0.885 74161 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::__Value
0.00 0.630 1.397 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_lookup_primary_RecordCache_k
ey
0.00 0.590 1.103 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::_fetch
0.00 0.550 1.319 48964 0.0000 0.0000
DBIx::SearchBuilder::Record::new
0.00 0.540 0.390 99870 0.0000 0.0000
Cache::Simple::TimedExpiry::expire
0.00 0.530 0.383 98278 0.0000 0.0000 RT::Base::CurrentUser
0.00 0.520 0.407 74812 0.0000 0.0000
DBIx::SearchBuilder::Record::__Val
ue

Funny you should mention this… I’m having exactly the same
Problems

DProf is giving me a lot of these errors

unstacked calls in outer
Garbled profile is missing some exit time stamps:

The only way I can get it to show results is with dprofpp -F

And this is what comes out

Total Elapsed Time = -4.27563 Seconds
User+System Time = 0 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 7.492 12.054 18 0.4162 0.6697
Locale::Maketext::Lexicon::Gettext
::parse
0.00 3.903 4.564 34594 0.0001 0.0001
Locale::Maketext::Lexicon::Gettext
::transform
0.00 0.589 13.217 2 0.2946 6.6083
Locale::Maketext::Lexicon::import
0.00 0.540 0.486 29981 0.0000 0.0000 Encode::decode_utf8
0.00 0.519 0.547 18 0.0288 0.0304
Locale::Maketext::Lexicon::lexicon
get
0.00 0.448 1.313 32 0.0140 0.0410
HTML::Mason::Commands::BEGIN
0.00 0.207 0.292 1 0.2069 0.2924
Convert::ASN1::parser::yyparse
0.00 0.189 0.891 12 0.0158 0.0743 RT::I18N::BEGIN
0.00 0.179 0.651 18 0.0099 0.0361 RT::tickets::BEGIN
0.00 0.169 0.188 15 0.0113 0.0126 Convert::ASN1::BEGIN
0.00 0.149 2.173 13 0.0115 0.1672 RT::Record::BEGIN
0.00 0.148 0.166 6 0.0247 0.0276
DBIx::SearchBuilder::Handle::BEGIN
0.00 0.139 1.594 14 0.0099 0.1138 RT::User::BEGIN
0.00 0.129 3.839 13 0.0099 0.2953 RT::Mason::BEGIN
0.00 0.129 0.719 9 0.0143 0.0799 Net::LDAP::BEGIN

Which doesn’t look quite right, are these the same issues you are
getting?

Matthew Watson

-----Original Message-----
From: rt-users-bounces@lists.bestpractical.com [mailto:rt-users-
bounces@lists.bestpractical.com] On Behalf Of Jason A. Diegmueller
Sent: Monday, 4 October 2004 2:46 PM
To: rt-users@lists.fsck.com
Subject: Re: [rt-users] Speeding up RT3

All this talk about profiling and indexeing has me playing again
tonight.
A forewarning, I am not a coder, and I don’t do perl.

I’m trying desparately to use DProf to figure out where all my time is
being burnt up when loading larger tickets – database? Webserver?
Something else? However, I keep getting strange dprofpp errors
and numbers that don’t make sense (ttoal time in the negatives).

This is a relatively clean perl 5.8.5 install (upgraded just
yesterday).

Am I missing a module? Am I doing something wrong?

This appears to work on anything but standalone_httpd & webmux.pl. If
I
run dprofpp -p rt, I get realistic figures (although, of course, it
doesn’t actually product any valuable output) which makes me think
there
is something specific and/or tricky about standalone_http.

I’m currently running RT 3.2.2.

Thanks!

[evelocity03:jdiegmueller] [/usr/local/rt/bin] # dprofpp -p
standalone_httpd
You can connect to your RT server at http://localhost:8080/

[evelocity03:jdiegmueller] [/usr/local/rt/bin] # dprofpp
Exporter::Heavy::heavy_export
Garbled profile, unexpected exit time stamp at /usr/local/bin/dprofpp
line
666, line 755.

Faking 12 exit timestamp(s).
Total Elapsed Time = -8.35145 Seconds
User+System Time = 0 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 1.030 1.810 74161 0.0000 0.0000 RT::Record::__Value
0.00 0.918 4.497 14592 0.0001 0.0003 RT::Principal::HasRight
0.00 0.899 2.198 48964 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::new
0.00 0.859 4.387 49090 0.0000 0.0001 RT::Record::LoadByCols
0.00 0.790 0.719 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_gen_alternate_RecordCache_ke
y
0.00 0.750 0.516 155929 0.0000 0.0000
DBIx::SearchBuilder::Record::id
0.00 0.720 3.709 49090 0.0000 0.0001
DBIx::SearchBuilder::Record::Cacha
ble::LoadByCols
0.00 0.710 0.857 99034 0.0000 0.0000
Cache::Simple::TimedExpiry::fetch
0.00 0.700 0.885 74161 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::__Value
0.00 0.630 1.397 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_lookup_primary_RecordCache_k
ey
0.00 0.590 1.103 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::_fetch
0.00 0.550 1.319 48964 0.0000 0.0000
DBIx::SearchBuilder::Record::new
0.00 0.540 0.390 99870 0.0000 0.0000
Cache::Simple::TimedExpiry::expire
0.00 0.530 0.383 98278 0.0000 0.0000 RT::Base::CurrentUser
0.00 0.520 0.407 74812 0.0000 0.0000
DBIx::SearchBuilder::Record::__Val
ue


The rt-users Archives

Be sure to check out the RT wiki at http://wiki.bestpractical.com

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

Matthew–

Yes, that’s precisely my problem. -F (“faking timestamps”?) will provide
output, but as with your case, the times are all of (total elapsed time is
negative, which is of course impossible).

perl 5.8.5
RT 3.2.2

I tried this on an RT 3.0.12 insance as well, same deal. This appears
only to affect webmux.pl, because I can run it against any other perl
scripts (“rt help”, as an example) and get sane (albeit useless) numbers.

Devel::Profiler appears to return sane numbers when run against webmux.pl,
however I can’t get it to work recursively (ie, if I run it against
standalone_httpd it doesn’t profile the behavior of webmux.pl).

-jdOn Mon, 4 Oct 2004, Matthew Watson wrote:

Funny you should mention this… I’m having exactly the same
Problems

DProf is giving me a lot of these errors

unstacked calls in outer
Garbled profile is missing some exit time stamps:

The only way I can get it to show results is with dprofpp -F

And this is what comes out

Total Elapsed Time = -4.27563 Seconds
User+System Time = 0 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 7.492 12.054 18 0.4162 0.6697
Locale::Maketext::Lexicon::Gettext
::parse
0.00 3.903 4.564 34594 0.0001 0.0001
Locale::Maketext::Lexicon::Gettext
::transform
0.00 0.589 13.217 2 0.2946 6.6083
Locale::Maketext::Lexicon::import
0.00 0.540 0.486 29981 0.0000 0.0000 Encode::decode_utf8
0.00 0.519 0.547 18 0.0288 0.0304
Locale::Maketext::Lexicon::lexicon
get
0.00 0.448 1.313 32 0.0140 0.0410
HTML::Mason::Commands::BEGIN
0.00 0.207 0.292 1 0.2069 0.2924
Convert::ASN1::parser::yyparse
0.00 0.189 0.891 12 0.0158 0.0743 RT::I18N::BEGIN
0.00 0.179 0.651 18 0.0099 0.0361 RT::tickets::BEGIN
0.00 0.169 0.188 15 0.0113 0.0126 Convert::ASN1::BEGIN
0.00 0.149 2.173 13 0.0115 0.1672 RT::Record::BEGIN
0.00 0.148 0.166 6 0.0247 0.0276
DBIx::SearchBuilder::Handle::BEGIN
0.00 0.139 1.594 14 0.0099 0.1138 RT::User::BEGIN
0.00 0.129 3.839 13 0.0099 0.2953 RT::Mason::BEGIN
0.00 0.129 0.719 9 0.0143 0.0799 Net::LDAP::BEGIN

Which doesn’t look quite right, are these the same issues you are
getting?

Matthew Watson

-----Original Message-----
From: rt-users-bounces@lists.bestpractical.com [mailto:rt-users-
bounces@lists.bestpractical.com] On Behalf Of Jason A. Diegmueller
Sent: Monday, 4 October 2004 2:46 PM
To: rt-users@lists.fsck.com
Subject: Re: [rt-users] Speeding up RT3

All this talk about profiling and indexeing has me playing again
tonight.
A forewarning, I am not a coder, and I don’t do perl.

I’m trying desparately to use DProf to figure out where all my time is
being burnt up when loading larger tickets – database? Webserver?
Something else? However, I keep getting strange dprofpp errors
and numbers that don’t make sense (ttoal time in the negatives).

This is a relatively clean perl 5.8.5 install (upgraded just
yesterday).

Am I missing a module? Am I doing something wrong?

This appears to work on anything but standalone_httpd & webmux.pl. If
I
run dprofpp -p rt, I get realistic figures (although, of course, it
doesn’t actually product any valuable output) which makes me think
there
is something specific and/or tricky about standalone_http.

I’m currently running RT 3.2.2.

Thanks!

[evelocity03:jdiegmueller] [/usr/local/rt/bin] # dprofpp -p
standalone_httpd
You can connect to your RT server at http://localhost:8080/

[evelocity03:jdiegmueller] [/usr/local/rt/bin] # dprofpp
Exporter::Heavy::heavy_export
Garbled profile, unexpected exit time stamp at /usr/local/bin/dprofpp
line
666, line 755.

Faking 12 exit timestamp(s).
Total Elapsed Time = -8.35145 Seconds
User+System Time = 0 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
0.00 1.030 1.810 74161 0.0000 0.0000 RT::Record::__Value
0.00 0.918 4.497 14592 0.0001 0.0003 RT::Principal::HasRight
0.00 0.899 2.198 48964 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::new
0.00 0.859 4.387 49090 0.0000 0.0001 RT::Record::LoadByCols
0.00 0.790 0.719 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_gen_alternate_RecordCache_ke
y
0.00 0.750 0.516 155929 0.0000 0.0000
DBIx::SearchBuilder::Record::id
0.00 0.720 3.709 49090 0.0000 0.0001
DBIx::SearchBuilder::Record::Cacha
ble::LoadByCols
0.00 0.710 0.857 99034 0.0000 0.0000
Cache::Simple::TimedExpiry::fetch
0.00 0.700 0.885 74161 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::__Value
0.00 0.630 1.397 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_lookup_primary_RecordCache_k
ey
0.00 0.590 1.103 49090 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::_fetch
0.00 0.550 1.319 48964 0.0000 0.0000
DBIx::SearchBuilder::Record::new
0.00 0.540 0.390 99870 0.0000 0.0000
Cache::Simple::TimedExpiry::expire
0.00 0.530 0.383 98278 0.0000 0.0000 RT::Base::CurrentUser
0.00 0.520 0.407 74812 0.0000 0.0000
DBIx::SearchBuilder::Record::__Val
ue


The rt-users Archives

Be sure to check out the RT wiki at http://wiki.bestpractical.com

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

Funny you should mention this… I’m having exactly the same
Problems

DProf is giving me a lot of these errors

Create a mason page that contains only

% exit;

and call that after your test. Wken you kill the process, dprof doesn’t
deal with exit conditions normally.

Jesse–

Ok, great. I created a local/html/Elements/Quit with % exit; as you
suggest. Now, when I run dprofpp, I get valid times but I still require
-F in order to run. Is this okay?

Thanks for your help …

-jdOn Mon, 4 Oct 2004, Jesse Vincent wrote:

On Mon, Oct 04, 2004 at 06:31:05PM +1000, Matthew Watson wrote:

Funny you should mention this… I’m having exactly the same
Problems

DProf is giving me a lot of these errors

Create a mason page that contains only

% exit;

and call that after your test. Wken you kill the process, dprof doesn’t
deal with exit conditions normally.

Now that’s’s interesting. Have you ended up reindexing the Oracle
database? If so, I know some folks who would love to see the work
you’ve done.

Find attached our current oracle indexes for RT3.

We also had to set

optimizer_index_cost_adj = 30;

Otherwise the optimizer was passing up some of the indexes I believe.

Regards,
Matt.

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

rt-indexes.txt (7.87 KB)

On a large Pg site that I’ve got ties to, we didn’t see a whole lot of
change with em, actually.

Shocked I am… most of the indexes as defined in the distribution are
just not used at all. They can’t be since the queries are against
lower(column) and the index is not. If you “explain” or “explain
analyze” the generated queries you’ll see what I mean.

However, some of the queries sped up from say 200ms to 85ms, so it may
not be that noticeable to the average user… :slight_smile: Pg is just plain
wicked fast anyhow.

I still do see a massive improvement on the “Home” page load time (I
have 8 queues visible to my personal account). Unfortunately the
Debug=1 parameter doesn’t cascade thru to the links within the page so
timing is hard to get accurate – when you reload the page with
?Debug=1 appended chances of the data being in the disk buffer cache is
high so the speed will be way faster.

Try this experiment: turn on the postgres query logger (in
postgres.conf, set log_min_duration=0), then hit some pages. Look at
the generated queries and run “explain analyze” against them after some
time such that the disk buffers may have flushed out.

The old indexes are never used unless you happen to hit upon an integer
index, or some of the handful of queries against some ACLs that are not
lower()'d.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

smime.p7s (2.42 KB)

Jesse Vincent wrote:

What DBIx::SearchBuilder are you running?

1.01

Go up to the latest. We saw noticable performance improvements using the
latest DBIx::SearchBuilder

This is not an exaggeration!

Excellent work.

Following up on this,

I’ve moved our RT install onto a new Dual 3.2ghz Xeon system with 2gb
ram.

Not really seeing huge improvements.

Still seeing load times of about 5 seconds for index.html , for a user
with
Access to about 20 queues. And 2-4 seconds for a ticket with a single
transaction.

I’ve tried using Devel::DProf, but the results just don’t look correct
(perhaps I’m using it wrong). For a start its getting lots of unstacked
calls, but that seems to be normal.

But User+System Time seems way off. The following is from running
Devel::Dprof with the standalone, and spending a couple of minutes
clicking
Around the UI, loading various tickets, etc.

[root@stratus bin]# time perl -d:DProf standalone_httpd

real 1m52.397s
user 0m16.495s
sys 0m1.511s

[root@stratus bin]# dprofpp -O 10 tmon.out
Storable::_freeze has 1 unstacked calls in outer
Exporter::Heavy::heavy_require_version has 1 unstacked calls in outer
CGI::path_info has 1 unstacked calls in outer
CGI::cache has 1 unstacked calls in outer
Net::LDAP::import has -1 unstacked calls in outer
Sys::Syslog::AUTOLOAD has -7 unstacked calls in outer
utf8::AUTOLOAD has -1 unstacked calls in outer
RT::Interface::Web::Handler::new has -1 unstacked calls in outer
Exporter::export_tags has -1 unstacked calls in outer
Net::LDAP::Constant::import has 1 unstacked calls in outer
File::Glob::GLOB_TILDE has 1 unstacked calls in outer
AutoLoader::AUTOLOAD has -3 unstacked calls in outer
Storable::nfreeze has 1 unstacked calls in outer
Exporter::Heavy::heavy_export_ok_tags has 4 unstacked calls in outer
CGI::AUTOLOAD has -6 unstacked calls in outer
utf8::SWASHNEW has 1 unstacked calls in outer
Storable::thaw has 1 unstacked calls in outer
POSIX::AUTOLOAD has -1 unstacked calls in outer
Sys::Syslog::ANON has 7 unstacked calls in outer
Exporter::export_ok_tags has -4 unstacked calls in outer
Exporter::Heavy::heavy_export_to_level has 2 unstacked calls in outer
Exporter::export_to_level has -2 unstacked calls in outer
CGI::request_method has 1 unstacked calls in outer
Exporter::Heavy::heavy_export has 40 unstacked calls in outer
Socket::AUTOLOAD has -10 unstacked calls in outer
CGI::unescapeHTML has 1 unstacked calls in outer
File::Glob::GLOB_BRACE has 1 unstacked calls in outer
Exporter::Heavy::heavy_export_tags has 1 unstacked calls in outer
File::Glob::GLOB_NOMAGIC has 1 unstacked calls in outer
File::Glob::AUTOLOAD has -5 unstacked calls in outer
RT::Interface::Web::Handler::NewCGIHandler has 1 unstacked calls in
outer
File::Glob::GLOB_QUOTE has 1 unstacked calls in outer
CGI::delete has 1 unstacked calls in outer
Fcntl::AUTOLOAD has -8 unstacked calls in outer
Socket::ANON has 10 unstacked calls in outer
POSIX::load_imports has 1 unstacked calls in outer
Fcntl::ANON has 8 unstacked calls in outer
File::Glob::GLOB_ALPHASORT has 1 unstacked calls in outer
Exporter::require_version has -1 unstacked calls in outer
Exporter::export has -40 unstacked calls in outer
CGI::header has 1 unstacked calls in outer
Total Elapsed Time = 103.1376 Seconds
User+System Time = 8.597692 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
5.62 0.483 2.200 892 0.0005 0.0025 RT::tickets::_parser
5.20 0.447 0.447 1670 0.0003 0.0003 DBD::Oracle::st::_prepare
4.37 0.376 10.089 3429 0.0001 0.0029
HTML::Mason::Commands::ANON
4.13 0.355 1.717 13121 0.0000 0.0001 RT::Record::LoadByCols
3.84 0.330 0.464 1457 0.0002 0.0003 Sys::Syslog::connect_udp
2.87 0.247 0.660 13760 0.0000 0.000
DBIx::SearchBuilder::Record::Cachable::new
2.45 0.211 0.211 112307 0.0000 0.0000
HTML::Mason::MethodMaker::ANON
2.12 0.182 0.244 6132 0.0000 0.0000 Regexp::Common::_decache
2.05 0.176 0.424 5828 0.0000 0.0001 DBIx::SearchBuilder::Limit
2.01 0.173 0.566 13121 0.0000 0.0000
DBIx::SearchBuilder::Record::Cachable::_lookup_primary_RecordCache_key

Any more suggestions on how we can speed RT3 up?

Regards,
Matt.

-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Sunday, October 03, 2004 12:25 AM
To: Matthew Watson
Cc: rt-users@lists.fsck.com
Subject: Re: [rt-users] Speeding up RT3

What DBIx::SearchBuilder are you running?

1.01

Go up to the latest. We saw noticable performance improvements using
the latest DBIx::SearchBuilder

How many queues do
non-superusers have on the front page?

18 queues, which have the right “SeeQueue”,“CreatTicket” and
“ReplyToTicket” assigned to “Everyone”.

Tracking this down, it’s the QuickSearch box that is the major
slowdown,
taking about 6 seconds to load, seeming most of that time is taken
up
processing the ACL in perl.

Is that a guess or do you have profiling results?

Profiling using MasonX::Profiler.

nod That gets us down to mason component, but won’t tell us anything
about the underlying libraries :wink:

If you don’t have profiling results, I’d strongly recommend you use
standalone_httpd and Devel::DProf to profile and post those
results.

I’ll have a look into doing that.

Wonderful.

Also, what sorts of CPUs are sitting behind that oracle database?
Have
your DBAs spent time profiling what RT’s doing on the database side
and
looking for index improvements?

I’m not what the CPUs are off hand, it’s dual processor sun hardware
with 4gb of ram, but I can’t remember exactly what, however we have
done
quite a bit of work making sure the database isn’t the cause of the
problems, all the queries for the front page take less than 1 second
to
run (combined).

Now that’s’s interesting. Have you ended up reindexing the Oracle
database? If so, I know some folks who would love to see the work
you’ve done. But that points to library issues. And the SearchBuilder
issues
were the ones that were most visible in our profiling runs here at
Best
Practical. (These days, much of RT’s time is spent in mason rendering
the page itself. Amazon.com has been doing a bunch of mason
performance
work that will significantly improve that in the next big release)

Jesse

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

Following up on this,

I’ve moved our RT install onto a new Dual 3.2ghz Xeon system with 2gb
ram.

Not really seeing huge improvements.

Still seeing load times of about 5 seconds for index.html , for a user
with
Access to about 20 queues. And 2-4 seconds for a ticket with a single
transaction.

I’ve tried using Devel::DProf, but the results just don’t look correct
(perhaps I’m using it wrong). For a start its getting lots of unstacked
calls, but that seems to be normal.

  1. are you exiting standalone httpd by calling a mason page with %exit
    as its only content?

  2. what version of RT are you running?

Is it 3.2.2? Can you test out 3.2-HEAD from subversion?

  1. what OS are you running?

  2. What version of DBIx::SearchBuilder are you running?

  3. are you doing any profiling of how long the backend database is
    spending on each page view?

Total Elapsed Time = 103.1376 Seconds
User+System Time = 8.597692 Seconds

That’s 8.6 seconds of CPU time over 2 minutes. What pages did you view?
how many times?

Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
5.62 0.483 2.200 892 0.0005 0.0025 RT::tickets::_parser

I wouldn’t expect this to be so high (note that it’s not that high to
begin with.) with RT 3.2-HEAD.

Looking at this, my best guess is RT is spending most of its time
waiting on the database.

Looking at this, my best guess is RT is spending most of its time
waiting on the database.

As a follow on, how many queries should be run for index.html?

I’ve setup logging of the queries that were being run, I’m find for a
user with 25 queues, for the index.html I’m getting about 250 queries!

Regards,
Matt.

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.

Looking at this, my best guess is RT is spending most of its time
waiting on the database.

As a follow on, how many queries should be run for index.html?

I’ve setup logging of the queries that were being run, I’m find for a
user with 25 queues, for the index.html I’m getting about 250 queries!

Are there duplicate queries? What percentage of them are ACLs? do you
do per-queue acls or glbal acls? Want to answer the rest of my
debugging questions from the previous message?

-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Saturday, October 23, 2004 1:00 AM
To: Matthew Watson
Cc: rt-users@lists.fsck.com
Subject: Re: [rt-users] Speeding up RT3

Looking at this, my best guess is RT is spending most of its time
waiting on the database.

As a follow on, how many queries should be run for index.html?

I’ve setup logging of the queries that were being run, I’m find for
a
user with 25 queues, for the index.html I’m getting about 250
queries!

Are there duplicate queries? What percentage of them are ACLs? do you
do per-queue acls or glbal acls? Want to answer the rest of my
debugging questions from the previous message?

Sorry yes, I wrote out these answers first, but then forgot to send it
:slight_smile:

Almost all the queries appear to be on the acl, eg

SELECT * FROM ( SELECT limitquery.*,rownum limitrownum FROM ( SELECT
ACL.id from ACL, Groups, Principals, CachedGroupMembers WHERE
(ACL.RightName = ‘SuperUser’ OR ACL.RightName = ‘ShowTicket’) AND
Principals.Disabled = 0 AND CachedGroupMembers.Disabled = 0 AND
Principals.id = Groups.id AND Principals.id =
CachedGroupMembers.GroupId AND CachedGroupMembers.MemberId = ‘174’ AND (
ACL.ObjectType = ‘RT::System’ OR (ACL.ObjectType = ‘RT::Ticket’ AND
ACL.ObjectId = ‘1062173’) OR (ACL.ObjectType = ‘RT::Queue’ AND
ACL.ObjectId = ‘7’)) AND ( ( ACL.PrincipalId = Principals.id AND
ACL.PrincipalType = ‘Group’ AND (Groups.Domain = ‘SystemInternal’ OR
Groups.Domain = ‘UserDefined’ OR Groups.Domain = ‘ACLEquivalence’ OR
Groups.Domain = ‘Personal’)) ) ) limitquery WHERE rownum <= 1 ) WHERE
limitrownum >= 1

Something I should mention is I’ve edited the quicksearch Element to
have 4 columns, “new”,“Open”,“stalled” and “all”. Which will effect the
query numbers.

Acls are mostly per queue.

From what I can see though, most of the time is spent doing things other
than waiting on the database,

An example from the index.html page (25 queues, non superuser)

Time to display: 5.843325
Time in Database: 1.678154
Number of Query: 231

  1. are you exiting standalone httpd by calling a mason page with %exit
    as its only content?

Yep

  1. what version of RT are you running?

Is it 3.2.2? Can you test out 3.2-HEAD from subversion?

Yeh its 3.2.2,

Where is 3.2-HEAD (not very used to subversion)

-bash-2.05b$ svn co svn://svn.bestpractical.com/rt/branches/3.2-HEAD
svn: URL ‘svn://svn.bestpractical.com/rt/branches/3.2-HEAD’ doesn’t
exist

  1. what OS are you running?

Linux 2.6.8.1 #20 SMP

  1. What version of DBIx::SearchBuilder are you running?

1.11

  1. are you doing any profiling of how long the backend database is
    spending on each page view?

Total Elapsed Time = 103.1376 Seconds
User+System Time = 8.597692 Seconds

That’s 8.6 seconds of CPU time over 2 minutes. What pages did you
view?
how many times?

Front page about 3-4 times,
Few queues, few tickets.

I can come up with some specifics if its more helpful (eg, just load
index.html and then Exit )

Thanks.

Regards
Matt Watson

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. Please notify the sender
immediately by email if you have received this email by mistake and delete this email
from your system. Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the organisation.
Finally, the recipient should check this email and any attachments for the presence of
viruses. The organisation accepts no liability for any damage caused by any virus
transmitted by this email.