Performance with many attachments and transactions

Kevin_Murphy · November 17, 2005, 5:07pm

Hi all,

Just a sanity check:

I’m assuming it is not unreasonable for a ticket with 44 attachments and
75 transactions to take 13-14 seconds to load?

RT 3.4.4 is running on Apache 1.3.33 with mod_perl compiled in, Perl
5.8.6, database is PostgreSQL 8.0.4, DBIx::SearchBuilder is 1.33, Mason
is 1.3101. Machine is Mac OS X 10.4.3 on dual 1.42 GHz PowerMac G4 with
2GB RAM, idle at the time of my test.

There are 433 separate database queries made in the course of loading
this ticket, whose cumulative time (as measured in the postgresql log
file) is just 1.19 seconds, but the entire load takes 13-14 seconds,
both according to the RT footer and by measuring the elapsed time across
all database queries in the postgresql log file.

The ticket’s html source is 113,683 bytes and is sent over a 100Mbit/sec
LAN. The time for my browser (Firefox) to receive and render a static
version of the ticket page from the same web server is well under a
second. So query, transmission, and rendering time is <= 2 seconds,
perl squirrelling is ~ 11-12 seconds. Seems like a lot, somehow.

Thanks,
Kevin Murphy

Jesse_Vincent · November 17, 2005, 5:10pm

The ticket’s html source is 113,683 bytes and is sent over a 100Mbit/sec
LAN. The time for my browser (Firefox) to receive and render a static
version of the ticket page from the same web server is well under a
second. So query, transmission, and rendering time is <= 2 seconds,
perl squirrelling is ~ 11-12 seconds. Seems like a lot, somehow.

I’d put (small amounts of) money on the problem here being the
colorization of the different levels of quoted message.

Kevin_Murphy · November 17, 2005, 5:50pm

Jesse Vincent wrote:

The ticket’s html source is 113,683 bytes and is sent over a 100Mbit/sec
LAN. The time for my browser (Firefox) to receive and render a static
version of the ticket page from the same web server is well under a
second. So query, transmission, and rendering time is <= 2 seconds,
perl squirrelling is ~ 11-12 seconds. Seems like a lot, somehow.

I’d put (small amounts of) money on the problem here being the
colorization of the different levels of quoted message.

That doesn’t seem to be it. I poked around, and in
ShowTransactionAttachments, I commented out the following two lines:

            eval { $content = Text::Quoted::extract($content); };
            if ($@) { 1; }

After restarting, I can see that the quoted passages in comments are no
longer colorized, but the speed is the same. (There are only a dozen
instances of quoting, anyway). There are 15 file attachments on this
ticket, BTW – not sure if that’s unusual.

Is this worth continuing to investigate, or should we just start setting
aside money for a new server ?

Thanks,
Kevin Murphy

Jesse_Vincent · November 17, 2005, 6:25pm

After restarting, I can see that the quoted passages in comments are no
longer colorized, but the speed is the same. (There are only a dozen
instances of quoting, anyway). There are 15 file attachments on this
ticket, BTW – not sure if that’s unusual.

Is this worth continuing to investigate, or should we just start setting
aside money for a new server ?

It’d be worth running the standalone_httpd under perl’s profiler. But
hey, new hardware is always nice

Kevin_Murphy · November 17, 2005, 8:40pm

Jesse Vincent wrote:

It’d be worth running the standalone_httpd under perl’s profiler.

OK, I tried this (after shutting down the main web server):

$ sudo perl -d:DProf /usr/local/rt3/bin/standalone_httpd 80

And got this error message:

want panicked: I’ve been asked to find a null return address.
(Are you trying to call me from inside a tie handler?)
at /Library/Perl/5.8.6/darwin-thread-multi-2level/Want.pm line 100,
line 16.

Yowza. I have no idea how to respond to that. I could try building a
custom perl (I’m currently using the stock Apple perl), re-installing RT
prerequisites and giving that a go. If it behaves better, I’ll re-build
mod_perl and Apache, I guess.

Thanks,
Kevin Murphy

Vivek_Khera1 · November 17, 2005, 9:05pm

There are 433 separate database queries made in the course of
loading this ticket, whose cumulative time (as measured in the
postgresql log file) is just 1.19 seconds, but the entire load
takes 13-14 seconds, both according to the RT footer and by
measuring the elapsed time across all database queries in the
postgresql log file.

Did you also measure the time to transfer the bits from the Pg server
to the web server? That may also be a bottleneck. I can literally
see the delay in painting the page when RT encounters a large
attachment only to report a link with the size of the attachment.

It seems to me RT is fetching the whole large attachment just to find
its size and content type rather than just fetching the size from the
DB. I could be wrong, though.

Jesse_Vincent · November 17, 2005, 9:07pm

There are 433 separate database queries made in the course of
loading this ticket, whose cumulative time (as measured in the
postgresql log file) is just 1.19 seconds, but the entire load
takes 13-14 seconds, both according to the RT footer and by
measuring the elapsed time across all database queries in the
postgresql log file.

Did you also measure the time to transfer the bits from the Pg server
to the web server? That may also be a bottleneck. I can literally
see the delay in painting the page when RT encounters a large
attachment only to report a link with the size of the attachment.

Uh. It shouldn’t be doing that with a modern version of RT. (If it is,
we’re missing a ->columns() somewhere.

Kevin_Murphy · November 28, 2005, 4:53pm

There are 433 separate database queries made in the course of
loading this ticket, whose cumulative time (as measured in the
postgresql log file) is just 1.19 seconds, but the entire load
takes 13-14 seconds

Did you also measure the time to transfer the bits from the Pg server
to the web server? That may also be a bottleneck. I can literally
see the delay in painting the page when RT encounters a large
attachment only to report a link with the size of the attachment.

Uh. It shouldn’t be doing that with a modern version of RT. (If it is,
we’re missing a ->columns() somewhere.

Oh ho: in my case, Vivek is right. The file attachments seem to be
sent to RT from the database even when they are not displayed.

I just ran tcpdump on the PostgreSQL socket held by standalone_httpd,
and the total amount of traffic on that port was 755kb. The sum of the
sizes of the attachments on this ticket is 620kb. Also, I can see text
from some of the non-displayed text attachments in the tcpdump packet
output. Not that it matters, but 154kb of the attachments are from
binary files, the remaining 466kb are from .txt or .html files. The
generated HTML is 116kb. MaxInlineBody is set to 10000, which excludes
all but 18.2kb of the attachments from being displayed. Possibly
relevant: I must confess that I’ve made the following customization to
ShowTransactionAttachments, which prevents any of the file attachments
from being displayed:

if (   $message->ContentType =~ m{^(text|message)}i
    && !$message->Filename   # <== prevent any file attachment from

being displayed in-line
&& $size <= $RT::MaxInlineBody )

I’m using RT 3.4.4, PG 8.0.4, DBIx::SearchBuilder v1.35.

Other than the above customization, I have made only 3 or 4 cosmetic
customizations of stock RT source files.

Assuming that my change above doesn’t inadvertently drag file contents
over … could this somehow be a postgresql-specific problem? Can
somebody else who uses postgresql see if they have this problem also?

I’ll try to debug this, but I’m not familiar with RT internals, so I may
be a bit slow (and unfortunately I can’t get the Perl profiler and
debugger to work with standalone_httpd on my platform: Mac OS X 10.4.3,
Perl 5.8.7 from source).

Thanks,
Kevin Murphy

Kevin_Murphy · November 28, 2005, 6:00pm

Kevin Murphy wrote:

In my case, Vivek is right. The file attachments seem to be sent to
RT from the database even when they are not displayed.

Looking at the database log, I can see that it is only the text/plain or
text attachments whose content is automatically queried by RT. The
content of binary files is not automatically fetched:

SELECT DISTINCT main.id AS id, main.filename, main.Content AS content,
main.ContentType AS contenttype, main.TransactionId AS transactionid,
main.ContentEncoding AS contentencoding
FROM Attachments main, Transactions Transactions_1, Tickets Tickets_2
WHERE ((Tickets_2.EffectiveId = ‘1717’))
AND ((Transactions_1.ObjectId = Tickets_2.id))
AND ((Transactions_1.ObjectType = ‘RT::Ticket’))
AND ((main.ContentType = ‘text/plain’) or (main.ContentType LIKE
‘message/%’) or (main.ContentType = ‘text’))
AND ((main.TransactionId = Transactions_1.id));

For my users, it makes sense never to display the contents of named file
attachments unless the user requests one by accessing the content link
for an attachment. And of course, if attachment content is not
displayed, it should really not be fetched from the database. RT’s
above database behavior seems to reflect its historical preference for
displaying all text/plain and text file attachments in-line (a
preference that I don’t think makes much sense – why should textual
file attachments be treated any differently than binary?)

I guess this is not an issue for most people, because most users of RT
deal mostly in binary file attachments.
However, we are a bioinformatics research lab, and we deal in a lot of
large tab-delimited and other text files.

I try to educate my users to only upload compressed versions of text
files (which is also preferable for other reasons), but I’d rather see
RT modified to not fetch content if it knows that it doesn’t need to
display it.

Thanks,
Kevin Murphy

Jesse_Vincent · November 28, 2005, 6:13pm

For my users, it makes sense never to display the contents of named file
attachments unless the user requests one by accessing the content link
for an attachment. And of course, if attachment content is not
displayed, it should really not be fetched from the database. RT’s
above database behavior seems to reflect its historical preference for
displaying all text/plain and text file attachments in-line (a
preference that I don’t think makes much sense – why should textual
file attachments be treated any differently than binary?)

Seams reasonable enough. I’ve created a ticket for this one, but it’ll
likely get fixed a lot faster if someone steps up with a patch.

Jesse

Stephen_Turner · November 28, 2005, 8:03pm

At Monday 11/28/2005 01:13 PM, Jesse Vincent wrote:

Seams reasonable enough. I’ve created a ticket for this one, but it’ll
likely get fixed a lot faster if someone steps up with a patch.

Jesse

Can you clarify the requirements? It seems like there are all kinds of
conditions on whether to display an attachment or not that would be
difficult to cram into a single database query.

What would be relatively simple though (I think), is to add a “size less
than MaxInlineBody” limit to the query that gets text attachments. Would
that be enough?

Thanks,
Steve

Jesse_Vincent · November 28, 2005, 8:06pm

At Monday 11/28/2005 01:13 PM, Jesse Vincent wrote:

Seams reasonable enough. I’ve created a ticket for this one, but it’ll
likely get fixed a lot faster if someone steps up with a patch.

Jesse

Can you clarify the requirements? It seems like there are all kinds of
conditions on whether to display an attachment or not that would be
difficult to cram into a single database query.

What would be relatively simple though (I think), is to add a “size less
than MaxInlineBody” limit to the query that gets text attachments. Would
that be enough?

That was along the lines of what I was thinking. Though I have a nagging
feeling that soem database or other is going to turn this into an even
slower query, possibly by taking the attachments out of the database,
normalizing them to utf-7, counting the bits in each and then summing
the bits using a stored procedure coded in TCL.

Ruslan_Zakirov · November 29, 2005, 2:11am

At Monday 11/28/2005 01:13 PM, Jesse Vincent wrote:

Seams reasonable enough. I’ve created a ticket for this one, but it’ll
likely get fixed a lot faster if someone steps up with a patch.

Jesse

Can you clarify the requirements? It seems like there are all kinds of
conditions on whether to display an attachment or not that would be
difficult to cram into a single database query.

What would be relatively simple though (I think), is to add a “size less
than MaxInlineBody” limit to the query that gets text attachments. Would
that be enough?
We could build “LENGTH(Content) < $MaxInlineBody” condition, but we
can’t use ContentLength property because it’s buried in the Headers
field. You should look into SB to see how to do such limit.

Thanks,
Steve

The rt-users Archives

Be sure to check out the RT Wiki at http://wiki.bestpractical.com

Download a free sample chapter of RT Essentials from O’Reilly Media at http://rtbook.bestpractical.com

WE’RE COMING TO YOUR TOWN SOON - RT Training in Amsterdam, Boston and
San Francisco - Find out more at http://bestpractical.com/services/training.html

Best regards, Ruslan.

Ben_Goodwin · November 29, 2005, 11:34pm

I hate to chime in with a “me too” especially without hard data but:

We were running 3.5.something on a dual P3 1.4 with 2 gig ram and 15kRPM
SCSI RAID as the webserver and the same configuration for the database on
another box. Loading tickets with any reasonable amount of history to them
was slow, and tickets with a lot of history would take 90 seconds to load.
All the time was spent in Perl (not the database, which was 85% in RAM). So
I decided to upgrade. We now have the webserver and DB server on a single
dual P4 Xeon 3.4GHz (2MB cache 800 MHz bus) with 6 gig of ram on RHEL4. The
DB is about 1.5 gig so the ENTIRE DB is in RAM - DB calls are nice and fast.
Full content searches take < 10 seconds (MySQL). However, ticket displays,
while faster, are not nearly as improved as I’d hoped. The same 90-second
ticket is down to about 60 seconds - again, all perl (FCGI under Apache2),
sitting at 100% CPU until the rendering is done. One of these days I’ll
spend time looking into this but thought I’d at least chime in to indicate
"you’re not alone" as well as indicate that even with very beefy hardware,
it’s still slow.

-=| Ben

Jesse_Vincent · November 29, 2005, 11:37pm

I hate to chime in with a “me too” especially without hard data but:

We were running 3.5.something on a dual P3 1.4 with 2 gig ram and 15kRPM
SCSI RAID as the webserver and the same configuration for the database on
another box. Loading tickets with any reasonable amount of history to them
was slow, and tickets with a lot of history would take 90 seconds to load.
All the time was spent in Perl (not the database, which was 85% in RAM). So
I decided to upgrade. We now have the webserver and DB server on a single
dual P4 Xeon 3.4GHz (2MB cache 800 MHz bus) with 6 gig of ram on RHEL4. The
DB is about 1.5 gig so the ENTIRE DB is in RAM - DB calls are nice and fast.
Full content searches take < 10 seconds (MySQL). However, ticket displays,
while faster, are not nearly as improved as I’d hoped. The same 90-second
ticket is down to about 60 seconds - again, all perl (FCGI under Apache2),
sitting at 100% CPU until the rendering is done. One of these days I’ll
spend time looking into this but thought I’d at least chime in to indicate
“you’re not alone” as well as indicate that even with very beefy hardware,
it’s still slow.

Some actual profiling data would be useful. Either perl-side or
mysql-side.

Nate_Duehr · November 29, 2005, 11:49pm

Ben Goodwin wrote:

I hate to chime in with a “me too” especially without hard data but:

We were running 3.5.something on a dual P3 1.4 with 2 gig ram and 15kRPM
SCSI RAID as the webserver and the same configuration for the database on
another box. Loading tickets with any reasonable amount of history to them
was slow, and tickets with a lot of history would take 90 seconds to load.

This seems really wrong from what I’ve seen. Running Apache v1, and
MySQL - I see huge tickets take less time on an Athlon 2500 with 1G of
RAM.

Are you sure you weren’t struggling with the Linux kernel’s general
problems above 4GB of RAM? If the kernel wasn’t compiled correctly,
there were some serious performance problems above 4GB.

Were you ever able to classify the latency caused by not having the DB
and the webserver be the same machine in this setup above?

All the time was spent in Perl (not the database, which was 85% in RAM). So
I decided to upgrade. We now have the webserver and DB server on a single
dual P4 Xeon 3.4GHz (2MB cache 800 MHz bus) with 6 gig of ram on RHEL4. The
DB is about 1.5 gig so the ENTIRE DB is in RAM - DB calls are nice and fast.

That helps a lot. I found that using even the example MySQL “large”
database configuration stuff in the mysql docs also helped performance
immensely. Even though bottlenecks LOOKED like they were Perl, Perl was
waiting on the DB.

Watching vmstat for page faults, since my server also does “other”
things, I do see page faults when the server’s busy – I should add RAM
– but I tweaked a while on mysql settings to try to find a happy medium
where if three or so users are pulling pages at the same time, the box
isn’t page faulting on the DB.

Later on, I also realized in my little project, that I didn’t need old
tickets so badly that I had to have them online, and I used Ruslan’s
Shredder to dump about three years of DB content not too long ago – the
system sped WAY up on things like huge searches that included tickets
back that old, but it didn’t really change the system speed feel for
single ticket work.

Attachments seem to be the real performance killer. The more
attachments, the slower pages render. Without attachments, or with only
one or two, I have no complaints – especially on such cheap hardware!

Full content searches take < 10 seconds (MySQL). However, ticket displays,
while faster, are not nearly as improved as I’d hoped. The same 90-second
ticket is down to about 60 seconds - again, all perl (FCGI under Apache2),
sitting at 100% CPU until the rendering is done. One of these days I’ll
spend time looking into this but thought I’d at least chime in to indicate
“you’re not alone” as well as indicate that even with very beefy hardware,
it’s still slow.

It sure feels like whenever I use it that Apache 2 is somewhat bloated.
You may (if you can) see if you can benchmark the same pages from an
Apache1 setup in a temporary directory on the same box, if you’re not
averse to using older versions of Apache for some reason.

Nate

Ben_Goodwin · December 8, 2005, 2:22am

-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Tuesday, November 29, 2005 6:38 PM
To: Ben Goodwin
Cc: rt-users@lists.bestpractical.com
Subject: Re: [rt-users] Performance with many attachments
and transactions

I hate to chime in with a “me too” especially without hard
data but:

[snip]

Displaying a particular ticket under RT 3.5.6 3 times produces:

%Time ExclSec CumulS #Calls sec/call Csec/c Name
14.4 4.858 4.858 51699 0.0001 0.0001 Encode::utf8::encode_xs
7.79 2.618 11.798 168354 0.0000 0.0001 RT::Record::__Value
6.40 2.151 3.282 289387 0.0000 0.0000
DBIx::SearchBuilder::Record::_Prim
aryKey
5.37 1.803 4.388 168354 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha
ble::__Value
5.35 1.798 4.655 250916 0.0000 0.0000 DBIx::SearchBuilder::Record::Id
4.92 1.651 1.651 268566 0.0000 0.0000 RT::Record::_Accessible
4.79 1.610 2.620 169879 0.0000 0.0000
DBIx::SearchBuilder::Record::__Val
ue
3.98 1.337 44.079 7556 0.0002 0.0058 HTML::Mason::Commands::ANON
3.36 1.129 1.129 291293 0.0000 0.0000 RT::Record::_PrimaryKeys
3.24 1.087 3.852 167482 0.0000 0.0000 Encode::decode
2.97 0.999 1.574 114716 0.0000 0.0000
Cache::Simple::TimedExpiry::fetch
2.96 0.995 1.608 167481 0.0000 0.0000 Encode::utf8::decode_xs
2.88 0.966 7.467 51005 0.0000 0.0001 RT::Record::LoadByCols
2.59 0.869 0.869 51005 0.0000 0.0000
DBIx::SearchBuilder::Record::Cacha

ble::_gen_alternate_RecordCache_ke
y
2.54 0.854 6.253 51005 0.0000 0.0001
DBIx::SearchBuilder::Record::Cacha
ble::LoadByCols

Indeed - the total ExclSec (24.725) is right on target with how long it took
to display the ticket.

I can give some hard details on MySQL but I’m fairly convinced it’s not the
database (it’s a modified version of my-huge.cnf).

It seems alarming that the function taking the least time is STILL almost a
second long.

System config attached.

-=| Ben

sysconf.txt (9.61 KB)

Ben_Goodwin · December 8, 2005, 3:10am

[snip]

Indeed - the total ExclSec (24.725) is right on target with
how long it took
to display the ticket.
[snip]
It seems alarming that the function taking the least time is
STILL almost a
second long.

And that’s me just being ignorant. I ran the display 3 times so both the
above statements are incorrect.

This ticket has 105 transactions, all inline text/html. I guess it puzzles
me that there are a lot of functions that are called tens of thousands of
times to display 105 transactions and some header info.

MySQL log attached - one run of displaying the ticket in question.

-=| Ben

log.zip (4.54 KB)