I have just upgraded an RT install to 4.2.0. We are experiencing a problem
where the ticket history is very slow. Everything else seems to load just
fine and the history will load in ~50-60 seconds. Also, if you go to the
same ticket in a short period of time, the history will load quickly
(probably cached somewhere).
Can anyone give me pointers on where to start looking into the problem?
I’ve confirmed that the database has all of the indexes it’s supposed to
have. The load on the web server and database server are next to nothing.
A little history, this RT instance was originally 3.8 and was upgraded to
4.0.13 a few months ago and then upgraded to 4.2.0 last night.
The ticket table is ~30,000 records, the transactions table is ~3,800,000,
and the users table is ~280,000. (we load users from LDAP every night which
is why there are more users than tickets)
I have just upgraded an RT install to 4.2.0. We are experiencing a
problem where the ticket history is very slow. Everything else seems
to load just fine and the history will load in ~50-60 seconds. Also,
if you go to the same ticket in a short period of time, the history
will load quickly (probably cached somewhere).
[snip]
The ticket table is ~30,000 records, the transactions table is
~3,800,000, and the users table is ~280,000.
On Wed, 2013-10-23 at 12:14 -0400, Tod Detre wrote:
I have just upgraded an RT install to 4.2.0. We are experiencing a
problem where the ticket history is very slow. Everything else seems
to load just fine and the history will load in ~50-60 seconds. Also,
if you go to the same ticket in a short period of time, the history
will load quickly (probably cached somewhere).
[snip]
The ticket table is ~30,000 records, the transactions table is
~3,800,000, and the users table is ~280,000.
I’ve had the mysql slow log on, but it does not show any slow queries.
Another interesting point, the mobile site does not seem to have the same
problem.On Wed, Oct 23, 2013 at 12:34 PM, Alex Vandiver alexmv@bestpractical.comwrote:
On Wed, 2013-10-23 at 12:27 -0400, Tod Detre wrote:
I’ve had the mysql slow log on, but it does not show any slow
queries.
That is surprising to me, as the symptoms of “first load is slow,
successive ones are fast” sounds like the MySQL query cache to me, as
I’m not aware of any other caching layers in that area. I’d
double-check your slow query log threshold and relevant settings.
Another interesting point, the mobile site does not seem to have the
same problem.
I’ve had the mysql slow log on, but it does not show any slow
queries.
That is surprising to me, as the symptoms of “first load is slow,
successive ones are fast” sounds like the MySQL query cache to me, as
I’m not aware of any other caching layers in that area. I’d
double-check your slow query log threshold and relevant settings.
Sorry, about that I am seeing slow queries, but not for every time I see
the slow history problem. Here is an example of the slow queries we’re
seeing:
26134
SET timestamp=1382560324;
SELECT DISTINCT main.* FROM Tickets main JOIN Groups Groups_1 ON (
Groups_1.Domain = ‘RT::Ticket-Role’ ) AND ( Groups_1.Instance = main.id )
LEFT JOIN Cache
dGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.Disabled =
‘0’ ) AND ( CachedGroupMembers_2.MemberId = ‘37’ ) AND (
CachedGroupMembers_2.GroupI
d = Groups_1.id ) WHERE ( ( ( main.Queue = ‘1’ OR main.Queue = ‘1’ OR
main.Queue = ‘3’ OR main.Queue = ‘4’ OR main.Queue = ‘4’ OR main.Queue =
‘5’ OR main.
Queue = ‘5’ OR main.Queue = ‘6’ OR main.Queue = ‘6’ OR main.Queue = ‘7’ OR
main.Queue = ‘8’ OR main.Queue = ‘9’ OR main.Queue = ‘9’ OR main.Queue =
‘10’ OR m
ain.Queue = ‘10’ OR main.Queue = ‘11’ OR main.Queue = ‘12’ OR main.Queue =
‘13’ OR main.Queue = ‘14’ OR main.Queue = ‘15’ OR main.Queue = ‘16’ OR
main.Queue
= ‘17’ OR main.Queue = ‘18’ OR main.Queue = ‘19’ OR main.Queue = ‘20’ OR
main.Queue = ‘21’ OR main.Queue = ‘22’ OR main.Queue = ‘23’ OR main.Queue =
‘24’ OR
main.Queue = ‘25’ OR main.Queue = ‘26’ OR main.Queue = ‘27’ OR main.Queue =
‘28’ OR main.Queue = ‘29’ OR main.Queue = ‘30’ OR main.Queue = ‘31’ OR
main.Queue
= ‘32’ OR main.Queue = ‘33’ OR main.Queue = ‘34’ OR main.Queue = ‘35’ OR
main.Queue = ‘36’ OR main.Queue = ‘37’ OR main.Queue = ‘38’ OR main.Queue =
‘39’ OR
main.Queue = ‘40’ OR main.Queue = ‘41’ OR main.Queue = ‘42’ OR main.Queue
= ‘43’ OR main.Queue = ‘44’ OR main.Queue = ‘45’ OR main.Queue = ‘46’ OR
main.Queu
e = ‘47’ OR main.Queue = ‘48’ OR main.Queue = ‘49’ OR main.Queue = ‘50’ OR
main.Queue = ‘51’ OR main.Queue = ‘52’ OR main.Queue = ‘53’ OR main.Queue =
‘54’ O
R main.Queue = ‘54’ OR main.Queue = ‘56’ OR main.Queue = ‘57’ OR main.Queue
= ‘58’ OR main.Queue = ‘59’ OR main.Queue = ‘60’ OR main.Queue = ‘62’ OR
main.Que
ue = ‘63’ OR main.Queue = ‘64’ OR main.Queue = ‘64’ OR main.Queue = ‘64’ OR
main.Queue = ‘66’ OR main.Queue = ‘68’ OR main.Queue = ‘68’ OR main.Queue =
‘70’
OR main.Queue = ‘71’ OR main.Queue = ‘71’ OR main.Queue = ‘72’ OR
main.Queue = ‘73’ OR main.Queue = ‘75’ OR main.Queue = ‘76’ OR main.Queue =
‘76’ OR main.Qu
eue = ‘77’ OR main.Queue = ‘77’ OR main.Queue = ‘78’ OR main.Queue = ‘78’
OR main.Queue = ‘79’ OR main.Queue = ‘79’ OR main.Queue = ‘80’ OR
main.Queue = ‘80’
OR main.Queue = ‘81’ OR main.Queue = ‘81’ OR main.Queue = ‘82’ OR
main.Queue = ‘83’ ) OR ( CachedGroupMembers_2.MemberId IS NOT NULL AND
Groups_1.Name = ’
Requestor’ ) OR ( main.Owner = ‘37’ ) ) ) AND (main.IsMerged IS NULL)
AND (main.Status != ‘deleted’) AND (main.Type = ‘ticket’) AND
(main.Priority > ‘49’
AND ( CASE WHEN main.Created BETWEEN ‘1969-12-31 11:59:59’ AND ‘1970-01-01
12:00:01’ THEN NULL ELSE main.Created END > ‘2013-09-23 20:32:04’ OR (
main.Stat
us = ‘new’ OR main.Status = ‘open’ OR main.Status = ‘stalled’ ) ) ) ORDER
BY main.Status ASC, main.Created DESC LIMIT 10;
Time: 131023 16:32:05
Another interesting point, the mobile site does not seem to have the
same problem.
Do you have any local customizations?
We do have one customization. However even if I remove it and clear the
mason cache, the problem persists.
I’m learning more and more about mysql and RT trying to track down this
problem. I ran mysqldumpslow to try to find queries that are showing up
a lot and found the queries listed below. However, I’m not convinced it is
on the sql server as I can cause the problem to resurface once it’s gone by
clearing my browser’s cache.
Count: 192 Time=7.50s (1439s) Lock=0.00s (0s) Rows=1.0 (192), rt4[rt4]@ xxx.xxx.net
SELECT count(main.id) FROM ObjectCustomFieldValues main WHERE
(main.Disabled = ‘S’) AND (main.ObjectId = ‘S’) AND (main.ObjectType = ‘S’)
Count: 159 Time=0.17s (27s) Lock=0.00s (0s) Rows=1.0 (159), rt4[rt4]@ xxx.xxx.net
SELECT COUNT(DISTINCT main.id) FROM Tickets main JOIN Groups Groups_1 ON
( Groups_1.Domain = ‘S’ ) AND ( Groups_1.Instance = main.id ) LEFT JOIN
CachedGro
upMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.Disabled = ‘S’ )
AND ( CachedGroupMembers_2.MemberId = ‘S’ ) AND (
CachedGroupMembers_2.GroupId = G
roups_1.id ) WHERE ( ( ( main.Queue = ‘S’ OR main.Queue = ‘S’ OR
main.Queue = ‘S’ OR main.Queue = ‘S’ ) OR ( CachedGroupMembers_2.MemberId
IS NOT NULL AN
D Groups_1.Name = ‘S’ ) OR ( main.Owner = ‘S’ ) ) ) AND (main.IsMerged
IS NULL) AND (main.Status != ‘S’) AND (main.Type = ‘S’) AND (main.Priority
‘S’ AN
D ( CASE WHEN main.Created BETWEEN ‘S’ AND ‘S’ THEN NULL ELSE main.Created
END > ‘S’ OR ( main.Status = ‘S’ OR main.Status = ‘S’ OR main.Status = ‘S’
) )
)
Count: 159 Time=0.21s (32s) Lock=0.00s (0s) Rows=1.0 (159), rt4[rt4]@ xxx.xxx.net
SELECT COUNT(DISTINCT main.id) FROM Tickets main JOIN Groups Groups_1 ON
( Groups_1.Domain = ‘S’ ) AND ( Groups_1.Instance = main.id ) LEFT JOIN
CachedGro
upMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.Disabled = ‘S’ )
AND ( CachedGroupMembers_2.MemberId = ‘S’ ) AND (
CachedGroupMembers_2.GroupId = G
roups_1.id ) WHERE ( ( ( main.Queue = ‘S’ OR main.Queue = ‘S’ ) OR (
CachedGroupMembers_2.MemberId IS NOT NULL AND Groups_1.Name = ‘S’ ) OR (
main.Owne
r = ‘S’ ) ) ) AND (main.IsMerged IS NULL) AND (main.Status != ‘S’) AND
(main.Type = ‘S’) AND (main.Priority > ‘S’ AND ( CASE WHEN main.Created
BETWEEN ‘S’
AND ‘S’ THEN NULL ELSE main.Created END > ‘S’ OR ( main.Status = ‘S’ OR
main.Status = ‘S’ OR main.Status = ‘S’ ) ) )On Wed, Oct 23, 2013 at 4:38 PM, Tod Detre tod.detre@maine.edu wrote:
On Wed, 2013-10-23 at 13:47 -0400, Tod Detre wrote:
I’ve had the mysql slow log on, but it does not show any slow
queries.
That is surprising to me, as the symptoms of “first load is slow,
successive ones are fast” sounds like the MySQL query cache to me, as
I’m not aware of any other caching layers in that area. I’d
double-check your slow query log threshold and relevant settings.
Sorry, about that I am seeing slow queries, but not for every time I see
the slow history problem. Here is an example of the slow queries we’re
seeing:
26134
SET timestamp=1382560324;
SELECT DISTINCT main.* FROM Tickets main JOIN Groups Groups_1 ON (
Groups_1.Domain = ‘RT::Ticket-Role’ ) AND ( Groups_1.Instance = main.id )
LEFT JOIN Cache
dGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.Disabled =
‘0’ ) AND ( CachedGroupMembers_2.MemberId = ‘37’ ) AND (
CachedGroupMembers_2.GroupI
d = Groups_1.id ) WHERE ( ( ( main.Queue = ‘1’ OR main.Queue = ‘1’ OR
main.Queue = ‘3’ OR main.Queue = ‘4’ OR main.Queue = ‘4’ OR main.Queue =
‘5’ OR main.
Queue = ‘5’ OR main.Queue = ‘6’ OR main.Queue = ‘6’ OR main.Queue = ‘7’ OR
main.Queue = ‘8’ OR main.Queue = ‘9’ OR main.Queue = ‘9’ OR main.Queue =
‘10’ OR m
ain.Queue = ‘10’ OR main.Queue = ‘11’ OR main.Queue = ‘12’ OR main.Queue =
‘13’ OR main.Queue = ‘14’ OR main.Queue = ‘15’ OR main.Queue = ‘16’ OR
main.Queue
= ‘17’ OR main.Queue = ‘18’ OR main.Queue = ‘19’ OR main.Queue = ‘20’ OR
main.Queue = ‘21’ OR main.Queue = ‘22’ OR main.Queue = ‘23’ OR main.Queue =
‘24’ OR
main.Queue = ‘25’ OR main.Queue = ‘26’ OR main.Queue = ‘27’ OR main.Queue
= ‘28’ OR main.Queue = ‘29’ OR main.Queue = ‘30’ OR main.Queue = ‘31’ OR
main.Queue
= ‘32’ OR main.Queue = ‘33’ OR main.Queue = ‘34’ OR main.Queue = ‘35’ OR
main.Queue = ‘36’ OR main.Queue = ‘37’ OR main.Queue = ‘38’ OR main.Queue =
‘39’ OR
main.Queue = ‘40’ OR main.Queue = ‘41’ OR main.Queue = ‘42’ OR main.Queue
= ‘43’ OR main.Queue = ‘44’ OR main.Queue = ‘45’ OR main.Queue = ‘46’ OR
main.Queu
e = ‘47’ OR main.Queue = ‘48’ OR main.Queue = ‘49’ OR main.Queue = ‘50’ OR
main.Queue = ‘51’ OR main.Queue = ‘52’ OR main.Queue = ‘53’ OR main.Queue =
‘54’ O
R main.Queue = ‘54’ OR main.Queue = ‘56’ OR main.Queue = ‘57’ OR
main.Queue = ‘58’ OR main.Queue = ‘59’ OR main.Queue = ‘60’ OR main.Queue =
‘62’ OR main.Que
ue = ‘63’ OR main.Queue = ‘64’ OR main.Queue = ‘64’ OR main.Queue = ‘64’
OR main.Queue = ‘66’ OR main.Queue = ‘68’ OR main.Queue = ‘68’ OR
main.Queue = ‘70’
OR main.Queue = ‘71’ OR main.Queue = ‘71’ OR main.Queue = ‘72’ OR
main.Queue = ‘73’ OR main.Queue = ‘75’ OR main.Queue = ‘76’ OR main.Queue =
‘76’ OR main.Qu
eue = ‘77’ OR main.Queue = ‘77’ OR main.Queue = ‘78’ OR main.Queue = ‘78’
OR main.Queue = ‘79’ OR main.Queue = ‘79’ OR main.Queue = ‘80’ OR
main.Queue = ‘80’
OR main.Queue = ‘81’ OR main.Queue = ‘81’ OR main.Queue = ‘82’ OR
main.Queue = ‘83’ ) OR ( CachedGroupMembers_2.MemberId IS NOT NULL AND
Groups_1.Name = ’
Requestor’ ) OR ( main.Owner = ‘37’ ) ) ) AND (main.IsMerged IS NULL)
AND (main.Status != ‘deleted’) AND (main.Type = ‘ticket’) AND
(main.Priority > ‘49’
AND ( CASE WHEN main.Created BETWEEN ‘1969-12-31 11:59:59’ AND
‘1970-01-01 12:00:01’ THEN NULL ELSE main.Created END > ‘2013-09-23
20:32:04’ OR ( main.Stat
us = ‘new’ OR main.Status = ‘open’ OR main.Status = ‘stalled’ ) ) )
ORDER BY main.Status ASC, main.Created DESC LIMIT 10;
Time: 131023 16:32:05
Another interesting point, the mobile site does not seem to have the
same problem.
Do you have any local customizations?
We do have one customization. However even if I remove it and clear the
mason cache, the problem persists.
I’m learning more and more about mysql and RT trying to track down
this problem. I ran mysqldumpslow to try to find queries that are
showing up a lot and found the queries listed below. However, I’m not
convinced it is on the sql server as I can cause the problem to
resurface once it’s gone by clearing my browser’s cache.
If it is indeed not the database (which would surprise me), you may need
to turn on Mason profiling to determine which components are being slow.
Instructions for doing so are in http://docs.bestpractical.com/RT_Config#MasonParameters Note that this
has a performance impact, so you will likely want to test this on a
separate development server.
Count: 192 Time=7.50s (1439s) Lock=0.00s (0s) Rows=1.0 (192),
rt4[rt4]@xxx.xxx.net
SELECT count(main.id) FROM ObjectCustomFieldValues main WHERE
(main.Disabled = ‘S’) AND (main.ObjectId = ‘S’) AND (main.ObjectType =
‘S’)
The time taken by this query is quite surprising. Can you show the
query plan for it, by finding one specific example, and running:
EXPLAIN SELECT count(main.id)
FROM ObjectCustomFieldValues main
WHERE (main.Disabled = ‘…’)
AND (main.ObjectId = ‘…’) AND (main.ObjectType = ‘…’);
…replacing the '…'s with real values? Also, the output of:
SHOW INDEXES FROM ObjectCustomFieldValues;
mysql> EXPLAIN SELECT count(main.id) FROM ObjectCustomFieldValues main;
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
| 1 | SIMPLE | main | index | NULL |
ObjectCustomFieldValues2 | 265 | NULL | 3098664 | Using index |
1 row in set (0.01 sec)
mysql> EXPLAIN SELECT count(main.id) FROM ObjectCustomFieldValues main
WHERE (main.Disabled = ‘0’) AND (main.ObjectId = ‘3910406’) AND
(main.ObjectType = ‘RT::Transaction’);
| id | select_type | table | type | possible_keys | key | key_len |
ref | rows | Extra |
| 1 | SIMPLE | main | ALL | NULL | NULL | NULL |
NULL | 3098664 | Using where |
1 row in set (0.00 sec)
So it looks like the first query is using an index, but the second is
not. I’ve confirmed that the ObjectCustomFieldValues table has the
correct indexes listed in the schema.mysql. However the disabled field
is not indexed. Should that be added?
So it looks like the first query is using an index, but the second is
not. I’ve confirmed that the ObjectCustomFieldValues table has the
correct indexes listed in the schema.mysql. However the disabled field
is not indexed. Should that be added?
The difficulty is that there is no index on ObjectType, ObjectId;
ObjectCustomFieldValues2 starts with the CustomField column, which is
not limited here, so it cannot be used.
Try adding the following index:
CREATE INDEX ObjectCustomFieldValues_Object
ON ObjectCustomFieldValues(ObjectType, ObjectId, Disabled);
I additionally want to know what is generating that query. Turning on http://docs.bestpractical.com/RT_Config#StatementLog (on a
single-process development instance) will allow you to find which
component is generating the query, and if thus it is likely to be a call
path which others will encounter this problem with. This will effect if
and how we fix this for others in 4.2.1
Adding the index did indeed fix the issue. I’ve pasted the line from
the SQL log below. It looks like the query is coming from
RT::Transaction. I’m not that great at perl so I’m not sure, but I
think it might be the method LoadCustomFieldByIdentifier. I do know
that the URL that starts the query is /Helpers/TicketHistory.
[29826] [Mon Oct 28 13:33:12 2013] [debug]: SQL(2.188469s): SELECT
count(main.id) FROM ObjectCustomFieldValues main WHERE (main.Disabled
= ‘0’) AND (main.ObjectId = ‘307581’) AND (main.ObjectType =
‘RT::Transaction’) ; (/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:1292)On Fri, Oct 25, 2013 at 5:22 PM, Alex Vandiver alexmv@bestpractical.com wrote:
On Fri, 2013-10-25 at 16:56 -0400, Tod Detre wrote:
Here are the EXPLAIN results: [snip]
So it looks like the first query is using an index, but the second is
not. I’ve confirmed that the ObjectCustomFieldValues table has the
correct indexes listed in the schema.mysql. However the disabled field
is not indexed. Should that be added?
The difficulty is that there is no index on ObjectType, ObjectId;
ObjectCustomFieldValues2 starts with the CustomField column, which is
not limited here, so it cannot be used.
Try adding the following index:
CREATE INDEX ObjectCustomFieldValues_Object
ON ObjectCustomFieldValues(ObjectType, ObjectId, Disabled);
I additionally want to know what is generating that query. Turning on http://docs.bestpractical.com/RT_Config#StatementLog (on a
single-process development instance) will allow you to find which
component is generating the query, and if thus it is likely to be a call
path which others will encounter this problem with. This will effect if
and how we fix this for others in 4.2.1
Adding the index did indeed fix the issue. I’ve pasted the line from
the SQL log below. It looks like the query is coming from
RT::Transaction. I’m not that great at perl so I’m not sure, but I
think it might be the method LoadCustomFieldByIdentifier. I do know
that the URL that starts the query is /Helpers/TicketHistory.
[29826] [Mon Oct 28 13:33:12 2013] [debug]: SQL(2.188469s): SELECT
count(main.id) FROM ObjectCustomFieldValues main WHERE (main.Disabled
= ‘0’) AND (main.ObjectId = ‘307581’) AND (main.ObjectType =
‘RT::Transaction’) ; (/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:1292)
To be useful, we’d unfortunately need to know the full stack trace. The
full stack trace is shown in the UI in the Admin → Tools → SQL Queries
page if StatementLog is enabled.
SELECT count(main.id) FROM ObjectCustomFieldValues main WHERE
(main.Disabled = ‘0’) AND (main.ObjectId = ‘3561835’) AND
(main.ObjectType = ‘RT::Transaction’)Toggle stack trace
Stack:
[/opt/rt4/sbin/…/lib/RT/Handle.pm:1646]
[/usr/local/share/perl/5.14.2/DBIx/SearchBuilder/Handle.pm:591]
[/usr/local/share/perl/5.14.2/DBIx/SearchBuilder.pm:295]
[/opt/rt4/sbin/…/lib/RT/SearchBuilder.pm:1001]
[/opt/rt4/sbin/…/lib/RT/ObjectCustomFieldValues.pm:158]
[/usr/local/share/perl/5.14.2/DBIx/SearchBuilder.pm:1471]
[/opt/rt4/share/html/Elements/ShowTransaction:71]
[/opt/rt4/share/html/Elements/ShowHistory:120]
[/opt/rt4/share/html/Helpers/TicketHistory:48]
[/opt/rt4/share/html/Helpers/autohandler:51]
[/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:680]
[/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:368]
[/opt/rt4/share/html/autohandler:53]On Thu, Oct 31, 2013 at 1:09 PM, Alex Vandiver alexmv@bestpractical.com wrote:
On Mon, 2013-10-28 at 09:54 -0400, Tod Detre wrote:
Adding the index did indeed fix the issue. I’ve pasted the line from
the SQL log below. It looks like the query is coming from
RT::Transaction. I’m not that great at perl so I’m not sure, but I
think it might be the method LoadCustomFieldByIdentifier. I do know
that the URL that starts the query is /Helpers/TicketHistory.
[29826] [Mon Oct 28 13:33:12 2013] [debug]: SQL(2.188469s): SELECT
count(main.id) FROM ObjectCustomFieldValues main WHERE (main.Disabled
= ‘0’) AND (main.ObjectId = ‘307581’) AND (main.ObjectType =
‘RT::Transaction’) ; (/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:1292)
To be useful, we’d unfortunately need to know the full stack trace. The
full stack trace is shown in the UI in the Admin → Tools → SQL Queries
page if StatementLog is enabled.