RT 3.0.4pre1

RT 3.0.4pre1 contains a set of fixups and cleanups and two seperate
MASSIVE performance improvements. Both relate to ACL checks. One change
was based on work done by Chris Audley and results in significantly
faster generation of pages like Update.html that have to figure out who
can own a given ticket. The other is a new change based on research by
Phil Homewood that seems to massively improve the performance of ALL acl
checks by working around SQL server query optimizer quirks.

Additionally, support for various bits of RTFM was enhanced, and the
storage and display of transactions about relationships between tickets
was signficantly enhanced and extended.

This release would be RC1 instead of pre1, except my postgres test
machine is off in the shop, so I’ve only been able to run the test suite
against mysql. I’m anxious to hear back from folks about how this
release feels.

    Jesse

Project “rt.3”, Branch 0 Page 1
Change Log Thu Jul 3 01:48:59 2003

rt.3.D000, C0, jesse, Thu Mar 13 20:43:23 2003, RT: Request Tracker, branch 3.0.
RT: Request Tracker, branch 3.0.

Change Delta  Brief Description
 169    125	  CustomField rights checking was overly restrictive for users
	  without queue-specific rights
 170    126	  I18N mail testing was was being cavalier with the state of
	  acls after its testing.  (clone of 3.0.C167)
 171    127	  Ticket Update.html fix to not doubly load content
 172    128	  Fixing postgres sortorder bug unmased by searchbuilder fix
 176    129	  Applying POD patches from ourinternet (clone of 3.0.C173)
 177    130	  UTF8, Custom Field and text message rendering fixes from
	  ourinternet
 178    131	  #2843 Date relations were too strict in RT::Tickets searches
 179    132	  #2847: allow URI Resolver to render itself
 173    133	  ShowMessageHeaders; make headers clicky
 180    134	  use RTIR callbacks
 175    135	  Updating rt-setup-database to take acl and schema file names
	  on the commandline
 181    136	  Refactored Users::WhoHaveRight from Chris Audley at Blackrock
 182    137	  Download link in ShowTransaction
 185    138	  Fix for speedycgi disappearing database connections
 183    139	  #2873: Fix for insufficently agressive loop culling
 186    140	  Fix for nested email message parsing
 187    141	  Split the HasRight ACL query into two parts. It's now two
	  small and light SQL queries, instead of one big one that
	  overwhelmed databases
 188    142	  Stylistic cleanups for HasRight optimizations
 189    143	  Relationship transactions are recorded and displayed more
	  robustly
 190    144	  Bumping the version to 3.0.4pre1

On Thu, Jul 03, 2003 at 01:56:52AM -0400, Jesse developed
a new theory of relativity and:

RT 3.0.4pre1 contains a set of fixups and cleanups and two seperate
MASSIVE performance improvements. Both relate to ACL checks. One change
was based on work done by Chris Audley and results in significantly
faster generation of pages like Update.html that have to figure out who
can own a given ticket. The other is a new change based on research by
Phil Homewood that seems to massively improve the performance of ALL acl
checks by working around SQL server query optimizer quirks.

This release has significantly improved performance. Users have noticed
the improvements already.

Dean

Jesse wrote:

The other is a new change based on research by
Phil Homewood that seems to massively improve the performance of ALL acl
checks by working around SQL server query optimizer quirks.

Just to clarify:- the quirks in question are seen in MySQL 4;
it’s unknown at this stage whether Postgres benefits from this
change. Feedback welcome :slight_smile:
Phil Homewood, Systems Janitor, http://www.SnapGear.com
pdh@snapgear.com Ph: +61 7 3435 2810 Fx: +61 7 3891 3630
SnapGear - Custom Embedded Solutions and Security Appliances

Hi all,

It was mentioned somewhere before that CachedGroupMembers queries
get MySQL busy.

I’ve upgraded our RT to 3.0.4pre1 from 3.0.3 release, and caught this problem again:
Mysql server takes 99% of the CPU, and mysqladmin -p proc output gives:
Locked | DELETE FROM CachedGroupMembers WHERE id='7977’
See the whole output in the attachment. Ignore
"select count(id) from CachedGroupMembers", this is where we tried to troubleshoot it.

It happened couple of times before the upgrade that mysql process is 99% busy, and
only Mysql server restart helps.
Something was mentioned about the indexes, but I didn’t read that thread
carefully. Did something change in the database schema?

Mysql version 4.0.13-log on FreeBSD 4.8-STABLE

Thanks,
Stan

RT-Error.txt (3.1 KB)

Stanislav Sinyagin wrote:

| 268 | rt_user | localhost | rt3 | Query | 272 | Sending data | SELECT count(DISTINCT main.id) FROM Tickets main, Groups Groups_1, Principals Principals_2, CachedGr |

This one could be a problem, if it’s running for more than a couple
of seconds. Are you using mod_perl or fastcgi?

Hi Jesse,

I haven’t done a lot of testing, but here is some timing information based
on using apache 1.3.27 and postgres 7.3.3. All of these results come from
ab -d -S -k -n 20 -C RT_SID=xxx "$URL"

RT3.0.3
http://localhost/rt3/index.html Time per request: 4885.15 [ms] (mean)
http://localhost/rt3/Ticket/Create.html?Queue=2 Time per request: 14739.75
[ms] (mean)
http://localhost/rt3/Ticket/ModifyPeople.html?id=1502 Time per request:
9989.75 [ms] (mean)
http://localhost/rt3/Ticket/ModifyAll.html?id=1502 Time per request:
10913.50 [ms] (mean)

RT3.0.3 with Chris Audley’s improvement
http://localhost/rt3/index.html Time per request: 4755.45 [ms] (mean)
http://localhost/rt3/Ticket/Create.html?Queue=2 Time per request: 376.65
[ms] (mean)
http://localhost/rt3/Ticket/ModifyPeople.html?id=1502 Time per request:
517.45 [ms] (mean)
http://localhost/rt3/Ticket/ModifyAll.html?id=1502 Time per request: 1108.00
[ms] (mean)

RT4.0.4pre1
http://localhost/rt3/index.html Time per request: 4452.65 [ms] (mean)
http://localhost/rt3/Ticket/Create.html?Queue=2 Time per request: 379.60
[ms] (mean)
http://localhost/rt3/Ticket/ModifyPeople.html?id=1502 Time per request:
453.00 [ms] (mean)
http://localhost/rt3/Ticket/ModifyAll.html?id=1502 Time per request: 998.20
[ms] (mean)

Hope this helps,
Paul

Stanislav Sinyagin wrote:

| 268 | rt_user | localhost | rt3 | Query | 272 | Sending data | SELECT count(DISTINCT
main.id) FROM Tickets main, Groups Groups_1, Principals Principals_2, CachedGr |

This one could be a problem, if it’s running for more than a couple
of seconds. Are you using mod_perl or fastcgi?

mod_perl 1.27

The server was blocked for quite a long time: few minutes have passed before we
restarted mysql.

Can you give me a hint where such situation can occur, so that I could reproduce it?
In our setup, all privileged users belong to few groups, and
the global and queue rights are given to groups, not individuals.
Also there is at least one ticket where Cc: watchers are a group.

Thanks,
Stan

Hi all,

It was mentioned somewhere before that CachedGroupMembers queries
get MySQL busy.

One of the things I see from below is that you have a number of SELECT
GET_LOCK queries that are waiting for the same sessions. That implies
that someone got impatient and started trying to reload pages while RT
was thinking. Those hits will then wait for 6 minutes while the previous
locks time out. Incidentally, I’ve seen some very poor mysql
behaviour on freebsd, due to threading issues, which might also be able
to explain what you’re seeing.

I’ve upgraded our RT to 3.0.4pre1 from 3.0.3 release, and caught this problem again:
Mysql server takes 99% of the CPU, and mysqladmin -p proc output gives:
Locked | DELETE FROM CachedGroupMembers WHERE id='7977’
See the whole output in the attachment. Ignore
"select count(id) from CachedGroupMembers", this is where we tried to troubleshoot it.

It happened couple of times before the upgrade that mysql process is 99% busy, and
only Mysql server restart helps.
Something was mentioned about the indexes, but I didn’t read that thread
carefully. Did something change in the database schema?

Mysql version 4.0.13-log on FreeBSD 4.8-STABLE

Thanks,
Stan

Content-Description: RT-Error.txt

ticket 9#mysqladmin -p proc
Enter password:
±----±---------±----------±------±--------±-----±-------------±-----------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
±----±---------±----------±------±--------±-----±-------------±-----------------------------------------------------------------------------------------------------+
| 1 | snortman | localhost | snort | Sleep | 0 | | |
| 262 | rt_user | localhost | rt3 | Query | 221 | User lock | SELECT GET_LOCK(‘Apache-Session-0c7777147baf0004b46fed5fef74de0e’, 3600) |
| 263 | rt_user | localhost | rt3 | Query | 241 | User lock | SELECT GET_LOCK(‘Apache-Session-0c7777147baf0004b46fed5fef74de0e’, 3600) |
| 264 | rt_user | localhost | rt3 | Sleep | 208 | | |
| 265 | rt_user | localhost | rt3 | Sleep | 208 | | |
| 267 | rt_user | localhost | rt3 | Query | 201 | User lock | SELECT GET_LOCK(‘Apache-Session-0c7777147baf0004b46fed5fef74de0e’, 3600) |
| 268 | rt_user | localhost | rt3 | Query | 272 | Sending data | SELECT count(DISTINCT main.id) FROM Tickets main, Groups Groups_1, Principals Principals_2, CachedGr |
| 269 | rt_user | localhost | rt3 | Query | 209 | User lock | SELECT GET_LOCK(‘Apache-Session-0c7777147baf0004b46fed5fef74de0e’, 3600) |
| 270 | rt_user | localhost | rt3 | Query | 207 | Locked | DELETE FROM CachedGroupMembers WHERE id=‘7977’ |
| 271 | rt_user | localhost | rt3 | Query | 208 | User lock | SELECT GET_LOCK(‘Apache-Session-0c7777147baf0004b46fed5fef74de0e’, 3600) |
| 272 | rt_user | localhost | rt3 | Query | 183 | User lock | SELECT GET_LOCK(‘Apache-Session-b2a45354b7281a5d1495e5349785de55’, 3600) |
| 275 | rt_user | localhost | rt3 | Query | 113 | User lock | SELECT GET_LOCK(‘Apache-Session-b2a45354b7281a5d1495e5349785de55’, 3600) |
| 279 | root | localhost | rt3 | Query | 23 | Locked | select count(id) from CachedGroupMembers |
| 280 | root | localhost | | Query | 0 | | show processlist |
±----±---------±----------±------±--------±-----±-------------±-----------------------------------------------------------------------------------------------------+

http://www.bestpractical.com/rt – Trouble Ticketing. Free.

Stanislav Sinyagin wrote:

This one could be a problem, if it’s running for more than a couple
of seconds. Are you using mod_perl or fastcgi?

mod_perl 1.27

Ah, OK. That shouldn’t cause pain for long. I’ve seen similar
problems under fastcgi where the client went away, and the
thread sat in “sending data” state forever.

As jesse pointed out, several SELECT GET_LOCK()s for the same
session indicate something similar:- that the client has got
bored and aborted transactions and tried to do other things.

mod_perl should be a lot nicer about this than fastcgi, and
eventually (after a minute or two) things should return to
normal.

Might I reccomend the “mytop” utility (it’s in the FreeBSD
ports collection) to keep an eye on what’s happening when this
next occurs?