Searches by watchers with OR aggregator

Hi, guys.

Thank you for the feedback. Here is patch that should address
performance problems with queries like:
“Owner = X OR Requestor = Y”
"Requestor = Y OR Status = X"
and other queries where positive search conditions (=, like …) on
Requestor, Cc, AdminCc or Watchers are joined with other conditions
using OR. It doesn’t apply to queries with all ANDs, as well it’s not
about searches by links, dates or something else. It’s about watchers
and ORs.

Best regards, Ruslan.

3.6-searches_by_watchers_performance_fix.patch (2.11 KB)

Hi, guys.

Thank you for the feedback. Here is patch that should address
performance problems with queries like:
“Owner = X OR Requestor = Y”
“Requestor = Y OR Status = X”
and other queries where positive search conditions (=, like …) on
Requestor, Cc, AdminCc or Watchers are joined with other conditions
using OR. It doesn’t apply to queries with all ANDs, as well it’s not
about searches by links, dates or something else. It’s about watchers
and ORs.

Hi Ruslan,

Thanks for your help. The patch fixed the query in question. There
is another query behind it that is extremely slow under MySQL 4.1.22,
apparently. Under MySQL 4.1.18 it takes 3 seconds - on MySQL 4.1.22
it takes 3minutes. Yikes.

Seems like a pretty huge difference for any possible query optimizer
changes from 4.1.18 → 4.1.21.

Looks like I might need to downgrade mysql.

SELECT DISTINCT main.* FROM Tickets main CROSS JOIN Users Users_3 JOIN
Groups Groups_1 ON ( Groups_1.Domain = ‘RT::Ticket-Role’ ) AND (
Groups_1.Type = ‘Requestor’ ) AND ( Groups_1.Instance = main.id ) LEFT
JOIN CachedGroupMembers CachedGroupMembers_2 ON (
CachedGroupMembers_2.MemberId = Users_3.id ) AND (
CachedGroupMembers_2.GroupId = Groups_1.id ) WHERE
(Users_3.EmailAddress LIKE ‘%tom%’) AND (main.Status != ‘deleted’) AND
(main.Owner = ‘58936’ OR ( CachedGroupMembers_2.id IS NOT NULL ) )
AND (main.Type = ‘ticket’) AND (main.EffectiveId = main.id) ORDER BY
main.id ASC LIMIT 50

50 rows in set (3 min 2.12 sec)

Hi, guys.

Thank you for the feedback. Here is patch that should address
performance problems with queries like:
“Owner = X OR Requestor = Y”
“Requestor = Y OR Status = X”
and other queries where positive search conditions (=, like …) on
Requestor, Cc, AdminCc or Watchers are joined with other conditions
using OR. It doesn’t apply to queries with all ANDs, as well it’s not
about searches by links, dates or something else. It’s about watchers
and ORs.

Hi Ruslan,

Thanks for your help. The patch fixed the query in question. There
is another query behind it that is extremely slow under MySQL 4.1.22,
apparently. Under MySQL 4.1.18 it takes 3 seconds - on MySQL 4.1.22
it takes 3minutes. Yikes.
Can you give me EXPLAINs? Wonder if you can give explain of the query
for both versions of mysql?

Keneth, Robert or other RT users, could you please try to reproduce
results Brian sees.

Seems like a pretty huge difference for any possible query optimizer
changes from 4.1.18 → 4.1.21.
When I’m talking about mysql query optimizer, I’m talking not about
optimizations mysql uses during execution of a query, but about the
mysql query planner which defines order of operations, which indexes
to use and so on. When you run EXPLAIN SELECT… you’re executing the
planner and get a plan mysql will use to execute the query. Even minor
changes in a plan can kill performance.

Ok, we have another variant of the query. Remember that third query we
tested before with COUNT(…)? Try the following query:
SELECT SQL_NO_CACHE DISTINCT main.*
FROM Tickets main
JOIN Groups Groups_1
ON Groups_1.Domain = ‘RT::Ticket-Role’
AND Groups_1.Type = ‘Requestor’
AND Groups_1.Instance = main.id
LEFT JOIN CachedGroupMembers CachedGroupMembers_2
ON CachedGroupMembers_2.GroupId = Groups_1.id
LEFT JOIN Users Users_3
ON CachedGroupMembers_2.MemberId = Users_3.id
WHERE
main.Status != ‘deleted’
AND main.Type = ‘ticket’
AND main.EffectiveId = main.id
AND ( main.Owner = ‘58936’ OR Users_3.EmailAddress LIKE ‘%tom%’ )
ORDER BY main.id ASC LIMIT 50

May be we should switch to this variant even if counting ticket with
it is slower.

Looks like I might need to downgrade mysql.

SELECT DISTINCT main.* FROM Tickets main CROSS JOIN Users Users_3 JOIN
Groups Groups_1 ON ( Groups_1.Domain = ‘RT::Ticket-Role’ ) AND (
Groups_1.Type = ‘Requestor’ ) AND ( Groups_1.Instance = main.id ) LEFT
JOIN CachedGroupMembers CachedGroupMembers_2 ON (
CachedGroupMembers_2.MemberId = Users_3.id ) AND (
CachedGroupMembers_2.GroupId = Groups_1.id ) WHERE
(Users_3.EmailAddress LIKE ‘%tom%’) AND (main.Status != ‘deleted’) AND
(main.Owner = ‘58936’ OR ( CachedGroupMembers_2.id IS NOT NULL ) )
AND (main.Type = ‘ticket’) AND (main.EffectiveId = main.id) ORDER BY
main.id ASC LIMIT 50

50 rows in set (3 min 2.12 sec)

Best regards, Ruslan.

Can you give me EXPLAINs? Wonder if you can give explain of the query
for both versions of mysql?

The explains are identical between 4.1.18 and 4.1.22:

| 1 | SIMPLE | main | ALL |
PRIMARY,Tickets2,Tickets4,Tickets5 | NULL | NULL | NULL
| 141524 | Using where; Using temporary; Using
filesort |
| 1 | SIMPLE | Groups_1 | ref | Groups1,Groups2
| Groups1 | 135 | const,rt3.main.id,const
| 1 | Using where; Using index; Distinct |
| 1 | SIMPLE | Users_3 | index | NULL
| Users4 | 121 | NULL
| 1548 | Using where; Using index; Distinct |
| 1 | SIMPLE | CachedGroupMembers_2 | ref | DisGrouMem,GrouMem
| DisGrouMem | 10 |
rt3.Groups_1.id,rt3.Users_3.id | 1 | Using where; Using index;
Distinct

Keneth, Robert or other RT users, could you please try to reproduce
results Brian sees.

Seems like a pretty huge difference for any possible query optimizer
changes from 4.1.18 → 4.1.21.
When I’m talking about mysql query optimizer, I’m talking not about
optimizations mysql uses during execution of a query, but about the
mysql query planner which defines order of operations, which indexes
to use and so on. When you run EXPLAIN SELECT… you’re executing the
planner and get a plan mysql will use to execute the query. Even minor
changes in a plan can kill performance.

Gotcha, this is definately something that was changed in mysql between
4.1.18 and 4.1.22.

Ok, we have another variant of the query. Remember that third query we
tested before with COUNT(…)? Try the following query:
SELECT SQL_NO_CACHE DISTINCT main.*
FROM Tickets main
JOIN Groups Groups_1
ON Groups_1.Domain = ‘RT::Ticket-Role’
AND Groups_1.Type = ‘Requestor’
AND Groups_1.Instance = main.id
LEFT JOIN CachedGroupMembers CachedGroupMembers_2
ON CachedGroupMembers_2.GroupId = Groups_1.id
LEFT JOIN Users Users_3
ON CachedGroupMembers_2.MemberId = Users_3.id
WHERE
main.Status != ‘deleted’
AND main.Type = ‘ticket’
AND main.EffectiveId = main.id
AND ( main.Owner = ‘58936’ OR Users_3.EmailAddress LIKE ‘%tom%’ )
ORDER BY main.id ASC LIMIT 50

May be we should switch to this variant even if counting ticket with
it is slower.

This runs in 2.07 sec in 4.1.22 and 2.10 sec in 4.1.18.

-Brian

First off thanks for the patch, I just applied it and the effected
queries are running faster!

As for the query below:

On 5.0.45 I’m getting:
Original query: 0.23 sec
New query: 0.59 sec

We don’t really have any older 4.x systems laying around that I could
test this on.

Explains are listed below, thanks again!

.r’

Original query:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: main
type: ALL
possible_keys: PRIMARY,Tickets2,Tickets3,Tickets4,Tickets5,Tickets6
key: NULL
key_len: NULL
ref: NULL
rows: 7275
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: Groups_1
type: ref
possible_keys: Groups1,Groups2
key: Groups1
key_len: 139
ref: const,rt3.main.id,const
rows: 1
Extra: Using where; Using index; Distinct
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: Users_3
type: index
possible_keys: NULL
key: Users4
key_len: 123
ref: NULL
rows: 622
Extra: Using where; Using index; Distinct
*************************** 4. row ***************************
id: 1
select_type: SIMPLE
table: CachedGroupMembers_2
type: ref
possible_keys: DisGrouMem,GrouMem
key: GrouMem
key_len: 10
ref: rt3.Groups_1.id,rt3.Users_3.id
rows: 1
Extra: Using where; Using index; Distinct

New Query:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: main
type: ALL
possible_keys: PRIMARY,Tickets2,Tickets3,Tickets4,Tickets5,Tickets6
key: NULL
key_len: NULL
ref: NULL
rows: 7275
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: Groups_1
type: ref
possible_keys: Groups1,Groups2
key: Groups1
key_len: 139
ref: const,rt3.main.id,const
rows: 1
Extra: Using where; Using index; Distinct
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: CachedGroupMembers_2
type: ref
possible_keys: DisGrouMem,GrouMem
key: DisGrouMem
key_len: 5
ref: rt3.Groups_1.id
rows: 1
Extra: Using index; Distinct
*************************** 4. row ***************************
id: 1
select_type: SIMPLE
table: Users_3
type: eq_ref
possible_keys: PRIMARY,Users3
key: PRIMARY
key_len: 4
ref: rt3.CachedGroupMembers_2.MemberId
rows: 1
Extra: Using where; Distinct

yes, it can, but I really would like to hear from other users about
the patch. Especially I’m interested in reports from people who are
using 4.1.22 and have many tickets and many users in the DB.On 7/30/07, Brian Kerr kerrboy@gmail.com wrote:

Hey Ruslan,

Can the patch be reworked to use the third query that you had sent to me?

Thanks,
Brian

On 7/25/07, Ruslan Zakirov ruz@bestpractical.com wrote:

Hi, guys.

Thank you for the feedback. Here is patch that should address
performance problems with queries like:
“Owner = X OR Requestor = Y”
“Requestor = Y OR Status = X”
and other queries where positive search conditions (=, like …) on
Requestor, Cc, AdminCc or Watchers are joined with other conditions
using OR. It doesn’t apply to queries with all ANDs, as well it’s not
about searches by links, dates or something else. It’s about watchers
and ORs.


Best regards, Ruslan.

Best regards, Ruslan.