RTx-Shredder speed (or lack of!)

I’ve just been running RTx-Shredder for the first time in quite a while,
and I get really slow speeds out of it… 67 minutes to delete 10
tickets. I’m sure I remember doing 1000s in a day with a previous version.

RT itself is quite responsive, so I don’t think it’s a general server issue.

I’m using
time ./rtx-shredder --plugin 'Tickets=queue,probablespam;status,deleted’
and I get offered 10 tickets to delete, and then much later:
real 67m38.004s
user 0m8.891s
sys 0m0.320s

Is there something I should be looking at here? A missing index or
something? I can see that my Attachments and CachedGroupMembers tables
have ~3M rows, for example.

Thanks for any pointers…

Howie

I’ve just been running RTx-Shredder for the first time in quite a
while,
and I get really slow speeds out of it… 67 minutes to delete 10
tickets. I’m sure I remember doing 1000s in a day with a previous
version.

RT itself is quite responsive, so I don’t think it’s a general
server issue.

I’m using
time ./rtx-shredder --plugin
‘Tickets=queue,probablespam;status,deleted’
and I get offered 10 tickets to delete, and then much later:
real 67m38.004s
user 0m8.891s
sys 0m0.320s

Is there something I should be looking at here? A missing index or
something? I can see that my Attachments and CachedGroupMembers tables
have ~3M rows, for example.

Thanks for any pointers…

Yes, it does take a lot of time, because the sorts of queries it makes
are not “normal” for RT, so there are no indices. I believe in 3.8
much of this has been addressed, but certainly in 3.4 I have had to
add a lot of extra indices to make Shredder go faster (and even after
all that it’s still fairly slow, but at least now I can delete several
hundred tickets an hour, when necessary)

I created five indexes on the Transactions table, in particular:

| Transactions | 1 | tjrc_hack1 | 1 |
OldReference | A | 17 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack1 | 2 |
ReferenceType | A | 17 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack2 | 1 |
NewReference | A | 285372 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack2 | 2 |
ReferenceType | A | 285372 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack3 | 1 |
OldValue | A | 109758 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack3 | 2 |
Type | A | 109758 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack4 | 1 |
NewValue | A | 13719 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack4 | 2 |
Type | A | 13719 | NULL | NULL | YES |
BTREE | |
| Transactions | 1 | tjrc_hack5 | 1 |
Creator | A | 10119 | NULL | NULL | |
BTREE | |

Basically, I created these by logging into the RT database with mysql
while running RTx-Shredder, and watching which queries were taking a
long time, running EXPLAIN on them and then adding appropriate indexes
to stop the full table scans which were going on.

I also added two indexes to CachedGroupMembers:

| CachedGroupMembers | 1 | tjrc_hack1 | 1 |
ImmediateParentId | A | 592001 | NULL | NULL | YES
| BTREE | |
| CachedGroupMembers | 1 | tjrc_hack1 | 2 |
MemberId | A | 592001 | NULL | NULL | YES
| BTREE | |
| CachedGroupMembers | 1 | tjrc_hack2 | 1 |
Via | A | 592001 | NULL | NULL | YES
| BTREE | |
| CachedGroupMembers | 1 | tjrc_hack2 | 2 |
id | A | 592001 | NULL | NULL |
| BTREE | |

And a Creator index on Attachments:

| Attachments | 1 | tjrc_creator | 1 |
Creator | A | 1634 | NULL | NULL | |
BTREE | |

Regards,

Tim

The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.

Tim Cutts wrote:> On 27 Nov 2008, at 1:14 pm, Howard Jones wrote:

Yes, it does take a lot of time, because the sorts of queries it makes
are not “normal” for RT, so there are no indices. I believe in 3.8
much of this has been addressed, but certainly in 3.4 I have had to
add a lot of extra indices to make Shredder go faster (and even after
all that it’s still fairly slow, but at least now I can delete several
hundred tickets an hour, when necessary)
Ah, thanks Tim!

I had a brief heart-in-mouth moment when the Attachments index took
20-some minutes to build, blocking inserts in the process, but now I can
shred 10 tickets in about 3 minutes - about 20 times faster.

I’ve also enabled the slow-query-log, so I can perhaps tune it a bit more.

Cheers,

Howie

(my case is for 3.6.4, incidentally)