RT 4.4.6 - Slow queue listing query when sorting by owner?

Hello! We had an old RT 4.4.4 instance which was recently upgraded to 4.4.6 on a new OS. The SQL queries being used to do things like simple queue open-ticket listings (of the sort you get by clicking on a queue name from the stock “my queues” list on the dashboard) seem to have gotten quite a bit more complicated, and are much slower than they used to be. The backend database is MariaDB in both cases, though there was also a MariaDB upgrade inbetween. The RT config is the same between them, and we’ve long had “Set($UseSQLForACLChecks, 0);” in our local check to improve performance.

For example, in 4.4.4, the query used to show open tickets in a queue (sorted by owner) ends up:

SELECT main.*
FROM    
    Tickets main
    JOIN Queues Queues_1  ON ( Queues_1.id = main.Queue )
    LEFT JOIN Users Users_2 ON ( Users_2.id = main.Owner )
WHERE           
    (main.IsMerged IS NULL)
    AND (main.Status != 'deleted')
    AND (main.Type = 'ticket')
    AND (   
        main.Queue = '15' AND (
                (
                    Queues_1.Lifecycle = 'approvals' AND
                    ( main.Status = 'new' OR main.Status = 'open' OR main.Status = 'stalled' )
                ) OR (
                    Queues_1.Lifecycle = 'default' AND
                    ( main.Status = 'new' OR main.Status = 'open' OR main.Status = 'stalled' )
                )
            )
        )
ORDER BY Users_2.Name ASC
LIMIT 50;

But in 4.4.6 it’s:

SELECT main.*
FROM
    Tickets main
    LEFT JOIN Groups Groups_2  ON ( Groups_2.Domain = 'RT::Ticket-Role' ) AND ( Groups_2.Name = 'Owner' ) AND ( Groups_2.Instance = main.id )
    JOIN Queues Queues_1  ON ( Queues_1.id = main.Queue )
    LEFT JOIN CachedGroupMembers CachedGroupMembers_3  ON ( CachedGroupMembers_3.Disabled = '0' ) AND ( CachedGroupMembers_3.GroupId != CachedGroupMembers_3.MemberId ) AND ( CachedGroupMembers_3.GroupId = Groups_2.id )
    LEFT JOIN Groups Groups_4  ON ( Groups_4.id = CachedGroupMembers_3.MemberId )
    LEFT JOIN Users Users_5  ON ( Users_5.id = CachedGroupMembers_3.MemberId )
WHERE
    (Groups_4.id IS NULL)
    AND (main.IsMerged IS NULL)
    AND (main.Status != 'deleted')
    AND (main.Type = 'ticket')
    AND (
        main.Queue = '15'
        AND (
                (
                    Queues_1.Lifecycle = 'default'
                    AND main.Status IN ('new', 'open', 'stalled')
                ) OR (
                    Queues_1.Lifecycle = 'approvals'
                    AND main.Status IN ('new', 'open', 'stalled')
                )
            )
        )
GROUP BY main.id
ORDER BY MIN(Users_5.Name) ASC
LIMIT 50;

Those extra JOINs in 4.4.6 make a pretty huge difference to the query time. I’ve checked our InnoDB buffer pool and it’s got plenty of room and isn’t overloaded. Is there a way to tell 4.4.6 to not JOIN through Groups/CachedGroupMembers like that?

So after digging through this a bit more, it seems maybe related to the multiple-owner capability added in RT 4.4.5, perhaps? The 4.4.4 behavior was a simple JOIN based on Tickets.Owner, whereas the new version hops through Groups->CachedGroupMembers->Groups and does some checking to make sure a record’s not disabled… We’ve currently got the default of “Set($AssetMultipleOwner, 0);”, so we’re definitely not actively using multiple-owner functionality.

Oh sorry, one more note about this which I should’ve mentioned at the start: this slow behavior only happens for “regular” users. We’d not noticed originally after the upgrade because superusers seem to still just use the simple Ticket.Owner JOIN. I’d originally thought it must have been related to that $UseSQLForACLChecks param because of that, but in the end I think that was a red herring…

It looks like someone was reporting something similar during 5.0.0 beta, as well: Login – specifically that second query, after they mention “The other one is more serious, since it’s about 500 ms for every row of a query that includes the “Owner” field.” I’m beginning to think that tallies very closely with my own experience. I’ve mostly been focusing on this one ticket-search bit of SQL, but it had also seemed like even populating the list of tickets was taking a long time too, and I too suspect it’s because of this Owner handling.

One more (probably final) update on this – it looks like at the moment we can get back to the old very-quick Owner-name sorting if we replace Owner with OwnerName in the search result columns/format. In the DefaultSearchResultFormat config var, that probably means replacing Owner with ‘OwnerName’. Then we may want to write something to loop through folks’ user preferences, 'cause some folks do have a custom format set in there.

Anyway, that’ll probably do for now, for this issue, at least. I do wonder if anything can be done on our database side to make those more-expensive queries more performant. I’ve yet to find any smoking gun performancewise there, though. We ended up expanding our InnoDB buffer pool just in case, and it’s now nowhere even close to full. We do have query caching turned on in there, which I’ve found is a contentious choice, but we’re not pushing nearly enough transactions for that to start bottlenecking anything, and its hit/miss numbers look good too.

I’ve seen something alike in 4.4.4 when I’ve tried to resolve some potentionally security issue. We only assign rights for user groups and roles, not for single users. But historically we have many users with Privileged flag that are not in any group now. So I’ve tried to set all of such users Privileged=0. Of course with the exception of RT_System, root and nobody. And then RT started for normal users bahave as you’ve described and also behavior for superusers was the same (all worked OK for them). I haven’t solve that in short time so I returned back all changes and now it works normally even after upgrade to 4.4.6. Sorry not to help more but maybe there is some clue for you.