OS: RHEL 7.6
Apache2.4.6 with FastCGI mod_fcgid/2.3.9
Oracle Database
I have RT5 running in a custom location and everything works (ish) although RT sometimes takes several minutes to return a page and sometimes I get the dreaded Error 500 Internal Server Error. I’ve read through the docs several times and can’t see what config changes I need to make to improve performance. From the email side of things, everything appears to be running well and cases are being created. It’s simply the web interface that is having the issue.
As per web deployment doc I have disabled mod_speling and mod_cache and have the prefork MPM mod configured.
My SSL virtual host includes:
ScriptAlias /rt /app/rt5/sbin/rt-server.fcgi/
<Location /rt>
Require all granted
Options +ExecCGI
AddHandler fcgid-script fcgi
Does anyone have suggestions where else I could look/investigate.
Unsure if the logs help. I saw similar messages when some files didn’t have the right permissions. As this is intermittent, I’ve ruled out permissions:
Thanks for your reply. It’s completely random but I am now focussed on the server config itself, as I’ve noticed stopping apache can take a while and result in it timing out and being killed.
I’ve also reviewed atop logs and noticed yesterday that multiple rt-server.fcgi proceses were started that resulted in all swap memory being consumed before the system killed them all off.
I’ll look to tweak the mod_fcgid.conf, and see if other modules are conflicting as I do have php running, but that doesn’t experience any issues and will continue to run whilst I experience issues with RT. I’m currently using PHP for Webmail that I eventually want RT to replace!
A new development is Error 500 on CSS files:
[03/Sep/2020:10:29:10 +0100] “GET /rt/NoAuth/css/elevator-light/squished-07928e9017d9e4f24077f9c5aabcc235.css HTTP/1.1” 500 547
Just to revist this a month later, I’ve tweaked various configs and believed it helped with stability; it has to some degree as I was having apache processes running away and consuming memory & swap. The config change has helped, but I’m still getting a lot of “mod_fcgid: read data timeout in 30 seconds” even when pushing email into RT using rt-mailgate.
I have enabled SQL debug mode and can see SQL queries are quick:
[Mon Oct 5 15:56:28 2020] [debug]: SQL(0.000898s): SELECT * FROM Tickets WHERE id = ?; [ bound values: ‘261’ ] (/app/rt5/sbin/…/lib/RT/Interface/Web.pm:1356)
The last line in the RT log is
[Mon Oct 5 15:56:28 2020] [debug]: SQL(0.000867s): SELECT * FROM Queues WHERE id = ?; [ bound values: ‘23’ ] (/app/rt5/sbin/…/lib/RT/Interface/Web.pm:1356)
Then the following in Apache logs (just over 30 seconds)
[Mon Oct 05 16:57:02.700685 2020] [fcgid:warn] [pid 19755] [client 10...:12680] mod_fcgid: read data timeout in 30 seconds, referer: https://*****/rt/Ticket/Display.html?id=261
[Mon Oct 05 16:57:02.700793 2020] [core:error] [pid 19755] [client 10...:12680] End of script output before headers: rt-server.fcgi, referer: https://*****/rt/Ticket/Display.html?id=261
It’s like RT just stops processing for some reason.
Possibly, but given the fact the server is handling a low load with only a couple of users, I would have thought 30 seconds would be plenty of time and that another issue is at play here. In general, the solution works well only taking a few seconds to display information/ticket/homepage etc… it just sometimes seems to get stuck.
I need a better picture of what is happening between the last RT log at Mon Oct 5 16:56:28 2020 and the fcgid warning at Mon Oct 05 16:57:02.700685 2020. I’ll look to see if I can a more detailed view of what’s happening on the server. Perhaps atop or something else can help. But will see if increasing the timeout helps in any way.
@KayJay I had a similar issue and after a week of struggle it was an rt5 bug.
To check if it is the same case you can follow two paths
Fast path, less accurate to know if it is the same case :
Remove the “SavedSearches” module from all dashboards.
On your RT home page click edit on the top right corner
Drag and drop the module to the left.
Even one dashboard with this module can slow everything down.
Second path, more time consuming but accurate :
Enable the debug log on Apache server and search for the error log below :
[130972] [Tue Aug 11 01:22:02 2020] [error]: Can't locate object method "ColumnMapClassName" via package "RT::SavedSearch" at /opt/production/rt5.0/share/html/Elements/CollectionAsTable/Header line 74.
Good suggestion about changing the LogLevel to debug, which I’ve done this morning, as well as increased TimeOut to 45 seconds.
I still have a fairly vanilla install of RT5, upgraded from 4.4, and don’t have any SavedSearches. But thanks for your suggestion. I hope the degub mode turns something up.