It occurs sporadically. When navigating around I would say it around 1 in 10 clicks. When I leave it on the home page it occurs after 5-10 reloads. In my testing it seems able to happen on just about any page.
I am running on RHEL 6.5
Apache 2.2.15
mod_fcgid-2.3.9
Oracle Instant Client 11.2 against a 11g database.
I ran the development server on the same code as our production (4.0.17) for a day before upgrading and did not experience this error so I feel it is related to 4.2.x
I enabled debug logging and stack traces but when this error occurs nothing is logged other than the http error log above.
Does anyone have any idea what could be causing this or ideas on how to troubleshoot this further?
Does anyone have any idea what could be causing this or ideas on how to
troubleshoot this further?
Look for slow queries in your database (there are oracle specific
reports for this) or try turning on MasonX::Profiler in development
and see what RT is doing when it times out.
Look for slow queries in your database (there are oracle specific
reports for this) or try turning on MasonX::Profiler in development
and see what RT is doing when it times out.
I enabled MasonX::Profiler as you suggested. Here is the relevant log entries
RT::Config::ANON(‘RT::Config=HASH(0x14eab98)’, undef) called at /opt/rt4devel/sbin/…/lib/RT/Config.pm line 1159
RT::Config::PostLoadCheck(‘RT::Config=HASH(0x14eab98)’) called at /opt/rt4devel/sbin/…/lib/RT.pm line 199
RT::Init(‘RT’, ‘Heavy’, 1) called at /opt/rt4devel/sbin/rt-server.fcgi line 126
Our DBA is looking into the database side. Apparently It is blocking itself going against the 11g database. Sessions are blocking other sessions. We are looking into any possible database parameters new to 11g that cause this behavior to be different from our production 10g RT database.
Our DBA is looking into the database side. Apparently It is blocking
itself going against the 11g database. Sessions are blocking other
sessions. We are looking into any possible database parameters new to
11g that cause this behavior to be different from our production 10g RT
database.
Sounds like it’s indeed related to session locking. You can try
switching to on-disk sessions and see if it resolves the issue:
Set($WebSessionClass, "Apache::Session::File");
However, we’d also be interested to hear what your DBA discovers about
the deadlocks, as this issue may well bite other Oracle users, and we’d
like to resolve it.
Sounds like it’s indeed related to session locking. You can try
switching to on-disk sessions and see if it resolves the issue:
Set($WebSessionClass, "Apache::Session::File”);
Running this way did indeed prevent the error from occurring. We
switched back to database sessions to perform more testing.
However, we’d also be interested to hear what your DBA discovers about
the deadlocks, as this issue may well bite other Oracle users, and we’d
like to resolve it.
Our DBA has been doing some research and believes FOR UPDATE in this
statement is the issue
SELECT a_session
FROM sessions
WHERE id = :p1 FOR UPDATE
From Oracle Doc:
How Oracle Database Locks Data
Locks are mechanisms that prevent destructive interaction between
transactions accessing the same resource—either user objects such as
tables and rows or system objects not visible to users, such as shared
data structures in memory and data dictionary rows.
In all cases, Oracle Database automatically obtains necessary locks when
executing SQL statements, so users need not be concerned with such
details. Oracle Database automatically uses the lowest applicable level
of restrictiveness to provide the highest degree of data concurrency yet
also provide fail-safe data integrity. Oracle Database also allows the
user to lock data manually.
Doing some more research I see that RT using Oracle used to use File Sesssions and switched in 4.2 to using Oracle. Then a change was made to make sessions default to the MaxAttachment size which, from reading DBD::Oracle documentation, could trigger FOR UPDATE being used causing the locking issue.
I guess the answer for me is to just use Apache Sesssion Files.
I would be happy to test any solutions for using Oracle for sessions on my devel instance but don’t want to hold up my upgrade any longer.
The two commits in question
I did confirm this bug occurs against a 10g and 11g Oracle database using 4.2
Sounds like it’s indeed related to session locking. You can try
switching to on-disk sessions and see if it resolves the issue:
Set($WebSessionClass, "Apache::Session::File”);
Running this way did indeed prevent the error from occurring. We
switched back to database sessions to perform more testing.
However, we’d also be interested to hear what your DBA discovers about
the deadlocks, as this issue may well bite other Oracle users, and we’d
like to resolve it.
Our DBA has been doing some research and believes FOR UPDATE in this
statement is the issue
SELECT a_session
FROM sessions
WHERE id = :p1 FOR UPDATE
From Oracle Doc:
How Oracle Database Locks Data
Locks are mechanisms that prevent destructive interaction between
transactions accessing the same resource—either user objects such as
tables and rows or system objects not visible to users, such as shared
data structures in memory and data dictionary rows.
In all cases, Oracle Database automatically obtains necessary locks when
executing SQL statements, so users need not be concerned with such
details. Oracle Database automatically uses the lowest applicable level
of restrictiveness to provide the highest degree of data concurrency yet
also provide fail-safe data integrity. Oracle Database also allows the
user to lock data manually.
Doing some more research I see that RT using Oracle used to use File
Sesssions and switched in 4.2 to using Oracle. Then a change was made to
make sessions default to the MaxAttachment size which, from reading
DBD::Oracle documentation, could trigger FOR UPDATE being used causing
the locking issue.
I guess the answer for me is to just use Apache Sesssion Files.
I would be happy to test any solutions for using Oracle for sessions on
my devel instance but don�t want to hold up my upgrade any longer.
I did confirm this bug occurs against a 10g and 11g Oracle database
using 4.2
What steps do you use to reproduce the error? If we can trigger it
locally (we test against 10g), we can more easily fix the error.
Alex
I didn’t need to do anything specific to reproduce it. I just upgraded to 4.2 and it started. It is somewhat intermittent though. I usually login to the main dashboard in a tab and set it to refresh every 10 minutes and it happens within about half an hour. I also can make it happen by refreshing the main page over and over.
What steps do you use to reproduce the error? If we can trigger it
locally (we test against 10g), we can more easily fix the error.
I didn�t need to do anything specific to reproduce it. I just upgraded
to 4.2 and it started. It is somewhat intermittent though. I usually
login to the main dashboard in a tab and set it to refresh every 10
minutes and it happens within about half an hour. I also can make it
happen by refreshing the main page over and over.
If I can provide anything else let me know.
I’m ideally looking for a reproduction strategy against a clean RT 4.2.
Can you show the results of (filling in … for the value of your RT
session cookie):
SELECT id, LENGTH(a_session) FROM sessions
WHERE id = ‘…’
…as well as the value that your $MaxAttachmentSize is set to?