Weird panic error

Hi,

Weir experiencing a weird error in our production installation of RT,
something we haven’t seen before.

It’s randomly creatiing errors, after a bunch of refreshes it seems to work
again for a while.
Clearing the mason object cache + reloading the webservice also seems to
help a bit. (but only for a while)

Below a few errors that weir getting:

  1. panic: attempt to copy value HTML::Quoted::Parser=HASH(0x7f7a770dc590)

to a freed scalar 7f7a77eac6d0 at
/usr/local/share/perl/5.10.1/HTML/Quoted.pm line 79.

  1. Cannot copy to ARRAY in sassign at
    /usr/local/share/perl/5.10.1/HTML/Quoted.pm line 79.

The errors result in no history being shown, not in the overview page or on
the history page. (the history is replaced by the error line)

4.0.4 is our current version, I haven’'t yet made plans for upgrading to
4.0.5. (should I?)
We’ve experienced these errors since version 4.0.4 of RT, we didn’t seem to
have these errors with version 4.0.2.

It seems to be something with the HTML Quoted.pm file, thats the common
factor on all errors. I just don’t have a clue on where to find the
solution.

Since the problem comes up randomly it’s a little difficult to figure out
what’s causing it, I’m assuming that it might be the plugin
“RT::Extension::HistoryFilter” but turing that one off will affect the user
experience allot.
Also, we can’t seem to reproduce the error in our testing environment,
which uses the same plugins and an almost identical configuration.

In addition, I’m getting a bunch of errors in the apache logs for RT.
Though I’m not sure if they are related: (snippets of the most common
errors, they happen quite allot, especially the last one listed)

Error while loading /opt/rt4/sbin/rt-server: Since your configuration

exists (/opt/rt4/etc/RT_SiteConfig.pm) but is not writable, I’m refusing to
do anything.\n, referer: https://rt
Apache2::RequestIO::rflush: (103) Software caused connection abort at
/usr/local/share/perl/5.10.1/Plack/Handler/Apache2.pm line 153, referer:
https://rt

Has anyone seen these errors before? And if so, what could be the cause?

Thanks in advance!

– Bart

One way to get farther in situations like these is a quick hack to HTML/Quoted.pm as follows:

eval {
my $parser = HTML::Quoted::Parser->new(
[…]
);
};
if ($@) {
require Carp;
Carp::confess(“TOKEN or interesting variable”);
}

Josh Narins
Director of Application Development
SeniorBridge

845 Third Ave
7th Floor
New York, NY 10022
Tel: (212) 994-6194
Fax: (212) 994-4260
Mobile: (917) 488-6248
jnarins@seniorbridge.com
seniorbridge.comhttp://www.seniorbridge.com/

[http://www.seniorbridge.com/images/seniorbridgedisclaimerTAG.gif]

SeniorBridge Statement of Confidentiality: The contents of this email message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any dissemination, distribution or copying of this email by an unintended or mistaken recipient is strictly prohibited. In said event, kindly reply to the sender and destroy all entries of this message and any attachments from your system. Thank you.From: rt-users-bounces@lists.bestpractical.com [mailto:rt-users-bounces@lists.bestpractical.com] On Behalf Of Bart
Sent: Thursday, February 16, 2012 4:03 AM
To: rt-users@lists.bestpractical.com
Subject: [rt-users] Weird panic error

Hi,

Weir experiencing a weird error in our production installation of RT, something we haven’t seen before.

It’s randomly creatiing errors, after a bunch of refreshes it seems to work again for a while.
Clearing the mason object cache + reloading the webservice also seems to help a bit. (but only for a while)

Below a few errors that weir getting:

  1. panic: attempt to copy value HTML::Quoted::Parser=HASH(0x7f7a770dc590) to a freed scalar 7f7a77eac6d0 at /usr/local/share/perl/5.10.1/HTML/Quoted.pm line 79.

  2. Cannot copy to ARRAY in sassign at /usr/local/share/perl/5.10.1/HTML/Quoted.pm line 79.

The errors result in no history being shown, not in the overview page or on the history page. (the history is replaced by the error line)

4.0.4 is our current version, I haven’'t yet made plans for upgrading to 4.0.5. (should I?)
We’ve experienced these errors since version 4.0.4 of RT, we didn’t seem to have these errors with version 4.0.2.

It seems to be something with the HTML Quoted.pm file, thats the common factor on all errors. I just don’t have a clue on where to find the solution.

Since the problem comes up randomly it’s a little difficult to figure out what’s causing it, I’m assuming that it might be the plugin “RT::Extension::HistoryFilter” but turing that one off will affect the user experience allot.
Also, we can’t seem to reproduce the error in our testing environment, which uses the same plugins and an almost identical configuration.

In addition, I’m getting a bunch of errors in the apache logs for RT. Though I’m not sure if they are related: (snippets of the most common errors, they happen quite allot, especially the last one listed)
Error while loading /opt/rt4/sbin/rt-server: Since your configuration exists (/opt/rt4/etc/RT_SiteConfig.pm) but is not writable, I’m refusing to do anything.\n, referer: https://rt
Apache2::RequestIO::rflush: (103) Software caused connection abort at /usr/local/share/perl/5.10.1/Plack/Handler/Apache2.pm line 153, referer: https://rt

Has anyone seen these errors before? And if so, what could be the cause?

Thanks in advance!

– Bart

It seems to be something with the HTML Quoted.pm file, thats the common factor on all errors.
I just don’t have a clue on where to find the solution.

Which version of HTML::Quoted are you running. You should also be
able to safely remove HTML::Quoted from the system and RT will run
without it if it causes problems. You’ll lose functionality though.

Since the problem comes up randomly it’s a little difficult to figure out what’s causing it,
I’m assuming that it might be the plugin “RT::Extension::HistoryFilter” but turing that one
off will affect the user experience allot.
Also, we can’t seem to reproduce the error in our testing environment, which uses the same
plugins and an almost identical configuration.

Unless you can replicate load in testing, I don’t know if you’ll be
able to do a simple test case.

In addition, I’m getting a bunch of errors in the apache logs for RT. Though I’m not sure if
they are related: (snippets of the most common errors, they happen quite allot, especially the
last one listed)

 Error while loading /opt/rt4/sbin/rt-server: Since your configuration exists
 (/opt/rt4/etc/RT_SiteConfig.pm) but is not writable, I'm refusing to do anything.\n,
 referer: [1]https://rt......

This error implies that RT cannot talk to your database, which could
be causing lots of problems. Look for the error that comes before it
in the logs.

 Apache2::RequestIO::rflush: (103) Software caused connection abort at
 /usr/local/share/perl/5.10.1/Plack/Handler/Apache2.pm line 153, referer: [2]https://rt......

Traditionally this is “someone hit the stop button”.

-kevin

Sorry Slightly off-topicOn 2012-02-16 17:10, Kevin Falcone wrote:

[…]

  Apache2::RequestIO::rflush: (103) Software caused connection abort at
  /usr/local/share/perl/5.10.1/Plack/Handler/Apache2.pm line 153, referer: [2]https://rt......

Traditionally this is “someone hit the stop button”.

Hi Kevin

Honestly I have doubts on that traditional wisdom.
For example, I have 102 occurences of that error between
Mon Feb 13 07:37:12 2012 and Thu Feb 16 17:01:44 2012
102 people hitting the stop button in 3,5 days ?!?

Gerard

Honestly I have doubts on that traditional wisdom.
For example, I have 102 occurences of that error between
Mon Feb 13 07:37:12 2012 and Thu Feb 16 17:01:44 2012
102 people hitting the stop button in 3,5 days ?!?

It’s also dropped connections (think mobile devices).

1) panic: attempt to copy value
HTML::Quoted::Parser=HASH(0x7f7a770dc590) to a freed scalar
7f7a77eac6d0 at /usr/local/share/perl/5.10.1/HTML/Quoted.pm line 79.

2) Cannot copy to ARRAY in sassign at
/usr/local/share/perl/5.10.1/HTML/Quoted.pm line 79.

I’d be very curious to see the raw mail that’s causing these errors.

Also, we can’t seem to reproduce the error in our testing environment,
which uses the same plugins and an almost identical configuration.

Same module versions?

Thomas

Sorry Slightly off-topic

[…]

 Apache2::RequestIO::rflush: (103) Software caused connection abort at
 /usr/local/share/perl/5.10.1/Plack/Handler/Apache2.pm line 153, referer: [2]https://rt......

Traditionally this is “someone hit the stop button”.

Honestly I have doubts on that traditional wisdom.
For example, I have 102 occurences of that error between
Mon Feb 13 07:37:12 2012 and Thu Feb 16 17:01:44 2012
102 people hitting the stop button in 3,5 days ?!?

Gerard - see Thomas’ response about mobile connections. My main point
is that this error is not likely to be the problem he is seeking.
Unfortunately, it’s a mod_perl level error with no useful way to
debug, and it’s been that way since the mod_perl1 days.

-kevin

Hi everyone,

Thanks for all the replies!

I’ll have a look at the perl stuff and see if something can receive an
update, but removing the HTML::Quoted packet doesn’t seem right when
functionality is lost. Also, debuggin in production is rather hard since
the problem is pretty hard to predict (as I mentioned, I can’t reproduce
it).

The problem only shows up every now and then, usually we empty the RT
caches and restart Apache after which the problem stays away for a few days
again. So it isn’t a huge issue, just an annoying one when it occurs.

One thing that doesn’t seem right are the errors relating to the database,
I think I’ll take a closer look into that since that shouldn’t happen at
all. The database is hosted on another server, but this is all on ethernet
distance (low latency). But, there might be an issue with the database
configuration or it might need more tweaking. Compared to the testing
environment this is the factor that’s different, in the testing environment
the database is hosted on the same machine as RT.

Thanks for all the input so far, I believe that I can get allot further now
in solving this problem :slight_smile:

– Bart

Op 16 februari 2012 18:21 schreef Kevin Falcone
falcone@bestpractical.comhet volgende:> On Thu, Feb 16, 2012 at 06:05:00PM +0100, Gerard FENELON wrote:

Sorry Slightly off-topic

On 2012-02-16 17:10, Kevin Falcone wrote:

[…]

 Apache2::RequestIO::rflush: (103) Software caused connection

abort at

 /usr/local/share/perl/5.10.1/Plack/Handler/Apache2.pm line 153,

referer: [2]https://rt

Traditionally this is “someone hit the stop button”.

Honestly I have doubts on that traditional wisdom.
For example, I have 102 occurences of that error between
Mon Feb 13 07:37:12 2012 and Thu Feb 16 17:01:44 2012
102 people hitting the stop button in 3,5 days ?!?

Gerard - see Thomas’ response about mobile connections. My main point
is that this error is not likely to be the problem he is seeking.
Unfortunately, it’s a mod_perl level error with no useful way to
debug, and it’s been that way since the mod_perl1 days.

-kevin


RT Training Sessions (http://bestpractical.com/services/training.html)

  • Boston — March 5 & 6, 2012