RT 4.2.10 crash

Hi all,
I have recently implemented approvals for one queue in our RT (v4.2.10).

I am experiencing a crash, and I cannot find the reason for it.
As soon as the approver approves a ticket, he gets an error message (see attached image001.png).

The log is not very clear (not for me at least) on the reason for the crash.

Here is an excerpt of the log just around the crash:

[25336] [Wed Jun 3 15:14:27 2015] [debug]: Found 3 scrips for TransactionBatch stage with applicable type(s) Status for txn #1533997 on ticket #105388 (/opt/rt4/sbin/…/lib/RT/Scrips.pm:477)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #50 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 6 – ((eval 694):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #6 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 10 – ((eval 695):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #10 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Found 3 scrips for TransactionBatch stage with applicable type(s) Comment for txn #1533993 on ticket #105388 (/opt/rt4/sbin/…/lib/RT/Scrips.pm:477)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #50 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 6 – ((eval 696):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #6 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 10 – ((eval 697):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #10 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Found 3 scrips for TransactionBatch stage with applicable type(s) Comment for txn #1533994 on ticket #105388 (/opt/rt4/sbin/…/lib/RT/Scrips.pm:477)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #50 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 6 – ((eval 698):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #6 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 10 – ((eval 699):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #10 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [error]: Died at /opt/rt4/sbin/…/lib/RT/Rule.pm line 100.

Stack:
[/opt/rt4/sbin/…/lib/RT/Rule.pm:100]
[/opt/rt4/sbin/…/lib/RT/Approval/Rule/Passed.pm:104]
[/opt/rt4/sbin/…/lib/RT/Ruleset.pm:73]
[/opt/rt4/sbin/…/lib/RT/Transaction.pm:212]
[/opt/rt4/sbin/…/lib/RT/Record.pm:1693]
[/opt/rt4/sbin/…/lib/RT/Ticket.pm:2696]
[/opt/rt4/sbin/…/lib/RT/Ticket.pm:2398]
[/opt/rt4/sbin/…/lib/RT/Ticket.pm:2348]
[/opt/rt4/share/html/Approvals/index.html:86]
[/opt/rt4/share/html/Approvals/autohandler:53]
[/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:681]
[/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:369]
[/opt/rt4/share/html/autohandler:53] (/opt/rt4/sbin/…/lib/RT/Interface/Web/Handler.pm:208)
[25336] [Wed Jun 3 15:14:27 2015] [warning]: Rollback and commit are mixed while escaping nested transaction at /usr/local/share/perl5/DBIx/SearchBuilder/Handle.pm line 848. (/usr/local/share/perl5/DBIx/SearchBuilder/Handle.pm:848)
[25336] [Wed Jun 3 15:14:27 2015] [critical]: Transaction not committed. Usually indicates a software fault.Data loss may have occurred (/opt/rt4/sbin/…/lib/RT/Interface/Web/Handler.pm:168)

I found evidence of some problems in a scrip a little earlier in the log:
[25227] [Wed Jun 3 13:49:47 2015] [info]: – Scrip 21 – ((eval 790):1)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $principal in pattern match (m//) at /opt/rt4/sbin/…/lib/RT/Group.pm line 1155. (/opt/rt4/sbin/…/lib/RT/Group.pm:1153)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Group::HasMember was called with an argument that isn’t an RT::Principal or id. It’s (undefined) (/opt/rt4/sbin/…/lib/RT/Group.pm:1158)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $principal in pattern match (m//) at /opt/rt4/sbin/…/lib/RT/Group.pm line 1155. (/opt/rt4/sbin/…/lib/RT/Group.pm:1153)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Group::HasMember was called with an argument that isn’t an RT::Principal or id. It’s (undefined) (/opt/rt4/sbin/…/lib/RT/Group.pm:1158)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $principal in pattern match (m//) at /opt/rt4/sbin/…/lib/RT/Group.pm line 1155. (/opt/rt4/sbin/…/lib/RT/Group.pm:1153)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Group::HasMember was called with an argument that isn’t an RT::Principal or id. It’s (undefined) (/opt/rt4/sbin/…/lib/RT/Group.pm:1158)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $addr in concatenation (.) or string at (eval 790) line 54. ((eval 790):54)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $Principal in concatenation (.) or string at (eval 790) line 54. ((eval 790):54)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Scrip:AddWatchersOnCorrespond: Failed to add new Cc watcher to ticket #105369: (#) - One, and only one, of PrincipalId/User/Group is required ((eval 790):54)

However, this was an earlier instance of the error. After that, I tried to disable scrip 21 (AddWatchersOnCorrespond) but the approver got the error again.

I am attaching the complete log related to the latest instance of the error.

I would be extremely grateful if someone could shed some light on the problem, especially since I am somewhat scared that we’re corrupting our data.

Thank you in advance.
Cris

rt_debug_log.zip (4.94 KB)

I see that nobody has any idea regarding my problem.

After more investigation, I can say that part of the problem is that, upon approving a ticket, there is a correspond transaction on the original ticket, so all the scrips that act on correspond are fired. BUT I cannot actually see any correspond transaction on the original ticket!

Also, by adding a lot of debugging in the scrips, I see other strange ticket metadata. As an example, I see the ticket status is “open”, but when I look at the ticket in RT it is still “new” (as it should be).

This is the reason (or at least one of the reasons) why RT is misbehaving… or, better, some of my scrips are misbehaving.

Could someone please shed some light on the way approvals are working internally? Especially the reason of this ghost correspond transaction, and how can I tell it apart from the “normal” correspond transactions.

T.I.A.

CrisOn 04/06/2015 10:32, Guadagnino Cristiano wrote:
Hi all,
I have recently implemented approvals for one queue in our RT (v4.2.10).

I am experiencing a crash, and I cannot find the reason for it.
As soon as the approver approves a ticket, he gets an error message (see attached image001.png).

The log is not very clear (not for me at least) on the reason for the crash.

Here is an excerpt of the log just around the crash:

[25336] [Wed Jun 3 15:14:27 2015] [debug]: Found 3 scrips for TransactionBatch stage with applicable type(s) Status for txn #1533997 on ticket #105388 (/opt/rt4/sbin/…/lib/RT/Scrips.pm:477)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #50 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 6 – ((eval 694):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #6 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 10 – ((eval 695):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #10 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Found 3 scrips for TransactionBatch stage with applicable type(s) Comment for txn #1533993 on ticket #105388 (/opt/rt4/sbin/…/lib/RT/Scrips.pm:477)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #50 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 6 – ((eval 696):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #6 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 10 – ((eval 697):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #10 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Found 3 scrips for TransactionBatch stage with applicable type(s) Comment for txn #1533994 on ticket #105388 (/opt/rt4/sbin/…/lib/RT/Scrips.pm:477)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #50 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 6 – ((eval 698):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #6 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [info]: – Scrip 10 – ((eval 699):1)
[25336] [Wed Jun 3 15:14:27 2015] [debug]: Skipping Scrip #10 because it isn’t applicable (/opt/rt4/sbin/…/lib/RT/Scrips.pm:337)
[25336] [Wed Jun 3 15:14:27 2015] [error]: Died at /opt/rt4/sbin/…/lib/RT/Rule.pm line 100.

Stack:
[/opt/rt4/sbin/…/lib/RT/Rule.pm:100]
[/opt/rt4/sbin/…/lib/RT/Approval/Rule/Passed.pm:104]
[/opt/rt4/sbin/…/lib/RT/Ruleset.pm:73]
[/opt/rt4/sbin/…/lib/RT/Transaction.pm:212]
[/opt/rt4/sbin/…/lib/RT/Record.pm:1693]
[/opt/rt4/sbin/…/lib/RT/Ticket.pm:2696]
[/opt/rt4/sbin/…/lib/RT/Ticket.pm:2398]
[/opt/rt4/sbin/…/lib/RT/Ticket.pm:2348]
[/opt/rt4/share/html/Approvals/index.html:86]
[/opt/rt4/share/html/Approvals/autohandler:53]
[/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:681]
[/opt/rt4/sbin/…/lib/RT/Interface/Web.pm:369]
[/opt/rt4/share/html/autohandler:53] (/opt/rt4/sbin/…/lib/RT/Interface/Web/Handler.pm:208)
[25336] [Wed Jun 3 15:14:27 2015] [warning]: Rollback and commit are mixed while escaping nested transaction at /usr/local/share/perl5/DBIx/SearchBuilder/Handle.pm line 848. (/usr/local/share/perl5/DBIx/SearchBuilder/Handle.pm:848)
[25336] [Wed Jun 3 15:14:27 2015] [critical]: Transaction not committed. Usually indicates a software fault.Data loss may have occurred (/opt/rt4/sbin/…/lib/RT/Interface/Web/Handler.pm:168)

I found evidence of some problems in a scrip a little earlier in the log:
[25227] [Wed Jun 3 13:49:47 2015] [info]: – Scrip 21 – ((eval 790):1)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $principal in pattern match (m//) at /opt/rt4/sbin/…/lib/RT/Group.pm line 1155. (/opt/rt4/sbin/…/lib/RT/Group.pm:1153)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Group::HasMember was called with an argument that isn’t an RT::Principal or id. It’s (undefined) (/opt/rt4/sbin/…/lib/RT/Group.pm:1158)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $principal in pattern match (m//) at /opt/rt4/sbin/…/lib/RT/Group.pm line 1155. (/opt/rt4/sbin/…/lib/RT/Group.pm:1153)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Group::HasMember was called with an argument that isn’t an RT::Principal or id. It’s (undefined) (/opt/rt4/sbin/…/lib/RT/Group.pm:1158)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $principal in pattern match (m//) at /opt/rt4/sbin/…/lib/RT/Group.pm line 1155. (/opt/rt4/sbin/…/lib/RT/Group.pm:1153)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Group::HasMember was called with an argument that isn’t an RT::Principal or id. It’s (undefined) (/opt/rt4/sbin/…/lib/RT/Group.pm:1158)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $addr in concatenation (.) or string at (eval 790) line 54. ((eval 790):54)
[25227] [Wed Jun 3 13:49:47 2015] [warning]: Use of uninitialized value $Principal in concatenation (.) or string at (eval 790) line 54. ((eval 790):54)
[25227] [Wed Jun 3 13:49:47 2015] [error]: Scrip:AddWatchersOnCorrespond: Failed to add new Cc watcher to ticket #105369: (#) - One, and only one, of PrincipalId/User/Group is required ((eval 790):54)

However, this was an earlier instance of the error. After that, I tried to disable scrip 21 (AddWatchersOnCorrespond) but the approver got the error again.

I am attaching the complete log related to the latest instance of the error.

I would be extremely grateful if someone could shed some light on the problem, especially since I am somewhat scared that we’re corrupting our data.

Thank you in advance.
Cris

Cristiano Guadagnino

Servizio Sistemi Dipartimentali, Periferici e DB
Bankadati Servizi Informatici Soc.Cons.P.A.
Gruppo bancario Credito Valtellinese
VIA TRENTO, 22 - 23100 SONDRIO
tel +39 0342522172 - fax +39 0342522997
guadagnino.cristiano@creval.it
www.creval.ithttp://www.creval.it

Il presente messaggio non è di natura personale ma inviato per esigenze lavorative; l’eventuale messaggio di risposta potrà essere conosciuto anche da altri soggetti diversi dall’originatore di questo messaggio per dette esigenze o per controllo aziendale. Questo messaggio, corredato dei relativi allegati, contiene informazioni da considerarsi strettamente riservate, ed è destinato esclusivamente al destinatario sopra indicato, il quale è l’unico autorizzato ad usarlo, copiarlo e, sotto la propria responsabilità, diffonderlo. Chiunque ricevesse questo messaggio per errore o comunque lo leggesse senza esserne legittimato è avvertito che trattenerlo, copiarlo, divulgarlo, distribuirlo a persone diverse dal destinatario è severamente proibito, ed è pregato di rinviarlo immediatamente al mittente distruggendone l’originale.

I have recently implemented approvals for one queue in our RT (v4.2.10).

I am experiencing a crash, and I cannot find the reason for it.
As soon as the approver approves a ticket, he gets an error message (see
attached image001.png).

The log is not very clear (not for me at least) on the reason for the crash.

[25336] [Wed Jun 3 15:14:27 2015] [error]: Died at
/opt/rt4/sbin/…/lib/RT/Rule.pm line 100.

Looking at the code, that error is because a RT template didn’t load.
->Load probably returns an error message that should be logged, but its
pretty bad that your RT is crashing there.

Is this a clean install of 4.2.10? Or did you upgrade from a previous
version? If so, what version?

In general approvals are known to be working correctly, so its likely
something specific to the contents of your database, its somehow got in
to a bad state. You might want to check your database with
sbin/rt-validator:

https://bestpractical.com/docs/rt/latest/rt-validator.html

Todd, thank you very much for your reply.

This is an upgrade from 4.2.9, but we have a long story with RT (since
2008) and we have always been upgrading without ever doing a clean install.
I think some minor corruption may have been adding up through the years.

Anyway, I just finished creating a couple of VMs, cloning the production
server. I have been running rt-validator --check --verbose --resolve on
the copy and now I am testing to verify that RT is still working as
expected.
rt-validator reported a lor of errors, but mainly due to missing
principals (principals that existed at the time but have been delete since).

In the next days, as my time permits, I’ll be testing the approvals
workflow on the copies and see if the problem is solved or at least if I
can get a clearer debug log.

CrisOn 11/06/2015 03:18, Todd Wade wrote:

Looking at the code, that error is because a RT template didn’t load.
->Load probably returns an error message that should be logged, but
its pretty bad that your RT is crashing there.

Is this a clean install of 4.2.10? Or did you upgrade from a previous
version? If so, what version?

In general approvals are known to be working correctly, so its likely
something specific to the contents of your database, its somehow got
in to a bad state. You might want to check your database with
sbin/rt-validator:

rt-validator - RT 5.0.5 Documentation - Best Practical

Todd,
unfortunately, even after completely fixing the db with rt-validator, I
still have the same problem with approvals.

I am at a loss, if anybody has something to contribute it would be very
appreciated.

CrisOn 11/06/2015 11:31, Guadagnino Cristiano wrote:

Todd, thank you very much for your reply.

This is an upgrade from 4.2.9, but we have a long story with RT (since
2008) and we have always been upgrading without ever doing a clean install.
I think some minor corruption may have been adding up through the years.

Anyway, I just finished creating a couple of VMs, cloning the production
server. I have been running rt-validator --check --verbose --resolve on
the copy and now I am testing to verify that RT is still working as
expected.
rt-validator reported a lor of errors, but mainly due to missing
principals (principals that existed at the time but have been delete since).

In the next days, as my time permits, I’ll be testing the approvals
workflow on the copies and see if the problem is solved or at least if I
can get a clearer debug log.

Cris

On 11/06/2015 03:18, Todd Wade wrote:

Looking at the code, that error is because a RT template didn’t load.
->Load probably returns an error message that should be logged, but
its pretty bad that your RT is crashing there.

Is this a clean install of 4.2.10? Or did you upgrade from a previous
version? If so, what version?

In general approvals are known to be working correctly, so its likely
something specific to the contents of your database, its somehow got
in to a bad state. You might want to check your database with
sbin/rt-validator:

rt-validator - RT 5.0.5 Documentation - Best Practical

unfortunately, even after completely fixing the db with rt-validator, I
still have the same problem with approvals.

I am at a loss, if anybody has something to contribute it would be very
appreciated.

Can you confirm that the Approvals templates are installed? Navigate to
/Admin/Queues/index.html?FindDisabledQueues=1 (this is the ‘Admin
queues’ page with the ‘Include disabled queues in listing’ checkbox
checked), click on the __Approvals queue and then click on ‘Tempates’ in
the subnav. Are there Templates listed there?

Without access to your system its very much a guess but it sounds like
somehow the approvals infrastructure wasn’t installed (or wasn’t
installed correctly) in your system during the upgrade that added it in.

Regards,

Todd,
I have these templates:

  • New Pending Approval
  • Approval Passed
  • All Approvals Passed
  • Approval Rejected
  • New Pending Approval in HTML
  • Approval Passed in HTML
  • All Approvals Passed in HTML
  • Approval Rejected in HTML
  • Approval Ready for Owner in HTML
  • Transaction in HTML
  • Transaction

Upon visual inspection they seem OK, but the last two templates are empty.

Of all the templates, only three are used by scrips:

  • New Pending Approval is used by scrip 12 (“When an approval ticket is created, notify the Owner and AdminCc of the item awaiting their approval”)
  • Approval Rejected is used by scrip 13 (“If an approval is rejected, reject the original and delete pending approvals”)
  • Transaction is used by scrip 4 (“On Create Notify AdminCcs”)

Todd, does Best Practical offer support for troubleshooting?
Could you send me (privately) some figures?
If we can’t fix this issue by ourselves I’ll see if I can convince the management to ask for support.

T.I.A.

CrisOn 17/06/2015 05:34, Todd Wade wrote:
On 6/16/15 10:57 AM, Guadagnino Cristiano wrote:
unfortunately, even after completely fixing the db with rt-validator, I
still have the same problem with approvals.

I am at a loss, if anybody has something to contribute it would be very
appreciated.

Can you confirm that the Approvals templates are installed? Navigate to /Admin/Queues/index.html?FindDisabledQueues=1 (this is the ‘Admin queues’ page with the ‘Include disabled queues in listing’ checkbox checked), click on the __Approvals queue and then click on ‘Tempates’ in the subnav. Are there Templates listed there?

Without access to your system its very much a guess but it sounds like somehow the approvals infrastructure wasn’t installed (or wasn’t installed correctly) in your system during the upgrade that added it in.

Regards,