Quick rundown of our set up
Server1 has rt-2.0.10_pre4, postgres 7.1.2, and apache_mod_perl. It
provides the WebUI and the database/
Server2 has rt.2.0.8. It handles incoming emails with rt-mailgate, and
connects to the database on server1.
This has been working happily for months, the last time I made
configuration changes was when rt-2.0.10_pre4 was released.
Early this evening, rt on Server1 appeared to lose the plot.
All ticket searches returned no results. The home page, for example,
returned no results for the 'Tickets I own/requested’
It listed all the queues on the right, but returned 0 for new/open/stalled.
running 'rt --limit-queue general --limit-status open --limit-new open
would also return nothing.
At first I thought my database had died, but running the same query on the
CLI on Server2 returned the correct results.
I upgraded to rt-2.0.11 on Server1, and DBIx-Searchbuilder-0.48 but
experienced the same problems.
I rolled back to rt-2.0.9 on Server1, and everything started working again,
still with DBIx-SearchBuilder-0.48
I know there was a lot of postgres optimisation in rt-2.0.10+, so my first
guess is that my db has reached a size where some of the optimisation code
is actually failing, for whatever reason.
However, I know there’s much larger installations on RT out there, so I’m
also guessing that my guess is wrong…
I’ve checked, and my indices are all up to date with rt-2.0.11’s schema
We have 16,281 tickets, 57,669 transactions, and 861 users.
Oh, and my perl version is 5.005_03.
There’s absolutely nothing in RT’s logs, or apache’s to indicate what’s
going on, even with debug on.
Now here’s the rub - in the time since I started poking around,
correspondance has come in on creating new tickets
They are now diplaying under rt-2.0.11, but all the prior tickets are still
Finally, I look at the postgres errlog and find this:
DEBUG: --Relation pg_toast_16600–
DEBUG: Pages 1: Changed 0, reaped 0, Empty 0, New 0; Tup 2: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 1201, MaxLen 1492; Re-using:
Free/Avail. Space 0/0
; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
DEBUG: Index pg_toast_16600_idx: Pages 2; Tuples 2. CPU 0.00s/0.00u sec.
FATAL 1: Database “mydatabsae” does not exist in the system catalog.
DEBUG: MoveOfflineLogs: remove 000000000000004C
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: MoveOfflineLogs: remove 000000000000004D
pq_recvbuf: unexpected EOF on client connection
Server process (pid 11169) exited with status 9 at Thu Feb 7 17:42:34 2002
Terminating any active server processes…
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend died
abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am going to
terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.
<—Above NOTICE repeats many times—>
Server processes were terminated at Thu Feb 7 17:42:39 2002
Reinitializing shared memory and semaphores
DEBUG: database system was interrupted at 2002-02-07 17:41:47 GMT
DEBUG: CheckPoint record at (0, 1315175136)
DEBUG: Redo record at (0, 1315175136); Undo record at (0, 0); Shutdown FALSE
DEBUG: NextTransactionId: 3245776; NextOid: 221814
DEBUG: database system was not properly shut down; automatic recovery in
DEBUG: ReadRecord: record with zero len at (0, 1315175200)
DEBUG: redo is not required
The Data Base System is starting up
DEBUG: database system is in production state
ERROR: Query was cancelled.
Checking transactions in the database confirms that the two new tickets
came in after that time
57668 | | 16329 | 0 | Create
| | | | | 12 | Thu Feb 07 17:50:20 2002 GMT
57667 | | 16328 | 0 | Create
| | | | | 865 | Thu Feb 07 17:42:04 2002 GMT
At this point, I’m lost.
I don’t know if RT killed postgres, or if postgres killed RT.
However, as mentioned, rt-2.0.9 is working fine.