Database connections not closing

Hi,

I’m having a file descriptor shortage problem on a server that is
running RT. I don’t think it is RT’s fault, but probably one of
DBD* or DBI*, or mod_perl or another of RT’s prerequisite. I thought
someone might be able to help me narrow my search down to the component
at fault.

Here is my setup:

  • RT 3.4.4, using mod_perl (not FastCGI)

  • OpenBSD 3.8

  • perl v5.8.6

  • Apache 1.3.29

  • mod_perl-1.29p0

  • PostgreSQL 8.0.3

  • p5-Apache-Test-1.11

  • p5-Cache-Cache-1.02

  • DBD-Pg-1.43

  • Apache-Session-1.80

  • Apache-DBI-0.94

  • DBI-1.48

  • DBIx-SearchBuilder-1.26

Now the problem is that the connections to PostgreSQL are not closed
properly; they remain open. After some time, the system runs out of
available file descriptors.

ps ax


21343 ?? I 0:02.97 postmaster: rt_user rt3 ::1(13111) idle (postgres)
32696 ?? I 0:01.61 postmaster: rt_user rt3 ::1(39447) idle (postgres)
27978 ?? I 0:02.44 postmaster: rt_user rt3 ::1(24086) idle (postgres)
18651 ?? I 0:02.30 postmaster: rt_user rt3 ::1(18178) idle (postgres)
2186 ?? I 0:01.29 postmaster: rt_user rt3 ::1(38913) idle (postgres)
28400 ?? I 0:01.82 postmaster: rt_user rt3 ::1(24840) idle (postgres)
9270 ?? I 0:01.74 postmaster: rt_user rt3 ::1(33224) idle (postgres)

sysctl kern.nfiles kern.maxfiles

kern.nfiles=700
kern.maxfiles=1772

PostgreSQL almost uses them all:

fstat | grep postgre | wc -l

 575

Restarting Apache frees them all up, since all the processes that were
holding the SQL sessions are killed:

apachectl restart

sysctl kern.nfiles kern.maxfiles

kern.nfiles=167
kern.maxfiles=1772

fstat | grep postgre | wc -l

  26

Alternatively, killing a few httpd children frees descriptors
proportionnally. So there is no doubt that a web application is at
fault. And RT is the only web application running on that server.
Now like I said I don’t think RT is at fault, but I need someone to
point me in the right direction. Where in RT’s code are the connections
to the database closed (I’m not very familiar with perl) ? What
component would most likely be at fault here ? Would there
be a way to generate more logs about success or failure of database
disconnects ?

Knowing the broken component will allow me to file a bug report to the
appropriate person.

Any suggestions?

Thanks,
Pascal

Now the problem is that the connections to PostgreSQL are not closed
properly; they remain open. After some time, the system runs out of
available file descriptors.

Two things: either increase the number of max connections allowed by
your Postgres installation or decrease the max number of apache
children you allow to run RT.

RT in mod_perl uses Apache::DBI to cache the db connections, so the
children keep the connection open until they die. This is expected.

Now, if your web server is also serving up other pages, then you’re
gonna end up with lots of way oversized httpd processes doing the
wrong thing. Best is to dedicate one apache instance for RT and/or
use FastCGI to run RT.