Long attachments timeout

Hi all,

The mail handler needs some fixup.

Long attachments (more than 1.5 Mb) cause such error in Posfix mailq:

(temporary failure. Command output: An Error Occurred ================= 500
read timeout )

After the mail body is passed to Apache, the mail handler takes few minutes
to process the long attachment. During this post-processing, HTTP POST
operation is not yet finished, and Apache returns error 500 after 300 seconds
(the default Timeout value).

Then the attachment goes to the database, and gets delivered to
all relevant recipients, but the message is marked as "deferred"
in the mail queue. After some time,
postfix (or sendmail) tries to deliver it again.

As a result, you may have few dozens of jumbo attachments in your mailbox
during weekend.

The problem is resolved by increasing Apache Timeout value:

Timeout 3600

In addition, MaxRequestsPerChild is set to 10 in order to avoid
unlimited allocated memory growth.

I think the mail handler should finalize the HTTP session before starting
to process the message body.

This happens with RT 3.0.4, FreeBSD 4.8-STABLE, Pentium4, 2 GHz, 1Gb RAM,
IDE hdd, perl 5.8

Cheers,
Stan

[Please do not send mail to rt-devel and rt-bugs at the same time, as
people replying will create new ticket after new ticket. If you need to,
mail to rt-bugs and get a ticket id and then use that in the subject of
both messages]

ok, sorry for that.

After the mail body is passed to Apache, the mail handler takes few minutes
to process the long attachment. During this post-processing, HTTP POST
operation is not yet finished, and Apache returns error 500 after 300 seconds
(the default Timeout value).

Minutes? That surprises me. What database? That sounds like you have
something badly mistuned.

I didn’t have time to investigate in depth, but here’s what I’ve got:

“cat longmessage.mime | rt-mailgate” takes approx. 60 seconds for 2.5 Mb message.
But when I send it via MTA, it never leaves the mailqueue, with the deferred message
as described before. I’ll try to get more output in a more relaxed time.

Probably this is partly apache/mod_perl bug.

I think the mail handler should finalize the HTTP session before starting
to process the message body.

You mean that the server should tell the client “Ok, I’ve recorded the
message, you’re done” before it has done so? That’s a very, very bad
idea. That could easily result in lost messages.

ok, wrong idea. But how can I track it and get more clue on what’s going on
during the mail delivery?

Stan

I think the mail handler should finalize the HTTP session before starting
to process the message body.

You mean that the server should tell the client “Ok, I’ve recorded the
message, you’re done” before it has done so? That’s a very, very bad
idea. That could easily result in lost messages.

ok, wrong idea. But how can I track it and get more clue on what’s going on
during the mail delivery?

One option may be flipping the response to stream mode and outputting
soomething at the beginning.

Stan


rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

http://www.bestpractical.com/rt – Trouble Ticketing. Free.

One option may be flipping the response to stream mode and outputting
soomething at the beginning.

You mean, “cat someting | lynx -post_data http://…mailgate” ?

By the way, we use localhost connection for the delivery.
Could it be that the loopback interface causes some problems?

No. I mean the other way round. The problem you seem to be hitting is
taht the client times out because the server returns NO data. Modifying
the server t oreeturn a little bit of data before processing begins
might help.On Mon, Jul 28, 2003 at 12:55:32PM -0700, Stanislav Sinyagin wrote:

— Jesse Vincent jesse@bestpractical.com wrote:

One option may be flipping the response to stream mode and outputting
soomething at the beginning.

You mean, “cat someting | lynx -post_data http://…mailgate” ?

By the way, we use localhost connection for the delivery.
Could it be that the loopback interface causes some problems?


rt-devel mailing list
rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

http://www.bestpractical.com/rt – Trouble Ticketing. Free.

I couldn’t reproduce the problem with freshly restarted Apache (1.3.27, mod_perl 1.27),
will try again later.

Another problem is the allocated memory growth. After 2.8 Mb message,
the Apache process memory is more than 300Mb:

bash-2.05b# ps -vcxa | egrep 'httpd|PID’
PID STAT TIME SL RE PAGEIN VSZ RSS LIM TSIZ %CPU %MEM COMMAND
40003 I 0:41.31 209 1010 5408 369140 1108 - 272 0.0 0.2 httpd
40005 I 0:24.17 209 1114 1909 348448 297640 - 272 0.0 58.1 httpd
40006 I 0:05.78 209 1058 988 30528 13028 - 272 0.0 2.5 httpd
40007 I 0:00.96 34 1010 1613 29704 7248 - 272 0.0 1.4 httpd
40004 I 0:00.18 209 1010 643 27440 8800 - 272 0.0 1.7 httpd
40001 Ss 0:02.50 0 1874 4 26252 2060 - 272 0.0 0.4 httpd

Is there a way to reduce the memory consumption? Save the message body
to a temporary file, instead of keeping it in memory?

Regards,
Stan— Jesse Vincent jesse@bestpractical.com wrote:

No. I mean the other way round. The problem you seem to be hitting is
taht the client times out because the server returns NO data. Modifying
the server t oreeturn a little bit of data before processing begins
might help.

Hi,

I have the same problem with long attachments. After sending testmail
with 6MB attachment, RT send the autoreplay mail but no new ticket
was created. Then apache needs all CPU time but nothing happens.
Setting “Timeout” to 3600 in httpd.conf resolved the problem.
The ticket was created in 2 minutes. But apache needs all available
Ram and app. 600MB swap space.

My test-system:
Pentium3 1GHz, 512MB RAM, IDE harddisk
RT 3.0.4
SuSE 8.2
Perl 5.8.0
Apache 1.3.27
Modperl 1.27
Mysql 4.0.13
Sendmail

Greetings,
Torsten

I couldn’t reproduce the problem with freshly restarted Apache (1.3.27, mod_perl 1.27),
will try again later.

Something strange. With 3-0-4, I could not reproduce it several times.
After upgrading to 3.0.5pre2, the problem appears immediately:

Here’s the mailq and rt-mailgate output, with real e-mail addresses removed.
In the debug output, two previous deliveries were successful, and the last one
was right after upgrade.

bash-2.05b# mailq
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
2A96534D3F 2936359 Mon Aug 4 13:46:02 root@.net
(temporary failure)
rt@
.net

– 2867 Kbytes in 1 Request.
bash-2.05b# tail -25 var/log/maildebug
Status: open
Requestor: s.sinyagin@.ch
Connecting to http://127.0.0.1//REST/1.0/NoAuth/mail-gateway at /opt/rt3/bin/rt-mailgate line 403,
<> chunk 1.
ok
Ticket: 942
Queue: rt
Owner: s.sinyagin
Status: open
Requestor: s.sinyagin@
.ch
Connecting to http://127.0.0.1//REST/1.0/NoAuth/mail-gateway at /opt/rt3/bin/rt-mailgate line 403,
<> chunk 1.
ok
Ticket: 942
Queue: rt
Owner: s.sinyagin
Status: open
Requestor: s.sinyagin@*****.ch
Connecting to http://127.0.0.1//REST/1.0/NoAuth/mail-gateway at /opt/rt3/bin/rt-mailgate line 403,
<> chunk 1.
An Error Occurred

500 read timeout
This is /opt/rt3/bin/rt-mailgate exiting because of an undefined server error at
/opt/rt3/bin/rt-mailgate line 445, <> chunk 1.
bash-2.05b#

Forgot to mention that Apache process still occupies 100% CPU, while
the MTA has disconnected already:

last pid: 57402; load averages: 1.13, 1.05, 0.74 up 54+02:27:51 14:01:32
36 processes: 2 running, 34 sleeping
CPU states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle
Mem: 333M Active, 46M Inact, 86M Wired, 15M Cache, 60M Buf, 14M Free
Swap: 1024M Total, 131M Used, 893M Free, 12% Inuse

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
57315 www 62 0 359M 269M RUN 13:51 97.07% 97.07% httpd
57402 root 28 0 1904K 968K RUN 0:00 1.88% 0.34% top
2357 mysql 2 0 127M 10648K poll 20:44 0.00% 0.00% mysqld— Stanislav Sinyagin ssinyagin@yahoo.com wrote:

— Stanislav Sinyagin ssinyagin@yahoo.com wrote:

I couldn’t reproduce the problem with freshly restarted Apache (1.3.27, mod_perl 1.27),
will try again later.

Something strange. With 3-0-4, I could not reproduce it several times.
After upgrading to 3.0.5pre2, the problem appears immediately: