De-HTML incoming mail requests

Hi All,

I need to have RT de-HTML messages when it injects them into my General
queue. Right now, it receives HTML email, creates the request and puts the
message in an attachment that we can download. When I click on the
attachment I get a text display of the HTML content of the email…

Thanks in advance,
Sean

BTW, I have found the “Set($TrustHTMLAttachments , undef);” configuration
option in RT_Config.pm and RT_SiteConfig.pm and have changed the “undef” to
a 1 and to ‘true’ to no avail…

This option is for download link, if option is true then you get html
page with attachment download link, without it you get plain text.

Scrubbing before/after inserting in RT and other methods was discussed
here, search for info.

Sean McKay wrote:

This option is for download link, if option is true then you get html
page with attachment download link, without it you get plain text.

Scrubbing before/after inserting in RT and other methods was discussed
here, search for info.

(RT 3.1 will scrub and display html inline.)

I see your patch, it doesn’t all what you wanted this to do.
It’s save place for JavaScript, ActiveX calls injection.
+$scrubber->default( 0,

  • { '’ => 0, id => 1, class => 1, href => 1, face => 1, size => 1,
    target => 1 } );
    +$scrubber->deny(qw[
    ]);
    +$scrubber->allow( qw[A B U P BR I HR BR SMALL EM FONT SPAN DIV UL OL LI
    DL DT DD] );

href can contain JS and other weird data.

		Best regards. Ruslan.

Jesse Vincent wrote:

Jesse/Ruslan,

We’re messing with trying to just delete any HTML tags – we don’t care if
we catch an occasional greater than or less than sign. On of our programmers
came up with:

The regular expression:

		s/<.*?>//g; 

And I’ve put the following into the mailgate.in Perl script after reading
the message in from STDIN:

Read the message in from STDIN

$args{‘message’} = <>;

############################## Change by Sean
$args{‘message’} = s/<.*?>//g;
############################## End change by Sean

if ($opts{‘extension’}) {
$args{$opts{‘extension’}} = $ENV{‘EXTENSION’};
}

The only problem is that this results in an empty ticket created by
RT_System. Any tips on where this should go? Basically the goal is to
eliminate any html tag…

Thanks,
Sean-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Wednesday, May 19, 2004 11:56 PM
To: Ruslan U. Zakirov
Cc: Sean McKay; rt-users@lists.fsck.com
Subject: Re: [rt-users] De-HTML incoming mail requests

On Thu, May 20, 2004 at 10:52:28AM +0400, Ruslan U. Zakirov wrote:

This option is for download link, if option is true then you get html
page with attachment download link, without it you get plain text.

Scrubbing before/after inserting in RT and other methods was discussed
here, search for info.

(RT 3.1 will scrub and display html inline.)

Try: s/<[^>]*>//g

Sean McKay wrote:

Jesse/Ruslan,

We’re messing with trying to just delete any HTML tags – we don’t care if
we catch an occasional greater than or less than sign. On of our programmers
came up with:

The regular expression:

  	s/<.*?>//g; 

And I’ve put the following into the mailgate.in Perl script after reading
the message in from STDIN:

Read the message in from STDIN

$args{‘message’} = <>;

############################## Change by Sean
$args{‘message’} = s/<.*?>//g;
############################## End change by Sean

if ($opts{‘extension’}) {
$args{$opts{‘extension’}} = $ENV{‘EXTENSION’};
}

The only problem is that this results in an empty ticket created by
RT_System. Any tips on where this should go? Basically the goal is to
eliminate any html tag…

Thanks,
Sean

-----Original Message-----
From: Jesse Vincent [mailto:jesse@bestpractical.com]
Sent: Wednesday, May 19, 2004 11:56 PM
To: Ruslan U. Zakirov
Cc: Sean McKay; rt-users@lists.fsck.com
Subject: Re: [rt-users] De-HTML incoming mail requests

This option is for download link, if option is true then you get html
page with attachment download link, without it you get plain text.

Scrubbing before/after inserting in RT and other methods was discussed
here, search for info.

(RT 3.1 will scrub and display html inline.)


http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

RT Developer and Administrator training is coming to LA, DC and Frankfurt this spring and summer.
http://bestpractical.com/services/training.html

Sign up early, as class space is limited.

With regards,

Say_Ten

This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.

In the event of misdirection, illegible or incomplete transmission please telephone (023) 8024 3137
or return the E.mail to postmaster@multiplay.co.uk.

I see your patch, it doesn’t all what you wanted this to do.
It’s save place for JavaScript, ActiveX calls injection.
+$scrubber->default( 0,

  • { ‘*’ => 0, id => 1, class => 1, href => 1, face => 1, size => 1,
    target => 1 } );

+$scrubber->deny(qw[*]);
+$scrubber->allow( qw[A B U P BR I HR BR SMALL EM FONT SPAN DIV UL OL LI
DL DT DD] );

href can contain JS and other weird data.

Indeed. From the docs, it looks like we could restrict to “safe” URI
types: http, https, ftp, gopher. Anything else?

        'href'        => qr{^(?!(?:java)?script)}i,
                'src'         => qr{^(?!(?:java)?script)}i,

From the docs, it looks like we could restrict to “safe” URI
types: http, https, ftp, gopher. Anything else?

mailto.

Sebastian

Sebastian Flothow
sebastian@flothow.de

Because it reverses the logical flow of conversation.