Recompiling an email

Dear list,

Thanks a lot for past assistance. Now I was wondering what the best way
would be to go about pulling the entire first email from a Ticket in
order to save it outside of RT. We want to be able to train a Spam
Assassin instance on messages that have been positively identified by
humans in RT, so I need to somehow extract the messages.

I was planning on having a Custom Field that they mark to set the
message as Spam, and I have already accomplished this portion. But I’ve
been having trouble recompiling the whole first message.

Any help on this would be greatly appreciated.

Matthew Ekstrand-Abueg
Systems Administrator
Network Infrastructure, RSSP-IT
UC Berkeley

Matthew Ekstrand-Abueg mattea@rescomp.berkeley.edu writes:

Dear list,

Thanks a lot for past assistance. Now I was wondering what the best way
would be to go about pulling the entire first email from a Ticket in
order to save it outside of RT. We want to be able to train a Spam
Assassin instance on messages that have been positively identified by
humans in RT, so I need to somehow extract the messages.

I was planning on having a Custom Field that they mark to set the
message as Spam, and I have already accomplished this portion. But I’ve
been having trouble recompiling the whole first message.

When I worked at the Free Software Foundation, we had a system similar
to what you are describing. Though we had an archive of all mail sent
to RT that was maintained by the MTA on the RT machine. So for
training false negatives, our script would look for tickets that had
been marked as spam (with a custom field), look at their transactions
to find a Message-ID header, and then go through the external archive
to find the original copy of the message.

I don’t know how tied you are to SpamAssassin, but if you can consider
alternative systems, DSPAM may be particularly suited to this kind of
work. It can create an identifier for each message it processes, and
can be retrained on an error given only that identifier. So your
script would only need to seek out the DSPAM header in RT and provide
that to DSPAM for retraining.

James E. Blair
Principal Email Systems Administrator
UC Berkeley - IST