"large" attachments (was Re: rt notes)


#1
  1. Because current large-object interface in postgres is such a pain to
    work with, I converted all columns to TEXT. However, postgres up to 7.1
    won’t allow you to put >8k of data into one row, which means things won’t
    quite work right. 7.1 is to be released sometime this year. :slight_smile:

nod One thing I’ve been vaguely pondering is the idea of storing "large"
attachments on disk, rather than in the database. This isn’t something I’m
thrilled with, but it may be necessary to deal with many databases’ broken
large-object handling.

Are you thinking of adding functionality similar to stripmime?

J.D. Falk "Laughter is the sound
Product Manager that knowledge makes when it’s born."
Mail Abuse Prevention System LLC – The Cluetrain Manifesto


#2
  1. Because current large-object interface in postgres is such a pain to
    work with, I converted all columns to TEXT. However, postgres up to 7.1
    won’t allow you to put >8k of data into one row, which means things won’t
    quite work right. 7.1 is to be released sometime this year. :slight_smile:

nod One thing I’ve been vaguely pondering is the idea of storing "large"
attachments on disk, rather than in the database. This isn’t something I’m
thrilled with, but it may be necessary to deal with many databases’ broken
large-object handling.

Are you thinking of adding functionality similar to stripmime?

Well, RT2 already does most of that. but the idea is that you’d be able to
set a cutoff to spool to disk rather than into the database. it complicates
some things a bit, but could be useful.

the thought is that you’d also be able to set a maximum size, over which an
attachment would get bounced and a stoplist of mimetypes that should NEVER
be accepted.

    -j


J.D. Falk "Laughter is the sound
Product Manager that knowledge makes when it’s born."
Mail Abuse Prevention System LLC – The Cluetrain Manifesto


Rt-devel mailing list
Rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

jesse reed vincent — root@eruditorum.orgjesse@fsck.com
pgp keyprint: 50 41 9C 03 D0 BC BC C8 2C B9 77 26 6F E1 EB 91
<Dr_Memory> the point is that words were exchanged. neurolinguistic
programming will do the rest. they should be showing up at my house
any day now.


#3
  1. Because current large-object interface in postgres is such a pain to
    work with, I converted all columns to TEXT. However, postgres up to 7.1
    won’t allow you to put >8k of data into one row, which means things won’t
    quite work right. 7.1 is to be released sometime this year. :slight_smile:

nod One thing I’ve been vaguely pondering is the idea of storing "large"
attachments on disk, rather than in the database. This isn’t something I’m
thrilled with, but it may be necessary to deal with many databases’ broken
large-object handling.

I would really strongly suggest you don’t do that. Better to do it right,
dealing with all sillinesses of databases if necessary than do it wrong
way.

There aren’t that many of them, its just postgres being annoying but you
can work around using lztext or in 7.1, it’s completely fixed.

-alex


#4

nod One thing I’ve been vaguely pondering is the idea of storing "large"
attachments on disk, rather than in the database. This isn’t something Im
thrilled with, but it may be necessary to deal with many databases’ broken
large-object handling.

I would really strongly suggest you don’t do that. Better to do it right,
dealing with all sillinesses of databases if necessary than do it wrong
way.

nod I’ve gotten people fighting hard on both sides of this. I don’t want to store things on disk. it gets very very icky. But someone or other had convinced me that it was “better” to do that, than to drop those 1/2 gig attachments
into a database that could choke. The right thing to do is probably to figure out some nice db-neutral way to chunk things and have per-db cutoffs.

There aren’t that many of them, its just postgres being annoying but you
can work around using lztext or in 7.1, it’s completely fixed.

-alex

jesse reed vincent — root@eruditorum.orgjesse@fsck.com
pgp keyprint: 50 41 9C 03 D0 BC BC C8 2C B9 77 26 6F E1 EB 91
They’ll take my private key when they pry it from my cold dead fingers!


#5

nod One thing I’ve been vaguely pondering is the idea of storing "large"
attachments on disk, rather than in the database. This isn’t something Im
thrilled with, but it may be necessary to deal with many databases’ broken
large-object handling.

I would really strongly suggest you don’t do that. Better to do it right,
dealing with all sillinesses of databases if necessary than do it wrong
way.

nod I’ve gotten people fighting hard on both sides of this.

Cool, let me be the first to fight hard for configurability, then. :slight_smile:

How about a config value for maximum size? Set it to 0 and attachments
are never stored in the database, set it to something small if your
database has trouble with large attachments, set it to something big if
your database can swallow them with no trouble.

(it does make it a bit more trouble for the API that deals with
attachments to transparantly fetch them from the filesystem vs
database as necessary, but seems worthwhile)

I don’t
want to store things on disk. it gets very very icky. But someone or
other had convinced me that it was “better” to do that, than to drop
those 1/2 gig attachments into a database that could choke. The right
thing to do is probably to figure out some nice db-neutral way to chunk
things and have per-db cutoffs.

Chunking seems like the wrong solution; either the database handles large
attachments correctly or it doesn’t. If it doesn’t, why waste time with
it? Drop them in the filesystem. Especially since (as per below),
they’re fixed in the next version.

There aren’t that many of them, its just postgres being annoying but you
can work around using lztext or in 7.1, it’s completely fixed.

-alex


jesse reed vincent — root@eruditorum.orgjesse@fsck.com
pgp keyprint: 50 41 9C 03 D0 BC BC C8 2C B9 77 26 6F E1 EB 91

They’ll take my private key when they pry it from my cold dead fingers!


Rt-devel mailing list
Rt-devel@lists.fsck.com
http://lists.fsck.com/mailman/listinfo/rt-devel

meow
_ivan


#6

nod I’ve gotten people fighting hard on both sides of this.

Cool, let me be the first to fight hard for configurability, then. :slight_smile:

How about a config value for maximum size? Set it to 0 and attachments
are never stored in the database, set it to something small if your
database has trouble with large attachments, set it to something big if
your database can swallow them with no trouble.

If we put in filesystem-storage as an option it will work like this…along
with the rejection functionality i described earlier…

(it does make it a bit more trouble for the API that deals with
attachments to transparantly fetch them from the filesystem vs
database as necessary, but seems worthwhile)

nod It’s gotta be transparent to interface code. no question about that.

I don’t
want to store things on disk. it gets very very icky. But someone or
other had convinced me that it was “better” to do that, than to drop
those 1/2 gig attachments into a database that could choke. The right
thing to do is probably to figure out some nice db-neutral way to chunk
things and have per-db cutoffs.

Chunking seems like the wrong solution; either the database handles large
attachments correctly or it doesn’t. If it doesn’t, why waste time with
it? Drop them in the filesystem. Especially since (as per below),
they’re fixed in the next version.

FWIW, we’ve had no end to problems with dropping attachements in the
filesystem rather than the DB in 1.0.x. For sites where I have an RT instance,
I would personally rahter have chunked content in the database than in
the filesystem. One of the things that we lose when storing in the filessytem
is the free searching that the database gets us.

    -j

jesse reed vincent — root@eruditorum.orgjesse@fsck.com
pgp keyprint: 50 41 9C 03 D0 BC BC C8 2C B9 77 26 6F E1 EB 91
…realized that the entire structure of the net could be changed to be made
more efficient, elegant, and spontaneously make more money for everyone
involved. It’s a marvelously simple diagram, but this form doesn’t have a way
for me to draw it. It’ll wait. -Adam Hirsch


#7

nod I’ve gotten people fighting hard on both sides of this.

Cool, let me be the first to fight hard for configurability, then. :slight_smile:

How about a config value for maximum size? Set it to 0 and attachments
are never stored in the database, set it to something small if your
database has trouble with large attachments, set it to something big if
your database can swallow them with no trouble.

If we put in filesystem-storage as an option it will work like this…along
with the rejection functionality i described earlier…

Hmm, I actually have a big list of requiments for message processing that
we’re currently handling outside RT now. I’ll send current design ideas
along in a separate message.

(it does make it a bit more trouble for the API that deals with
attachments to transparantly fetch them from the filesystem vs
database as necessary, but seems worthwhile)

nod It’s gotta be transparent to interface code. no question about that.

I don’t
want to store things on disk. it gets very very icky. But someone or
other had convinced me that it was “better” to do that, than to drop
those 1/2 gig attachments into a database that could choke. The right
thing to do is probably to figure out some nice db-neutral way to chunk
things and have per-db cutoffs.

Chunking seems like the wrong solution; either the database handles large
attachments correctly or it doesn’t. If it doesn’t, why waste time with
it? Drop them in the filesystem. Especially since (as per below),
they’re fixed in the next version.

FWIW, we’ve had no end to problems with dropping attachements in the
filesystem rather than the DB in 1.0.x. For sites where I have an RT instance,
I would personally rahter have chunked content in the database than in
the filesystem. One of the things that we lose when storing in the filessytem
is the free searching that the database gets us.

:confused: I dunno, if I wanted to search message text, I wouldn’t use an SQL
database. Header fields in database fields perhaps, but for searching
message bodies I’d drop them in the filesystem and use something along the
lines of Isearch / Glimpse / ht:://Dig

…anyway, i’m sure it will be configurable either way.

meow
_ivan