How to Serialize RT with externalized attachements?


#1

We have ~30G externalized attachements and a ~4G db left.

Now we want to move from mysql to postgresql. So the proposed process for this (see here for example) is basically:

  1. Serialize the data (rt-serialize)
  2. Set DatabaseType to Pg in RT_SiteConfig.pm
  3. Initialize a ne pg-database (rt-setup-database)
  4. Import the data (rt-import)

I initially worked on a dev-machine where externalized data was not present. I then created the mysql-db from a backup och the live-system. This works nice and well except that the externalized data is not shown when viewing tickets.

But when I try to serialize the database it will complain about all the missing attachments with messages like:

Failed to load 7d526b07fdddbe548704a7d60466f5fb93feebc13c1bec6244ef3c7edacbcb3c from external storage: File does not exist

The result is that the references in the serialized data is set to null. This can be solved by first copying all the externalized data to the dev-machine. Now the serialization works without errors, byt the already externalized data is also included in the serialized output (~34G v.s. ~4G).

This is very inconvenient. The externalized data is in all practical terms already serialized (stored in separate files on disk) so I can not see any reason why it should be serialized again.

To proceed I would now need to import all data (including the previously externalized data) back into the new database just to externalize it ones again.


So my question is:

Is there a way for me to avoid having to serialize the externalized data and just serialize the database-content?


#2

If the serializing does not change the filename on the external data, you could create zero-length copies of the external data-set before serializing. Then once in the new form, copy the real data over the zero-length files. [Might have to have minimal data in the files for serialize to work. I.e. have one or more bytes so it is not a null file.]

If serializing does change the filenames, then you might be able to make a patch that serializes the name rather than the contents of the external data.

/jeff


#3

Good thinking Jeff.

But I do not think the zeroing will work. When the database is serialized (with rt-serialize) the already externalized content will be moved back into the serialized data as if it would never have been externalized. In other words the new database would after the import contain all data including the previously externalized data.

I have been trying to create the patch for a while now but unfortunately Iā€™m not fluent in perl just yet so I have not been able to find out exactly how the serialization works and what to change to make it work the way I want. But YES I would like to add a option something like --keep-externalized-data-external to rt-serialize. If you know how to make such a patch, or if you have any tips of where to look, any help would be appreciated.


#4

Since this has been a major annoyance for me (and I guess others eventually), I finally managed to create a patch and a pull-request for this on github:

So until (if ever) it makes it into the stable-release you could find the code there.

Have fun!