MySQL->Oracle DB Migration & Binary Attachment data corruption - UTF8 issue?

Hello,

Here are the pertinent details for our attempted migration:

Suse Linux 9.3 x86 32-bit
MIGRATING: MySQL 4.0.18 to Oracle 9i 9.2.0.6.0
RT 3.6.3
Perl 5.8.3
DBIx::SearchBuilder 1.45, DBD::Oracle 1.19
Apache 2.0.49
Oracle SQL Developer & MySQL Migration Workbench plugin (latest)

Issue: Binary Attachments & MySQL->Oracle DB Migration

Does anyone have experience with a successful MySQL 4.x to Oracle 9x RT
migration?

  1. We have a perfectly functioning RT instance on this maching using either
    MySQL or Oracle database type. I have done both installs, and both work
    independently.

  2. After some trial and error while migrating to the Oracle database,
    things seem to function much better if I allow RT to create the initial DB
    Schema, rather than attempt to have the Oracle tool migrate the schema from
    MySQL.

  3. I have used the Oracle SQL Developer to migrate the data from the MySQL
    to the Oracle instance.

  4. The database contents seem to have migrated without any issue, however,
    a primary issue is the binary attachments appear to be corrupt coming from
    the migrated Oracle DB.

  5. Upon comparing the contents of an attachment bit-for-bit from the MySQL
    to Oracle tables, the data in each is identical post migration!!

  6. Text attachments work fine, migrated just fine. Tickets, users, etc.
    migrated successfully.

  7. There appears to be a decoding issue with the Oracle binary attachments
    (from the migrated MySQL db) in that they seem to grow in size approximately
    23% upon downloading, and are corrupted as a result. The data obtained
    through RT (downloading a ticket attachment, both WEB and CMDLINE) is NOT
    the expected data stored in the Oracle table. Perhaps this is a UTF8 issue
    of some sort?

  8. Hex dump (snippet) of data (Correct in the database, Incorrect produced
    out of RT’s migrated database):

Correct data: d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 3e 00 03 00 fe ff 09 00
Incorrect data: ef bf bd ef bf bd 11 ef bf bd ef bf bd ef bf bd 1a ef bf
bd 00 00 00 00 00 00 00 00 00 00 00 00

Note the 8-bit characters from the correct data above being replaced by:
0xEF 0xBF 0xBD in what is obtained from RT in the incorrect data line
above.

  1. How do we overcome this so that previous (migrated) attachments work
    correctly, and going forward all new attachments appear correctly as well?

Any thoughts or suggestions?

Thanks!

Todd

Todd Williams wrote:

Hello,

Here are the pertinent details for our attempted migration:

Suse Linux 9.3 x86 32-bit
MIGRATING: MySQL 4.0.18 to Oracle 9i 9.2.0.6.0
RT 3.6.3
Perl 5.8.3
DBIx::SearchBuilder 1.45, DBD::Oracle 1.19
Apache 2.0.49
Oracle SQL Developer & MySQL Migration Workbench plugin (latest)

Issue: Binary Attachments & MySQL->Oracle DB Migration

Does anyone have experience with a successful MySQL 4.x to Oracle 9x RT
migration?

  1. We have a perfectly functioning RT instance on this maching using
    either MySQL or Oracle database type. I have done both installs, and
    both work independently.

  2. After some trial and error while migrating to the Oracle database,
    things seem to function much better if I allow RT to create the initial
    DB Schema, rather than attempt to have the Oracle tool migrate the
    schema from MySQL.

  3. I have used the Oracle SQL Developer to migrate the data from the
    MySQL to the Oracle instance.

  4. The database contents seem to have migrated without any issue,
    however, a primary issue is the binary attachments appear to be corrupt
    coming from the migrated Oracle DB.

  5. Upon comparing the contents of an attachment bit-for-bit from the
    MySQL to Oracle tables, the data in each is identical post migration!!

  6. Text attachments work fine, migrated just fine. Tickets, users,
    etc. migrated successfully.

  7. There appears to be a decoding issue with the Oracle binary
    attachments (from the migrated MySQL db) in that they seem to grow in
    size approximately 23% upon downloading, and are corrupted as a result.
    The data obtained through RT (downloading a ticket attachment, both WEB
    and CMDLINE) is NOT the expected data stored in the Oracle table.
    Perhaps this is a UTF8 issue of some sort?

  8. Hex dump (snippet) of data (Correct in the database, Incorrect
    produced out of RT’s migrated database):

Correct data: d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 3e 00 03 00 fe ff 09 00
Incorrect data: ef bf bd ef bf bd 11 ef bf bd ef bf bd ef bf bd 1a ef
bf bd 00 00 00 00 00 00 00 00 00 00 00 00

Note the 8-bit characters from the correct data above being replaced by:
0xEF 0xBF 0xBD in what is obtained from RT in the incorrect data line
above.

  1. How do we overcome this so that previous (migrated) attachments work
    correctly, and going forward all new attachments appear correctly as well?

I’m using Oracle too, Oracle XE. The way rows in the attachments table
look is that they are base64 encoded. You can check that by starting
from scratch and entering one ticket with a picture of word document
attachment. You’ll see that the content column contains the base64
encoded data and contenttype and contentencoding columns contain the
values needed to reconstruct the attachment. What probably is happening
is that the migration wizard has transferred all the data from MySQL to
Oracle but didn’t take into account that RT sees Oracle as a binary
unsafe database, thats why the content column is base64 encoded.
Depending on what is actually in the attachments table you might be able
to transform it to the right values.

If you need help, let me know.

Joop

Hi Joop,

Yes, that is what we found – all new attachments to the Oracle DB are
indeed Base64 encoded. Not sure why RT sees Oracle as binary unsafe
(perhaps earlier versions were determined as unsafe) but I suspected this
might be the case. However, in our case, the migrated attachment data can
be manually extracted from Oracle verbatim and compared with what was in the
MySQL database, binary compatible bit for bit. It’s all in how RT is
interpreting the attachment data coming from the newly migrated Oracle
database tables.

Could we easily address the interpretation issue within the RT code somehow,
or should we address the issue at the database level where there are 700+
tickets with multiple migrated attachments (automate the recoding of all
attachments)? I have looked at several places in the RT code and it is not
painfully obvious where one might attempt to work around this issue.

Thanks for all your help and guidance in advance,

ToddOn 6/13/07, Joop van de Wege JoopvandeWege@mococo.nl wrote:

Todd Williams wrote:

Hello,

Here are the pertinent details for our attempted migration:

Suse Linux 9.3 x86 32-bit
MIGRATING: MySQL 4.0.18 to Oracle 9i 9.2.0.6.0
RT 3.6.3
Perl 5.8.3
DBIx::SearchBuilder 1.45, DBD::Oracle 1.19
Apache 2.0.49
Oracle SQL Developer & MySQL Migration Workbench plugin (latest)

Issue: Binary Attachments & MySQL->Oracle DB Migration

Does anyone have experience with a successful MySQL 4.x to Oracle 9x RT
migration?

  1. We have a perfectly functioning RT instance on this maching using
    either MySQL or Oracle database type. I have done both installs, and
    both work independently.

  2. After some trial and error while migrating to the Oracle database,
    things seem to function much better if I allow RT to create the initial
    DB Schema, rather than attempt to have the Oracle tool migrate the
    schema from MySQL.

  3. I have used the Oracle SQL Developer to migrate the data from the
    MySQL to the Oracle instance.

  4. The database contents seem to have migrated without any issue,
    however, a primary issue is the binary attachments appear to be corrupt
    coming from the migrated Oracle DB.

  5. Upon comparing the contents of an attachment bit-for-bit from the
    MySQL to Oracle tables, the data in each is identical post migration!!

  6. Text attachments work fine, migrated just fine. Tickets, users,
    etc. migrated successfully.

  7. There appears to be a decoding issue with the Oracle binary
    attachments (from the migrated MySQL db) in that they seem to grow in
    size approximately 23% upon downloading, and are corrupted as a result.
    The data obtained through RT (downloading a ticket attachment, both WEB
    and CMDLINE) is NOT the expected data stored in the Oracle table.
    Perhaps this is a UTF8 issue of some sort?

  8. Hex dump (snippet) of data (Correct in the database, Incorrect
    produced out of RT’s migrated database):

Correct data: d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 3e 00 03 00 fe ff 09 00
Incorrect data: ef bf bd ef bf bd 11 ef bf bd ef bf bd ef bf bd 1a ef
bf bd 00 00 00 00 00 00 00 00 00 00 00 00

Note the 8-bit characters from the correct data above being replaced by:
0xEF 0xBF 0xBD in what is obtained from RT in the incorrect data line
above.

  1. How do we overcome this so that previous (migrated) attachments work
    correctly, and going forward all new attachments appear correctly as
    well?

I’m using Oracle too, Oracle XE. The way rows in the attachments table
look is that they are base64 encoded. You can check that by starting
from scratch and entering one ticket with a picture of word document
attachment. You’ll see that the content column contains the base64
encoded data and contenttype and contentencoding columns contain the
values needed to reconstruct the attachment. What probably is happening
is that the migration wizard has transferred all the data from MySQL to
Oracle but didn’t take into account that RT sees Oracle as a binary
unsafe database, thats why the content column is base64 encoded.
Depending on what is actually in the attachments table you might be able
to transform it to the right values.

If you need help, let me know.

Joop

Hi Joop,

Yes, that is what we found – all new attachments to the Oracle DB are
indeed Base64 encoded. Not sure why RT sees Oracle as binary unsafe
(perhaps earlier versions were determined as unsafe) but I suspected this
might be the case. However, in our case, the migrated attachment data can
be manually extracted from Oracle verbatim and compared with what was in the
MySQL database, binary compatible bit for bit. It’s all in how RT is
interpreting the attachment data coming from the newly migrated Oracle
database tables.

Could we easily address the interpretation issue within the RT code somehow,
or should we address the issue at the database level where there are 700+
tickets with multiple migrated attachments (automate the recoding of all
attachments)? I have looked at several places in the RT code and it is not
painfully obvious where one might attempt to work around this issue.

Thanks for all your help and guidance in advance,

Todd

Todd,

PostgreSQL suffers from the same classification as Oracle in this
regard. It should be very easy to run through your attachments table
and just re-encode your contents to base64.

Ken> On 6/13/07, Joop van de Wege JoopvandeWege@mococo.nl wrote:

Todd Williams wrote:

Hello,

Here are the pertinent details for our attempted migration:

Suse Linux 9.3 x86 32-bit
MIGRATING: MySQL 4.0.18 to Oracle 9i 9.2.0.6.0
RT 3.6.3
Perl 5.8.3
DBIx::SearchBuilder 1.45, DBD::Oracle 1.19
Apache 2.0.49
Oracle SQL Developer & MySQL Migration Workbench plugin (latest)

Issue: Binary Attachments & MySQL->Oracle DB Migration

Does anyone have experience with a successful MySQL 4.x to Oracle 9x RT
migration?

  1. We have a perfectly functioning RT instance on this maching using
    either MySQL or Oracle database type. I have done both installs, and
    both work independently.

  2. After some trial and error while migrating to the Oracle database,
    things seem to function much better if I allow RT to create the initial
    DB Schema, rather than attempt to have the Oracle tool migrate the
    schema from MySQL.

  3. I have used the Oracle SQL Developer to migrate the data from the
    MySQL to the Oracle instance.

  4. The database contents seem to have migrated without any issue,
    however, a primary issue is the binary attachments appear to be corrupt
    coming from the migrated Oracle DB.

  5. Upon comparing the contents of an attachment bit-for-bit from the
    MySQL to Oracle tables, the data in each is identical post migration!!

  6. Text attachments work fine, migrated just fine. Tickets, users,
    etc. migrated successfully.

  7. There appears to be a decoding issue with the Oracle binary
    attachments (from the migrated MySQL db) in that they seem to grow in
    size approximately 23% upon downloading, and are corrupted as a result.
    The data obtained through RT (downloading a ticket attachment, both WEB
    and CMDLINE) is NOT the expected data stored in the Oracle table.
    Perhaps this is a UTF8 issue of some sort?

  8. Hex dump (snippet) of data (Correct in the database, Incorrect
    produced out of RT’s migrated database):

Correct data: d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 3e 00 03 00 fe ff 09 00
Incorrect data: ef bf bd ef bf bd 11 ef bf bd ef bf bd ef bf bd 1a ef
bf bd 00 00 00 00 00 00 00 00 00 00 00 00

Note the 8-bit characters from the correct data above being replaced by:
0xEF 0xBF 0xBD in what is obtained from RT in the incorrect data line
above.

  1. How do we overcome this so that previous (migrated) attachments work
    correctly, and going forward all new attachments appear correctly as
    well?

I’m using Oracle too, Oracle XE. The way rows in the attachments table
look is that they are base64 encoded. You can check that by starting
from scratch and entering one ticket with a picture of word document
attachment. You’ll see that the content column contains the base64
encoded data and contenttype and contentencoding columns contain the
values needed to reconstruct the attachment. What probably is happening
is that the migration wizard has transferred all the data from MySQL to
Oracle but didn’t take into account that RT sees Oracle as a binary
unsafe database, thats why the content column is base64 encoded.
Depending on what is actually in the attachments table you might be able
to transform it to the right values.

If you need help, let me know.

Joop


The rt-users Archives

Community help: http://wiki.bestpractical.com
Commercial support: sales@bestpractical.com

Discover RT’s hidden secrets with RT Essentials from O’Reilly Media.
Buy a copy at http://rtbook.bestpractical.com

Perhaps someone out there has already written something similar to do this,
so that we need not re-invent the wheel?

On another note, I cannot seem to find the RT3 MySQL to Oracle conversion
script/info that has been mentioned in some older posts. It is not on the
BestPractical site that I can find, and Google apparently used to come up
with a cached copy of this script/info, but that seems to be no longer
available as well. (Google for “RT3MySQL2Oracle”)

Any guidance would be much appreciated.

Thanks and regards.On 6/14/07, Kenneth Marshall ktm@rice.edu wrote:

On Wed, Jun 13, 2007 at 04:26:08PM -0400, Todd Williams wrote:

Hi Joop,

Yes, that is what we found – all new attachments to the Oracle DB are
indeed Base64 encoded. Not sure why RT sees Oracle as binary unsafe
(perhaps earlier versions were determined as unsafe) but I suspected
this
might be the case. However, in our case, the migrated attachment data
can
be manually extracted from Oracle verbatim and compared with what was in
the
MySQL database, binary compatible bit for bit. It’s all in how RT is
interpreting the attachment data coming from the newly migrated Oracle
database tables.

Could we easily address the interpretation issue within the RT code
somehow,
or should we address the issue at the database level where there are
700+
tickets with multiple migrated attachments (automate the recoding of all
attachments)? I have looked at several places in the RT code and it is
not
painfully obvious where one might attempt to work around this issue.

Thanks for all your help and guidance in advance,

Todd

Todd,

PostgreSQL suffers from the same classification as Oracle in this
regard. It should be very easy to run through your attachments table
and just re-encode your contents to base64.

Ken

On 6/13/07, Joop van de Wege JoopvandeWege@mococo.nl wrote:

Todd Williams wrote:

Hello,

Here are the pertinent details for our attempted migration:

Suse Linux 9.3 x86 32-bit
MIGRATING: MySQL 4.0.18 to Oracle 9i 9.2.0.6.0
RT 3.6.3
Perl 5.8.3
DBIx::SearchBuilder 1.45, DBD::Oracle 1.19
Apache 2.0.49
Oracle SQL Developer & MySQL Migration Workbench plugin (latest)

Issue: Binary Attachments & MySQL->Oracle DB Migration

Does anyone have experience with a successful MySQL 4.x to Oracle 9x
RT
migration?

  1. We have a perfectly functioning RT instance on this maching using
    either MySQL or Oracle database type. I have done both installs, and
    both work independently.

  2. After some trial and error while migrating to the Oracle
    database,
    things seem to function much better if I allow RT to create the
    initial
    DB Schema, rather than attempt to have the Oracle tool migrate the
    schema from MySQL.

  3. I have used the Oracle SQL Developer to migrate the data from the
    MySQL to the Oracle instance.

  4. The database contents seem to have migrated without any issue,
    however, a primary issue is the binary attachments appear to be
    corrupt
    coming from the migrated Oracle DB.

  5. Upon comparing the contents of an attachment bit-for-bit from the
    MySQL to Oracle tables, the data in each is identical post
    migration!!

  6. Text attachments work fine, migrated just fine. Tickets, users,
    etc. migrated successfully.

  7. There appears to be a decoding issue with the Oracle binary
    attachments (from the migrated MySQL db) in that they seem to grow in
    size approximately 23% upon downloading, and are corrupted as a
    result.
    The data obtained through RT (downloading a ticket attachment, both
    WEB
    and CMDLINE) is NOT the expected data stored in the Oracle table.
    Perhaps this is a UTF8 issue of some sort?

  8. Hex dump (snippet) of data (Correct in the database, Incorrect
    produced out of RT’s migrated database):

Correct data: d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 00
00
00 00 00 00 00 00 3e 00 03 00 fe ff 09 00
Incorrect data: ef bf bd ef bf bd 11 ef bf bd ef bf bd ef bf bd 1a
ef
bf bd 00 00 00 00 00 00 00 00 00 00 00 00

Note the 8-bit characters from the correct data above being replaced
by:
0xEF 0xBF 0xBD in what is obtained from RT in the incorrect data
line
above.

  1. How do we overcome this so that previous (migrated) attachments
    work
    correctly, and going forward all new attachments appear correctly as
    well?

I’m using Oracle too, Oracle XE. The way rows in the attachments table
look is that they are base64 encoded. You can check that by starting
from scratch and entering one ticket with a picture of word document
attachment. You’ll see that the content column contains the base64
encoded data and contenttype and contentencoding columns contain the
values needed to reconstruct the attachment. What probably is happening
is that the migration wizard has transferred all the data from MySQL to
Oracle but didn’t take into account that RT sees Oracle as a binary
unsafe database, thats why the content column is base64 encoded.
Depending on what is actually in the attachments table you might be
able
to transform it to the right values.

If you need help, let me know.

Joop


The rt-users Archives

Community help: http://wiki.bestpractical.com
Commercial support: sales@bestpractical.com

Discover RT’s hidden secrets with RT Essentials from O’Reilly Media.
Buy a copy at http://rtbook.bestpractical.com