The Bug Genie team blog

What's cooking behind the scenes of The Bug Genie

The Bug Genie 3.2 and UTF

leave a comment »

As discussed earlier on this blog, changes in The Bug Genie 3.2 to improve our support of Unicode may result in some data being mangled after upgrading to The Bug Genie 3.2 The technical reasons behind this have been explained before, this post is just to give a brief overview on what happens and how to fix it, for those of you upgrading from previous releases.

Do I need this?

If you use PostgreSQL, or are performing a fresh installation, this is not necessary. In addition, this may not be necessary when you upgrade from a prior release, especially if you do not use special characters.

The fix, if necessary, should be applied as soon as possible after upgrading, do not perform the fix before installing the 3.2 files. You may, if you wish, install the 3.2 files, run the upgrade script, and then explore your installation to see if the fix is necessary – if you see mangled characters in any text field then you will need to apply it. Do not adjust any field before applying the fix, as any new special characters will be destroyed.

If there are only a few to correct, you can always do this by hand, but the fixes below are more efficient if there are many issues.

The problem

We did not correctly handle the connection to the database in The Bug Genie 3.0 and 3.1. This meant there were occasional issues with special characters being mangled or lost in various places, such as fields in an issue. The connection opened to the server was not Unicode, and therefore the data was not stored in a Unicode fashion, leading to problems.

In The Bug Genie 3.2, we do create a proper Unicode connection, meaning the mangled data will now be shown as-is. While there were cases previously where the data was correctly shown, by ensuring the problem is resolved properly now, we avoid potential issues in the future.

The solution

There are two solutions available. If necessary, you should apply one before upgrading, but you can always check afterwards to see if a fix is necessary. You can apply the fix after upgrading as long as you do not add any Unicode characters to the database beforehand, as these will be destroyed by the fix.

If you have command line access to the server:

The following commands will resolve the issue. A database dump is made in the non-UTF format we used to use in The Bug Genie, this is then restored in the correct format. The database is recreated also, to ensure it is in UTF format, so please make sure you have the right permissions. We assume a database called thebuggenie, and a user called root. Change these if necessary.

mysqldump -h localhost --user=root -p --default-character-set=latin1 -c --insert-ignore --skip-set-charset -r dump.sql thebuggenie
mysql --user=root -p --execute="DROP DATABASE thebuggenie; CREATE DATABASE thebuggenie CHARACTER SET utf8 COLLATE utf8_general_ci;"
mysql --user=root --max_allowed_packet=16M -p --default-character-set=utf8 thebuggenie < dump.sql

If you don’t have command line access to server, but you do have phpMyAdmin:

You can also apply the fix using phpMyAdmin. The trick is to change the connection collation, which can be done via box on the front page

Collection Collation box on the phpMyAdmin front page

This should be set to latin1 when taking the database export (leave the file as UTF-8), then set back to utf8_general_ci when recreating the database and importing. The database collation should be set to utf8_general_ci, and this can be set via box to the right of the database name field:

Database creation field

Please remember to dump just the data of the database, and not the structure and data. This can be done by selecting a choice when exporting, you may have to choose an option ‘Custom – display all possible options’ first.

Written by lsproc

January 4, 2012 at 00:09

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: