The Bug Genie team blog

What's cooking behind the scenes of The Bug Genie

Database encoding issues

with one comment

We’ve been having a rush of encoding issues recently, with people experiencing all kinds of issues – from popups not loading, weird errors occurring to text being displayed incorrectly. After much investigating it seems almost all these issues are related to database encoding (in addition to some display and encoding/decoding/conversion issues inside thebuggenie – these are also being corrected).

Database charsets / encoding

For The Bug Genie to function properly, it is expected that the database and table charset / encoding is utf-8. Some users are using different charsets, such as latin-8 as well as cyrillic variants, which are causing a lot of issues when The Bug Genie expects this to be utf-8. In addition to this, as The Bug Genie starts to use JSON more extensively, having data stored in non-utf8 format increasingly starts to become a proplem as JSON is a format which expects data to be encoded in unicode (with utf-8 as the default encoding). The Bug Genie defaults to creating tables with utf-8 encoding, but sometimes the database collation setting, default database encoding and other settings are causing incompatibilities with the data stored and the expected utf-8 encoding. We have also not communicated clearly that the database collation / encoding must be utf-8 for The Bug Genie to function properly. This will be improved in upcoming versions.

Migrating data to utf-8

If you’re already experiencing issues with utf-8 encoding or weird issues in The Bug Genie, it is recommended that you investigate whether your database encoding and collation settings are all set to utf-8. If they are not, you should convert your database content to utf-8 – both to fix most of the weird issues in The Bug Genie, and to make sure your data won’t get corrupted in the future.

There are several good guides out there to convert database contents (as well as the database itself + tables) to utf-8. Most of them involves the following steps:

  1. Dump your existing data to a .sql file
  2. Dump your existing database, and recreate it with the same name but with specified utf-8 encoding + collation
  3. Convert your .sql file from the old encoding to utf-8
  4. Import into your new database

Converting MySQL data to utf-8

MySQL docs: http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html

Gentoo Wiki: http://en.gentoo-wiki.com/wiki/Convert_latin1_to_UTF-8_in_MySQL

Converting Postgres data to utf-8

Postgres mailing list: http://archives.postgresql.org/pgsql-general/2006-03/msg01515.php

http://bryan-murdock.blogspot.com/2008/11/convert-postgresql-database-from-latin1.html

 

There are also several bugreports for these issues in The Bug Genie. We will continue to investigate and improve these and related encoding issues in the bug genie source code, as well.

Written by Daniel André

June 27, 2011 at 10:08

Posted in Uncategorized

One Response

Subscribe to comments with RSS.

  1. […] a comment » zegenie posted a short while ago regarding encoding trouble with The Bug Genie, especially those of you who use […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: