
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[tlug] unable to create local copy utf8 encoded Japanese MySQL data
- Date: Tue, 14 Feb 2006 12:42:22 +0900
- From: Dave Gutteridge <dave@example.com>
- Subject: [tlug] unable to create local copy utf8 encoded Japanese MySQL data
- User-agent: Mozilla Thunderbird 1.0.7 (X11/20051013)
TLUG,
This question is more MySQL than specifically Linux, although I'm
running MySQL exclusively on Linux. The thing is that I posted this
question to the MySQL mailing list, but I'm seeing no action there
because there doesn't seem to be many people familiar with encoding
issues. So I'm turning to you guys in hopes that Japanese and utf-8
encoding is something you can help with.
I have a MySQL database that I am trying to copy from my hosting
service, where it was created, to my home machine, where I now run
Ubuntu Linux and I want to create a full testing and development
environment. The database is in utf8 encoding, and has a mix of English
and Japanese.
It might be relevant that the data has been around for a few years,
and when it was created, the hosting service was running MySQL 3.2 (not
exactly sure, but somewhere in the 3 series), which did not have utf-8
support. So there is a mix of how the Japanese is stored in the
database, as described below.
Currently, both the hosting service and my home computer are running
MySQL 4.2, which has decent utf-8 support.
So, to the specific problem. I've exported an .sql file from my
hosting service, with structure and data, and copied it to my home machine.
I can take the .sql file, open it in OpenOffice Write as a text
encoded file, and verify that it is encoded in utf-8. Most of the
Japanese text shows up readable. Some of it, however, shows up as coded
numbers (I'm not sure what the term is when utf displays this way):
メーン・ I think this might be "legacy" data,
held over from the days when MySQL did not have utf8 support.
When I import the .sql file into MySQL, I can look at it in phpMyAdmin
and see that the text that displayed correctly as Japanese in OpenOffice
still displays correctly as Japanese. The text that was in number form
is also still in number form when viewed through phpMyAdmin. In short,
phpMyAdmin sees it after import the same way that OpenOffice did before
import.
But, then when I view a PHP file in FireFox, and it accesses the
database, the situation changes. The text that is encoded as numbers
displays as correct Japanese. The text that displays as actual Japanese
text in OpenOffice and phpMyAdmin now displays as question marks.
Again, just to be clear, all Japanese characters, regardless of how
they look in phpMyAdmin, display correctly when viewed from the hosting
service.
The ideal scenario is for the text that displays as proper Japanese
characters in OpenOffice and phpMyAdmin to display as proper Japanese
characters when viewed via PHP/Browser. I am willing to go through the
database with a fine tooth comb and replace the numbered utf8 characters
with the correct Japanese. But I'm not willing to do the reverse, and
make the database completely non-human readable when viewed through
phpMyAdmin.
I hope someone can shed some light on this.
Thank you.
--
Dave M G
Home |
Main Index |
Thread Index