Best way to convert character set from "latin1 to utf8" for existing
Best way to convert character set from "latin1 to utf8" for existing
am 01.06.2009 08:41:05 von Uma Bhat
--0016e64eabf4ec2d6e046b43b5e9
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Hi All,
I have read many blogs suggesting some examples for this.
But suggestions from you guys who have ACTUALLY worked on such a scenario
would help me out the best.
Current Database has:
DEFAULT CHARACTER SET - latin1
DEFAULT COLLATION : latin1_swedish_ci
We need to convert this to
DEFAULT CHARACTER SET - utf8
DEFAULT COLLATION : utf8_general_ci
Note that this has to be done on a database that has *existing data* in it .
Hence just by doing a:
ALTER DATABASE CHARSET=utf8;
would result in unexpected behaviour of the data.
Thanks!
Uma
--0016e64eabf4ec2d6e046b43b5e9--
Re: Best way to convert character set from "latin1 to utf8" for
am 01.06.2009 11:05:46 von ewen fortune
Uma,
On Mon, Jun 1, 2009 at 8:41 AM, Uma Bhat wrote:
> Hi All,
>
> I have read many blogs suggesting some examples for this.
> But suggestions from you guys who have ACTUALLY worked on such a scenario
> would help me out the best.
>
>
> Current Database has:
> DEFAULT CHARACTER SET - latin1
> DEFAULT COLLATION : latin1_swedish_ci
>
> We need to convert this to
> =A0DEFAULT CHARACTER SET - utf8
> DEFAULT COLLATION : utf8_general_ci
>
>
> Note that this has to be done on a database that has *existing data* in i=
t .
>
> Hence just by doing a:
>
> ALTER DATABASE CHARSET=3Dutf8;
>
> would result in unexpected behaviour of the data.
Ryan Lowe blogged about this.
http://www.mysqlperformanceblog.com/2009/03/17/converting-ch aracter-sets/
He wrote a tool for it (linked from post)
http://www.pablowe.net/convert_charset
And Schlomi Noach commented that openark also has a tool.
http://code.openark.org/forge/openark-kit
Cheers,
Ewen
>
> Thanks!
> Uma
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg
Re: Best way to convert character set from "latin1 to utf8" for
am 08.06.2009 04:29:24 von Uma Bhat
--001485eb041eacc28f046bcd0204
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Thank was great piece of info Ewen, Thanks!
However this approach works for new data. But the existing data in the
database does not show us the Japanese characters from application side.
Appreciate responses who 'actually' got to work on this conversion.
Thanks!
Uma
On 6/1/09, ewen fortune wrote:
>
> Uma,
>
> On Mon, Jun 1, 2009 at 8:41 AM, Uma Bhat wrote:
> > Hi All,
> >
> > I have read many blogs suggesting some examples for this.
> > But suggestions from you guys who have ACTUALLY worked on such a scenario
> > would help me out the best.
> >
> >
> > Current Database has:
> > DEFAULT CHARACTER SET - latin1
> > DEFAULT COLLATION : latin1_swedish_ci
> >
> > We need to convert this to
> > DEFAULT CHARACTER SET - utf8
> > DEFAULT COLLATION : utf8_general_ci
> >
> >
> > Note that this has to be done on a database that has *existing data* in
> it .
> >
> > Hence just by doing a:
> >
> > ALTER DATABASE CHARSET=utf8;
> >
> > would result in unexpected behaviour of the data.
>
> Ryan Lowe blogged about this.
>
> http://www.mysqlperformanceblog.com/2009/03/17/converting-ch aracter-sets/
>
> He wrote a tool for it (linked from post)
>
> http://www.pablowe.net/convert_charset
>
> And Schlomi Noach commented that openark also has a tool.
>
> http://code.openark.org/forge/openark-kit
>
> Cheers,
>
> Ewen
>
> >
>
> > Thanks!
> > Uma
> >
>
--001485eb041eacc28f046bcd0204--
Re: Best way to convert character set from "latin1 to utf8" for
am 08.06.2009 04:52:22 von Darryle steplight
Uma,
I apologize in advance if this is redundant ,because I did not
click on any of Ewen's link. Nonetheless, this is the approach I would
take.
start your mysql server with different --character-set-server and
---collation-server options
Type SHOW COLLATION; in your mysql shell to determine which collations
are available for each character set
If you want to change the character set while running MySql, that may
also change the sort order. you must run myisamchk -r -q
-set-collation=3Dcollation_name on all MyISAM tables or your indexes may
not be ordered correctly
There are numerous collations for the uft8 charset so I'm assuming
mysql is selecting a collation that you don't want to use.
Additionally, if you did not run myisamchk on any of your MyISAM
tables that may be why you are getting unexpected results. I hope this
helps.
On Sun, Jun 7, 2009 at 10:29 PM, Uma Bhat wrote:
> Thank was great piece of info Ewen, Thanks!
>
> However this approach works for new data. But the existing data in the
> database does not show us the Japanese characters from application side.
>
> Appreciate responses who 'actually' got to work on this conversion.
>
> Thanks!
> Uma
>
>
> On 6/1/09, ewen fortune wrote:
>>
>> Uma,
>>
>> On Mon, Jun 1, 2009 at 8:41 AM, Uma Bhat wrote:
>> > Hi All,
>> >
>> > I have read many blogs suggesting some examples for this.
>> > But suggestions from you guys who have ACTUALLY worked on such a scena=
rio
>> > would help me out the best.
>> >
>> >
>> > Current Database has:
>> > DEFAULT CHARACTER SET - latin1
>> > DEFAULT COLLATION : latin1_swedish_ci
>> >
>> > We need to convert this to
>> > =A0DEFAULT CHARACTER SET - utf8
>> > DEFAULT COLLATION : utf8_general_ci
>> >
>> >
>> > Note that this has to be done on a database that has *existing data* i=
n
>> it .
>> >
>> > Hence just by doing a:
>> >
>> > ALTER DATABASE CHARSET=3Dutf8;
>> >
>> > would result in unexpected behaviour of the data.
>>
>> Ryan Lowe blogged about this.
>>
>> http://www.mysqlperformanceblog.com/2009/03/17/converting-ch aracter-sets=
/
>>
>> He wrote a tool for it (linked from post)
>>
>> http://www.pablowe.net/convert_charset
>>
>> And Schlomi Noach commented that openark also has a tool.
>>
>> http://code.openark.org/forge/openark-kit
>>
>> Cheers,
>>
>> Ewen
>>
>> >
>>
>> > Thanks!
>> > Uma
>> >
>>
>
--=20
A: It reverses the normal flow of conversation.
Q: What's wrong with top-posting?
A: Top-posting.
Q: What's the biggest scourge on plain text email discussions?
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg
Re: Best way to convert character set from "latin1 to utf8" for existing database?
am 08.06.2009 05:30:14 von chaim.rieger
RXhwb3J0IHNjaGVtYQ0KRXhwb3J0IGRhdGENCkNoYW5nZSBleHBvcnRlZCBz Y2hlbWEgdG8gdXRm
OA0KSW1wb3J0IHNjaGVtYSBpbnRvIG5ldyBkYg0KSW1wb3J0IGV4cG9ydGVk IGRhdGEgaW50byBu
ZXcgZGINCg0KDQpTZW50IHZpYSBCbGFja0JlcnJ5IGZyb20gVC1Nb2JpbGUN Cg0KLS0tLS1Pcmln
aW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IERhcnJ5bGUgU3RlcGxpZ2h0IDxk c3RlcGxpZ2h0QGdt
YWlsLmNvbT4NCg0KRGF0ZTogU3VuLCA3IEp1biAyMDA5IDIyOjUyOjIyIA0K VG86IFVtYSBCaGF0
PGJoYXQudW1hQGdtYWlsLmNvbT4NCkNjOiBld2VuIGZvcnR1bmU8ZXdlbi5m b3J0dW5lQGdtYWls
LmNvbT47IDxteXNxbEBsaXN0cy5teXNxbC5jb20+DQpTdWJqZWN0OiBSZTog QmVzdCB3YXkgdG8g
Y29udmVydCBjaGFyYWN0ZXIgc2V0IGZyb20gImxhdGluMSB0byB1dGY4IiBm b3IgDQoJZXhpc3Rp
bmcgZGF0YWJhc2U/DQoNCg0KVW1hLA0KICAgSSBhcG9sb2dpemUgaW4gYWR2 YW5jZSBpZiB0aGlz
IGlzIHJlZHVuZGFudCAsYmVjYXVzZSBJIGRpZCBub3QNCmNsaWNrIG9uIGFu eSBvZiBFd2VuJ3Mg
bGluay4gTm9uZXRoZWxlc3MsIHRoaXMgaXMgdGhlIGFwcHJvYWNoIEkgd291 bGQNCnRha2UuDQoN
CnN0YXJ0IHlvdXIgbXlzcWwgc2VydmVyIHdpdGggZGlmZmVyZW50IC0tY2hh cmFjdGVyLXNldC1z
ZXJ2ZXIgYW5kDQotLS1jb2xsYXRpb24tc2VydmVyIG9wdGlvbnMNCg0KVHlw ZSBTSE9XIENPTExB
VElPTjsgaW4geW91ciBteXNxbCBzaGVsbCB0byBkZXRlcm1pbmUgd2hpY2gg Y29sbGF0aW9ucw0K
YXJlIGF2YWlsYWJsZSBmb3IgZWFjaCBjaGFyYWN0ZXIgc2V0DQoNCg0KSWYg eW91IHdhbnQgdG8g
Y2hhbmdlIHRoZSBjaGFyYWN0ZXIgc2V0IHdoaWxlIHJ1bm5pbmcgTXlTcWws IHRoYXQgbWF5DQph
bHNvIGNoYW5nZSB0aGUgc29ydCBvcmRlci4geW91IG11c3QgcnVuIG15aXNh bWNoayAtciAtcQ0K
LXNldC1jb2xsYXRpb249Y29sbGF0aW9uX25hbWUgb24gYWxsIE15SVNBTSB0 YWJsZXMgb3IgeW91
ciBpbmRleGVzIG1heQ0Kbm90IGJlIG9yZGVyZWQgY29ycmVjdGx5DQoNClRo ZXJlIGFyZSBudW1l
cm91cyBjb2xsYXRpb25zIGZvciB0aGUgdWZ0OCBjaGFyc2V0IHNvIEknbSBh c3N1bWluZw0KbXlz
cWwgaXMgc2VsZWN0aW5nIGEgY29sbGF0aW9uIHRoYXQgeW91IGRvbid0IHdh bnQgdG8gdXNlLg0K
QWRkaXRpb25hbGx5LCBpZiB5b3UgZGlkIG5vdCBydW4gbXlpc2FtY2hrIG9u IGFueSBvZiB5b3Vy
IE15SVNBTQ0KdGFibGVzIHRoYXQgbWF5IGJlIHdoeSB5b3UgYXJlIGdldHRp bmcgdW5leHBlY3Rl
ZCByZXN1bHRzLiBJIGhvcGUgdGhpcw0KaGVscHMuDQoNCg0KDQpPbiBTdW4s IEp1biA3LCAyMDA5
IGF0IDEwOjI5IFBNLCBVbWEgQmhhdDxiaGF0LnVtYUBnbWFpbC5jb20+IHdy b3RlOg0KPiBUaGFu
ayB3YXMgZ3JlYXQgcGllY2Ugb2YgaW5mbyBFd2VuLCBUaGFua3MhDQo+DQo+ IEhvd2V2ZXIgdGhp
cyBhcHByb2FjaCB3b3JrcyBmb3IgbmV3IGRhdGEuIEJ1dCB0aGUgZXhpc3Rp bmcgZGF0YSBpbiB0
aGUNCj4gZGF0YWJhc2UgZG9lcyBub3Qgc2hvdyB1cyB0aGUgSmFwYW5lc2Ug Y2hhcmFjdGVycyBm
cm9tIGFwcGxpY2F0aW9uIHNpZGUuDQo+DQo+IEFwcHJlY2lhdGUgcmVzcG9u c2VzIHdobyAnYWN0
dWFsbHknIGdvdCB0byB3b3JrIG9uIHRoaXMgY29udmVyc2lvbi4NCj4NCj4g VGhhbmtzIQ0KPiBV
bWENCj4NCj4NCj4gT24gNi8xLzA5LCBld2VuIGZvcnR1bmUgPGV3ZW4uZm9y dHVuZUBnbWFpbC5j
b20+IHdyb3RlOg0KPj4NCj4+IFVtYSwNCj4+DQo+PiBPbiBNb24sIEp1biAx LCAyMDA5IGF0IDg6
NDEgQU0sIFVtYSBCaGF0IDxiaGF0LnVtYUBnbWFpbC5jb20+IHdyb3RlOg0K Pj4gPiBIaSBBbGws
DQo+PiA+DQo+PiA+IEkgaGF2ZSByZWFkIG1hbnkgYmxvZ3Mgc3VnZ2VzdGlu ZyBzb21lIGV4YW1w
bGVzIGZvciB0aGlzLg0KPj4gPiBCdXQgc3VnZ2VzdGlvbnMgZnJvbSB5b3Ug Z3V5cyB3aG8gaGF2
ZSBBQ1RVQUxMWSB3b3JrZWQgb24gc3VjaCBhIHNjZW5hcmlvDQo+PiA+IHdv dWxkIGhlbHAgbWUg
b3V0IHRoZSBiZXN0Lg0KPj4gPg0KPj4gPg0KPj4gPiBDdXJyZW50IERhdGFi YXNlIGhhczoNCj4+
ID4gREVGQVVMVCBDSEFSQUNURVIgU0VUIC0gbGF0aW4xDQo+PiA+IERFRkFV TFQgQ09MTEFUSU9O
IDogbGF0aW4xX3N3ZWRpc2hfY2kNCj4+ID4NCj4+ID4gV2UgbmVlZCB0byBj b252ZXJ0IHRoaXMg
dG8NCj4+ID4goERFRkFVTFQgQ0hBUkFDVEVSIFNFVCAtIHV0ZjgNCj4+ID4g REVGQVVMVCBDT0xM
QVRJT04gOiB1dGY4X2dlbmVyYWxfY2kNCj4+ID4NCj4+ID4NCj4+ID4gTm90 ZSB0aGF0IHRoaXMg
aGFzIHRvIGJlIGRvbmUgb24gYSBkYXRhYmFzZSB0aGF0IGhhcyAqZXhpc3Rp bmcgZGF0YSogaW4N
Cj4+IGl0IC4NCj4+ID4NCj4+ID4gSGVuY2UganVzdCBieSBkb2luZyBhOg0K Pj4gPg0KPj4gPiBB
TFRFUiBEQVRBQkFTRSA8ZGJuYW1lPiBDSEFSU0VUPXV0Zjg7DQo+PiA+DQo+ PiA+IHdvdWxkIHJl
c3VsdCBpbiB1bmV4cGVjdGVkIGJlaGF2aW91ciBvZiB0aGUgZGF0YS4NCj4+ DQo+PiBSeWFuIExv
d2UgYmxvZ2dlZCBhYm91dCB0aGlzLg0KPj4NCj4+IGh0dHA6Ly93d3cubXlz cWxwZXJmb3JtYW5j
ZWJsb2cuY29tLzIwMDkvMDMvMTcvY29udmVydGluZy1jaGFyYWN0ZXItc2V0 cy8NCj4+DQo+PiBI
ZSB3cm90ZSBhIHRvb2wgZm9yIGl0IChsaW5rZWQgZnJvbSBwb3N0KQ0KPj4N Cj4+IGh0dHA6Ly93
d3cucGFibG93ZS5uZXQvY29udmVydF9jaGFyc2V0DQo+Pg0KPj4gQW5kIFNj aGxvbWkgTm9hY2gg
Y29tbWVudGVkIHRoYXQgb3BlbmFyayBhbHNvIGhhcyBhIHRvb2wuDQo+Pg0K Pj4gaHR0cDovL2Nv
ZGUub3BlbmFyay5vcmcvZm9yZ2Uvb3BlbmFyay1raXQNCj4+DQo+PiBDaGVl cnMsDQo+Pg0KPj4g
RXdlbg0KPj4NCj4+ID4NCj4+DQo+PiA+IFRoYW5rcyENCj4+ID4gVW1hDQo+ PiA+DQo+Pg0KPg0K
DQoNCg0KLS0gDQpBOiBJdCByZXZlcnNlcyB0aGUgbm9ybWFsIGZsb3cgb2Yg Y29udmVyc2F0aW9u
Lg0KUTogV2hhdCdzIHdyb25nIHdpdGggdG9wLXBvc3Rpbmc/DQpBOiBUb3At cG9zdGluZy4NClE6
IFdoYXQncyB0aGUgYmlnZ2VzdCBzY291cmdlIG9uIHBsYWluIHRleHQgZW1h aWwgZGlzY3Vzc2lv
bnM/DQoNCi0tDQpNeVNRTCBHZW5lcmFsIE1haWxpbmcgTGlzdA0KRm9yIGxp c3QgYXJjaGl2ZXM6
IGh0dHA6Ly9saXN0cy5teXNxbC5jb20vbXlzcWwNClRvIHVuc3Vic2NyaWJl OiAgICBodHRwOi8v
bGlzdHMubXlzcWwuY29tL215c3FsP3Vuc3ViPWNoYWltLnJpZWdlckBnbWFp bC5jb20NCg0K