Storage of UTF-8 char in MySQL

Storage of UTF-8 char in MySQL

am 14.08.2010 16:28:50 von Ryan Chan

According to this document:
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html

It said MySQL support UTF-8 using one to three bytes per character.

But I have created a test table:

-- create table test ( c char(5) ) default charset =utf8;

From the table status, the data length is alway a multiple of 16.

So how does it support 3 byte UTF-8 in practice?


Thanks

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Storage of UTF-8 char in MySQL

am 16.08.2010 09:46:00 von Werner Van Belle

--------------enigE18C4F922DFB9E4D8F466719
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Ryan Chan wrote:
> According to this document:
> http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html
>
> It said MySQL support UTF-8 using one to three bytes per character.
>
> But I have created a test table:
>
> -- create table test ( c char(5) ) default charset =3Dutf8;
>
> >From the table status, the data length is alway a multiple of 16.
>
> So how does it support 3 byte UTF-8 in practice?
> =20
I'm afraid you might need to read up on UTF8 and unicode in general.
It's not a 'choice' to have 1, 2 or 3 bytes per character. Rather, when
the characters is sufficiently weird then UTF8 will use 2 or 3 bytes for
that specific character only. Only if your entire message is weird, will
each character consume 3 bytes.

Wkr,

Werner,-

--=20
http://werner.yellowcouch.org/



--------------enigE18C4F922DFB9E4D8F466719
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIcBAEBAgAGBQJMaOy7AAoJEDQztpxTPuxVm30QAIuif5mmt28eVpJtil+a L1c6
RaVs7Le2WPEuFN5YGBEe8jlsp3Dptjw2M4iCyTBfajQeLUWakobZAsE9YB62 1gWq
yQQGhHSMVB1qDISgJ173c9On9qtMzChiD+CVo8T9MHgqr9BaIXjZcZjxTqof RiKx
0NASrN+CRr3AcObGKiX4aS6NarwgDrIwmwsG1T9RZX3k7zpEPm/Km54KZCVN Qh9X
BCHa49oY7kFieq+VxuZ4367p5fSQQTm4q3R9nWX8DUUInrGg2jInd26lyCFJ O930
TVB6B0CWG3hVJJ6WHd9eme96NjWXRVjvFsm1/1aG+1FXfDbaFlnZ9ZscGGiT 7Mpk
MOlocN6lx9LU9QDnvudbzNIMf3FOQEHHrbNTJJS+pe4TtKwyvbsx1J7BW2e9 7T9U
umbaoi1VCEkfpFon6OMEH1Fb+xN2lQwlOzQK2NtLXVqNO0SaYfiiyk7fef/M jsjq
4YpLrFu06epjUY2J9YqZUcMjDmnG9d6JpfkP2LkAGcpJk7cQ0YSVohsk7AG6 92Yi
EMD8TFBTWhLlZVB1qMDXH9q0cdIIrbvAyz2WrSeh1AX5EVfQmwtnXqplLUno q1k1
r9VCsOcTwwVaofgHtun4kKwJQijy3lMiBNakQ6GVRZsPjJim1rXGgisOcIMz vOfR
5DZ+XGAuE6x7dBZAhWch
=2M/K
-----END PGP SIGNATURE-----

--------------enigE18C4F922DFB9E4D8F466719--

Re: Storage of UTF-8 char in MySQL

am 16.08.2010 12:06:45 von Joerg Bruehe

Ryan,, all:


Ryan Chan wrote:
> According to this document:
> http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html
>=20
> It said MySQL support UTF-8 using one to three bytes per character.
>=20
> But I have created a test table:
>=20
> -- create table test ( c char(5) ) default charset =3Dutf8;
>=20
> From the table status, the data length is alway a multiple of 16.

When you have a question about any software behavior (not limited to
MySQL), you really should specify the version and the platform.

No, we won't guess.


Jörg

--=20
Joerg Bruehe, MySQL Build Team, joerg.bruehe@oracle.com
ORACLE Deutschland B.V. & Co. KG, Komturstrasse 18a, D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
Amtsgericht Muenchen: HRA 95603


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg