What collations for UCS distinguishing accented characters from theirunaccented equivalents are avai

What collations for UCS distinguishing accented characters from theirunaccented equivalents are avai

am 30.01.2011 22:34:16 von Filipus Klutiero

Hi,
an international site has some content in several latin languages, for
example English and French. Sometimes 2 pages, one in English and one in
French, have the same name except for an accent (for example, in
English, Demonstration, in French, Démonstration). The site's database
schema enforces page names to be unique. Trying to convert the content
to support UCS, this causes a problem because the collation we tried
using, utf8_unicode_ci, consider an accented letter and its unaccented
representation as the same.

What collations for UCS exist that distinguish accented characters from
their unaccented equivalents? I saw utf8_bin, but that seems very
different from utf8_unicode_ci, which I would like to use if it wasn't
for this problem. I would like something as close to utf8_unicode_ci as
possible.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: What collations for UCS distinguishing accented characters fromtheir unaccented equivalents are

am 31.01.2011 17:59:05 von Joerg Bruehe

Hi Filipus, all!


Filipus Klutiero wrote:
> Hi,
> an international site has some content in several latin languages, for
> example English and French. Sometimes 2 pages, one in English and one i=
n
> French, have the same name except for an accent (for example, in
> English, Demonstration, in French, D=E9monstration). The site's databas=
e
> schema enforces page names to be unique.=20

To me, your approach sounds wrong: What would you do if two languages
used the same word with fully identical spelling, no accents involved?
For example, "demonstration" is used in both English and German.

If you want to have unique names in a multi-language setup, IMO you
should take the language (code) as part of the page name. With the
example above, you might end up with "DE_demonstration" and
"EN_demonstration", and then a (missing) accent in "FR_demonstration"
might not matter at all.

> [[...]]
>=20


HTH,
Jörg

--=20
Joerg Bruehe, MySQL Build Team, joerg.bruehe@oracle.com
ORACLE Deutschland B.V. & Co. KG, Komturstrasse 18a, D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven
Amtsgericht Muenchen: HRA 95603


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg