Bookmarks

Yahoo Gmail Google Facebook Delicious Twitter Reddit Stumpleupon Myspace Digg

Search queries

078274121, info0a ip, should prodicers of software_based services be held liable or not liable for economic injuries, should producers of soft ware based services such as ATMs be held liable for economic injuries suffered when their systems fail?, nisc wwwxxx, wwwxxx0cm, should producers of software-based services, such as atms, be held liable for economic injuries suffered when their systems fail?, wwwxxx0cm, www.webdp.net, Event 9 IIS log failed to write entry

Links

XODOX
Impressum

#1: Re: Strange "Â" character output when using simplex

Posted on 2008-04-07 15:19:28 by bizt

On 25 Feb, 10:56, Toby A Inkster <usenet200...@tobyinkster.co.uk>
wrote:
> Andy Hassall wrote:
> > bizt <bissa...@yahoo.co.uk> wrote:
>
> >> I converting an XML string using simplexml_load_string function. It is
> >> giving me a =C2 character for some reason dotted around the text.
>
> > =A0simplexml always outputs in UTF-8. Is your page's encoding UTF-8?
>
> At a guess, ISO-8859-1 or perhaps ISO-8859-15.
>
> In UTF-8, a "prefix" of an 0xC2 byte is used to access the top half of the=

> "Latin-1 Supplement" block which includes a lot of juicy characters such
> as currency symbols, fractions, superscript 2 and 3, the copyright and
> registered trademark symbols, and the non-breaking space.
>
> However in ISO-8859-1 and -15, the byte 0xC2 represents an =C2, so if UTF-=
8
> is misinterpreted as one of those, then you get =C2 followed by some other=

> nonsense character.
>
> Probably the easiest solution would be to take the output from SimpleXML
> and pass it through iconv():
>
> =A0 =A0 =A0 =A0 $xmlout =3D iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $xmlou=
t);
>
> Note that UTF-8 is capable of representing a far greater range of
> characters than ISO-8859-1/-15 are, so certain characters may not properly=

> survive conversion. (Using the '//TRANSLIT' option tells iconv to do its
> best, and if, say, a particular accented character is not available in
> ISO-8859-1, then to substitute an unaccented one in its place.)
>
> --
> Toby A Inkster BSc (Hons) ARCS
> [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
> [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 26 days, 15:55.]
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Bottled Wat=
er
> =A0 =A0 =A0 =A0 =A0http://tobyinkster.co.uk/blog/2008/02/18/bottled-water/=



Hi, ive tried what you said which worked for one of my pages but when
i tried it on another i got the following:

Notice: iconv() [function.iconv]: Detected an illegal character in
input string in /home/public_html/search_apartments.php on line 67

Im using the following to convert my XML string which is fetched via
cUrl:

$result =3D iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $result);

Would it be the case that my $result string, im not providing the
iconv() with the correct input encoding? If so, is there a way for me
to detect the input encoding?

Cheers

Martyn

Report this message

#2: Re: Strange "Â" character output when usin

Posted on 2008-04-08 17:00:00 by AnrDaemon

Greetings, bizt.
In reply to Your message dated Monday, April 7, 2008, 17:19:28,

>> >> I converting an XML string using simplexml_load_string function. It is
>> >> giving me a  character for some reason dotted around the text.
>>
>> >  simplexml always outputs in UTF-8. Is your page's encoding UTF-8?
>>
>> At a guess, ISO-8859-1 or perhaps ISO-8859-15.
>>
>> In UTF-8, a "prefix" of an 0xC2 byte is used to access the top half of the
>> "Latin-1 Supplement" block which includes a lot of juicy characters such
>> as currency symbols, fractions, superscript 2 and 3, the copyright and
>> registered trademark symbols, and the non-breaking space.
>>
>> However in ISO-8859-1 and -15, the byte 0xC2 represents an Â, so if UTF-8
>> is misinterpreted as one of those, then you get  followed by some other
>> nonsense character.
>>
>> Probably the easiest solution would be to take the output from SimpleXML
>> and pass it through iconv():
>>
>>         $xmlout = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $xmlout);
>>
>> Note that UTF-8 is capable of representing a far greater range of
>> characters than ISO-8859-1/-15 are, so certain characters may not properly
>> survive conversion. (Using the '//TRANSLIT' option tells iconv to do its
>> best, and if, say, a particular accented character is not available in
>> ISO-8859-1, then to substitute an unaccented one in its place.)

> Hi, ive tried what you said which worked for one of my pages but when
> i tried it on another i got the following:

> Notice: iconv() [function.iconv]: Detected an illegal character in
> input string in /home/public_html/search_apartments.php on line 67

> Im using the following to convert my XML string which is fetched via
> cUrl:

> $result = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $result);

> Would it be the case that my $result string, im not providing the
> iconv() with the correct input encoding? If so, is there a way for me
> to detect the input encoding?

As a guess, Your "B" probably followed by space and represent a non-breaking
space.

To Your trouble with iconv on $result, I think You should take care of the
SOURCE BEFORE using simplexml_load_string.
And see what the encoding it use. Because if Your source in, say, ISO-8859-15,
You can't have any untranslatable characters in UTF-8 what You can't convert
back to ISO-8859-15.


--
Sincerely Yours, AnrDaemon <anrdaemon@freemail.ru>

Report this message