Re: join("") somehow changes characters after "z"
Re: join("") somehow changes characters after "z"
am 10.10.2007 19:48:03 von Cloink_Friggson
Could someone explain in even more layman terms (I struggle with
Perl), whether this will help me with a similar encoding problem I
have?
Please note that thhe URI::Escape::uri_unescape() function does not
cater for the latter encodings.
I receive from a JavaScript-encoded URL in a web/cgi environment,
potentially UTF8 data encoded via encodeURIComponent(); for example a
=A3-sign (GBP-pound-sign, in case that got mis-translated) is charCode
163 and becomes %C2%A3. You can do this in your web browser by typing
this in your address bar:-
javascript:alert(encodeURIComponent('=A3'))
(if you have =A3-sign on your keyboard! the not-sign, , and broken
pipe, =A6, chars do it too).
I have managed to get compatability between JS & Perl, up to 255
(possibly higher, don't have the code in front of me here - certainly,
I've got a =A3-sign working); however, I know that the compatability
will break for the higher-order characters that will be encoded in JS
to a series of THREE %xx's, eg %11%A2%4D (no idea what that might be).
Many thanks.
Re: join("") somehow changes characters after "z"
am 10.10.2007 20:06:52 von it_says_BALLS_on_your forehead
On Oct 10, 1:48 pm, Cloink wrote:
> Could someone explain in even more layman terms (I struggle with
> Perl), whether this will help me with a similar encoding problem I
> have?
>
> Please note that thhe URI::Escape::uri_unescape() function does not
> cater for the latter encodings.
>
> I receive from a JavaScript-encoded URL in a web/cgi environment,
> potentially UTF8 data encoded via encodeURIComponent(); for example a
> =A3-sign (GBP-pound-sign, in case that got mis-translated) is charCode
> 163 and becomes %C2%A3. You can do this in your web browser by typing
> this in your address bar:-
> javascript:alert(encodeURIComponent('=A3'))
> (if you have =A3-sign on your keyboard! the not-sign, , and broken
> pipe, =A6, chars do it too).
>
> I have managed to get compatability between JS & Perl, up to 255
> (possibly higher, don't have the code in front of me here - certainly,
> I've got a =A3-sign working); however, I know that the compatability
> will break for the higher-order characters that will be encoded in JS
> to a series of THREE %xx's, eg %11%A2%4D (no idea what that might be).
>
Have you examined HTML::Entities::encode_entities? This may do what
you require--not positive.
Re: join("") somehow changes characters after "z"
am 10.10.2007 22:18:44 von Ben Morrow
Quoth Cloink :
> Could someone explain in even more layman terms (I struggle with
> Perl), whether this will help me with a similar encoding problem I
> have?
>
> Please note that thhe URI::Escape::uri_unescape() function does not
> cater for the latter encodings.
uri_unescape decodes each %xx sequence into the bytes it represents. If
those bytes are supposed to be UTF8, you can convert them to Perl
character strings with Encode::decode; that is,
my $chars = Encode::decode utf8 =>
URI::Escape::uri_unescape $uri_chars;
> I receive from a JavaScript-encoded URL in a web/cgi environment,
> potentially UTF8 data encoded via encodeURIComponent(); for example a
> £-sign (GBP-pound-sign, in case that got mis-translated) is charCode
> 163 and becomes %C2%A3. You can do this in your web browser by typing
> this in your address bar:-
> javascript:alert(encodeURIComponent('£'))
> (if you have £-sign on your keyboard! the not-sign, ¬, and broken
> pipe, ¦, chars do it too).
>
> I have managed to get compatability between JS & Perl, up to 255
> (possibly higher, don't have the code in front of me here - certainly,
> I've got a £-sign working); however, I know that the compatability
> will break for the higher-order characters that will be encoded in JS
> to a series of THREE %xx's, eg %11%A2%4D (no idea what that might be).
I'm not sure how you got £ working... I think you must have some
confusion somewhere about whether your strings are bytes or characters,
or perhaps about whether you are using UTF8 or ISO8859-1.
length uri_unescape '%C2%A3'
returns 2, but if it was correctly decoded into a single £ character it
should return 1.
Ben
Re: join("") somehow changes characters after "z"
am 18.10.2007 21:59:11 von Cloink_Friggson
> Have you examined HTML::Entities::encode_entities? This may do what
> you require--not positive.
I will look, thanks. Not now, I need the sette, the tv and a glass of
wine.
Re: join("") somehow changes characters after "z"
am 18.10.2007 22:00:54 von Cloink_Friggson
> Have you examined HTML::Entities::encode_entities? This may do what
> you require--not positive.
I will look, thanks. Not now, I need the settee, the tv and a glass of
wine (or two).