Valid characters in GET data

Valid characters in GET data

am 20.12.2007 16:42:51 von David Segall

I want to encode a string that will be used as a GET data parameter
but the algorithm I have can produce the characters "/", "+" and "="
in addition to alphanumeric characters. Those characters don't have a
named entity in my HTML text book so I believe they can be used
without further encoding. Can they? In other words, is the URL
valid? For extra credit :),
where should I have looked to find the definitive answer to this
question?

Re: Valid characters in GET data

am 20.12.2007 17:32:05 von Harlan Messinger

David Segall wrote:
> I want to encode a string that will be used as a GET data parameter
> but the algorithm I have can produce the characters "/", "+" and "="
> in addition to alphanumeric characters. Those characters don't have a
> named entity in my HTML text book so I believe they can be used
> without further encoding. Can they? In other words, is the URL
> valid? For extra credit :),
> where should I have looked to find the definitive answer to this
> question?

URLs are not HTML. They have their own syntax. In particular, the plus
and equals signs have special meanings in a query string: the plus sign
is interpreted as a space character and the equals sign is used to
create a key/value association as you did in your own example,
associating the key "myparam" with the value "four/two+5=5".

The characters that have special meaning in a query string or that
delimit the query string from other parts of the URL are the ones in the
set {=?&;#%+}. When you use want to use any of these as an ordinary
character, encode it as %nn where nn is the hexadecimal ASCII code for
the character. An embedded space can be encoded as either %20 or a plus
sign.

See http://en.wikipedia.org/wiki/Percent-encoding.

Re: Valid characters in GET data

am 20.12.2007 17:42:28 von Toby A Inkster

David Segall wrote:

> I want to encode a string that will be used as a GET data parameter but
> the algorithm I have can produce the characters "/", "+" and "=" in
> addition to alphanumeric characters. Those characters don't have a named
> entity in my HTML text book so I believe they can be used without
> further encoding.

Wrong. All three have special meanings in form data, so although a URL
like "http://example.com?myparam=four/two+3=5" is perfectly valid, your
webserver and/or scripting language may interpret it differently to how
you might expect.

For example, the '+' may be decoded to a space character before you get
your hands on it.

URLs such as this would typically need to be hex-encoded to avoid being
interpreted:

http://example.com?myparam=four%2Ftwo%2B3%3D5

- and * would probably be safer than + and /. < or > would be safer than =.

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 3:12.]

Sharing Music with Apple iTunes
http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/

Re: Valid characters in GET data

am 20.12.2007 17:51:26 von Toby A Inkster

David Segall wrote:

> where should I have looked to find the definitive answer to this
> question?

Mostly:

HTML 4.01 Specification: 17.13 Form submission
http://www.w3.org/TR/html401/interact/forms.html#h-17.13

Supporting documents:

RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1
http://www.ietf.org/rfc/rfc2616.txt

RFC 1738: Uniform Resource Locators (URL)
http://www.ietf.org/rfc/rfc1738.txt

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 3:18.]

Sharing Music with Apple iTunes
http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/

Re: Valid characters in GET data

am 21.12.2007 13:30:40 von Toby A Inkster

Harlan Messinger wrote:

> URLs are not HTML. They have their own syntax.

But of course, URLs represented in HTML have to both conform to the URL
syntax rules, *plus* HTML syntax rules. (The trick is to apply the URL
rules first, such as percent-encoding, and *then* apply HTML rules, like
representing ampersands as "&".)

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 23:04.]

Sharing Music with Apple iTunes
http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/