Processing broken HTTP Headers while proxying...
am 16.08.2002 02:30:02 von Brett HutleyThis is a multi-part message in MIME format.
--------------000708070204070800030909
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Greetings all.
A couple of days ago I was trying to proxy Zope through Apache (1.3.26)
on OpenBSD. Zope was listening on port 8080, and I had a rewrite rule in
my http.conf file to proxy requests through to the Zope web server.
Unfortunately, when I tried to test this setup, I ended up getting
multiple HTTP headers in the response, which ended up appearing on the
client browser.
What I discovered was that when mod_proxy was processing the headers, if
it got a line that didn't have a colon in it, (other than the first
line), it would return an NULL header list back to the calling function
indicating that the headers were invalid. Unfortunately, the Zope server
was sending back the string:
Server: Zope/(Zope 2.5.1b1 (OpenBSD package zope-2.5.1b1)
, python 2.1.2, openbsd3) ZServer/1.1b1
split on two lines, which meant mod_proxy wouldn't process the headers
properly. Of course, this is a problem with the zope compilation rather
than mod_proxy, but I was wondering just how strict the proxying code
should be in parsing the headers from the remote server? It seems to me
that if the header parsing stuff in proxy_util.c was slightly more
forgiving of irregularities then life would be just that little bit
better :)
Here is a patch to proxy_util.c that will allow certain header fields to
have (up to) a certain number additional continuation lines... at the
moment I've only defined the "Server" field name and specified that it
can have up to 2 additional invalid lines after it, but more header
fields can be added to fields_allowing_additional_lines array.
Cheers, Brett
--------------000708070204070800030909
Content-Type: text/plain;
name="proxy_util.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="proxy_util.patch"
diff -Naur old/proxy_util.c new/proxy_util.c
--- old/proxy_util.c Fri Aug 16 10:16:56 2002
+++ new/proxy_util.c Fri Aug 16 10:17:05 2002
@@ -377,12 +377,23 @@
* headers, which contain commas within the date field) do not get stuffed
* up.
*/
+static struct field_allowing_additional_lines {
+ char *field_name;
+ unsigned short max_num_additional_lines;
+} fields_allowing_additional_lines[] = {
+ { "Server", 2 } /* allow the server field a max of 2 additional lines */
+};
+
table *ap_proxy_read_headers(request_rec *r, char *buffer, int size, BUFF *f)
{
table *resp_hdrs;
int len;
char *value, *end;
char field[MAX_STRING_LEN];
+ char field_name[MAX_STRING_LEN]; /* to store the last field name read */
+ int num_additional_lines = 0; /* to count the number of crap header lines */
+
+ field_name[0] = '\0';
resp_hdrs = ap_make_table(r->pool, 20);
@@ -391,8 +402,33 @@
* the connection closes (EOF), or we timeout.
*/
while ((len = ap_getline(buffer, size, f, 1)) > 0) {
-
if (!(value = strchr(buffer, ':'))) { /* Find the colon separator */
+ /* check to see if the last header field read is
+ one we can have continuation lines for...
+ */
+ if (field_name[0]) /* check the 1st byte for null to be quick */
+ {
+ int i = 0;
+ int i_max = sizeof(fields_allowing_additional_lines) / sizeof(struct field_allowing_additional_lines);
+ int ok_to_continue = 0;
+
+ while (i < i_max)
+ {
+ if (!strcmp(field_name, fields_allowing_additional_lines[i].field_name))
+ {
+ if (num_additional_lines < fields_allowing_additional_lines[i].max_num_additional_lines)
+ ok_to_continue = 1;
+ break;
+ }
+ i++;
+ }
+
+ if (ok_to_continue)
+ {
+ num_additional_lines++;
+ continue;
+ }
+ }
/*
* Buggy MS IIS servers sometimes return invalid headers (an
@@ -415,6 +451,12 @@
*value = '\0';
++value;
+
+ /* woo hoo! We got a new header! Do some stuff! */
+ /* copy the header field name into a buffer so we can check it later */
+ strcpy(field_name, buffer);
+ num_additional_lines = 0; /* reset the number of additional lines */
+
/*
* XXX: RFC2068 defines only SP and HT as whitespace, this test is
* wrong... and so are many others probably.
--------------000708070204070800030909--