Fwd: [cpan #13025] parsing bug in HTTP::Message::parse()

am 15.08.2005 10:12:10 von bhirt

--Apple-Mail-10--242557227
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

FYI, i'm not the only one seeing this bug.

Begin forwarded message:

> From: " via RT"
> Date: August 13, 2005 8:55:10 PM MDT
> To: bhirt+cpan@mobygames.com
> Subject: [cpan #13025] parsing bug in HTTP::Message::parse()
> Reply-To: comment-libwww-perl@rt.cpan.org
>
>
> Full context and any attached attachments can be found at:
>
>
> [guest - Mon May 30 19:20:21 2005]:
>
>
>> Hi!
>>
>> I've stumbled across a bug in multi-part messages in HTTP::Message.
>> If the message content contains something that looks like a
>> header,
>> it accidentally gets detected as a header. However, a blank line
>> after a header should signal the end of headers, and below it, the
>> start of content as per the spec. It just looks like your regex
>> needs some tweeking, or maybe do something like this:
>>
>
> Found the same problem today... :(
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use HTTP::Message;
> use Data::Dumper;
>
> my $bad = qq(Content-Disposition: form-data; name="fetch_url"\n)
> . qq(\nhttp://www.jbisbee.com/\n);
> my $part = HTTP::Message->parse($bad);
> warn Dumper($part);
>
> With the output...
>
> $VAR1 = bless( {
> '_content' => '',
> '_headers' => bless( {
> 'content-disposition' => 'form-data; name="fetch_url"',
> ' '
> http' => '//www.jbisbee.com/'
> }, 'HTTP::Headers' )
> }, 'HTTP::Message' );
>
> of course this only happens with forms that specify
>
> enctype="multipart/form-data"
>
> for uploads so it may be a bit hard to reproduce
>

--------------------------------------------
MobyGames
http://www.mobygames.com
The world's largest and most comprehensive
gaming database project

--Apple-Mail-10--242557227--

Re: [cpan #13025] parsing bug in HTTP::Message::parse()

am 31.08.2005 20:34:21 von bhirt

Any news on getting this fixed?

I've rewritten my test program to be more simple and i will explain
what is wrong, including sample output. hopefully this clears up the
prior confusion with my previous bug reports.

--------PROGRAM----------

#!/usr/bin/perl -w

use strict;

use HTTP::Message;
use HTTP::Headers;
use Data::Dumper;

my $content = '------------0xKhTmLbOuNdArY
Content-Disposition: form-data; name="correct"

some data
------------0xKhTmLbOuNdArY
Content-Disposition: form-data; name="incorrect"

aoeu:aoeu
------------0xKhTmLbOuNdArY--
';

my $headers = HTTP::Headers->new;
$headers->header( 'content-type' => 'multipart/form-data;
boundary=----------0xKhTmLbOuNdArY' );
my $m = HTTP::Message->new($headers,$content);

foreach my $part ($m->parts)
{
print STDERR Dumper($part);
}

------OUTPUT----------

$VAR1 = bless( {
'_content' => 'some data',
'_headers' => bless( {
'content-disposition' =>
'form-data; name="correct"'
}, 'HTTP::Headers' )
}, 'HTTP::Message' );
$VAR1 = bless( {
'_content' => '',
'_headers' => bless( {
'content-disposition' =>
'form-data; name="incorrect"',
'
aoeu' => 'aoeu'
}, 'HTTP::Headers' )
}, 'HTTP::Message' );

What's wrong: The 2nd Part shows no content and lists "\naoeu" =>
'aoeu' as a header. 'aoeu:aoeu' should be the content. The parser
is parsing the content of the message. It should not be. Once the
blank line is reached, it's the beginning of the content and should
no longer be parsed.

Here is the current parse() function:

sub parse
{
my($class, $str) = @_;

my @hdr;
while (1) {
if ($str =~ s/^([^ \t:]+)[ \t]*: ?(.*)\n?//) {
push(@hdr, $1, $2);
$hdr[-1] =~ s/\r\z//;
}
elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
$hdr[-1] .= "\n$1";
$hdr[-1] =~ s/\r\z//;
}
else {
$str =~ s/^\r?\n//;
last;
}
}

new($class, \@hdr, $str);
}

When the part is being parsed, the first REGEX is matching past the
end of the headers and into the content. This is because the ([^ \t:]
+) goes past blank line that is supposed to be the end of the headers
and into the content. One way to fix this is to change the code so
that it checks for the end of the headers as a last part of the loop.

sub parse
{
my($class, $str) = @_;

my @hdr;
while (1) {
if ($str =~ s/^([^ \t:]+)[ \t]*: ?(.*)\n?//) {
push(@hdr, $1, $2);
$hdr[-1] =~ s/\r\z//;
}
elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
$hdr[-1] .= "\n$1";
$hdr[-1] =~ s/\r\z//;
}

# check to see if we are at the end of the headers.
if ($str =~ /^\r?\n/) {
$str =~ s/^\r?\n//;
last;
}
}

new($class, \@hdr, $str);
}

Please contact me if there is still confusion with this.

Best Regards,

Brian Hirt

On Aug 15, 2005, at 2:12 AM, Brian Hirt wrote:

> FYI, i'm not the only one seeing this bug.
>
> Begin forwarded message:
>
>
>> From: " via RT"
>> Date: August 13, 2005 8:55:10 PM MDT
>> To: bhirt+cpan@mobygames.com
>> Subject: [cpan #13025] parsing bug in HTTP::Message::parse()
>> Reply-To: comment-libwww-perl@rt.cpan.org
>>
>>
>> Full context and any attached attachments can be found at:
>>
>>
>> [guest - Mon May 30 19:20:21 2005]:
>>
>>
>>
>>> Hi!
>>>
>>> I've stumbled across a bug in multi-part messages in HTTP::Message.
>>> If the message content contains something that looks like a
>>> header,
>>> it accidentally gets detected as a header. However, a blank
>>> line
>>> after a header should signal the end of headers, and below it,
>>> the
>>> start of content as per the spec. It just looks like your regex
>>> needs some tweeking, or maybe do something like this:
>>>
>>>
>>
>> Found the same problem today... :(
>>
>> #!/usr/bin/perl
>> use strict;
>> use warnings;
>> use HTTP::Message;
>> use Data::Dumper;
>>
>> my $bad = qq(Content-Disposition: form-data; name="fetch_url"\n)
>> . qq(\nhttp://www.jbisbee.com/\n);
>> my $part = HTTP::Message->parse($bad);
>> warn Dumper($part);
>>
>> With the output...
>>
>> $VAR1 = bless( {
>> '_content' => '',
>> '_headers' => bless( {
>> 'content-disposition' => 'form-data; name="fetch_url"',
>> ' '
>> http' => '//www.jbisbee.com/'
>> }, 'HTTP::Headers' )
>> }, 'HTTP::Message' );
>>
>> of course this only happens with forms that specify
>>
>> enctype="multipart/form-data"
>>
>> for uploads so it may be a bit hard to reproduce
>>
>>
>

--------------------------------------------
MobyGames
http://www.mobygames.com
The world's largest and most comprehensive
gaming database project

Fwd: [cpan #13025] parsing bug in HTTP::Message::parse()

am 08.09.2005 05:48:59 von bhirt

--Apple-Mail-3--332232200
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

i'm not sure if this made it to the list, so I'm reposting.

Begin forwarded message:

> From: Brian Hirt
> Date: August 31, 2005 12:34:21 PM MDT
> To: libwww@perl.org
> Cc: gisle@ActiveState.com, Brian Hirt
> Subject: Re: [cpan #13025] parsing bug in HTTP::Message::parse()
>
>
> Any news on getting this fixed?
>
> I've rewritten my test program to be more simple and i will explain
> what is wrong, including sample output. hopefully this clears up
> the prior confusion with my previous bug reports.
>
> --------PROGRAM----------
>
> #!/usr/bin/perl -w
>
> use strict;
>
> use HTTP::Message;
> use HTTP::Headers;
> use Data::Dumper;
>
> my $content = '------------0xKhTmLbOuNdArY
> Content-Disposition: form-data; name="correct"
>
> some data
> ------------0xKhTmLbOuNdArY
> Content-Disposition: form-data; name="incorrect"
>
> aoeu:aoeu
> ------------0xKhTmLbOuNdArY--
> ';
>
> my $headers = HTTP::Headers->new;
> $headers->header( 'content-type' => 'multipart/form-data;
> boundary=----------0xKhTmLbOuNdArY' );
> my $m = HTTP::Message->new($headers,$content);
>
> foreach my $part ($m->parts)
> {
> print STDERR Dumper($part);
> }
>
>
>
> ------OUTPUT----------
>
> $VAR1 = bless( {
> '_content' => 'some data',
> '_headers' => bless( {
> 'content-disposition' =>
> 'form-data; name="correct"'
> }, 'HTTP::Headers' )
> }, 'HTTP::Message' );
> $VAR1 = bless( {
> '_content' => '',
> '_headers' => bless( {
> 'content-disposition' =>
> 'form-data; name="incorrect"',
> '
> aoeu' => 'aoeu'
> }, 'HTTP::Headers' )
> }, 'HTTP::Message' );
>
> What's wrong: The 2nd Part shows no content and lists "\naoeu" =>
> 'aoeu' as a header. 'aoeu:aoeu' should be the content. The
> parser is parsing the content of the message. It should not be.
> Once the blank line is reached, it's the beginning of the content
> and should no longer be parsed.
>
> Here is the current parse() function:
>
> sub parse
> {
> my($class, $str) = @_;
>
> my @hdr;
> while (1) {
> if ($str =~ s/^([^ \t:]+)[ \t]*: ?(.*)\n?//) {
> push(@hdr, $1, $2);
> $hdr[-1] =~ s/\r\z//;
> }
> elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
> $hdr[-1] .= "\n$1";
> $hdr[-1] =~ s/\r\z//;
> }
> else {
> $str =~ s/^\r?\n//;
> last;
> }
> }
>
> new($class, \@hdr, $str);
> }
>
> When the part is being parsed, the first REGEX is matching past the
> end of the headers and into the content. This is because the ([^
> \t:]+) goes past blank line that is supposed to be the end of the
> headers and into the content. One way to fix this is to change
> the code so that it checks for the end of the headers as a last
> part of the loop.
>
> sub parse
> {
> my($class, $str) = @_;
>
> my @hdr;
> while (1) {
> if ($str =~ s/^([^ \t:]+)[ \t]*: ?(.*)\n?//) {
> push(@hdr, $1, $2);
> $hdr[-1] =~ s/\r\z//;
> }
> elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
> $hdr[-1] .= "\n$1";
> $hdr[-1] =~ s/\r\z//;
> }
>
> # check to see if we are at the end of the headers.
> if ($str =~ /^\r?\n/) {
> $str =~ s/^\r?\n//;
> last;
> }
> }
>
> new($class, \@hdr, $str);
> }
>
> Please contact me if there is still confusion with this.
>
> Best Regards,
>
>
> Brian Hirt
>
>
> On Aug 15, 2005, at 2:12 AM, Brian Hirt wrote:
>
>
>> FYI, i'm not the only one seeing this bug.
>>
>> Begin forwarded message:
>>
>>
>>
>>> From: " via RT"
>>> Date: August 13, 2005 8:55:10 PM MDT
>>> To: bhirt+cpan@mobygames.com
>>> Subject: [cpan #13025] parsing bug in HTTP::Message::parse()
>>> Reply-To: comment-libwww-perl@rt.cpan.org
>>>
>>>
>>> Full context and any attached attachments can be found at:
>>>
>>>
>>> [guest - Mon May 30 19:20:21 2005]:
>>>
>>>
>>>
>>>
>>>> Hi!
>>>>
>>>> I've stumbled across a bug in multi-part messages in HTTP::Message.
>>>> If the message content contains something that looks like a
>>>> header,
>>>> it accidentally gets detected as a header. However, a blank
>>>> line
>>>> after a header should signal the end of headers, and below
>>>> it, the
>>>> start of content as per the spec. It just looks like your regex
>>>> needs some tweeking, or maybe do something like this:
>>>>
>>>>
>>>>
>>>
>>> Found the same problem today... :(
>>>
>>> #!/usr/bin/perl
>>> use strict;
>>> use warnings;
>>> use HTTP::Message;
>>> use Data::Dumper;
>>>
>>> my $bad = qq(Content-Disposition: form-data; name="fetch_url"\n)
>>> . qq(\nhttp://www.jbisbee.com/\n);
>>> my $part = HTTP::Message->parse($bad);
>>> warn Dumper($part);
>>>
>>> With the output...
>>>
>>> $VAR1 = bless( {
>>> '_content' => '',
>>> '_headers' => bless( {
>>> 'content-disposition' => 'form-data; name="fetch_url"',
>>> ' '
>>> http' => '//www.jbisbee.com/'
>>> }, 'HTTP::Headers' )
>>> }, 'HTTP::Message' );
>>>
>>> of course this only happens with forms that specify
>>>
>>> enctype="multipart/form-data"
>>>
>>> for uploads so it may be a bit hard to reproduce
>>>
>>>
>>>
>>
>>
>
> --------------------------------------------
> MobyGames
> http://www.mobygames.com
> The world's largest and most comprehensive
> gaming database project
>
>

--------------------------------------------
MobyGames
http://www.mobygames.com
The world's largest and most comprehensive
gaming database project

--Apple-Mail-3--332232200--

parsing bug in HTTP::Message::parse()

am 20.09.2005 17:15:14 von bhirt

Hello,

Does anyone read this list? I've never had so much trouble getting
something fixed in the last 10 year of working with open projects.
I've filed bug reports, sent emails to this list, sent email to
Gisle. I've seen many people complain about this problem. I've been
personally contacted by 4 other people who have discovered this same
bug and saw my cpan ticket in RT (http://rt.cpan.org/NoAuth/Bug.html?
id=13025). The RT ticket just sits there.... I've offered patches
to fix the problem. I've offered different test cases. Can
someone please tell me how I should go about getting this problem
fixed? I would really like to get this issue resolved, and I'm
hopeful someone can point me in the right direction since this list
seems like the wrong place to go to.

Best Regards,

Brian Hirt

On Aug 31, 2005, at 12:34 PM, Brian Hirt wrote:

> Any news on getting this fixed?
>
> I've rewritten my test program to be more simple and i will explain
> what is wrong, including sample output. hopefully this clears up
> the prior confusion with my previous bug reports.
>
> --------PROGRAM----------
>
> #!/usr/bin/perl -w
>
> use strict;
>
> use HTTP::Message;
> use HTTP::Headers;
> use Data::Dumper;
>
> my $content = '------------0xKhTmLbOuNdArY
> Content-Disposition: form-data; name="correct"
>
> some data
> ------------0xKhTmLbOuNdArY
> Content-Disposition: form-data; name="incorrect"
>
> aoeu:aoeu
> ------------0xKhTmLbOuNdArY--
> ';
>
> my $headers = HTTP::Headers->new;
> $headers->header( 'content-type' => 'multipart/form-data;
> boundary=----------0xKhTmLbOuNdArY' );
> my $m = HTTP::Message->new($headers,$content);
>
> foreach my $part ($m->parts)
> {
> print STDERR Dumper($part);
> }
>
>
>
> ------OUTPUT----------
>
> $VAR1 = bless( {
> '_content' => 'some data',
> '_headers' => bless( {
> 'content-disposition' =>
> 'form-data; name="correct"'
> }, 'HTTP::Headers' )
> }, 'HTTP::Message' );
> $VAR1 = bless( {
> '_content' => '',
> '_headers' => bless( {
> 'content-disposition' =>
> 'form-data; name="incorrect"',
> '
> aoeu' => 'aoeu'
> }, 'HTTP::Headers' )
> }, 'HTTP::Message' );
>
> What's wrong: The 2nd Part shows no content and lists "\naoeu" =>
> 'aoeu' as a header. 'aoeu:aoeu' should be the content. The
> parser is parsing the content of the message. It should not be.
> Once the blank line is reached, it's the beginning of the content
> and should no longer be parsed.
>
> Here is the current parse() function:
>
> sub parse
> {
> my($class, $str) = @_;
>
> my @hdr;
> while (1) {
> if ($str =~ s/^([^ \t:]+)[ \t]*: ?(.*)\n?//) {
> push(@hdr, $1, $2);
> $hdr[-1] =~ s/\r\z//;
> }
> elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
> $hdr[-1] .= "\n$1";
> $hdr[-1] =~ s/\r\z//;
> }
> else {
> $str =~ s/^\r?\n//;
> last;
> }
> }
>
> new($class, \@hdr, $str);
> }
>
> When the part is being parsed, the first REGEX is matching past the
> end of the headers and into the content. This is because the ([^
> \t:]+) goes past blank line that is supposed to be the end of the
> headers and into the content. One way to fix this is to change
> the code so that it checks for the end of the headers as a last
> part of the loop.
>
> sub parse
> {
> my($class, $str) = @_;
>
> my @hdr;
> while (1) {
> if ($str =~ s/^([^ \t:]+)[ \t]*: ?(.*)\n?//) {
> push(@hdr, $1, $2);
> $hdr[-1] =~ s/\r\z//;
> }
> elsif (@hdr && $str =~ s/^([ \t].*)\n?//) {
> $hdr[-1] .= "\n$1";
> $hdr[-1] =~ s/\r\z//;
> }
>
> # check to see if we are at the end of the headers.
> if ($str =~ /^\r?\n/) {
> $str =~ s/^\r?\n//;
> last;
> }
> }
>
> new($class, \@hdr, $str);
> }
>
> Please contact me if there is still confusion with this.
>
> Best Regards,
>
>
> Brian Hirt
>
>
> On Aug 15, 2005, at 2:12 AM, Brian Hirt wrote:
>
>
>> FYI, i'm not the only one seeing this bug.
>>
>> Begin forwarded message:
>>
>>
>>
>>> From: " via RT"
>>> Date: August 13, 2005 8:55:10 PM MDT
>>> To: bhirt+cpan@mobygames.com
>>> Subject: [cpan #13025] parsing bug in HTTP::Message::parse()
>>> Reply-To: comment-libwww-perl@rt.cpan.org
>>>
>>>
>>> Full context and any attached attachments can be found at:
>>>
>>>
>>> [guest - Mon May 30 19:20:21 2005]:
>>>
>>>
>>>
>>>
>>>> Hi!
>>>>
>>>> I've stumbled across a bug in multi-part messages in HTTP::Message.
>>>> If the message content contains something that looks like a
>>>> header,
>>>> it accidentally gets detected as a header. However, a blank
>>>> line
>>>> after a header should signal the end of headers, and below
>>>> it, the
>>>> start of content as per the spec. It just looks like your regex
>>>> needs some tweeking, or maybe do something like this:
>>>>
>>>>
>>>>
>>>
>>> Found the same problem today... :(
>>>
>>> #!/usr/bin/perl
>>> use strict;
>>> use warnings;
>>> use HTTP::Message;
>>> use Data::Dumper;
>>>
>>> my $bad = qq(Content-Disposition: form-data; name="fetch_url"\n)
>>> . qq(\nhttp://www.jbisbee.com/\n);
>>> my $part = HTTP::Message->parse($bad);
>>> warn Dumper($part);
>>>
>>> With the output...
>>>
>>> $VAR1 = bless( {
>>> '_content' => '',
>>> '_headers' => bless( {
>>> 'content-disposition' => 'form-data; name="fetch_url"',
>>> ' '
>>> http' => '//www.jbisbee.com/'
>>> }, 'HTTP::Headers' )
>>> }, 'HTTP::Message' );
>>>
>>> of course this only happens with forms that specify
>>>
>>> enctype="multipart/form-data"
>>>
>>> for uploads so it may be a bit hard to reproduce
>>>
>>>
>>>
>>
>>
>
> --------------------------------------------
> MobyGames
> http://www.mobygames.com
> The world's largest and most comprehensive
> gaming database project
>
>

--------------------------------------------
MobyGames
http://www.mobygames.com
The world's largest and most comprehensive
gaming database project

Re: parsing bug in HTTP::Message::parse()

am 20.09.2005 20:59:05 von gisle

This problem was fixed in the libwww-perl CVS repository in February,
http://cvs.sourceforge.net/viewcvs.py/libwww-perl/lwp5/lib/H TTP/Message.pm?r1=1.56&r2=1.57.

Unfortunately I've not found the time to roll another official
libwww-perl release since then.

BTW, the patch above did find it's way into ActivePerl 813.

Regards,
Gisle

Re: parsing bug in HTTP::Message::parse()

am 21.09.2005 00:22:52 von bhirt

Gilse,

Thanks for the response, I'm happy to hear that it's fixed in cvs.
Hopefully a new release will be rolled before the years end.

--brian

On Sep 20, 2005, at 12:59 PM, Gisle Aas wrote:

> This problem was fixed in the libwww-perl CVS repository in February,
> http://cvs.sourceforge.net/viewcvs.py/libwww-perl/lwp5/lib/H TTP/
> Message.pm?r1=1.56&r2=1.57.
>
> Unfortunately I've not found the time to roll another official
> libwww-perl release since then.
>
> BTW, the patch above did find it's way into ActivePerl 813.
>
> Regards,
> Gisle
>

--------------------------------------------
MobyGames
http://www.mobygames.com
The world's largest and most comprehensive
gaming database project