Re: Possible bug in HTML::Parser version 3.48
am 08.02.2006 12:14:11 von gisleJack Goldstein
> I've installed HTML::Parser on an AIX 5.1 system running perl 5.8.6 along
> with HTML::Tagset and all tests passed except for one relating to POD that
> was skipped. However, one of our developers found that it didn't properly
> parse titles. Here's a sample program that demonstrates the problem. When
> run with the perl 5.8.6 that I installed, the output is
>
> Help Title is
> (blank line)
>
> but when when run with a copy of perl5.8.0 that someone else installed, we
> get:
>
> Help Title is Installation Help
>
> which I assume is correct.
Thanks for your bug report. This is indeed a bug. Its cause is that
some events would trigger under certain circumstances even after a
handler has told the parser to stop with $p->eof. I've now fixed this
issue and uploaded HTML-Parser-3.49 to CPAN.
My guess would be that your perl5.8.0 installation has a version of
HTML-Parser that is older than version 3.40, where we made
tags also parse in literal mode. This could explain why this issue
didn't occur with that perl installation.
> use HTML::Parser;
>
> my $title='';
>
> my $p = HTML::Parser->new(api_version => 3,);
> $p->handler(start=> \&title_handler, 'tagname, self');
> $p->parse_file("db2wi.htm");
> print "\nHelp Title is $title\n";
> exit 0;
>
> ########################################
> # Subroutines
> ########################################
> sub title_handler {
> return if shift ne 'title';
> my $self = shift;
> $self->handler(text => sub { $title= shift}, 'dtext');
BTW, HTML-Parser does not guarantee that all text between the
this code should append to $title instead of just assigning to it.
That would make it:
$self->handler(text => sub { $title .= shift}, 'dtext');
Alternatively, set the 'unbroken_text' attribute to a TRUE value.
> $self->handler(end => sub { shift->eof if shift eq 'title' }, 'tagname,
> self');
> }
Regards,
Gisle