html parsing
am 02.05.2005 17:29:54 von malcolm.millHi,=20
I'm trying to extract information from html like this...
http://www.rafb.net/paste/results/Ze4RTm27.html
I've tried modifiying examples from the man pages for HTML::TokeParser,=20
and HTML::TreeBuilder without much success.
I just want to identify such blocks of html by the attributes in the
child nodes; extract the text node under the first '',
extract the text node under the second '' as well as the href
attribute in the enclosed '' node,
store the output in a hash which I can pass to other functions or
print to a csv file.
If anyone can suggest anything while I read the docs and relevant
hacks in "Spidering Hacks" more carefully it would be appreciated.
Regards,=20
Malcolm.