HTML to XML in Perl?

HTML to XML in Perl?

am 12.05.2006 13:05:11 von Ilya Zakharevich

Suppose I want to translate an HTML to a XML-well-formed HTML (so that
I can, e.g., apply xsltproc to the result). E.g., HTML::TreeBuilder
can apply "usual heuristics" to parse HTML; how to get XML out of it?

Thanks,
Ilya

Re: HTML to XML in Perl?

am 12.05.2006 18:34:21 von John Bokma

Ilya Zakharevich wrote:

> Suppose I want to translate an HTML to a XML-well-formed HTML (so that
> I can, e.g., apply xsltproc to the result). E.g., HTML::TreeBuilder
> can apply "usual heuristics" to parse HTML; how to get XML out of it?

Question: I use XML, not XHTML, at home, and use XML::Twig to convert it
to HTML. I can use xsltproc if I want to on the XML file.

You might want to traverse the parse tree HTML::TreeBuilder generates.
Also, not 100% sure, but it might me that HTML tidy can do the XHTML
conversion for you:

Google...

"Validator fixes errors in HTML and XHTML. Converts HTML to XHTML. Free
Software."

http://www.google.com/search?q=html%20tidy%20xhtml

Sounds like it does :-D.

--
John Bokma Freelance software developer
&
Experienced Perl programmer: http://castleamber.com/