how to skip some xml records using XML::SAX parser?

am 15.02.2007 11:59:27 von uresh.kuruhuri

Hello All,

I have some performance issue while using the XML::SAX parser. I have
been using this parser for long time.

The problem is, I have a xml file, say >75 MB to parse. It's taking
hell lot of time to parse it. I don't need some of the xml records to
be parsed based on a particular field.

Each xml record is having 24 fields. On checking the first or third
field in the order of the xml record, i want to skip the parsing for
the remaining of the xml record and jump on to the next xml record. By
any chance, is it possible in XML::SAX parser???? If I can do this, I
would really save a significant amount of time in the process.

Also, let me know if the XML::SAX parser takes the DTD for the xml
document?

I remember that the XML::SAX parser reads the xml file line by line so
that more memory is not used up.

Thanks in anticipation.

Regards,
Uresh

Re: how to skip some xml records using XML::SAX parser?

am 16.02.2007 16:32:31 von keith

On Feb 15, 4:59 am, uresh.kuruh...@gmail.com wrote:
> Hello All,
>
> I have some performance issue while using the XML::SAX parser. I have
> been using this parser for long time.
>
> The problem is, I have a xml file, say >75 MB to parse. It's taking
> hell lot of time to parse it. I don't need some of the xml records to
> be parsed based on a particular field.
>
> Each xml record is having 24 fields. On checking the first or third
> field in the order of the xml record, i want to skip the parsing for
> the remaining of the xml record and jump on to the next xml record. By
> any chance, is it possible in XML::SAX parser???? If I can do this, I
> would really save a significant amount of time in the process.
>
> Also, let me know if the XML::SAX parser takes the DTD for the xml
> document?
>
> I remember that the XML::SAX parser reads the xml file line by line so
> that more memory is not used up.
>
> Thanks in anticipation.
>
> Regards,
> Uresh

One of the few things I defer to Java for over Perl, personally, is
XML parsing.
It just seems to work better, faster, more reliably, and is better
supported,
IMHO with third-party free-(as in beer)-ware.

So you have me at a disadvantage, but Perl likely supports XPATH,
which is a way
to parse the XML tree in your document selectively. O'Reilly's "Java
& XML" has
very helpful examples, I don't know how well they'd translate to a
Perl approach.

HTH,

Keith