XML Parsing too slow

XML Parsing too slow

am 19.11.2005 14:29:16 von krising

I have some files ranging from 6 meg to 15 meg that I need to process
to input into a database. I'm inheriting some old code using
XML::Simple and this process is taking forever (an hour or more). I
have eliminated the database as the time hog in this equation.

is there an alternative XML module I should be using?


Sent via Archivaty.com

Re: XML Parsing too slow

am 19.11.2005 14:43:47 von stephane HAbeTT roux

In , jabby wrote:

> I have some files ranging from 6 meg to 15 meg that I need to process
> to input into a database. I'm inheriting some old code using
> XML::Simple and this process is taking forever (an hour or more). I
> have eliminated the database as the time hog in this equation.
>
> is there an alternative XML module I should be using?

XML::LibXML

--
|":._.:"| http://habett.com/
| (=) | http://habett.org/
| .:':. | I send the energy to my enemy

Re: XML Parsing too slow

am 21.11.2005 11:15:36 von Michel Rodriguez

jabby wrote:
> I have some files ranging from 6 meg to 15 meg that I need to process
> to input into a database. I'm inheriting some old code using
> XML::Simple and this process is taking forever (an hour or more). I
> have eliminated the database as the time hog in this equation.
>
> is there an alternative XML module I should be using?

Hi,

It really depends on the code, on why the process is slow and on the
effort you want to put into re-writing it.

Why is the code slow? Is the call to XMLin, which loads the data into a
Perl structure, slow? Or is the problem that the data takes up to much
space in memory and that the system starts swapping pages? 6/15 Megs is
not that much these days, so I am not sure the problem lies with
XML::Simple. What are you doing with the data that takes that long?

You should probably start by running the XMLin call by itself to see how
long it takes.

BTW XML::LibXML is indeed faster than XML::Parser-based modules like
XML::Simple, but its interface is a lot different, instead of working
with a Perl structure, you work with a DOM, so if most of the processing
happens once the data has already been loaded in memory, I am not sure
it will actually speed things up.

--
mirod