XML Parsing too slow
am 19.11.2005 14:29:16 von krising
I have some files ranging from 6 meg to 15 meg that I need to process
to input into a database. I'm inheriting some old code using
XML::Simple and this process is taking forever (an hour or more). I
have eliminated the database as the time hog in this equation.
is there an alternative XML module I should be using?
Sent via Archivaty.com
Re: XML Parsing too slow
am 19.11.2005 14:43:47 von stephane HAbeTT roux
In , jabby wrote:
> I have some files ranging from 6 meg to 15 meg that I need to process
> to input into a database. I'm inheriting some old code using
> XML::Simple and this process is taking forever (an hour or more). I
> have eliminated the database as the time hog in this equation.
>
> is there an alternative XML module I should be using?
XML::LibXML
--
|":._.:"| http://habett.com/
| (=) | http://habett.org/
| .:':. | I send the energy to my enemy
Re: XML Parsing too slow
am 21.11.2005 11:15:36 von Michel Rodriguez
jabby wrote:
> I have some files ranging from 6 meg to 15 meg that I need to process
> to input into a database. I'm inheriting some old code using
> XML::Simple and this process is taking forever (an hour or more). I
> have eliminated the database as the time hog in this equation.
>
> is there an alternative XML module I should be using?
Hi,
It really depends on the code, on why the process is slow and on the
effort you want to put into re-writing it.
Why is the code slow? Is the call to XMLin, which loads the data into a
Perl structure, slow? Or is the problem that the data takes up to much
space in memory and that the system starts swapping pages? 6/15 Megs is
not that much these days, so I am not sure the problem lies with
XML::Simple. What are you doing with the data that takes that long?
You should probably start by running the XMLin call by itself to see how
long it takes.
BTW XML::LibXML is indeed faster than XML::Parser-based modules like
XML::Simple, but its interface is a lot different, instead of working
with a Perl structure, you work with a DOM, so if most of the processing
happens once the data has already been loaded in memory, I am not sure
it will actually speed things up.
--
mirod