XML Parser

XML Parser

am 04.01.2008 12:49:58 von Richard Price

I am updating a xml parser developed by a third party. The xml below is
parsed by the system using the code that follows:


-


-

15728499

FRED BLOGGS


-

15728498

JOHN SMITH



-

ABC LIMITED


$extract = parseXMLForLtd($xml, $type);
function parseXMLForLtd($xml="", $type) {
preg_match_all( "/\/s", $xml, $reportData);
$reportData[0] = preg_replace('/\n/i','',$reportData[0]);
$result = $reportData[0];
for($i=0; $i $item = trim(stripslashes($result[$i]));

if(eregi('
',$item) && eregi('
',$item)) {
$extract[name] = preg_replace('/([<][\/a-zA-Z0-9 ="-]+[>])/i', '',
$item);

However when I try to parse a the other field called "name" using:

else if(eregi('
',$item) && eregi('
id="secretary">',$item)) {
$extract[secretary] = preg_replace('/([<][\/a-zA-Z0-9 ="-]+[>])/i',
'', $item);

It does not work. Could anyone please advise how I can extract "Fred
Bloggs"?

Thanks In Advance

Re: XML Parser

am 04.01.2008 14:03:08 von p.lepin

Richard Price wrote in
<13ns7b2tg7ug6fa@corp.supernews.com>:
> I am updating a xml parser developed by a third party. The
> xml below is parsed by the system using the code that
> follows:
>
> $extract = parseXMLForLtd($xml, $type);
> function parseXMLForLtd($xml="", $type) {
> preg_match_all( "/\/s",
> $xml, $reportData); $reportData[0] =
> preg_replace('/\n/i','',$reportData[0]); $result =
> $reportData[0]; for($i=0; $i > $item = trim(stripslashes($result[$i]));
>
> if(eregi('

',$item) && eregi(' > id="company
> identification">',$item)) {
> $extract[name] = preg_replace('/([<][\/a-zA-Z0-9
> ="-]+[>])/i', '',
> $item);

OMG.

> However when I try to parse a the other field called
> "name" using:
>
> else if(eregi('
',$item) &&
> eregi('
',$item)) {
> $extract[secretary] =
> preg_replace('/([<][\/a-zA-Z0-9 ="-]+[>])/i',
> '', $item);
>
> It does not work. Could anyone please advise how I can
> extract "Fred Bloggs"?

Parsing hierarchical markup languages using regexen is an
exercise in futility, if not worse.



An XPath expression fetching the node you need would be:

/Company/section[@id='officers']/
section[@id='secretary']/section[@id='name']/text()

--
....also, I submit that we all must honourably commit seppuku
right now rather than serve the Dark Side by producing the
HTML 5 spec.