A cleaner way to do this?

A cleaner way to do this?

am 25.03.2010 17:42:36 von Paul Halliday

I am working on a parser for logs from a spam firewall. The format is
predictable until it reaches a certain point. It then varies greatly.

There are 2 things I want to grab from this area; the size of the
message (if it exists) and the subject (if it exists)

The line might look something like this:

- 2 39 some.text.here SZ:1825 SUBJ: A subject here

but it could also look like this:

5 6 421 Error: timeout

or this:

5 6 421 Client disconnected

All I really want is the value for each, not the prefix stuff. Which
means I still need more below, yuck.

I am doing it like this:

$remainder = explode(" ", $theLine, 18);
$s_size = '/SZ:\d+/';
$s_subject = '/SUBJ:.+/';

preg_match("$s_size","$remainder[17]",$a);
preg_match("$s_subject","$remainder[17]",$b);

if (count($a) > 0) {
$size = $a[0];
} else {
$size = 0;
}

if (count($b) > 0) {
$subject = $b[0];

} else {
$subject = "-";
}

Is there any way to clean this up a bit?

thanks.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: A cleaner way to do this?

am 25.03.2010 17:48:15 von Richard Quadling

On 25 March 2010 16:42, Paul Halliday wrote:
> I am working on a parser for logs from a spam firewall. The format is
> predictable until it reaches a certain point. It then varies greatly.
>
> There are 2 things I want to grab from this area; the size of the
> message (if it exists) and the subject (if it exists)
>
> The line might look something like this:
>
> - 2 39 some.text.here SZ:1825 SUBJ: A subject here
>
> but it could also look like this:
>
> 5 6 421 Error: timeout
>
> or this:
>
> 5 6 421 Client disconnected
>
> All I really want is the value for each, not the prefix stuff. Which
> means I still need more below, yuck.
>
> I am doing it like this:
>
> $remainder =3D explode(" ", $theLine, 18);
>                $s_size =3D '/SZ:\=
d+/';
>                $s_subject =3D '/S=
UBJ:.+/';
>
>                preg_match("$s_siz=
e","$remainder[17]",$a);
>                preg_match("$s_sub=
ject","$remainder[17]",$b);
>
>                if (count($a) > 0)=
{
>                    $siz=
e =3D $a[0];
>                } else {
>                    $siz=
e =3D 0;
>                }
>
>                if (count($b) > 0)=
{
>                    $sub=
ject =3D $b[0];
>
>                } else {
>                    $sub=
ject =3D "-";
>                }
>
> Is there any way to clean this up a bit?
>
> thanks.
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

You can do it handraulically using string functions. You can also use
regular expressions.

If you can supply a sensible chunk, I can build a regex for you.
Indicate the exact elements you want to retrieve.

If you want to email me the log file directly that's fine.

--=20
-----
Richard Quadling
"Standing on the shoulders of some very clever giants!"
EE : http://www.experts-exchange.com/M_248814.html
EE4Free : http://www.experts-exchange.com/becomeAnExpert.jsp
Zend Certified Engineer : http://zend.com/zce.php?c=3DZEND002498&r=3D213474=
731
ZOPA : http://uk.zopa.com/member/RQuadling

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: A cleaner way to do this?

am 25.03.2010 18:26:57 von Per Jessen

Paul Halliday wrote:

>=20
> Is there any way to clean this up a bit?
>=20

This is what I usually do:

if ( ($matches=3Dpreg_match(linepattern1,text,match))>0 )
{
// do stuff speicifc to linepattern1
}
else
if ( ($matches=3Dpreg_match(linepattern2,text,match))>0 )
{
// do stuff speicifc to linepattern2
}
else
if ( ($matches=3Dpreg_match(linepattern3,text,match))>0 )
{
}
else
if ( ($matches=3Dpreg_match(linepattern4,text,match))>0 )
{
}



--=20
Per Jessen, Zürich (15.9°C)


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php