File::Find script questions

File::Find script questions

am 19.07.2011 03:59:12 von sono-io

I've written a script to traverse my web server, find any files =
called "error_log", e-mail them to me, and then delete them. It is =
triggered by cron twice a day.

The script works great but I'd like to get advice on how I can =
clean up my code, so any comments are welcome.

Also, do I really need the foreach block in there? I couldn't =
get it to work without it, but it seems like I should be able to. =3D:\

Thanks,
Marc

------------------------------------

#!/usr/bin/perl

use strict;
use warnings;

use File::Find;
use File::HomeDir;

my $path_to_search =3D File::HomeDir->my_home.'/public_html';
my $file_name =3D 'error_log';
my $from_address =3D 'xxx@xxx.com';
my $to_address =3D 'xxx@xxx.com';
my $mail_app =3D '/usr/sbin/sendmail';
my $subject =3D 'An "error_log" was found';

find(\&wanted, $path_to_search);

sub wanted {
if ($File::Find::name =3D~ /$file_name/) {
my $msg =3D "MIME-Version: 1.0\n".
"Content-Type: text/plain\n".
"To: $to_address\n".
"From: $from_address\n".
"Subject: $subject\n\n";

open (my $mail_fh, "|$mail_app -t -oi -oem") || die =
"Can't open sendmail!";
print {$mail_fh} "$msg";
print $msg;
open (my $file_fh, '<', $File::Find::name) || =
die "Can't open $file_name: $!";
my (@lines) =3D <$file_fh>;
foreach my $line (@lines) {
print "$line";
}
print {$mail_fh} "@lines";
close ($file_fh) || die "Can't close file";
close $mail_fh || die "Can't close mail_fh";
unlink ($File::Find::name);
}
}

----------------------------

In case anyone's interested, here's the shell script it replaces:

#!/bin/sh

## Variables ##
FILES=3D`find /home/user/public_html -name error_log`
ADDRESS=3Dxxx@xxx.com

for file in $FILES
do
if [ -e "$file" ]
then
mail -s "$file" $ADDRESS < $file
rm -r $file # delete log file
fi
done


--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: File::Find script questions

am 19.07.2011 04:42:56 von Hal Wigoda

Why clean it up when it works and is not obfusticating.

On Mon, Jul 18, 2011 at 8:59 PM, Marc wrote:
> =A0 =A0 =A0 =A0I've written a script to traverse my web server, find any =
files called "error_log", e-mail them to me, and then delete them. =A0It is=
triggered by cron twice a day.
>
> =A0 =A0 =A0 =A0The script works great but I'd like to get advice on how I=
can clean up my code, so any comments are welcome.
>
> =A0 =A0 =A0 =A0Also, do I really need the foreach block in there? =A0I co=
uldn't get it to work without it, but it seems like I should be able to. =
=3D:\
>
> Thanks,
> Marc
>
> ------------------------------------
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> use File::Find;
> use File::HomeDir;
>
> my $path_to_search =3D File::HomeDir->my_home.'/public_html';
> my $file_name =A0 =A0  = 'error_log';
> my $from_address =A0 =3D 'xxx@xxx.com';
> my $to_address =A0 =A0 =3D 'xxx@xxx.com';
> my $mail_app =A0 =A0 =A0 =3D '/usr/sbin/sendmail';
> my $subject =A0 =A0 =A0  = 'An "error_log" was found';
>
> find(\&wanted, $path_to_search);
>
> sub wanted {
> =A0 =A0 =A0 =A0if ($File::Find::name =3D~ /$file_name/) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0my $msg =3D "MIME-Version: 1.0\n".
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Conte=
nt-Type: text/plain\n".
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"To: $=
to_address\n".
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"From:=
$from_address\n".
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Subje=
ct: $subject\n\n";
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0open (my $mail_fh, "|$mail_app -t -oi -oem=
") || die "Can't open sendmail!";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0print {$mail_fh} "$msg";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0print $msg;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0open (my $file_fh, '<', $F=
ile::Find::name) || die "Can't open $file_name: $!";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0my (@lines=
) =3D <$file_fh>;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0foreach my=
$line (@lines) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0print "$line";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0print {$ma=
il_fh} "@lines";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0close ($file_fh) || die "C=
an't close file";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0close $mail_fh || die "Can't close mail_fh=
";
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0unlink ($File::Find::name);
> =A0 =A0 =A0 =A0}
> }
>
> ----------------------------
>
> In case anyone's interested, here's the shell script it replaces:
>
> #!/bin/sh
>
> ## Variables ##
> FILES=3D`find /home/user/public_html -name error_log`
> ADDRESS=3Dxxx@xxx.com
>
> for file in $FILES
> =A0do
> =A0 =A0if [ -e "$file" ]
> =A0 =A0then
> =A0 =A0 =A0mail -s "$file" $ADDRESS < $file
> =A0 =A0 =A0rm -r $file =A0# delete log file
> =A0 =A0fi
> =A0done
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>
>



--=20
-----------------
Chicago
Hal Wigoda

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: File::Find script questions

am 19.07.2011 05:35:17 von jwkrahn

Marc wrote:
> I've written a script to traverse my web server, find any files called
> "error_log", e-mail them to me, and then delete them. It is triggered
> by cron twice a day.
>
> The script works great but I'd like to get advice on how I can clean
> up my code, so any comments are welcome.
>
> Also, do I really need the foreach block in there? I couldn't get it
> to work without it, but it seems like I should be able to. =:\
>
> Thanks,
> Marc
>
> ------------------------------------
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> use File::Find;
> use File::HomeDir;
>
> my $path_to_search = File::HomeDir->my_home.'/public_html';

The use of File::HomeDir is great, if this program were running on a
variety of different operating systems, but your use of
'/usr/sbin/sendmail' implies that this is only running on a Unix-like
operating system so you probably only need $ENV{HOME}.


> my $file_name = 'error_log';
> my $from_address = 'xxx@xxx.com';
> my $to_address = 'xxx@xxx.com';
> my $mail_app = '/usr/sbin/sendmail';
> my $subject = 'An "error_log" was found';
>
> find(\&wanted, $path_to_search);
>
> sub wanted {
> if ($File::Find::name =~ /$file_name/) {

Why compare the full path using a regular expression? Just compare the
file name itself:

return if $_ ne $file_name;


> my $msg = "MIME-Version: 1.0\n".
> "Content-Type: text/plain\n".
> "To: $to_address\n".
> "From: $from_address\n".
> "Subject: $subject\n\n";

This variable doesn't change during inside the loop, so don't put it
inside the loop.


> open (my $mail_fh, "|$mail_app -t -oi -oem") || die "Can't open sendmail!";

You should include the $! variable in the error message so you know why
it failed. You should use the list form of pipe open so you don't
invoke a shell to run $mail_app:

open my $mail_fh, '|-', $mail_app, '-t', '-oi' '-oem' or die "Can't
open '$mail_app' because: $!";


> print {$mail_fh} "$msg";

No need to copy $msg to a string:

print $mail_fh $msg;


> print $msg;
> open (my $file_fh, '<', $File::Find::name) || die "Can't open $file_name: $!";
> my (@lines) =<$file_fh>;
> foreach my $line (@lines) {
> print "$line";
> }
> print {$mail_fh} "@lines";

No, you don't need a foreach loop. And no, you don't need to quote
variables (copy variable contents to strings.)

my @lines = <$file_fh>;
print @lines;
print $mail_fh @lines;


> close ($file_fh) || die "Can't close file";

You should include the $! variable in the error message so you know why
it failed.


> close $mail_fh || die "Can't close mail_fh";

You should verify that the pipe was closed correctly:

close $mail_fh or warn $! ? "Error closing '$mail_app' pipe: $!"
: "Exit status $? from '$mail_app'";


> unlink ($File::Find::name);

You should verify that unlink() worked correctly:

unlink( $File::Find::name ) or warn "Cannot unlink '$File::Find::name'
because: $!";


> }
> }
>
> ----------------------------
>
> In case anyone's interested, here's the shell script it replaces:
>
> #!/bin/sh
>
> ## Variables ##
> FILES=`find /home/user/public_html -name error_log`
> ADDRESS=xxx@xxx.com
>
> for file in $FILES
> do
> if [ -e "$file" ]
> then
> mail -s "$file" $ADDRESS< $file
> rm -r $file # delete log file
> fi
> done



John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction. -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: File::Find script questions

am 19.07.2011 05:38:14 von Uri Guttman

>>>>> "HW" == Hal Wigoda writes:

HW> Why clean it up when it works and is not obfusticating.

because he asked for comments and it will be educational to all on the
list.

uri

--
Uri Guttman -- uri AT perlhunter DOT com --- http://www.perlhunter.com --
------------ Perl Developer Recruiting and Placement Services -------------
----- Perl Code Review, Architecture, Development, Training, Support -------

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: File::Find script questions

am 19.07.2011 05:50:16 von Uri Guttman

>>>>> "M" == Marc writes:

M> The script works great but I'd like to get advice on how I can
M> clean up my code, so any comments are welcome.

M> Also, do I really need the foreach block in there? I couldn't
M> get it to work without it, but it seems like I should be able
M> to. =:\

M> use strict;
M> use warnings;

good.

M> use File::Find;
M> use File::HomeDir;

M> find(\&wanted, $path_to_search);

M> sub wanted {
M> if ($File::Find::name =~ /$file_name/) {

it would be much cleaner IMO to collect the paths of the files you want
and then processs those paths in a sub. the rule is generally collect
the data you need and process later and elsewhere. this is good for
several reasons. you could write each part to be more generic and
reusable. when they are tied like this you have only one use for the
code. also it make it easier to debug. you can test the mailer sub by
passing it a path and never executing the search part.

M> my $msg = "MIME-Version: 1.0\n".
M> "Content-Type: text/plain\n".
M> "To: $to_address\n".
M> "From: $from_address\n".
M> "Subject: $subject\n\n";

instead of multiple lines with . operators this is much easier with a
here doc. no quotes, no \n, no . ops.

my $msg = < MIME-Version: 1.0
Content-Type: text/plain
To: $to_address
From: $from_address
Subject: $subject

MSG

but beyond this calling sendmail directly isn't cool anymore. there are
many useful mail modules that make this easier and work with other
mailers or directly with smtp.


M> open (my $mail_fh, "|$mail_app -t -oi -oem") || die "Can't open sendmail!";
M> print {$mail_fh} "$msg";

don't quote scalar vars like that. it makes an unneeded copy and can be
a bug in some situations.

also you can get the file text and do one print to the mailer.

M> print $msg;
M> open (my $file_fh, '<', $File::Find::name) ||
M> die "Can't open $file_name: $!";

have to plug File::Slurp here. it can read the whole file into a scalar
and is much cleaner and faster than your code.


M> my (@lines) = <$file_fh>;
M> foreach my $line (@lines) {
M> print "$line";
M> }
M> print {$mail_fh} "@lines";

that quoted array will put a space between each line. it may not be what
you want. this should be all you need:

use File::Slurp ;
my $log_text = read_file( $File::Find::name ) ;

print $mail_fh $msg, $log_text ;
print $log_text ;

M> close ($file_fh) || die "Can't close file";

not needed with slurp.

M> close $mail_fh || die "Can't close mail_fh";
M> unlink ($File::Find::name);
M> }
M> }


M> ----------------------------

M> In case anyone's interested, here's the shell script it replaces:

if you simplified it to separate subs like i mention and use mailer
modules and file::slurp, it will be almost as short as the shell version.

uri

--
Uri Guttman -- uri AT perlhunter DOT com --- http://www.perlhunter.com --
------------ Perl Developer Recruiting and Placement Services -------------
----- Perl Code Review, Architecture, Development, Training, Support -------

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: File::Find script questions

am 19.07.2011 19:03:00 von sono-io

Thanks to both Uri and John for their input. I've added your =
suggestions and the script is much cleaner and more concise now.

I decided to keep the header info in the sub because I'm now =
setting the subject to the path of the error log, so at a glance I know =
where there's a problem.

On Jul 18, 2011, at 8:50 PM, Uri Guttman wrote:

> it would be much cleaner IMO to collect the paths of the files you =
want
> and then process those paths in a sub. the rule is generally to =
collect
> the data you need and process later and elsewhere.

I'm sorry, but I'm not following you here. Could you explain a =
little more?

> but beyond this calling sendmail directly isn't cool anymore. there =
are
> many useful mail modules that make this easier and work with other
> mailers or directly with smtp.=20

I searched for mail modules and there seems to be a million of =
them. Could you recommend a couple that you like?

Thanks again,
Marc

------------------

#!/usr/bin/perl

use strict;
use warnings;

use File::Find;
use File::Slurp;

my $path_to_search =3D $ENV{HOME}.'/public_html';
my $file_name =3D 'error_log';
my $from_address =3D 'from@me.com';
my $to_address =3D 'to@you.com';
my $mail_app =3D '/usr/sbin/sendmail';


find(\&wanted, $path_to_search);

sub wanted {
return if $_ ne $file_name;

my $subject =3D substr $File::Find::name, =
length($path_to_search); # removes "/home/USER/public_html" from the =
subject
my $log_text =3D read_file($File::Find::name);
my $message =3D < MIME-Version: 1.0
Content-Type: text/plain
To: $to_address
From: $from_address
Subject: $subject

$log_text
MSG

open (my $mail_fh, '|-', $mail_app, '-t', '-oi', '-oem') or die =
"Can't open '$mail_app' because: $!";
print {$mail_fh} $message;
close ($mail_fh) or warn $! ? "Error closing '$mail_app' pipe: =
$!" : "Exit status $? from '$mail_app'";

unlink( $File::Find::name ) or die "Cannot unlink =
'$File::Find::name' because: $!";

}=

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: File::Find script questions

am 19.07.2011 23:03:20 von Uri Guttman

>>>>> "M" == Marc writes:

M> Thanks to both Uri and John for their input. I've added your suggestions and the script is much cleaner and more concise now.
M> I decided to keep the header info in the sub because I'm now setting the subject to the path of the error log, so at a glance I know where there's a problem.

M> On Jul 18, 2011, at 8:50 PM, Uri Guttman wrote:

>> it would be much cleaner IMO to collect the paths of the files you want
>> and then process those paths in a sub. the rule is generally to collect
>> the data you need and process later and elsewhere.

M> I'm sorry, but I'm not following you here. Could you explain a
M> little more?

basically the rule is make subs/modules/whatever do something focused
and small and do it well. putting the mailer code inside the want sub is
doing two different things in one area. it can get confusing. maybe it
isn't confusing for this simpler case but things never stay simple. so
my idea is to break it up into two subs. the want sub just finds the
error log files and maybe keeps the path to the file. it can build up a
list of hashes for that (or just a list of paths). now this is simple,
focused, and easy to write and test. just a few lines of code inside the
want sub.

then you write a mailer sub that is passed that list of hashes. it knows
how to loop over that list and mail out the file in each iteration. it
just does the mailing. it can be again tested independently from the
find/want code, it can be rewritten to use slurp and other mail modules,
etc. it does ONE thing (mail out the error log) and does it well. it
doesn't care where or how the error log file was found.

this is a very clean separation of ideas. one part finds files to be
processed (mailed) and the other does the processing (mailing).

>> but beyond this calling sendmail directly isn't cool anymore. there are
>> many useful mail modules that make this easier and work with other
>> mailers or directly with smtp.

M> I searched for mail modules and there seems to be a million of
M> them. Could you recommend a couple that you like?

i happen to use Mail::Mailer. it supports all major mailers with a clean
api. others will point you to different modules. but calling sendmail
directly is nasty and not portable.

here is some code i use for that.

my $mailer = Mail::Mailer->new( 'qmail' ) ;

$mailer->open( {
To => $email_addr,
From => 'uri@ZZZZZZZZZZZZ.com',
Sender => 'uri@ZZZZZZZZZZZZZ.com',
Subject => 'foo bar',

} ) ;

my $foo_text = join '', map "$_\n", @new_foos ;

print $mailer $foo_text ;
$mailer->close() ;


M> find(\&wanted, $path_to_search);

M> sub wanted {
M> return if $_ ne $file_name;

M> my $subject = substr $File::Find::name, length($path_to_search); # removes "/home/USER/public_html" from the subject

bah. that is a very silly way to do that. also don't comment on the end
of a line as it word wraps and is harder to read.

(my $subject = $File::Find::name) =~ s/^$path_to_search// ;

but you should do what i said and just collect an array of
$File::Find::name values. then you can pass that to the mailing sub. or
remove the prefix path for all of them:

s/^$path_to_search// for @error_log_paths ;

then either loop over that and mail each file individually or have the
mailer sub do the looping.

uri

--
Uri Guttman -- uri AT perlhunter DOT com --- http://www.perlhunter.com --
------------ Perl Developer Recruiting and Placement Services -------------
----- Perl Code Review, Architecture, Development, Training, Support -------

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/