Returning Hash Memory

am 11.12.2006 22:05:58 von unknown

I'm converting a program that continuously reads logs and converts
them into easily consumable csv output. The program runs
continuously -- meaning as long as the system stays up -- and
therefore can't continuously grow its memory.

The log file lines (they aren't sendmail but that is a good model)
contain a unique identifier like a queue id and there might be many
lines associated with a single transaction. The program has to keep
track of the data items it extracts from multiple lines and, when it
sees the last line in tha sequence, dump the "csv" line. At that
point it "frees" the memory associated with the line.

The original was written in perl4 and it uses a sequence like this:

while ($line =<>) {
($id, $val1, ..) = $line =~ /expression to parse line/;
eval "\$$id{'val1'} = \$val1";
eval "\$$id{'val2'} = \$val2"
....
if (...transaction is complete) {
dump transaction value;
eval "undef \%$id";
}
}

This works fine. In fact, it works fine in perl4 and 5 BUT it is
very slow. So I rewrote it for perl5 -- basically got rid of the
evals so it now looks like:

while ($line =<>) {
($id, $val1, ..) = $line =~ /expression to parse line/;
$$id{'val1'} = $val1";
$$id{'val2'} = $val2;
....
if (...transaction is complete) {
dump transaction value;
undef %$id;
}
}

THis appears to work but has a hitch ... the program now grows its
memory at a fairly rapid pace. It seems clear that undef isn't
freeing the hash --

It looks like the undef on the soft reference doesn't decrement the
count on the underlying object.

Any ideas?

Thanks

Re: Returning Hash Memory

am 11.12.2006 23:41:19 von Jim Gibson

In article <97hrn2psdjpl5eleb40bstc4tb4t3ab0o7@4ax.com>, Jim K wrote:

> I'm converting a program that continuously reads logs and converts
> them into easily consumable csv output. The program runs
> continuously -- meaning as long as the system stays up -- and
> therefore can't continuously grow its memory.
>
> The log file lines (they aren't sendmail but that is a good model)
> contain a unique identifier like a queue id and there might be many
> lines associated with a single transaction. The program has to keep
> track of the data items it extracts from multiple lines and, when it
> sees the last line in tha sequence, dump the "csv" line. At that
> point it "frees" the memory associated with the line.
>
> The original was written in perl4 and it uses a sequence like this:
>
> while ($line =<>) {
> ($id, $val1, ..) = $line =~ /expression to parse line/;
> eval "\$$id{'val1'} = \$val1";
> eval "\$$id{'val2'} = \$val2"
> ....
> if (...transaction is complete) {
> dump transaction value;
> eval "undef \%$id";
> }
> }

This is using symbolic references. Try using a hash instead (untested):

$queue{$id} = [];
push( @$queue{$id}, $val1 );
push( @$queue{$id}, $val2 );
etc.

When you are done with the $id queue:

delete $queue{$id}

which should free up the memory for the $id queue.

Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com

Re: Returning Hash Memory

am 12.12.2006 05:17:47 von someone

Jim K wrote:
> I'm converting a program that continuously reads logs and converts
> them into easily consumable csv output. The program runs
> continuously -- meaning as long as the system stays up -- and
> therefore can't continuously grow its memory.
>
> The log file lines (they aren't sendmail but that is a good model)
> contain a unique identifier like a queue id and there might be many
> lines associated with a single transaction. The program has to keep
> track of the data items it extracts from multiple lines and, when it
> sees the last line in tha sequence, dump the "csv" line. At that
> point it "frees" the memory associated with the line.
>
> The original was written in perl4 and it uses a sequence like this:
>
> while ($line =<>) {
> ($id, $val1, ..) = $line =~ /expression to parse line/;
> eval "\$$id{'val1'} = \$val1";
> eval "\$$id{'val2'} = \$val2"
> ....
> if (...transaction is complete) {
> dump transaction value;
> eval "undef \%$id";
> }
> }
>
> This works fine. In fact, it works fine in perl4 and 5 BUT it is
> very slow. So I rewrote it for perl5 -- basically got rid of the
> evals so it now looks like:
>
> while ($line =<>) {
> ($id, $val1, ..) = $line =~ /expression to parse line/;
> $$id{'val1'} = $val1";
> $$id{'val2'} = $val2;

In Perl5 $$id{'val1'} is usually written as $id->{'val1'}

> ....
> if (...transaction is complete) {
> dump transaction value;
> undef %$id;
> }
> }
>
> THis appears to work but has a hitch ... the program now grows its
> memory at a fairly rapid pace. It seems clear that undef isn't
> freeing the hash --
>
> It looks like the undef on the soft reference doesn't decrement the
> count on the underlying object.
>
> Any ideas?

Start your program with:

use warnings;
use strict;

Then use lexical variables and you probably want a hash of arrays for your data.

while ( my $line = <> ) {
my ( $id, @vals ) = $line =~ /expression to parse line/;
my %data = ( $id, \@vals );

....
if (...transaction is complete) {
dump transaction value;
}
# %data goes out of scope here and is destroyed
}

John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall