Memory issues

Memory issues

am 29.03.2008 13:45:58 von JM

Based on the fact that perl contains many memory leaks,

A universal way to measure how many memory is malloced is required.

Is there standard way to measure how many memory a process has
allacated, which run with cygwin perl, active perl, and strawberry perl?

This should help to localize which code makes memory leaks.

Re: Memory issues

am 29.03.2008 14:15:20 von Joost Diepenmaat

jm writes:

> Based on the fact that perl contains many memory leaks,

It doesn't.

> A universal way to measure how many memory is malloced is required.

I don't understand what that means.

> Is there standard way to measure how many memory a process has
> allacated, which run with cygwin perl, active perl, and strawberry perl?

That's not nearly universal. See Win32::Process::Info

> This should help to localize which code makes memory leaks.

It hardly would.

--
Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/

Re: Memory issues

am 29.03.2008 14:25:26 von smallpond

On Mar 29, 8:45 am, jm wrote:
> Based on the fact that perl contains many memory leaks,
>
> A universal way to measure how many memory is malloced is required.
>
> Is there standard way to measure how many memory a process has
> allacated, which run with cygwin perl, active perl, and strawberry perl?
>
> This should help to localize which code makes memory leaks.


perldoc perlfaq3
See:
How can I make my Perl program take less memory?
How can I free an array or hash so my program shrinks?

Re: Memory issues

am 29.03.2008 14:55:47 von JM

Joost Diepenmaat a écrit :
> jm writes:
>
>> Based on the fact that perl contains many memory leaks,
>
> It doesn't.

I wrote a sample of code to illustrate the issue.

The code create a 10 mega characters string. this is the only big data
in this sample.

Then, the main part of the code just modify this data; that mean that
memory usage should (in my humble opinion) stay near of 10 or 20 (or 40)
mega bytes.

The main program does not manipulate directly the string, but makes
functions aa and ab to manipulate this string. Those two functions aa
and ab just make substitutions within the string.

After creating the first string, perl use (around) 20 Mbytes. It is okay.

Calling function aa (one or several times) makes a memory leak (or
memory empreint) of 150 Mbytes.
I mean that once I called this function I do not know how to free those
150 mega bytes, but if I call this same function again I will not loose
more memory.

When I call the function ab, which is quite similar to function aa,
I have the same memory issue, but with only 50 Mbytes more.



Hereafter the result of the script, and the script.
System is debian etch, with 512 Mbytes memory.

----- Result of script: ----------------------------------
/tmp$ perl essai.pl
10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
20492 pts/1 R+ 0:00 0 1022 22977 20988 4.0 perl essai.pl

10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
20492 pts/1 S+ 0:43 0 1022 159921 158132 30.5 perl essai.pl

10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
20492 pts/1 R+ 0:57 0 1022 159921 158132 30.5 perl essai.pl
10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
20492 pts/1 R+ 4:34 0 1022 218529 216740 41.9 perl essai.pl
10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
20492 pts/1 R+ 8:10 0 1022 218529 216740 41.9 perl essai.pl


----- Script: ------------------------------------------


sub aa($)
{
my ($d) = @_;
$d =~ s/x(.....)/$1y/g ;
$d =~ s/x(.....)/$1z/g ;
$d =~ s/x(.....)/$1a/g ;
$d =~ s/x(.....)/$1b/g ;
$d =~ s/x(.....)/$1c/g ;
return $d;
}

sub ab($)
{
my ($d) = @_;
$d =~ s/a(.....)/$1y/g ;
$d =~ s/b(.....)/$1z/g ;
$d =~ s/c(.....)/$1a/g ;
$d =~ s/y(.....)/$1b/g ;
$d =~ s/z(.....)/$1c/g ;
return $d;
}


my $c= 'x' x (1000*1000*10) ;
$c .= "\x{1234}" ;
print length($c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;
$c = aa($c);
print length($c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;
$c = aa($c);
$c = aa($c);
$c = aa($c);
$c = aa($c);
$c = aa($c);
print length($c) ."\n" ;
my $v = qx( ps v $$ );
print $v;
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
print length($c) ."\n" ;
my $v = qx( ps v $$ );
print $v;
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
print length($c) ."\n" ;
my $v = qx( ps v $$ );
print $v;

Re: Memory issues

am 29.03.2008 15:10:16 von JM

smallpond a écrit :
> On Mar 29, 8:45 am, jm wrote:
>> Based on the fact that perl contains many memory leaks,
>>
>> A universal way to measure how many memory is malloced is required.
>>
>> Is there standard way to measure how many memory a process has
>> allacated, which run with cygwin perl, active perl, and strawberry perl?
>>
>> This should help to localize which code makes memory leaks.
>
>
> perldoc perlfaq3
> See:
> How can I make my Perl program take less memory?
> How can I free an array or hash so my program shrinks?

It is interesting, but it does not seam to solve my substitution issue.

However I does not understand this:

« Memory allocated to lexicals (i.e. my() variables)
cannot be reclaimed or reused even if they go out of scope. It is
reserved in case the variables come back into scope. Memory allocated
to global variables can be reused (within your program) by using
undef()ing and/or delete(). »

Aren't my variables local variables?
Why aren't they freed when function terminates?

Re: Memory issues

am 29.03.2008 15:27:14 von smallpond

On Mar 29, 10:10 am, jm wrote:
> smallpond a =E9crit :
>
> > On Mar 29, 8:45 am, jm wrote:
> >> Based on the fact that perl contains many memory leaks,
>
> >> A universal way to measure how many memory is malloced is required.
>
> >> Is there standard way to measure how many memory a process has
> >> allacated, which run with cygwin perl, active perl, and strawberry perl=
?
>
> >> This should help to localize which code makes memory leaks.
>
> > perldoc perlfaq3
> > See:
> > How can I make my Perl program take less memory?
> > How can I free an array or hash so my program shrinks?
>
> It is interesting, but it does not seam to solve my substitution issue.
>
> However I does not understand this:
>
> =AB Memory allocated to lexicals (i.e. my() variables)=

> cannot be reclaimed or reused even if they go out of scope. It is
> reserved in case the variables come back into scope. Memory allocat=
ed
> to global variables can be reused (within your program) by using
> undef()ing and/or delete(). =BB
>
> Aren't my variables local variables?
> Why aren't they freed when function terminates?


sub foo {
my $v =3D 5;
return \$v;
}

In C, once the function terminates $v is gone and a pointer
to it will fail. In perl this reference is legal and the
space will not be reclaimed.

In your sample of code above, when you pass a string to a sub,
perl will make a copy. If you pass a reference it will not.
This isn't a memory leak in perl, it's a memory leak in your
program.

Re: Memory issues

am 29.03.2008 16:12:18 von JM

smallpond a écrit :
> On Mar 29, 10:10 am, jm wrote:
>> smallpond a écrit :
>>
>>> On Mar 29, 8:45 am, jm wrote:
>>>> Based on the fact that perl contains many memory leaks,
>>>> A universal way to measure how many memory is malloced is required.
>>>> Is there standard way to measure how many memory a process has
>>>> allacated, which run with cygwin perl, active perl, and strawberry perl?
>>>> This should help to localize which code makes memory leaks.
>>> perldoc perlfaq3
>>> See:
>>> How can I make my Perl program take less memory?
>>> How can I free an array or hash so my program shrinks?
>> It is interesting, but it does not seam to solve my substitution issue.
>>
>> However I does not understand this:
>>
>> « Memory allocated to lexicals (i.e. my() variables)
>> cannot be reclaimed or reused even if they go out of scope. It is
>> reserved in case the variables come back into scope. Memory allocated
>> to global variables can be reused (within your program) by using
>> undef()ing and/or delete(). »
>>
>> Aren't my variables local variables?
>> Why aren't they freed when function terminates?
>
>
> sub foo {
> my $v = 5;
> return \$v;
> }
>
> In C, once the function terminates $v is gone and a pointer
> to it will fail. In perl this reference is legal and the
> space will not be reclaimed.
>
> In your sample of code above, when you pass a string to a sub,
> perl will make a copy. If you pass a reference it will not.
> This isn't a memory leak in perl, it's a memory leak in your
> program.

As you suggested, I tried to replace scalar by references, but this does
not look like saving memory (might be 10 Mbytes, I mean just the size of
the main variable):

--- results -------------------------
/tmp$ perl essai.pl && echo ok
10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
31679 pts/1 R+ 0:00 0 1022 22977 20996 4.0 perl essai.pl

10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
31679 pts/1 R+ 0:38 0 1022 150157 148372 28.7 perl essai.pl

10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
31679 pts/1 R+ 0:50 0 1022 150157 148372 28.7 perl essai.pl

10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
31679 pts/1 R+ 3:53 0 1022 198997 197212 38.1 perl essai.pl

10000001
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
31679 pts/1 R+ 6:55 0 1022 198997 197212 38.1 perl essai.pl

ok


--- scipt ---------------------------
sub aa($)
{
my ($d) = @_;
$$d =~ s/x(.....)/$1y/g ;
$$d =~ s/x(.....)/$1z/g ;
$$d =~ s/x(.....)/$1a/g ;
$$d =~ s/x(.....)/$1b/g ;
$$d =~ s/x(.....)/$1c/g ;
return $d;
}

sub ab($)
{
my ($d) = @_;
$$d =~ s/a(.....)/$1y/g ;
$$d =~ s/b(.....)/$1z/g ;
$$d =~ s/c(.....)/$1a/g ;
$$d =~ s/y(.....)/$1b/g ;
$$d =~ s/z(.....)/$1c/g ;
return $d;
}


my $s= 'x' x (1000*1000*10) ;
$s .= "\x{1234}" ;
my $c = \$s;
print length($$c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;
$c = aa($c);
print length($$c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;
$c = aa($c);
$c = aa($c);
$c = aa($c);
$c = aa($c);
$c = aa($c);
print length($$c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
print length($$c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
$c = ab($c);
print length($$c) ."\n" ;
my $v = qx( ps v $$ );
print "$v\n" ;

Re: Memory issues

am 29.03.2008 16:23:58 von jurgenex

jm wrote:
>Joost Diepenmaat a écrit :
>> jm writes:
>>
>>> Based on the fact that perl contains many memory leaks,
>>
>> It doesn't.
>
>I wrote a sample of code to illustrate the issue.
>
>The code create a 10 mega characters string. this is the only big data
>in this sample.

Which subsequently you copy a few times.

>Then, the main part of the code just modify this data; that mean that
>memory usage should (in my humble opinion) stay near of 10 or 20 (or 40)
>mega bytes.
>
>The main program does not manipulate directly the string, but makes
>functions aa and ab to manipulate this string. Those two functions aa
>and ab just make substitutions within the string.
No, they don't modify the string at all, they modify a _copy_ of the
original string.

>----- Script: ------------------------------------------
>
>
>sub aa($)
>{
> my ($d) = @_;

And here you create a copy of the original string.

> $d =~ s/x(.....)/$1y/g ;
> $d =~ s/x(.....)/$1z/g ;
> $d =~ s/x(.....)/$1a/g ;
> $d =~ s/x(.....)/$1b/g ;
> $d =~ s/x(.....)/$1c/g ;
> return $d;

You return that copy ...

>}
>
>
>my $c= 'x' x (1000*1000*10) ;
>$c .= "\x{1234}" ;
>print length($c) ."\n" ;
>my $v = qx( ps v $$ );
>print "$v\n" ;
>$c = aa($c);

....and you save that copy in $c, such that the memory cannot be reused.

The rest of the code seems to duplicate that action several times using
successively updated versions of the string as function argument, such
that successivly new copies of the string are created.

jue

Re: Memory issues

am 29.03.2008 16:53:08 von JM

Modifying a little bit again the script, and checking execution with
ltrace, I observed malloc is called 1871 times when free is just called
922 times.
Isn't it an issue?

I just replaced
my $s= 'x' x (1000*1000*10) ;
by
my $s= 'x' x (10) ;

and did:


/tmp$ ltrace perl essai.pl 2>&1 | sed 's/(.*//g' | sort | uniq -c |
grep 'malloc\|free'
922 free
1871 malloc



> --- scipt ---------------------------
> sub aa($)
> {
> my ($d) = @_;
> $$d =~ s/x(.....)/$1y/g ;
> $$d =~ s/x(.....)/$1z/g ;
> $$d =~ s/x(.....)/$1a/g ;
> $$d =~ s/x(.....)/$1b/g ;
> $$d =~ s/x(.....)/$1c/g ;
> return $d;
> }
>
> sub ab($)
> {
> my ($d) = @_;
> $$d =~ s/a(.....)/$1y/g ;
> $$d =~ s/b(.....)/$1z/g ;
> $$d =~ s/c(.....)/$1a/g ;
> $$d =~ s/y(.....)/$1b/g ;
> $$d =~ s/z(.....)/$1c/g ;
> return $d;
> }
>
>
> my $s= 'x' x (1000*1000*10) ;
> $s .= "\x{1234}" ;
> my $c = \$s;
> print length($$c) ."\n" ;
> my $v = qx( ps v $$ );
> print "$v\n" ;
> $c = aa($c);
> print length($$c) ."\n" ;
> my $v = qx( ps v $$ );
> print "$v\n" ;
> $c = aa($c);
> $c = aa($c);
> $c = aa($c);
> $c = aa($c);
> $c = aa($c);
> print length($$c) ."\n" ;
> my $v = qx( ps v $$ );
> print "$v\n" ;
> $c = ab($c);
> $c = ab($c);
> $c = ab($c);
> $c = ab($c);
> $c = ab($c);
> print length($$c) ."\n" ;
> my $v = qx( ps v $$ );
> print "$v\n" ;
> $c = ab($c);
> $c = ab($c);
> $c = ab($c);
> $c = ab($c);
> $c = ab($c);
> print length($$c) ."\n" ;
> my $v = qx( ps v $$ );
> print "$v\n" ;

Re: Memory issues

am 29.03.2008 16:58:51 von smallpond

On Mar 29, 11:12 am, jm wrote:
> smallpond a =E9crit :
>
>
>
> > On Mar 29, 10:10 am, jm wrote:
> >> smallpond a =E9crit :
>
> >>> On Mar 29, 8:45 am, jm wrote:
> >>>> Based on the fact that perl contains many memory leaks,
> >>>> A universal way to measure how many memory is malloced is required.
> >>>> Is there standard way to measure how many memory a process has
> >>>> allacated, which run with cygwin perl, active perl, and strawberry pe=
rl?
> >>>> This should help to localize which code makes memory leaks.
> >>> perldoc perlfaq3
> >>> See:
> >>> How can I make my Perl program take less memory?
> >>> How can I free an array or hash so my program shrinks?
> >> It is interesting, but it does not seam to solve my substitution issue.=

>
> >> However I does not understand this:
>
> >> =AB Memory allocated to lexicals (i.e. my() variabl=
es)
> >> cannot be reclaimed or reused even if they go out of scope. It i=
s
> >> reserved in case the variables come back into scope. Memory allo=
cated
> >> to global variables can be reused (within your program) by using=

> >> undef()ing and/or delete(). =BB
>
> >> Aren't my variables local variables?
> >> Why aren't they freed when function terminates?
>
> > sub foo {
> > my $v =3D 5;
> > return \$v;
> > }
>
> > In C, once the function terminates $v is gone and a pointer
> > to it will fail. In perl this reference is legal and the
> > space will not be reclaimed.
>
> > In your sample of code above, when you pass a string to a sub,
> > perl will make a copy. If you pass a reference it will not.
> > This isn't a memory leak in perl, it's a memory leak in your
> > program.
>
> As you suggested, I tried to replace scalar by references, but this does
> not look like saving memory (might be 10 Mbytes, I mean just the size of
> the main variable):
>



The sub call was just answering your question about
locals.

Each of these:
$$d =3D~ s/x(.....)/$1a/g ;

is making string copies in $1. $1 is a persistent
variable. perl 5.10 has new regex syntax for
avoiding use of $1, $2 etc.

Re: Memory issues

am 30.03.2008 00:40:36 von JM

jm a écrit :
>
> Modifying a little bit again the script, and checking execution with
> ltrace, I observed malloc is called 1871 times when free is just called
> 922 times.
> Isn't it an issue?
>
> and did:

ltrace perl essai.pl 2>&1 | grep 'malloc\|free\|realloc' | perl
observe_malloc_free.pl > memory.log

with observe_malloc_free.pl in mail bottom.

Hereafter, this result of memory leaks:

1 NULL is freed, but thats not a memory leak!

the rest can be read like this:
7 10 Mbytes data are not freed.
1 13 Mbytes data is not freed
8272 4 bytes data are not freed
9843 4080 bytes data are not freed

but I still do not know why...

/tmp$ cat memory.log | sed 's/.*=>//g' | sort | uniq -c
7 10
10 100
7 10000004
22 11
4 112
10 116
1 1192
26 12
1 124
4 128
12 13
1 131716
1 13334528
13 14
3 140
6 15
28 16
9 17
14 18
8 19
19 2
14 20
1 2048
6 21
5 22
3 23
230 24
1 240
6 25
7 256
4 27
1 2712
143 28
4 3
2 30
1 31
125 32
2 33
2 34
1 36
8272 4
3 40
1 4048
1 4064
9843 4080
8 4096
2 4373
1 44
1 45
78 48
1 49156
2 496
72 5
1 50
1 512
65 52
50 56
1 58
6 6
1 628
1 635
13 64
18 7
1 76
45 8
1 80
1 8080
2 84
4 88
74 9
1 98
1 freeing not allocated memory : NULL :



>> --- scipt ---------------------------
>> sub aa($)
>> {
>> my ($d) = @_;
>> $$d =~ s/x(.....)/$1y/g ;
>> $$d =~ s/x(.....)/$1z/g ;
>> $$d =~ s/x(.....)/$1a/g ;
>> $$d =~ s/x(.....)/$1b/g ;
>> $$d =~ s/x(.....)/$1c/g ;
>> return $d;
>> }
>>
>> sub ab($)
>> {
>> my ($d) = @_;
>> $$d =~ s/a(.....)/$1y/g ;
>> $$d =~ s/b(.....)/$1z/g ;
>> $$d =~ s/c(.....)/$1a/g ;
>> $$d =~ s/y(.....)/$1b/g ;
>> $$d =~ s/z(.....)/$1c/g ;
>> return $d;
>> }
>>
>>
>> my $s= 'x' x (1000*1000*10) ;
>> $s .= "\x{1234}" ;
>> my $c = \$s;
>> print length($$c) ."\n" ;
>> my $v = qx( ps v $$ );
>> print "$v\n" ;
>> $c = aa($c);
>> print length($$c) ."\n" ;
>> my $v = qx( ps v $$ );
>> print "$v\n" ;
>> $c = aa($c);
>> $c = aa($c);
>> $c = aa($c);
>> $c = aa($c);
>> $c = aa($c);
>> print length($$c) ."\n" ;
>> my $v = qx( ps v $$ );
>> print "$v\n" ;
>> $c = ab($c);
>> $c = ab($c);
>> $c = ab($c);
>> $c = ab($c);
>> $c = ab($c);
>> print length($$c) ."\n" ;
>> my $v = qx( ps v $$ );
>> print "$v\n" ;
>> $c = ab($c);
>> $c = ab($c);
>> $c = ab($c);
>> $c = ab($c);
>> $c = ab($c);
>> print length($$c) ."\n" ;
>> my $v = qx( ps v $$ );
>> print "$v\n" ;

-- observe...pl --------------------------

my %hash = ();


while (<>)
{
my $line = $_;
#print "jmg:" . $line;
if ( $line =~ m/malloc\(([0-9]*)\).*= *([0-9xa-fNUL]*)/ )
{
my $size = $1;
my $ad = $2;
#print "malloc : $1 : $2 : \n" ;
if ( defined ( $hash{$ad}) )
{
print "redundant malloc : $1 : $2 : \n" ;
}
$hash{$ad} = $size;
}
elsif ( $line =~ m/realloc\(([0-9xa-fNUL]*) *, *([0-9]*)\).*=
*([0-9xa-fNUL]*)/ )
{
my $adp = $1;
my $size = $2;
my $ad = $3;


if ( not defined ( $hash{$adp}) )
{
print "realloc not allocated memory : $adp : \n" ;
}
delete $hash{$adp} ;

#print "malloc : $1 : $2 : \n" ;
if ( defined ( $hash{$ad}) )
{
print "redundant malloc : $size : $ad : \n" ;
}
$hash{$ad} = $size;
}
elsif ( $line =~ m/free\(([0-9xa-fNUL]*)/ )
{
my $ad = $1;
#print "free : $1 : \n" ;
if ( not defined ( $hash{$ad}) )
{
print "freeing not allocated memory : $1 : \n" ;
}
delete $hash{$ad} ;
}
else
{
print "???:" . $line;
}

}

foreach my $key ( keys ( %hash) )
{
print $key . ' => ' . $hash{$key} . "\n" ;
}

Re: Memory issues

am 30.03.2008 00:52:45 von smallpond

On Mar 29, 7:40 pm, jm wrote:
> jm a =E9crit :
>
>
>
> > Modifying a little bit again the script, and checking execution with
> > ltrace, I observed malloc is called 1871 times when free is just called
> > 922 times.
> > Isn't it an issue?
>
> > and did:
>
> ltrace perl essai.pl 2>&1 | grep 'malloc\|free\|realloc' | perl
> observe_malloc_free.pl > memory.log
>
> with observe_malloc_free.pl in mail bottom.
>
> Hereafter, this result of memory leaks:
>
> 1 NULL is freed, but thats not a memory leak!
>
> the rest can be read like this:
> 7 10 Mbytes data are not freed.
> 1 13 Mbytes data is not freed
> 8272 4 bytes data are not freed
> 9843 4080 bytes data are not freed
>
> but I still do not know why...
>


I don't know much about the perl garbage collector,
but memory is not freed immediately when the ref
count goes to 0. When I run your program and watch
with top, VM goes to 200 MB and stays there for the
whole run. That seems to be some upper bound where
the garbage collector is running. Memory use does
not continue to go up.

Re: Memory issues

am 30.03.2008 01:12:59 von JM

smallpond a écrit :
> On Mar 29, 7:40 pm, jm wrote:
>> jm a écrit :
>>
>>
>>
>>> Modifying a little bit again the script, and checking execution with
>>> ltrace, I observed malloc is called 1871 times when free is just called
>>> 922 times.
>>> Isn't it an issue?
>>> and did:
>> ltrace perl essai.pl 2>&1 | grep 'malloc\|free\|realloc' | perl
>> observe_malloc_free.pl > memory.log
>>
>> with observe_malloc_free.pl in mail bottom.
>>
>> Hereafter, this result of memory leaks:
>>
>> 1 NULL is freed, but thats not a memory leak!
>>
>> the rest can be read like this:
>> 7 10 Mbytes data are not freed.
>> 1 13 Mbytes data is not freed
>> 8272 4 bytes data are not freed
>> 9843 4080 bytes data are not freed
>>
>> but I still do not know why...
>>
>
>
> I don't know much about the perl garbage collector,
> but memory is not freed immediately when the ref
> count goes to 0. When I run your program and watch
> with top, VM goes to 200 MB and stays there for the
> whole run. That seems to be some upper bound where
> the garbage collector is running. Memory use does
> not continue to go up.

This is because I have only 500 Mbytes on my computer.
So I made a perl demo program which works within this limit.

Instead of a 10 MBytes string, you can (try to) use a 40 Mbytes string,
or a 100 Mbytes string.

And then, you will see if the garabage collector start at 200 Mbytes,
.... or not.

What I only showed with ltrace and observe_malloc_free.pl is that when
the program stops, garbage collector did not collected all garbage.

Re: Memory issues

am 30.03.2008 01:13:14 von Joost Diepenmaat

jm writes:

> Modifying a little bit again the script, and checking execution with
> ltrace, I observed malloc is called 1871 times when free is just called
> 922 times.
> Isn't it an issue?

Please keep in mind that perl's memory allocation strategy in general is
optimized for longer running programs, not for one-off scripts (which
makes sense, since one-off scripts don't usually need the performance
gains). This means that for instance subroutines will get memory
allocated on the assumption that they'll be called again, and will take
about as much memory the next time.

This is NOT a memory leak per se, but it does mean that if you have a
subroutine that takes 100Mb to complete, your program will take that
memory and probably not give it back until the program ends. IOW, if you
have a long-running program that only means you need 100Mb for it to
run, it does NOT mean it takes a 100Mb for each call.

In your test case, don't assume that just because the regular expression
replacements don't in theory *need* to use any additional RAM, they
won't. Especially not if you're using UTF-8 encoded strings (which you
are). Perl algorithms tend to exchange RAM for speed in most cases
anyway, and replacing a match with a new string of exactly the same
length in bytes in a unicode string is a pretty uncommon use-case, so
it's likely not optimized.

Anyway, I've not seen a serious memory leak in perl itself in ages, and
I run perl processes that use up to 8 Gb of RAM and run for months
without issues.

--
Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/

Re: Memory issues

am 30.03.2008 11:28:52 von JM

Joost Diepenmaat a écrit :
> jm writes:
>
>> Modifying a little bit again the script, and checking execution with
>> ltrace, I observed malloc is called 1871 times when free is just called
>> 922 times.
>> Isn't it an issue?
>
> Please keep in mind that perl's memory allocation strategy in general is
> optimized for longer running programs, not for one-off scripts (which
> makes sense, since one-off scripts don't usually need the performance
> gains).

I did not read this, nor in the documentation, nor in the faq.
Might be a faqmemory might be helpful?

> This means that for instance subroutines will get memory
> allocated on the assumption that they'll be called again, and will take
> about as much memory the next time.

But when a same routine is called several times, with different kind and
size of data, some times it consume lot of memory, and some other time
less memory.
If your code contains hundred of functions and twenty of them consume
200 Mbytes (in a similar way as aa in my example), then a 2 Gbytes
computer will not be enough to run it.

> This is NOT a memory leak per se, but it does mean that if you have a
> subroutine that takes 100Mb to complete, your program will take that
> memory and probably not give it back until the program ends.

I understand this. But I'd like to have the opposite feature.
Or at least one perl function to release the memory used by a function
(or a package).

> IOW, if you
> have a long-running program that only means you need 100Mb for it to
> run, it does NOT mean it takes a 100Mb for each call.

Yes, it is what I observed.

But this mean it is not possible to free the memory consumed by one
function, when you know you need memory in another one function.

> In your test case, don't assume that just because the regular expression
> replacements don't in theory *need* to use any additional RAM, they
> won't. Especially not if you're using UTF-8 encoded strings (which you
> are). Perl algorithms tend to exchange RAM for speed in most cases
> anyway, and replacing a match with a new string of exactly the same
> length in bytes in a unicode string is a pretty uncommon use-case, so
> it's likely not optimized.

I do not think the issue is here.

> Anyway, I've not seen a serious memory leak in perl itself in ages, and
> I run perl processes that use up to 8 Gb of RAM and run for months
> without issues.

This mean you have a 8 Gbytes RAM memory computer.

But if memory was used by perl in a better way, might be the same
programs might work on a 512 MBytes RAM computer.

Re: Memory issues

am 30.03.2008 14:37:41 von hjp-usenet2

On 2008-03-29 13:55, jm wrote:
> Joost Diepenmaat a écrit :
>> jm writes:
>>
>>> Based on the fact that perl contains many memory leaks,
>>
>> It doesn't.
>
> I wrote a sample of code to illustrate the issue.
[...]
> Calling function aa (one or several times) makes a memory leak (or
> memory empreint) of 150 Mbytes.
> I mean that once I called this function I do not know how to free those
> 150 mega bytes, but if I call this same function again I will not loose
> more memory.

Then it's not a memory leak. A memory leak is when memory which has been
allocated cannot be (re)used. But in your case it can be reused (and is,
if you call the function again).

Perl is certainly wasteful with memory - The data structures have a lot
of overhead, and it often doesn't free memory because it might need it
again later - but AFAIK perl itself doesn't leak. (Perl programs often
leak - the garbage collector cannot detect cycles, for example, so
the programmer has to remember to do that).

hp

Re: Memory issues

am 30.03.2008 14:40:44 von hjp-usenet2

On 2008-03-30 00:13, Joost Diepenmaat wrote:
> jm writes:
>> Modifying a little bit again the script, and checking execution with
>> ltrace, I observed malloc is called 1871 times when free is just called
>> 922 times.
>> Isn't it an issue?
>
> Please keep in mind that perl's memory allocation strategy in general is
> optimized for longer running programs, not for one-off scripts (which
> makes sense, since one-off scripts don't usually need the performance
> gains).

Or maybe it is optimized for one-off scripts? One-off scripts rarely
need to worry about hogging memory and it is certainly faster to let the
OS free all the memory at once than to call free a gazillion times.

hp

Re: Memory issues

am 30.03.2008 14:55:00 von hjp-usenet2

On 2008-03-30 09:28, jm wrote:
> Joost Diepenmaat a écrit :
>> jm writes:
>>> Modifying a little bit again the script, and checking execution with
>>> ltrace, I observed malloc is called 1871 times when free is just called
>>> 922 times.
>>> Isn't it an issue?
[...]
>> IOW, if you
>> have a long-running program that only means you need 100Mb for it to
>> run, it does NOT mean it takes a 100Mb for each call.
>
> Yes, it is what I observed.
>
> But this mean it is not possible to free the memory consumed by one
> function, when you know you need memory in another one function.

You can. Perl keeps around the lexical variables, but not any objects
they point to. So avoid large scalar, hash or array variables in subs
and use references instead. For example, compare the behaviour of

#!/usr/bin/perl
use warnings;
use strict;

print a(@ARGV), "\n";
exit 0;

sub a {
print "entering a @_\n";
my ($n) = @_;

my $s1 = "a" x $n;
my $s2 = "b" x $n;

my $rc = length($s1 . $s2);

print "leaving a @_\n";
return $rc;
}

and

#!/usr/bin/perl
use warnings;
use strict;

print a(@ARGV), "\n";
exit 0;

sub a {
print "entering a @_\n";
my ($n) = @_;

my $s;
$s->[1] = "a" x $n;
$s->[2] = "b" x $n;

my $rc = length($s->[1] . $s->[2]);

print "leaving a @_\n";
return $rc;
}

(And of course "length($s->[1] . $s->[2])" is (intentionally) stupid -
replace it with "length($s->[1]) + length($s->[2])")


>> Anyway, I've not seen a serious memory leak in perl itself in ages, and
>> I run perl processes that use up to 8 Gb of RAM and run for months
>> without issues.
>
> This mean you have a 8 Gbytes RAM memory computer.
>
> But if memory was used by perl in a better way, might be the same
> programs might work on a 512 MBytes RAM computer.

Yes. But memory allocated to lexicals is usually the least of your
worries in this case. The overhead of typical perl data structures is
much worse. (Just this month I reduced the memory consumption of a
program from about 3 GB (which meant that it crashed sometimes, since
that's the limit on 32bit linux) to less than one GB by replacing an
anonymous array with a string (which I always had to unpack and repack
to access and manipulate the data within, which is ugly, but not really
slower than accessing the array).

hp

Re: Memory issues

am 30.03.2008 15:02:27 von hjp-usenet2

On 2008-03-29 23:52, smallpond wrote:
> I don't know much about the perl garbage collector,
> but memory is not freed immediately when the ref
> count goes to 0.

This is wrong. When the ref count goes to zero, perl immediately calls
free.

free may decide to keep the memory around for a subsequent malloc call,
but that doesn't have anything to do with perl - only with your system's
malloc/free implementation (if you use the system's malloc, which is the
default on most platforms, I think).

> When I run your program and watch with top, VM goes to 200 MB and
> stays there for the whole run. That seems to be some upper bound
> where the garbage collector is running.

Perl doesn't have a garbage collector which runs periodically.

hp

Re: Memory issues

am 30.03.2008 15:10:42 von Joost Diepenmaat

"Peter J. Holzer" writes:

> Or maybe it is optimized for one-off scripts? One-off scripts rarely
> need to worry about hogging memory and it is certainly faster to let the
> OS free all the memory at once than to call free a gazillion times.

True enough. In any case, as others mentioned, the perl interpreter is
pretty wasteful with memory in most cases where it could trade off
between speed and memory efficiency.

--
Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/

Re: Memory issues

am 30.03.2008 15:17:00 von Joost Diepenmaat

jm writes:

>> Anyway, I've not seen a serious memory leak in perl itself in ages, and
>> I run perl processes that use up to 8 Gb of RAM and run for months
>> without issues.
>
> This mean you have a 8 Gbytes RAM memory computer.

16 Gb, actually.

> But if memory was used by perl in a better way, might be the same
> programs might work on a 512 MBytes RAM computer.

No, because the data structures stored in the program really takes up
that much memory (though memory usage could concievably be reduced, if
we really needed to, but that's not a top priority for us). The tradeoff
here was also: speed vs memory use, and speed really was the top
priority (which also means significant pieces of the program are
actually written in C++).

The point being, these programs handle LOTS of data and run for months
without leaking - their memory use is about as stable as you'd expect. A
real leak is where repeated operation will continually increase the
amount of memory needed, which isn't what your test case is doing, and
as I said, it's not something I've seen (in the perl interpreter) in
quite a while.

--
Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/