Odd (undocumented?) behavior of RAM file within a loop

Odd (undocumented?) behavior of RAM file within a loop

am 29.01.2008 19:53:48 von David Filmer

Greetings. I am writing a program which processes some files within a
loop, and I am using the ability (as of Perl 5.8) to open a filehandle
on a scalar reference (which I recycle in the loop). Kindly consider
this stripped-down trivial code example which illustrates my question:

#!/usr/bin/perl
use strict; use warnings;

foreach (1..3) {
warn "Pass #$_\n";
my $file_in_memory;
open(my $fh, '>', \$file_in_memory) or die "Oops - $!\n";
close $fh;
}

__END__


It works fine for the first pass. However, subsequent passes emit warnings:

Pass #1
Pass #2
Use of uninitialized value in open at /perl/blah line 7.
Pass #3
Use of uninitialized value in open at /perl/blah line 7.

(line 7 is the "open" statement). Sure enough, if I make this change to
"initialize" the scalar:

my $file_in_memory = '';

then I have no problems (except I bust the second rule of chapter 4 in
Dr. Damian's _PBP_ - yikes).

Although the problem is easy enough to work around, I wonder why it
behaves this way (which I don't see documented). The really strange
thing (to me) is that it doesn't complain on the first pass. Any ideas?

Thx!

--
David Filmer (http://DavidFilmer.com)

Re: Odd (undocumented?) behavior of RAM file within a loop

am 29.01.2008 20:20:32 von John Bokma

David Filmer wrote:

> my $file_in_memory = '';
>
> then I have no problems (except I bust the second rule of chapter 4 in
> Dr. Damian's _PBP_ - yikes).


my $file_in_memory = q{}; # Empty string


It will probably take quite some time before I get used to that one.

--
John

http://johnbokma.com/mexit/

Re: Odd (undocumented?) behavior of RAM file within a loop

am 29.01.2008 21:53:37 von 1usa

David Filmer wrote in
news:2eOdndq_7J5h6ALaRVn_vwA@giganews.com:

> Greetings. I am writing a program which processes some files within a
> loop, and I am using the ability (as of Perl 5.8) to open a filehandle
> on a scalar reference (which I recycle in the loop). Kindly consider
> this stripped-down trivial code example which illustrates my question:
>
> #!/usr/bin/perl
> use strict; use warnings;
>
> foreach (1..3) {
> warn "Pass #$_\n";
> my $file_in_memory;
> open(my $fh, '>', \$file_in_memory) or die "Oops - $!\n";
> close $fh;
> }
>
> __END__
>
>
> It works fine for the first pass. However, subsequent passes emit
> warnings:
>
> Pass #1
> Pass #2
> Use of uninitialized value in open at /perl/blah line 7.
> Pass #3
> Use of uninitialized value in open at /perl/blah line 7.
>
> (line 7 is the "open" statement). Sure enough, if I make this change
> to "initialize" the scalar:

I found it curious that the warning disappears if I commented out the
close statement (on AS Perl 5.10 on Win32).

So, I looked a little closer:

C:\Temp> cat t.pl
#!/usr/bin/perl

use strict;
use warnings;

for my $pass ( 1 .. 3 ) {
warn "Pass #$pass\n";
open my $fh, '>', \(my $buffer) or die $!;
print \$buffer, "\n";
close $fh or die $!
}

__END__

C:\DOCUME~1\asu1\LOCALS~1\Temp> t
Pass #1
SCALAR(0x182b04c)
Pass #2
Use of uninitialized value $buffer in open at C:\Temp\t.pl line 8.
SCALAR(0x182b04c)
Pass #3
Use of uninitialized value $buffer in open at C:\Temp\t.pl line 8.
SCALAR(0x182b04c)

With the close statement commented out, I get:

C:\Temp> t
Pass #1
SCALAR(0x182b04c)
Pass #2
SCALAR(0x22ae14)
Pass #3
SCALAR(0x22ae24)

Note that the explicit close results in the reference to the same
memory area being reused whereas, without the explicit close, the
reference changes.

I do not know if this behavior is peculiar to my platform or if it
is documented in some way. I would have expected the behavior to be
the same with or without the explicit close.

I also tried this with Cygwin Perl 5.8.8 and got the following:

Without explicit close:

> perl t.pl
Pass #1
SCALAR(0x10070f2c)
Pass #2
SCALAR(0x100671c4)
Pass #3
SCALAR(0x10070f2c)

With explicit close:

> perl t.pl
Pass #1
SCALAR(0x10070f2c)
Pass #2
Use of uninitialized value in open at t.pl line 8.
SCALAR(0x10070f2c)
Pass #3
Use of uninitialized value in open at t.pl line 8.
SCALAR(0x10070f2c)

Hmmmmm ...

Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
clpmisc guidelines:

Re: Odd (undocumented?) behavior of RAM file within a loop

am 30.01.2008 03:11:20 von Ben Morrow

Quoth "A. Sinan Unur" <1usa@llenroc.ude.invalid>:
>
> I found it curious that the warning disappears if I commented out the
> close statement (on AS Perl 5.10 on Win32).
>
> So, I looked a little closer:
>
> C:\Temp> cat t.pl
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> for my $pass ( 1 .. 3 ) {
> warn "Pass #$pass\n";
> open my $fh, '>', \(my $buffer) or die $!;
> print \$buffer, "\n";
> close $fh or die $!
> }
>
> __END__
>
> C:\DOCUME~1\asu1\LOCALS~1\Temp> t
> Pass #1
> SCALAR(0x182b04c)
> Pass #2
> Use of uninitialized value $buffer in open at C:\Temp\t.pl line 8.
> SCALAR(0x182b04c)
> Pass #3
> Use of uninitialized value $buffer in open at C:\Temp\t.pl line 8.
> SCALAR(0x182b04c)

This seems to be an bug in PerlIO::scalar. There is a check in scalar.xs
that is supposed to avoid this warning; however, the code checks
SvTYPE(sv) > SVt_NULL ('this scalar has never been allocated') rather
than SvOK(sv) ('this scalar has a defined value'). When a lexical is
reused the former is true but the latter is false, so the warning gets
through. This means that

my $x = 3;
undef $x;
open my $F, '>', \$x;

will warn as well, which it also (IIUC) shouldn't.

I've done up a patch, which I'll send to p5p when I've tested it.

> With the close statement commented out, I get:
>
> C:\Temp> t
> Pass #1
> SCALAR(0x182b04c)
> Pass #2
> SCALAR(0x22ae14)

This makes less sense... I suspect what's happening here is that the
filehandle isn't being closed until *after* $buffer gets allocated for
the next iteration, so it is forced to allocate a new scalar since the
old one's still in use. This is arguably wrong, since the whole point is
to reuse the old scalar if we can, for efficiency.

Ben

Re: Odd (undocumented?) behavior of RAM file within a loop

am 30.01.2008 12:05:26 von 1usa

Ben Morrow wrote in
news:8v2575-srt.ln1@osiris.mauzo.dyndns.org:

>
> Quoth "A. Sinan Unur" <1usa@llenroc.ude.invalid>:
>>
>> I found it curious that the warning disappears if I commented out the
>> close statement (on AS Perl 5.10 on Win32).

....

>> Use of uninitialized value $buffer in open at C:\Temp\t.pl line 8.
>> SCALAR(0x182b04c)
>
> This seems to be an bug in PerlIO::scalar.



Thank you for your explanation.

....

> I've done up a patch, which I'll send to p5p when I've tested it.

Great.

>
>> With the close statement commented out, I get:
>>
>> C:\Temp> t
>> Pass #1
>> SCALAR(0x182b04c)
>> Pass #2
>> SCALAR(0x22ae14)
>
> This makes less sense... I suspect what's happening here is that the
> filehandle isn't being closed until *after* $buffer gets allocated for
> the next iteration, so it is forced to allocate a new scalar since the
> old one's still in use.

That is my gut feeling, too.

> This is arguably wrong, since the whole point is to reuse the old
> scalar if we can, for efficiency.

In fact, I would argue that it is wrong. I don't foresee myself being
able to first find where this is occurring and fix it without messing up
anything else.

On the other hand, when I used an infinite loop to see if this behavior
caused a leak, it seemed to just cycle through the same three locations
and memory usage did not increase at all.

C:\Temp> cat leak.pl
#!/usr/bin/perl

use strict;
use warnings;

while ( 1 ) {
open my $fh, '>', \(my $buffer) or die $!;
print \$buffer, "\n";
print $fh "test\n";
sleep 1;
}


C:\Temp> leak
SCALAR(0x182aeb4)
SCALAR(0x22acb4)
SCALAR(0x22ad04)
SCALAR(0x182aeb4)
SCALAR(0x22acb4)
SCALAR(0x22ad04)
SCALAR(0x182aeb4)
SCALAR(0x22acb4)
SCALAR(0x22ad04)
SCALAR(0x182aeb4)
SCALAR(0x22acb4)
SCALAR(0x22ad04)
SCALAR(0x182aeb4)
Terminating on signal SIGINT(2)

C:\Temp> perl -v

This is perl, v5.10.0 built for MSWin32-x86-multi-thread
(with 3 registered patches, see perl -V for more detail)

Copyright 1987-2007, Larry Wall

Binary build 1002 [283697] provided by ActiveState
http://www.ActiveState.com
Built Jan 10 2008 11:00:53

And:

> perl leak.pl
SCALAR(0x1002eabc)
SCALAR(0x1002ea98)
SCALAR(0x1002eabc)
SCALAR(0x1002ea98)
SCALAR(0x1002eabc)
SCALAR(0x1002ea98)
SCALAR(0x1002eabc)

> perl -v

This is perl, v5.8.8 built for cygwin-thread-multi-64int
(with 8 registered patches, see perl -V for more detail)

Copyright 1987-2006, Larry Wall



--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
clpmisc guidelines:

Re: Odd (undocumented?) behavior of RAM file within a loop

am 30.01.2008 13:53:55 von Ben Morrow

Quoth "A. Sinan Unur" <1usa@llenroc.ude.invalid>:
> Ben Morrow wrote in
> news:8v2575-srt.ln1@osiris.mauzo.dyndns.org:
>
> > This makes less sense... I suspect what's happening here is that the
> > filehandle isn't being closed until *after* $buffer gets allocated for
> > the next iteration, so it is forced to allocate a new scalar since the
> > old one's still in use.
>
> That is my gut feeling, too.
>
> > This is arguably wrong, since the whole point is to reuse the old
> > scalar if we can, for efficiency.
>
> In fact, I would argue that it is wrong. I don't foresee myself being
> able to first find where this is occurring and fix it without messing up
> anything else.
>
> On the other hand, when I used an infinite loop to see if this behavior
> caused a leak, it seemed to just cycle through the same three locations
> and memory usage did not increase at all.

No, I wouldn't expect that. The filehandle should still be getting
closed, just not until after $buffer has been reallocated. At that point
the old value of $buffer will be freed, so it's likely that some scalar
will 'soon' reuse its memory.

Ben