substr() hassle, *n*x vs. Win32

substr() hassle, *n*x vs. Win32

am 14.06.2006 12:48:09 von Mirco Wahab

While trying to get along with some
"C to perl" interfacing, I stumbled
upon substr when using it to 'lvalue'
a packed scalar structure to a portion
of its string like

substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;

which works fine (in the context I use it)
under Perl-587 (Inline::C 0.44) in Linux.

At a Win32-environment (Activeperl 587,
nmake, cl) this seem to fail sometimes.
Maybe I made a mistake that I'm not
aware of.

I'll include a short example where this
problem occurs. A C-function is called,
which allocates an SV* and writes
geometrical data (double) to it.
Back in perl, this data is unpacked
an printed (checked visually).

Then, this date is accessed by Perls
substr(), which works, as said, under
Linux but not under windows. The data
in $data gets messed up, as if perl
would cur out a portion of the string
and insert a new one of different size.

Thanks in advance,

Mirco

==>
#!/usr/lib/perl
use strict;
use warnings;

my $bsize = P3Dsize(); # sizeof(struct) from C
my $strct = "d3i"; # (atual) format of the above
my $vlen = 6; # number of elements to work with
my ($data, $x, $y, $z, $id, $blk); # some declarations

makesomevector_of($vlen, $data); # call into inline c

for my $idx (0 .. $vlen-1 ) {
# first: access structures and print (always fine!)
$blk = substr $data, $idx*$bsize, $bsize; # take out structure
($x, $y, $z, $id) = unpack $strct, $blk ; # unpack it
print " $x\t$y\t$z\t[$id]\n"; # print it (fine)

# second: access structures from Perl, !seems to fail in Win32!
substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+1, $y+1, $z+1, $id+1;

# third: access structures again and print (fails badly under W32/nmake/cl)
$blk = substr $data, $idx*$bsize, $bsize; # re-take structure
($x, $y, $z, $id) = unpack $strct, $blk; # unpack it
print "+$x\t$y\t$z\t[$id]+\n"; # print it (prints garbage!)
}

use Inline C => <<'END_OF_C_CODE';

typedef struct {
double x, y, z;
int id;
} P3D;

int P3Dsize() { return sizeof(P3D); }

int makesomevector_of(int Count, SV* perl_sv)
{
int id, blocksize = Count * P3Dsize();
P3D* pvec = (P3D*) malloc(blocksize);

if(pvec) {
for(id=0; id double val = id+1;
pvec[id].id = id;
pvec[id].x = val;
pvec[id].y = val*val;
pvec[id].z = val*val*val;
}
sv_setpvn( perl_sv, (char *)pvec, blocksize );
return 1;
}
return 0;
}
END_OF_C_CODE

Re: substr() hassle, *n*x vs. Win32

am 14.06.2006 15:25:48 von Mirco Wahab

Thus spoke Mirco Wahab (on 2006-06-14 12:48):

>
> substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
> [Problem]

I could - after some fiddling - spot the problem
and I'm wondering if there aren't some receipes for
that one ...

The problem is, I had a C structural type

struct {
double x, y, z;
int n;
};

and would tell that it is 28 bytes in size,
corresponding to a "dddi" pack format.

But actually each compiler has had its
own thinking about this, and - of course,
its the structure alignment problem, which
shines through here.

In the case above, my gcc 3.3.3 would tell
out of the box: "C struct size is 28 bytes"
whereas my VC6/cl would insist:
"C struct size is 32 bytes"

I could fix the Win32-problem simply by
changing the pack-format from (28 byte) "dddi"
to (32 byte) "dddii". This leads of course
to phenomenal failures in the Linux-version ;-)

It looks like one could have a cheap and fast
vector interface by writing stuff directly
into scalars (if $n gets larger than 10^5),
if not the only problem that hit me was the
compiler specific alignment problem.

How do I solve this?

Another question: if I return a C-generated
SV* back to perl via return(SV*), do I have
to 'mortalize' anything - or does perl take
care of that?

(Source example attached)

Regards

Mirco

==>
#!/usr/lib/perl
use strict;
use warnings;

my $vlen = 6; # number of elements to work with
my $bsize = P3Dsize(); # sizeof(struct) from C
my $strct = "dddi"; # (actual) format, padded to sizeof(struct)
my ($blk, $x, $y, $z, $n); # some declarations
print "C struct size is: $bsize bytes\n";

# call into inline c
my $data = makesomevector_of( $vlen ) or die "couldn't allocate\n";

for my $idx (0 .. $vlen-1 ) {
# first: access structures and print (always fine!)
$blk = substr $data, $idx*$bsize, $bsize; # take out structure
($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
print " $x\t$y\t$z\t[$n]\n"; # print it (fine)

# second: access structures from Perl (simply increment by 9)
substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+9, $y+9, $z+9, $n+9;

# third: look into structures again and print them
$blk = substr $data, $idx*$bsize, $bsize; # re-take structure
($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
print "+$x\t$y\t$z\t[$n]+\n"; # print it
}

use Inline C => <<'END_OF_C_CODE';

typedef struct {
double x, y, z;
int n; // always check structure alignment by
} P3D; // sizeof(struct) vs. sizeof(all members)

int P3Dsize() { return sizeof(P3D); }

SV* makesomevector_of(int Count)
{
int i, blocksize = Count * P3Dsize();
P3D* pvec = (P3D*) malloc(blocksize);
SV* perl_sv = newSV(0);

if(pvec) {
for(i=0; i double val = i+1;
pvec[i].n = i;
pvec[i].x = val;
pvec[i].y = val*val;
pvec[i].z = val*val*val;
}
sv_setpvn( perl_sv, (char *)pvec, blocksize );
free( pvec );
}
return perl_sv;
}
END_OF_C_CODE

Re: substr() hassle, *n*x vs. Win32

am 14.06.2006 16:25:39 von Ben Morrow

Quoth Mirco Wahab :
> Thus spoke Mirco Wahab (on 2006-06-14 12:48):
>
> >
> > substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
> > [Problem]
>
> I could - after some fiddling - spot the problem
> and I'm wondering if there aren't some receipes for
> that one ...
>
> The problem is, I had a C structural type
>
> struct {
> double x, y, z;
> int n;
> };
>
> and would tell that it is 28 bytes in size,
> corresponding to a "dddi" pack format.
>
> But actually each compiler has had its
> own thinking about this, and - of course,
> its the structure alignment problem, which
> shines through here.
>
> In the case above, my gcc 3.3.3 would tell
> out of the box: "C struct size is 28 bytes"
> whereas my VC6/cl would insist:
> "C struct size is 32 bytes"
>
> I could fix the Win32-problem simply by
> changing the pack-format from (28 byte) "dddi"
> to (32 byte) "dddii". This leads of course

Better would be one of "dddixx" or "dddxxi" depending on where the
space is. You can use the offsetof() C macro to find out.

> to phenomenal failures in the Linux-version ;-)
>
> It looks like one could have a cheap and fast
> vector interface by writing stuff directly
> into scalars (if $n gets larger than 10^5),
> if not the only problem that hit me was the
> compiler specific alignment problem.

You may want to look at Bit::Vector.

> How do I solve this?

One way is to use Inline::Struct.
Another is to have your Makefile.PL compile a little test program that
uses offsetof to work out the right template and prints it out. Then you
can substitute this into your .pm.

> Another question: if I return a C-generated
> SV* back to perl via return(SV*), do I have
> to 'mortalize' anything - or does perl take
> care of that?

Have you read Inline::C-Cookbook? If you return a SV*, Inline::C will
mortalize it for you. Otherwise, you must do it yourself.

> (Source example attached)
>
> ==>
> #!/usr/lib/perl
> use strict;
> use warnings;
>
> my $vlen = 6; # number of elements to work with
> my $bsize = P3Dsize(); # sizeof(struct) from C
> my $strct = "dddi"; # (actual) format, padded to sizeof(struct)
> my ($blk, $x, $y, $z, $n); # some declarations
> print "C struct size is: $bsize bytes\n";
>
> # call into inline c
> my $data = makesomevector_of( $vlen ) or die "couldn't allocate\n";
>
> for my $idx (0 .. $vlen-1 ) {
> # first: access structures and print (always fine!)
> $blk = substr $data, $idx*$bsize, $bsize; # take out structure
> ($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
> print " $x\t$y\t$z\t[$n]\n"; # print it (fine)
>
> # second: access structures from Perl (simply increment by 9)
> substr( $data, $idx*$bsize, $bsize ) = pack $strct, $x+9, $y+9, $z+9, $n+9;
>
> # third: look into structures again and print them
> $blk = substr $data, $idx*$bsize, $bsize; # re-take structure
> ($x, $y, $z, $n) = unpack $strct, $blk; # unpack it
> print "+$x\t$y\t$z\t[$n]+\n"; # print it
> }
>
> use Inline C => <<'END_OF_C_CODE';
>
> typedef struct {
> double x, y, z;
> int n; // always check structure alignment by
> } P3D; // sizeof(struct) vs. sizeof(all members)
>
> int P3Dsize() { return sizeof(P3D); }
>
> SV* makesomevector_of(int Count)
> {
> int i, blocksize = Count * P3Dsize();
> P3D* pvec = (P3D*) malloc(blocksize);

Don't use malloc! Use New/Safefree, which is the perl interface to the
allocator.

> SV* perl_sv = newSV(0);

You should probably use NEWSV instead, and it would be more efficient to
allocate string space straight away.

Ben

--
Joy and Woe are woven fine,
A Clothing for the Soul divine William Blake
Under every grief and pine 'Auguries of Innocence'
Runs a joy with silken twine. benmorrow@tiscali.co.uk

Re: substr() hassle, *n*x vs. Win32

am 14.06.2006 17:56:51 von xhoster

Mirco Wahab wrote:
> Thus spoke Mirco Wahab (on 2006-06-14 12:48):
>
> >
> > substr( $data, $offsett, $size ) = pack "ddd", $x, $y, $z;
> > [Problem]
>
> I could - after some fiddling - spot the problem
> and I'm wondering if there aren't some receipes for
> that one ...
>
> The problem is, I had a C structural type
>
> struct {
> double x, y, z;
> int n;
> };
>
> and would tell that it is 28 bytes in size,
> corresponding to a "dddi" pack format.
>
> But actually each compiler has had its
> own thinking about this, and - of course,
> its the structure alignment problem, which
> shines through here.
>
....
>
> It looks like one could have a cheap and fast
> vector interface by writing stuff directly
> into scalars (if $n gets larger than 10^5),
> if not the only problem that hit me was the
> compiler specific alignment problem.
>
> How do I solve this?

I think the simple answer is that you don't solve this. You can't
take on the performance power of C without taking its liabilities, one
of which is the nonportability of structs. So you can circumvent it
in several ways, but they depend on what you are trying to do. You
could use four independent arrays (3 for doubles and one for int), although
there may be alignment problems there as well. Or you could just fiddle
with it until it works on your machine and then accept that it will not be
portable.

>
> Another question: if I return a C-generated
> SV* back to perl via return(SV*), do I have
> to 'mortalize' anything - or does perl take
> care of that?

In this case, Perl takes care of it. I know this for two reasons. I ran
your code in a loop and noticed no memory leak, I added a
sv_2mortal(perl_sv) and ran your code and got "Attempt to free unreferenced
scalar" errors. Alas, I don't know how you figure these things out from
first principles.

I generally side step these issues by making a string of the right length
in perl, and then passing that string into the Inline code for the C to
fill in and use.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB

Re: substr() hassle, *n*x vs. Win32

am 14.06.2006 18:04:33 von xhoster

xhoster@gmail.com wrote:
> ...
> >
> > It looks like one could have a cheap and fast
> > vector interface by writing stuff directly
> > into scalars (if $n gets larger than 10^5),
> > if not the only problem that hit me was the
> > compiler specific alignment problem.
> >
> > How do I solve this?
>
> I think the simple answer is that you don't solve this. You can't
> take on the performance power of C without taking its liabilities, one
> of which is the nonportability of structs. So you can circumvent it
> in several ways, but they depend on what you are trying to do. You
> could use four independent arrays (3 for doubles and one for int),
> although there may be alignment problems there as well. Or you could
> just fiddle with it until it works on your machine and then accept that
> it will not be portable.
>
> >
> > Another question: if I return a C-generated
> > SV* back to perl via return(SV*), do I have
> > to 'mortalize' anything - or does perl take
> > care of that?
>
> In this case, Perl takes care of it. I know this for two reasons. I ran
> your code in a loop and noticed no memory leak, I added a
> sv_2mortal(perl_sv) and ran your code and got "Attempt to free
> unreferenced scalar" errors. Alas, I don't know how you figure these
> things out from first principles.

On both of these, never mind me. Listen to Ben. He really knows what he
is doing. If I'd seen his post before I started composing my own, I
wouldn't have responded.

> I generally side step these issues by making a string of the right length
> in perl, and then passing that string into the Inline code for the C to
> fill in and use.

Well, but I do still like doing this.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB