length in bytes not characters
length in bytes not characters
am 12.01.2008 17:40:57 von Harry Putnam
How do we get the length of a variable in bytes?
I see the length function returns length in characters.
unless you (From perldoc -f length):
use "do { use bytes; length(EXPR) }"
Nothing there would seem to indicate it cannot be used to get the
length in characters or bytes of an array.
It does say:
. . . . . . . . . . . . . . . . . Note that this cannot be
used on an entire array or hash to find out how many elements
these have. For that, use "scalar @array" and "scalar keys
%hash" respectively.
But that doesn't appear to preclude using it to get the length in
characters. But apparently you cannot since this returns <1>.
my @ar = ("one,","two","three","four","five");
my $char = length @ar;
print "char<$char>\n"
char<1>
There is a reference right at the end that might be pointing at what I'm
looking for but I can't understand what it means:
. . . . . . . . . . . . . . . . . . . . . . . .To get the
length in bytes, use "do { use bytes; length(EXPR) }", see
bytes.
What is meant by `see bytes'? Since it appears in function
documentation it seems to indicate there is a function called `bytes'
or at least something in the FAQ.
But here, neither `perldoc -f' bytes nor `peldoc -q bytes' shows
anything.
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: length in bytes not characters
am 12.01.2008 18:29:42 von ROB.DIXON
reader@newsguy.com wrote:
> How do we get the length of a variable in bytes?
> I see the length function returns length in characters.
> unless you (From perldoc -f length):
> use "do { use bytes; length(EXPR) }"
>
> Nothing there would seem to indicate it cannot be used to get the
> length in characters or bytes of an array.
Yes. length() works on a scalar value, and returns the length of that
value in terms of unicode characters, unless forced to use bytes instead
as you have shown.
> It does say:
> . . . . . . . . . . . . . . . . . Note that this cannot be
> used on an entire array or hash to find out how many elements
> these have. For that, use "scalar @array" and "scalar keys
> %hash" respectively.
An array or hash in scalar context evaluates to its element count. So
my $length = @array;
will also work.
> But that doesn't appear to preclude using it to get the length in
> characters. But apparently you cannot since this returns <1>.
> my @ar = ("one,","two","three","four","five");
> my $char = length @ar;
> print "char<$char>\n"
>
> char<1>
Now we reach murky waters in that you are trying to establish the
'length' of an array, which is meaningless until you define it. What
your code does is to take the number of elements in @ar (length forces a
scalar context on its parameter) and return the length in characters of
that value. So
my $char = length @ar;
is evaluated as
my $char = length 5;
which gives $char the value 1 since '5' is one character long.
> There is a reference right at the end that might be pointing at what I'm
> looking for but I can't understand what it means:
>
> . . . . . . . . . . . . . . . . . . . . . . . .To get the
> length in bytes, use "do { use bytes; length(EXPR) }", see
> bytes.
>
> What is meant by `see bytes'? Since it appears in function
> documentation it seems to indicate there is a function called `bytes'
> or at least something in the FAQ.
'bytes' is a pragma, and documentation on it can be retrieved in the
same way as for all pragma and modules by issuing
perldoc bytes
> But here, neither `perldoc -f' bytes nor `peldoc -q bytes' shows
> anything.
No. Because it is not a function and doesn't appear in the title of any
FAQ.
My best guess at what you're looking for is the total number of bytes in
all the elements of an array. This can be achieved by manually
accumulating the lengths:
use strict;
use warnings;
my @ar = qw(one two three four five);
$char = 0;
$char += length foreach @ar;
print "char<$char>\n";
which outputs "char<19>".
If you're hoping for something different from this then perhaps you
would let us know.
HTH,
Rob
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: length in bytes not characters
am 12.01.2008 19:17:42 von chas.owens
On Jan 12, 2008 12:29 PM, Rob Dixon wrote:
snip
> My best guess at what you're looking for is the total number of bytes in
> all the elements of an array. This can be achieved by manually
> accumulating the lengths:
>
> use strict;
> use warnings;
>
> my @ar = qw(one two three four five);
>
> $char = 0;
> $char += length foreach @ar;
>
> print "char<$char>\n";
>
> which outputs "char<19>".
>
> If you're hoping for something different from this then perhaps you
> would let us know.
snip
It is important to note that this returns the number of characters,
not the number of bytes (in this case they are the same since all of
the UTF-8 characters in your string take up only one byte). You need
to use the bytes pragma to force length to return the number of bytes:
#!/usr/bin/perl
use strict;
use warnings;
#\x{1816} is the Mongolian digit 6, one character, three bytes
my @ar = (qw(one two three four five), "\x{1816}");
my $char = 0;
$char += length for @ar;
my $bytes = 0;
{
use bytes;
$bytes += length for @ar;
}
print "$char characters and $bytes bytes\n";
Of course, all of this just tells of the string length of the values,
not the in-memory size. For instance,
my $num = 12345;
takes up less room (at least at first) than
my $str = "12345";
because the former is being stored as a number (SvIV) and the later is
being stored as characters (SvPV). The length function will return
five for both of them. I believe, but I haven't delved into perlguts
in a long time, that $num will take up more memory than $str after
this operation
my $len = length $num;
because $num will contain both the numeric and character
representations (SvPVIV) so it won't have to generate the character
version again if we ask for it later. I do not know of a good way to
determine the in-memory size of a scalar value (and consequently array
and hash values).
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: length in bytes not characters
am 12.01.2008 19:50:44 von Harry Putnam
Rob Dixon writes:
> which outputs "char<19>".
>
> If you're hoping for something different from this then perhaps you
> would let us know.
Sorry if I was unclear as to what I was after. Yes that was it.
> 'bytes' is a pragma, and documentation on it can be retrieved in the
> same way as for all pragma and modules by issuing
> perldoc bytes
And the bit about `perldoc bytes' was new ground for me too.
I never ran into that usage I guess. Or never really thought to look
a `pragma'.
I thought, right along, that kind of syntax was reserved for pulling
up documentation of a module, like `perldoc File::Find' or whatever.
And have been confused about `pragma' as well.
You've helped clear that up for me too.... thanks.
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: length in bytes not characters
am 12.01.2008 19:51:16 von Harry Putnam
"Chas. Owens" writes:
> It is important to note that this returns the number of characters,
> not the number of bytes (in this case they are the same since all of
> the UTF-8 characters in your string take up only one byte). You need
> to use the bytes pragma to force length to return the number of bytes:
More welcome details and examples.......... thanks
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: length in bytes not characters
am 12.01.2008 20:16:52 von chas.owens
On Jan 12, 2008 1:50 PM, wrote:
> Rob Dixon writes:
>
> > which outputs "char<19>".
> >
> > If you're hoping for something different from this then perhaps you
> > would let us know.
>
> Sorry if I was unclear as to what I was after. Yes that was it.
>
> > 'bytes' is a pragma, and documentation on it can be retrieved in the
> > same way as for all pragma and modules by issuing
>
> > perldoc bytes
>
> And the bit about `perldoc bytes' was new ground for me too.
>
> I never ran into that usage I guess. Or never really thought to look
> a `pragma'.
>
> I thought, right along, that kind of syntax was reserved for pulling
> up documentation of a module, like `perldoc File::Find' or whatever.
> And have been confused about `pragma' as well.
>
> You've helped clear that up for me too.... thanks.
Well, a pragma is nothing more than a module that changes the behavior
of Perl (as opposed to adding functionality like normal modules). In
fact, here is the bytes pragma from Perl 5.8.4 (without the POD):
package bytes;
our $VERSION = '1.01';
$bytes::hint_bits = 0x00000008;
sub import {
$^H |= $bytes::hint_bits;
}
sub unimport {
$^H &= ~$bytes::hint_bits;
}
sub AUTOLOAD {
require "bytes_heavy.pl";
goto &$AUTOLOAD;
}
sub length ($);
sub chr ($);
sub ord ($);
sub substr ($$;$$);
sub index ($$;$);
sub rindex ($$;$);
1;
As you can see it is normal Perl. The real magic happens inside the
length and other functions. They check the value of the hints global
variable ($^H) and change their behavior if the "use bytes" bit is
set. In Perl 5.10 we have been given the ability to (safely) write
our own pragmas, see http://perldoc.perl.org/perlpragma.html for more
information, or if you have 5.10 installed you can say perldoc
perlpragma.
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: length in bytes not characters
am 12.01.2008 20:55:01 von Harry Putnam
"Chas. Owens" writes:
> As you can see it is normal Perl. The real magic happens inside the
> length and other functions. They check the value of the hints global
> variable ($^H) and change their behavior if the "use bytes" bit is
> set. In Perl 5.10 we have been given the ability to (safely) write
> our own pragmas, see http://perldoc.perl.org/perlpragma.html for more
> information, or if you have 5.10 installed you can say perldoc
> perlpragma.
I do have 5.10 installed but not as the working system perl
I see /usr/local/src/test/bin/perldoc perlpragma
Yes.. and thanks again
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/