When is a string a number?

When is a string a number?

am 04.10.2006 20:34:58 von Martin.Evans

Hi,

I'm hoping someone here can give me some insight in to what I can do as
I feel sure someone must have hit this problem before. The software I'm
working is a good deal more complex than what is described here but I've
tried to resolve it down to make it more simple (perhaps
unsuccessfully).

The application reads values from a databases (but we'll stick to Oracle
for now), converts the data to JSON (JavaScript Object Notation), writes
the json to a file where it is picked up by some javascript code. JSON
was chosen because JSON is native to javascript and the javascript can
simply eval the read data to turn it into a javascript object.

With DBI/DBD::Oracle all values read from the database are scalars. As
everyone will know, whether something read from the database is a string
or a number in Perl purely depends on the context it is used in so:

$a = "1";
print $a +1 results in 2
and
print $a . 2 results in 12;

The problem is javascript is more tightly typed than Perl and it matters
whether something is a number or a string - it affects what operations
you can perform on a variable.

In JSON, a string is represented by quotes around it and a number is
missing the quotes.

My problem is that values read from the database which are actually
numbers (in the database) look like strings to the JSON parser so when I
do a selectall_arrayref("select number_field from table") and convert
the result to JSON I get:

["12"] in the serialised JSON data instead of:

[12]

I have to admit I don't know how the JSON module knows what is a number
and what is a string in Perl but I see the same issue with Data::Dumper
so I presume there must be some way to find out if a perl scalar is a
number or a string.

The problem gets a lot worse for me since I do some arithmetic on values
pulled from the database before converting them to JSON and this is
where Perl seems to change them into numbers e.g.

$r = selectall_arrayref(select number1, number2 from table);
$r->[0] += 2;
results in JSON output of:

[3, "3"]

and this really annoys javascript since the first value is typed as a
number and the second as a string.

You can imagine this can get really hairy if you have a field in the DB
called house_number_or_name.

Whilst investigating this I tried to reproduce with Data::Dumper
(another data serialiser) but the problem gets even more unfathomable to
me:

perl -MData::Dumper -le '$a = [1,"2"];print Dumper($a);'
$VAR1 = [1,'2'];

so Data::Dumper knows "2" is a string (XS version) and yet:

perl -MData::Dumper -le ' $Data::Dumper::Useperl =1;$a = [1,"2"];print
Dumper($a);'
$VAR1 = [1,2];

just because I've switched to a Perl version of Data::Dumper "2" is a
number.

I don't want to have to do ($var +0) on all the number fields I pull
from the database (to turn them into numbers) and neither do I want to
do a '$var .= ""' (to turn all the fields into strings).

As an aside (and probably a perl question rather than a DBI one) does
anyone know why the type of a scalar changes when you use it on the
right side of an assignment:

perl -MData::Dumper -le '$a="1"; print Dumper($a); $b += $a; print
Dumper($a);'
$VAR1 = '1';
$VAR1 = 1;

How does JSON and Data::Dumper know whether Perl thinks something is a
number or a string?

Sorry for the long posting. As I said, someone must have hit this before
and I hope you can offer me some insight as I'm rather stuck with this
now.

Martin
--
Martin J. Evans
Easysoft Limited
http://www.easysoft.com

Re: When is a string a number?

am 04.10.2006 21:00:45 von davidnicol

On 10/4/06, Martin J. Evans wrote:

> With DBI/DBD::Oracle all values read from the database are scalars. As
> everyone will know, whether something read from the database is a string
> or a number in Perl purely depends on the context it is used in so:

internally, there are flags in the SV structure that hint as to what conversions
have been done on the SV already, for efficiency's sake.

> I have to admit I don't know how the JSON module knows what is a number
> and what is a string in Perl but I see the same issue with Data::Dumper
> so I presume there must be some way to find out if a perl scalar is a
> number or a string.

It seems like the pure perl Dumper tests values with a regex and optimizes
to numbers when the stringification matches a number template, while
the XS version checks the flags in the SV structure.

> The problem gets a lot worse for me since I do some arithmetic on values
> pulled from the database before converting them to JSON and this is
> where Perl seems to change them into numbers e.g.

> I don't want to have to do ($var +0) on all the number fields I pull
> from the database (to turn them into numbers) and neither do I want to
> do a '$var .= ""' (to turn all the fields into strings).

you might have to do exactly that. "$var" will produce a string version. Sorry.
Apparently you can tell the JSON module to make everything strings:
http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOC ONVERT

> As an aside (and probably a perl question rather than a DBI one) does
> anyone know why the type of a scalar changes when you use it on the
> right side of an assignment:
>
> perl -MData::Dumper -le '$a="1"; print Dumper($a); $b += $a; print
> Dumper($a);'
> $VAR1 = '1';
> $VAR1 = 1;

The has-been-evaluated-as-a-number flag got set on $a when it was
evaluated as a number; then Dumper, with both available, chose the
number format.

> How does JSON and Data::Dumper know whether Perl thinks something is a
> number or a string?

inspecting the flags; except pure-perl Dumper apparently uses a
regular expression
to identify numbers. Those are guesses. The source is available for
your inspection.


--
The Country Of The Blind, by H.G. Wells
http://cronos.advenge.com/pc/Wells/p528.html

Re: When is a string a number?

am 04.10.2006 21:36:13 von Martin.Evans

On Wed, 2006-10-04 at 14:00 -0500, David Nicol wrote:
> On 10/4/06, Martin J. Evans wrote:
>
> > With DBI/DBD::Oracle all values read from the database are scalars. As
> > everyone will know, whether something read from the database is a string
> > or a number in Perl purely depends on the context it is used in so:
>
> internally, there are flags in the SV structure that hint as to what conversions
> have been done on the SV already, for efficiency's sake.

As I suspected.

> > I have to admit I don't know how the JSON module knows what is a number
> > and what is a string in Perl but I see the same issue with Data::Dumper
> > so I presume there must be some way to find out if a perl scalar is a
> > number or a string.
>
> It seems like the pure perl Dumper tests values with a regex and optimizes
> to numbers when the stringification matches a number template, while
> the XS version checks the flags in the SV structure.

OK, I only introduced the Data::Dumper example as it seemed very similar
but read on - I over simplified a little. However, it seems strange to
me that the pure perl Data::Dumper does something different. If I was
using Data::Dumper for data interchange and switched from the pure perl
to the XS version (or vice versa) and then called another XS based
module with the results - they could be different.

> > The problem gets a lot worse for me since I do some arithmetic on values
> > pulled from the database before converting them to JSON and this is
> > where Perl seems to change them into numbers e.g.
>
> > I don't want to have to do ($var +0) on all the number fields I pull
> > from the database (to turn them into numbers) and neither do I want to
> > do a '$var .= ""' (to turn all the fields into strings).
>
> you might have to do exactly that. "$var" will produce a string version. Sorry.

This seems amazing - I really did not want to do this. I guess what I'm
saying is that it would be really useful for a DBD to mark a number as a
number so it does not start as a string and mutate to a number when I do
arithmetic on it. Most DBDs must know if the column is a number one or
not (certainly DBD::ODBC does).

> Apparently you can tell the JSON module to make everything strings:
> http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOC ONVERT

Here was an over simplification. I moved from JSON to JSON Syck. The
JSON module is a purl perl module and does "the right thing" for me for
all the cases I tried but I changed to JSON Syck which uses the libsyck
library for speed. This explains the difference since JSON Syck is XS
and will know the flags in the SV structure and JSON won't.

> > As an aside (and probably a perl question rather than a DBI one) does
> > anyone know why the type of a scalar changes when you use it on the
> > right side of an assignment:
> >
> > perl -MData::Dumper -le '$a="1"; print Dumper($a); $b += $a; print
> > Dumper($a);'
> > $VAR1 = '1';
> > $VAR1 = 1;
>
> The has-been-evaluated-as-a-number flag got set on $a when it was
> evaluated as a number; then Dumper, with both available, chose the
> number format.

That's interesting to know, I'll look into that. I was rather surprised
the right hand side of an evaluation changed the type of a variable
on the right hand side.

> > How does JSON and Data::Dumper know whether Perl thinks something is a
> > number or a string?
>
> inspecting the flags; except pure-perl Dumper apparently uses a
> regular expression
> to identify numbers. Those are guesses. The source is available for
> your inspection.

I did look at it briefly but not extensively since it was not my primary
problem.

Thanks for your insights David, they have increased my understanding
although I still feel stuck. I could move back to using the JSON module
but I could run into problems with fields like house_name_or_number
which I'm now guessing JSON will serialise as a number when it looks
like a number and a string when not. If I stick with JSON Syck
(which is a lot faster and I need the speed), I'm
forced to do +0 on all database fields I know are a number and I don't
like this at all. I guess I'm rather surprised to hit this issue without
seeing anyone else with it after all this time using DBI.

Martin
--
Martin J. Evans
Easysoft Limited
http://www.easysoft.com

RE: When is a string a number?

am 04.10.2006 21:50:33 von Will.Rutherdale

This is a very interesting problem, Martin, because most of us ordinary
Perl users don't experience these issues on a regular basis if ever.

Just one thought, maybe a bit naiive, but wouldn't it be possible to
copy all the values you've got from the database and do the extra
computations on the copies separately? It might avoid polluting the
originals before passing them off to JSON.

-Will


> -----Original Message-----
> From: Martin J. Evans [mailto:martin.evans@easysoft.com]=20
> Sent: Wednesday 04 October 2006 15:36
> To: dbi-users
> Subject: Re: When is a string a number?
>=20
>=20
> On Wed, 2006-10-04 at 14:00 -0500, David Nicol wrote:
> > On 10/4/06, Martin J. Evans wrote:
> >=20
> > > With DBI/DBD::Oracle all values read from the database=20
> are scalars. As
> > > everyone will know, whether something read from the=20
> database is a string
> > > or a number in Perl purely depends on the context it is=20
> used in so:
> >=20
> > internally, there are flags in the SV structure that hint=20
> as to what conversions
> > have been done on the SV already, for efficiency's sake.
>=20
> As I suspected.
>=20
> > > I have to admit I don't know how the JSON module knows=20
> what is a number
> > > and what is a string in Perl but I see the same issue=20
> with Data::Dumper
> > > so I presume there must be some way to find out if a perl=20
> scalar is a
> > > number or a string.
> >=20
> > It seems like the pure perl Dumper tests values with a=20
> regex and optimizes
> > to numbers when the stringification matches a number template, while
> > the XS version checks the flags in the SV structure.
>=20
> OK, I only introduced the Data::Dumper example as it seemed=20
> very similar
> but read on - I over simplified a little. However, it seems strange to
> me that the pure perl Data::Dumper does something different. If I was
> using Data::Dumper for data interchange and switched from the=20
> pure perl
> to the XS version (or vice versa) and then called another XS based
> module with the results - they could be different.
>=20
> > > The problem gets a lot worse for me since I do some=20
> arithmetic on values
> > > pulled from the database before converting them to JSON=20
> and this is
> > > where Perl seems to change them into numbers e.g.
> >=20
> > > I don't want to have to do ($var +0) on all the number=20
> fields I pull
> > > from the database (to turn them into numbers) and neither=20
> do I want to
> > > do a '$var .=3D ""' (to turn all the fields into strings).
> >=20
> > you might have to do exactly that. "$var" will produce a=20
> string version. Sorry.
>=20
> This seems amazing - I really did not want to do this. I=20
> guess what I'm
> saying is that it would be really useful for a DBD to mark a=20
> number as a
> number so it does not start as a string and mutate to a=20
> number when I do
> arithmetic on it. Most DBDs must know if the column is a number one or
> not (certainly DBD::ODBC does).
>=20
> > Apparently you can tell the JSON module to make everything strings:
> > http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOC ONVERT
>=20
> Here was an over simplification. I moved from JSON to JSON Syck. The
> JSON module is a purl perl module and does "the right thing"=20
> for me for
> all the cases I tried but I changed to JSON Syck which uses=20
> the libsyck
> library for speed. This explains the difference since JSON Syck is XS
> and will know the flags in the SV structure and JSON won't.
>=20
> > > As an aside (and probably a perl question rather than a=20
> DBI one) does
> > > anyone know why the type of a scalar changes when you use=20
> it on the
> > > right side of an assignment:
> > >
> > > perl -MData::Dumper -le '$a=3D"1"; print Dumper($a); $b +=3D $a; print
> > > Dumper($a);'
> > > $VAR1 =3D '1';
> > > $VAR1 =3D 1;
> >=20
> > The has-been-evaluated-as-a-number flag got set on $a when it was
> > evaluated as a number; then Dumper, with both available, chose the
> > number format.
>=20
> That's interesting to know, I'll look into that. I was rather=20
> surprised
> the right hand side of an evaluation changed the type of a variable
> on the right hand side.
>=20
> > > How does JSON and Data::Dumper know whether Perl thinks=20
> something is a
> > > number or a string?
> >=20
> > inspecting the flags; except pure-perl Dumper apparently uses a
> > regular expression
> > to identify numbers. Those are guesses. The source is available for
> > your inspection.
>=20
> I did look at it briefly but not extensively since it was not=20
> my primary
> problem.
>=20
> Thanks for your insights David, they have increased my understanding
> although I still feel stuck. I could move back to using the=20
> JSON module
> but I could run into problems with fields like house_name_or_number
> which I'm now guessing JSON will serialise as a number when it looks
> like a number and a string when not. If I stick with JSON Syck
> (which is a lot faster and I need the speed), I'm
> forced to do +0 on all database fields I know are a number and I don't
> like this at all. I guess I'm rather surprised to hit this=20
> issue without
> seeing anyone else with it after all this time using DBI.
>=20
> Martin
> --=20
> Martin J. Evans
> Easysoft Limited
> http://www.easysoft.com
>=20


- - - - - Appended by Scientific Atlanta, a Cisco company - - - - -=20
This e-mail and any attachments may contain information which is confidenti=
al, proprietary, privileged or otherwise protected by law. The information =
is solely intended for the named addressee (or a person responsible for del=
ivering it to the addressee). If you are not the intended recipient of this=
message, you are not authorized to read, print, retain, copy or disseminat=
e this message or any part of it. If you have received this e-mail in error=
, please notify the sender immediately by return e-mail and delete it from =
your computer.

RE: When is a string a number?

am 04.10.2006 22:07:49 von Martin.Evans

On Wed, 2006-10-04 at 15:50 -0400, Rutherdale, Will wrote:
> This is a very interesting problem, Martin, because most of us ordinary
> Perl users don't experience these issues on a regular basis if ever.
>
> Just one thought, maybe a bit naiive, but wouldn't it be possible to
> copy all the values you've got from the database and do the extra
> computations on the copies separately? It might avoid polluting the
> originals before passing them off to JSON.
>
> -Will

It would be possible if it were not for the facts that:

a) the values computed from the DB values are required in the JSON
output. i.e. the perl code does some calculations on DB values that are
logic based and difficult to perform in SQL before being transformed
into JSON.

b) the computations on the DB retrieved values are dependent on what
values are returned so sometimes I do a $a += 1 before transforming to
JSON and sometimes I don't. This itself means that I can end up with
strings in the JSON or numbers depending on what logic is applied to the
DB returned values.

Martin
--
Martin J. Evans
Easysoft Limited
http://www.easysoft.com


> > -----Original Message-----
> > From: Martin J. Evans [mailto:martin.evans@easysoft.com]
> > Sent: Wednesday 04 October 2006 15:36
> > To: dbi-users
> > Subject: Re: When is a string a number?
> >
> >
> > On Wed, 2006-10-04 at 14:00 -0500, David Nicol wrote:
> > > On 10/4/06, Martin J. Evans wrote:
> > >
> > > > With DBI/DBD::Oracle all values read from the database
> > are scalars. As
> > > > everyone will know, whether something read from the
> > database is a string
> > > > or a number in Perl purely depends on the context it is
> > used in so:
> > >
> > > internally, there are flags in the SV structure that hint
> > as to what conversions
> > > have been done on the SV already, for efficiency's sake.
> >
> > As I suspected.
> >
> > > > I have to admit I don't know how the JSON module knows
> > what is a number
> > > > and what is a string in Perl but I see the same issue
> > with Data::Dumper
> > > > so I presume there must be some way to find out if a perl
> > scalar is a
> > > > number or a string.
> > >
> > > It seems like the pure perl Dumper tests values with a
> > regex and optimizes
> > > to numbers when the stringification matches a number template, while
> > > the XS version checks the flags in the SV structure.
> >
> > OK, I only introduced the Data::Dumper example as it seemed
> > very similar
> > but read on - I over simplified a little. However, it seems strange to
> > me that the pure perl Data::Dumper does something different. If I was
> > using Data::Dumper for data interchange and switched from the
> > pure perl
> > to the XS version (or vice versa) and then called another XS based
> > module with the results - they could be different.
> >
> > > > The problem gets a lot worse for me since I do some
> > arithmetic on values
> > > > pulled from the database before converting them to JSON
> > and this is
> > > > where Perl seems to change them into numbers e.g.
> > >
> > > > I don't want to have to do ($var +0) on all the number
> > fields I pull
> > > > from the database (to turn them into numbers) and neither
> > do I want to
> > > > do a '$var .= ""' (to turn all the fields into strings).
> > >
> > > you might have to do exactly that. "$var" will produce a
> > string version. Sorry.
> >
> > This seems amazing - I really did not want to do this. I
> > guess what I'm
> > saying is that it would be really useful for a DBD to mark a
> > number as a
> > number so it does not start as a string and mutate to a
> > number when I do
> > arithmetic on it. Most DBDs must know if the column is a number one or
> > not (certainly DBD::ODBC does).
> >
> > > Apparently you can tell the JSON module to make everything strings:
> > > http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOC ONVERT
> >
> > Here was an over simplification. I moved from JSON to JSON Syck. The
> > JSON module is a purl perl module and does "the right thing"
> > for me for
> > all the cases I tried but I changed to JSON Syck which uses
> > the libsyck
> > library for speed. This explains the difference since JSON Syck is XS
> > and will know the flags in the SV structure and JSON won't.
> >
> > > > As an aside (and probably a perl question rather than a
> > DBI one) does
> > > > anyone know why the type of a scalar changes when you use
> > it on the
> > > > right side of an assignment:
> > > >
> > > > perl -MData::Dumper -le '$a="1"; print Dumper($a); $b += $a; print
> > > > Dumper($a);'
> > > > $VAR1 = '1';
> > > > $VAR1 = 1;
> > >
> > > The has-been-evaluated-as-a-number flag got set on $a when it was
> > > evaluated as a number; then Dumper, with both available, chose the
> > > number format.
> >
> > That's interesting to know, I'll look into that. I was rather
> > surprised
> > the right hand side of an evaluation changed the type of a variable
> > on the right hand side.
> >
> > > > How does JSON and Data::Dumper know whether Perl thinks
> > something is a
> > > > number or a string?
> > >
> > > inspecting the flags; except pure-perl Dumper apparently uses a
> > > regular expression
> > > to identify numbers. Those are guesses. The source is available for
> > > your inspection.
> >
> > I did look at it briefly but not extensively since it was not
> > my primary
> > problem.
> >
> > Thanks for your insights David, they have increased my understanding
> > although I still feel stuck. I could move back to using the
> > JSON module
> > but I could run into problems with fields like house_name_or_number
> > which I'm now guessing JSON will serialise as a number when it looks
> > like a number and a string when not. If I stick with JSON Syck
> > (which is a lot faster and I need the speed), I'm
> > forced to do +0 on all database fields I know are a number and I don't
> > like this at all. I guess I'm rather surprised to hit this
> > issue without
> > seeing anyone else with it after all this time using DBI.
> >
> > Martin
> > --
> > Martin J. Evans
> > Easysoft Limited
> > http://www.easysoft.com
> >
>
>
> - - - - - Appended by Scientific Atlanta, a Cisco company - - - - -
> This e-mail and any attachments may contain information which is confidential, proprietary, privileged or otherwise protected by law. The information is solely intended for the named addressee (or a person responsible for delivering it to the addressee). If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete it from your computer.
>
>