Identify if a scalar is int, double or text

Klaus · May 11, 2007

Hello,

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');

The first is obviously an int, the second is text, the third a double
and the last is text again.

I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars, so that

print type_id($_), ' ' for (@L); print "\n";

results in the following output:

int text double text

I have found a solution where I use Devel:

eek, call Dump(), redirect
STDERR into an "in-memory" file ( \$data ) and analyse the "in-memory"
content:

$data =~ /^SV = IV(0x/ # for ints
$data =~ /^SV = NV(0x/ # for doubles
$data =~ /^SV = PV(0x/ # for text

but I wanted to know whether there is a better way.

**********************************************************
Here is my solution:

use strict;
use warnings;

use Devel:

eek;

print STDERR "Beginning of program\n";

my @L = (3, '3', 3.0, '3.0');

print type_id($_), ' ' for (@L);
print "\n";

print STDERR "End of program\n";

sub type_id {

# ============================
# At first I could not get the "in-memory"
# working. It took me a while before I
# found the all important documentation
# in perldoc -f open:
#
# [...] if you try to re-open STDOUT or
# STDERR as an "in memory" file, you have
# to close it first [...]
#
# As a consequence, a simple
# local *STDERR;
# open STDERR, '>', \$data;
# does not work.
#
# The following redirects STDERR into an
# "in-memory" file, but leaves STDERR
# closed on exit:
# local *STDERR = *STDERR;
# close STDERR;
# open STDERR, '>', \$data;
#
# so we have to dup STDERR first and
# restore STDERR manually at the end
# (...knowing that if the restore fails,
# we won't have STDERR anymore):
# ============================

# dup STDERR
open my $olderr, '>&', \*STDERR
or return "?002 [dup STDERR: $!]";
close STDERR
or return "?005 [close STDERR: $!]";

my $data = '';
open STDERR, '>', \$data;
Dump $_[0];
close STDERR;

# restore STDERR
open STDERR, '>&', $olderr or die;

if ($data =~ m{^SV = (.V)}) {
if ($1 eq 'IV') { return 'int' }
if ($1 eq 'NV') { return 'double' }
if ($1 eq 'PV') { return 'text' }
return "?020 [Invalid: SV = $1]";
}
return "?030 [Err: ".substr($data, 0, 12)."]";
}

Sisyphus · May 11, 2007

Klaus said:
Hello,

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');

The first is obviously an int, the second is text, the third a double
and the last is text again.

I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars, so that

print type_id($_), ' ' for (@L); print "\n";

results in the following output:

int text double text

I have found a solution where I use Devel:eek, call Dump(), redirect
STDERR into an "in-memory" file ( \$data ) and analyse the "in-memory"
content:

$data =~ /^SV = IV(0x/ # for ints
$data =~ /^SV = NV(0x/ # for doubles
$data =~ /^SV = PV(0x/ # for text

but I wanted to know whether there is a better way.

There's probably already a module that does this. (Scalar-Util-Numeric might
be one such module.)
It's fairly straightforward with XS or Inline::C. Be aware that a variable
can change from one type to another in rather sneaky ways - as the following
demonstrates:

use warnings;
use Inline C => Config =>
BUILD_NOISY => 1;

use Inline C => <<'END_OF_C_CODE';

int _itsa (SV * x) {
if(SvIOK(x)) return 1;
if(SvNOK(x)) return 2;
if(SvPOK(x)) return 3;
}

END_OF_C_CODE

my @L = (3, '3', 3.0, '3.0');
print type_id($_), ' ' for (@L); print "\n";

# int text double text

my $string = '7';
print type_id($string), "\n"; # text
$string *= 1;
print type_id($string), "\n"; # int

my $var = ~0;
print type_id($var), "\n"; # int
$var *= -1;
print type_id($var), "\n"; # double (though that might
# differ on 64-bit perls)

sub type_id {
return "int" if (_itsa($_[0]) == 1);
return "double" if(_itsa($_[0]) == 2);
return "text" if(_itsa($_[0]) == 3);
return "unknown";
}

anno4000 · May 11, 2007

Klaus said:
Hello,

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');

The first is obviously an int, the second is text, the third a double
and the last is text again.

I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars, so that

print type_id($_), ' ' for (@L); print "\n";

results in the following output:

int text double text

It is very rare in Perl that you need to know these differences.

I have found a solution where I use Devel:eek, call Dump(), redirect
STDERR into an "in-memory" file ( \$data ) and analyse the "in-memory"
content:

[snip]

You can tell the difference between a number or a string from
the behavior of bitwise boolean operations:

print $_ & ~$_ ? 'str ' :'num ' for (3, '3', 3.0, '3.0');
print "\n";

There is no similar way to tell the difference between an integer
and a float that happens to have an integer value.

The B::* set of modules should have the means to get the info
without catching printed output.

Anno

Mirco Wahab · May 11, 2007

Klaus said:
Hello,

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');

The first is obviously an int, the second is text, the third a double
and the last is text again.

I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars, so that

print type_id($_), ' ' for (@L); print "\n";

results in the following output:

int text double text

As Sisyphus mentioned, that might be tricky and
not very reliable, because Perl 'DWYM'-ifies
it's scalars every time depending on the context.

The situations where you would really need this
detailed information would also (imho) justify
looking deep into the perl (as Sisyphus did with
XS code or with the B modules).

After reading Anno's idea (bitwise complement), I
tried to combine this with Scalar::Util::Numeric
by a quick hack:

...
use Scalar::Util::Numeric qw(isnum isint isfloat);
no strict 'refs';

for my $l (3, '3', 3.0, '3.0', 'abc') {
print "[ $l ]\tlooks like ", $l & ~$l ? 'string' :'number', ' | ';
print map "$_, ", grep $_->($l), qw' isint isfloat ' ;
print "\n"
}
...

which prints:

[ 3 ] looks like number | isint,
[ 3 ] looks like string | isint,
[ 3 ] looks like number | isint,
[ 3.0 ] looks like string | isfloat,
[ abc ] looks like string |

Just an idea ...

Regards

M.

Klaus · May 11, 2007

Klaus said:
Klaus said:

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');
I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars
I have found a solution where I use Devel:eek, call Dump(), redirect
STDERR into an "in-memory" file ( \$data ) and analyse the "in-memory"
content:

Click to expand...

[snip]

The B::* set of modules should have the means to get the info
without catching printed output.

That's it - it's the B module which I was looking for (I knew it
existed, but I never thought I would ever use it).

Thanks a million !!

I am now using a simple "use B qw(svref_2object class);" and my
program is now much smaller and more to the point without any
redirection of STDERR:

Here is my new program:
============================
use strict;
use warnings;

use B qw(svref_2object class);

my @L = (3, '3', 3.0, '3.0');

print type_id($_), ' ' for (@L);
print "\n";

sub type_id {
my $cl = class svref_2object \$_[0];
if ($cl eq 'IV') { return 'int' }
if ($cl eq 'NV') { return 'double' }
if ($cl eq 'PV') { return 'text' }
return "?020 [Invalid: class(SV-Object) = '$cl']";
}
============================

Klaus · May 11, 2007

Be aware that a variable can change from one type to
another in rather sneaky ways - as the following
demonstrates:

[ snip ]

# int text double text

my $string = '7';
print type_id($string), "\n"; # text
$string *= 1;
print type_id($string), "\n"; # int

my $var = ~0;
print type_id($var), "\n"; # int
$var *= -1;
print type_id($var), "\n"; # double (though that might
# differ on 64-bit perls)

I see your point.

Here is another example of how a variable can change from one type to
another in rather sneaky ways:

=============================
use strict;
use warnings;

use B qw(svref_2object class);

my $num = 1;
while (1) {
my $id = type_id($num);
printf "id = %-6s num = %s\n", $id, $num;
last if $id ne 'int';
$num *= 10;
}

while ($num > 1) {
$num /= 10;
my $id = type_id($num);
printf "id = %-6s num = %s\n", $id, $num;
}

sub type_id {
my $cl = class svref_2object \$_[0];
if ($cl =~ m{IV$}) { return 'int' }
if ($cl =~ m{NV$}) { return 'double' }
if ($cl =~ m{PV$}) { return 'text' }
return "[$cl]";
}
=============================

And here is the output:

=============================
id = int num = 1
id = int num = 10
id = int num = 100
id = int num = 1000
id = int num = 10000
id = int num = 100000
id = int num = 1000000
id = int num = 10000000
id = int num = 100000000
id = int num = 1000000000
id = double num = 10000000000
id = double num = 1000000000
id = double num = 100000000
id = double num = 10000000
id = double num = 1000000
id = double num = 100000
id = double num = 10000
id = double num = 1000
id = double num = 100
id = double num = 10
id = double num = 1
=============================

anno4000 · May 11, 2007

Klaus said:
Klaus said:

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');
I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars
I have found a solution where I use Devel:eek, call Dump(), redirect
STDERR into an "in-memory" file ( \$data ) and analyse the "in-memory"
content:

Click to expand...

[snip]

The B::* set of modules should have the means to get the info
without catching printed output.

Click to expand...

That's it - it's the B module which I was looking for (I knew it
existed, but I never thought I would ever use it).

Thanks a million !!

Glad I could help, but I'm asking myself why you need to
know these differences. Perl works very hard to let us deal with
integers, floats, strings and references in a unified way. In
particular, the difference of a float and an int is usually
irrelevant. If it matters to your program, you are probably
not doing something the Perl way.

Anno

anno4000 · May 11, 2007

After reading Anno's idea (bitwise complement), I

I got that slightly wrong.

my $type = $_ & ~$_ ? 'str' : 'num';

takes an empty string for a number.

my $type = !length || $_ & ~$_ ? 'str' : 'num';

gets it right.

Anno

Klaus · May 12, 2007

Klaus said:
Klaus said:

On May 11, 2:49 pm, (e-mail address removed)-berlin.de wrote:

I have a list of 4 scalars
my @L = (3, '3', 3.0, '3.0');
I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars
I have found a solution where I use Devel:eek, call Dump(), redirect
STDERR into an "in-memory" file ( \$data ) and analyse the "in-memory"
content:

Click to expand...

[snip]

Click to expand...

The B::* set of modules should have the means to get the info
without catching printed output.

Click to expand...

Click to expand...

That's it - it's the B module which I was looking for (I knew it
existed, but I never thought I would ever use it).

Click to expand...

Thanks a million !!

Click to expand...

Glad I could help, but I'm asking myself why you need to
know these differences. Perl works very hard to let us deal with
integers, floats, strings and references in a unified way. In
particular, the difference of a float and an int is usually
irrelevant. If it matters to your program, you are probably
not doing something the Perl way.

I agree with that.

But I have a case where I need more control over the way Perl manages
numerical values.

At one point in the future, I will certainly have to read "perldoc
perlxstut" to gain understanding of how I can use C to control
numerical values, but being on Windows XP / Activestate Perl 5.8.8
with next to no background in C, I thought I tackle the problem first
with whatever tool / module / function Perl is giving me.

Here is my case:

I have written a Perl program where some of the variables have a
special requirement, that is they deal exclusively with monetary Euro
values (i.e. the requirement is that calculations happens in normal
IEEE float/double arithmetic, and assignment to any special "monetary"
variable must be rounded to the 2nd decimal).

My first approach was to use sprintf("%.2f", $var) whenever I assign
values to a "monetary Euro" variable, but this falls foul of the way
numerical values are stored internally (i.e. 0.10 can not be
represented as an exact double).

I worked around that by storing integer "cents" rather than "Euros": I
use sprintf("%.0f", $var) when I assign "monetary" values.

That works fine, but I found out that a "monetary" value above 20
million Euros ( > 2^31 cents ) is not stored as an integer, but as a
double. As I said, it seems to work and the double seems to stick to
its "integer" properties, but I want to monitor very closely what
happens with that "double" if I grow bigger and bigger values.

My final solution would be to tie variables to some C-subroutines that
(efficiently) do 4 things when I assign values to that variable:

1. Round values to the nearest integer (same as sprintf "%.0f", but
more efficient).
2. If that value fits into an int (i.e. -2^31 <= val <= 2^31-1), then
it must always be stored as an int.
3. If it is stored as a double, then make sure that double has no
decimal.
4. If a value is too big to be represented as a double with no
decimals, then
4a.) either set the variable to the maximum integral "double"
value,
4b.) or, depending on a compile time option, die.

-- Klaus

Dr.Ruud · May 12, 2007

Klaus schreef:

That works fine, but I found out that a "monetary" value above 20
million Euros ( > 2^31 cents ) is not stored as an integer, but as a
double. As I said, it seems to work and the double seems to stick to
its "integer" properties, but I want to monitor very closely what
happens with that "double" if I grow bigger and bigger values.

Never heared of bigints?
http://search.cpan.org/search?query=bigint&mode=all

Klaus · May 12, 2007

Klaus schreef:

Never heared of bigints?http://search.cpan.org/search?query=bigint&mode=all

Yes, I heard of the bigint::* modules, they construct a replacement of
all mathematical operations, but that carries an overhead and it lacks
control:

==> time/memory overhead: bigint variables are objects.
==> lack of control: there is no logical limit to how big a bigint
can get.

I want to stay with the standard built-in "int / double" arithmetic,
but I want additional control of the transition between the (as I see
it) 3 types of numerical values:

type 1. (int)
type 2. (double, with only integral values)
type 3. (double, with free decimal values)

This control should be implemented efficiently, hence my preference
for subroutines written in C.

Uri Guttman · May 12, 2007

K> I want to stay with the standard built-in "int / double" arithmetic,
K> but I want additional control of the transition between the (as I see
K> it) 3 types of numerical values:

K> type 1. (int)
K> type 2. (double, with only integral values)
K> type 3. (double, with free decimal values)

K> This control should be implemented efficiently, hence my preference
K> for subroutines written in C.

you can force integer match with use integer. but the normal way to
manage money exactly has always been to use an integer for the lowest
denomination. this means for dollars you count integer cents. only when
you convert in/out for printing do you deal with the decimal point. this
way you get total control, speed, accuracy, and easy rounding with
int(). if you want more accuracy (.1 cents) just make that your count
size. and you can overflow to float as a large int (> 32 bits) and get
47 bits of int so there will never be a loss of digits until you get to
be richer than uncle bill.

and i agree with paul, needing to know so much detail is not
perlish. using ints for money is what you want to do and is easy in
perl.

uri

Klaus · May 12, 2007

K> I want to stay with the standard built-in "int / double" arithmetic,
K> but I want additional control of the transition between the (as I see
K> it) 3 types of numerical values:

K> type 1. (int)
K> type 2. (double, with only integral values)
K> type 3. (double, with free decimal values)

K> This control should be implemented efficiently, hence my preference
K> for subroutines written in C.

you can force integer match with use integer. but the normal way to
manage money exactly has always been to use an integer for the lowest
denomination. this means for dollars you count integer cents. only when
you convert in/out for printing do you deal with the decimal point. this
way you get total control, speed, accuracy, and easy rounding with
int(). if you want more accuracy (.1 cents) just make that your count
size. and you can overflow to float as a large int (> 32 bits) and get
47 bits of int so there will never be a loss of digits until you get to
be richer than uncle bill.

and i agree with paul, needing to know so much detail is not
perlish. using ints for money is what you want to do and is easy in
perl.

I see, but my accounting program in Perl needs exact rounding (to the
cent) in a lot of places. As a consequence, the program is littered
with sprintf("%.0f") to get the rounding I need (I can't use integer,
because my monetary values exceed 32 bits and I don't want to use
bigint to avoid the overhead).

Let me please present a simple example which hopefully explains my
point:

Let's assume for the sake of the argument, I had to write an
accounting program in Perl which calculates the * exact * annual
interest rate at 6.3% for Bill's account balance:

=====================================
my $Bills_Balance = 52_625_587_938_19;
my $Annual_Interest = $Bills_Balance * 0.063;
my $Exact_Interest = sprintf("%.0f", $Bills_Balance * 0.063);
print "Balance = $Bills_Balance\n";
print "Interest = $Annual_Interest\n";
print "Exact Interest = $Exact_Interest\n";
=====================================
Output:
=====================================
Balance = 5262558793819
Interest = 331541204010.597
Exact Interest = 331541204011
=====================================

Maybe it is not perlish, but I would very much prefer to have the
variable "$Exact_Interest" tied to an efficient C-subroutine to
replace the ugly sprintf("%.0f").

I am not an expert on "tie" and I have just started reading "perldoc
perltie", but here is my vision of how I would like to be able to re-
write the example program:

=====================================
use Efficient_C_Subroutine_to_replace_rounding_with_sprintf;

my ($Annual_Interest, $Exact_Interest);
tie $Exact_Interest,
'Efficient_C_Subroutine_to_replace_rounding_with_sprintf;

my $Bills_Balance = 52_625_587_938_19;

$Annual_Interest = $Bills_Balance * 0.063;
$Exact_Interest = $Bills_Balance * 0.063;

print "Balance = $Bills_Balance\n";
print "Interest = $Annual_Interest\n";
print "Exact Interest = $Exact_Interest\n";
=====================================

I have searched CPAN, but I haven't yet found such a module.

Sisyphus · May 13, 2007

..
..

I see, but my accounting program in Perl needs exact rounding (to the
cent) in a lot of places. As a consequence, the program is littered
with sprintf("%.0f") to get the rounding I need

If sprintf("%.0f") is doing what you want, just wrap it in your own
subroutine and call that subroutine instead:

------------------------------------
use warnings;

$bal = 5262558793819;
$interest = $bal * 0.063;
print $interest, "\n";
round(\$interest);
print $interest, "\n";

sub round {
die "Must pass by reference" if !ref($_[0]);
${$_[0]} = sprintf("%.0f", ${$_[0]});
}
------------------------------------

(I can't use integer,
because my monetary values exceed 32 bits and I don't want to use
bigint to avoid the overhead).

For the types of calculations you're doing (as I understand it), the
overhead of Math::BigInt is not all that great. If the numbers get too
unwieldy (or the number of calculations you're doing gets into the hundreds
of thousands) use Math::GMP - which is about 5 times faster than M::BI for
the calculations I envisage you are doing, and about 700 times faster by the
time you get to multiplications involving 100,000-digit numbers.

Installing Math::GMP is as simple as:
ppm install http://theoryx5.uwinnipeg.ca/ppms/Math-GMP.ppd

Cheers,
Rob

Uri Guttman · May 13, 2007

K> I see, but my accounting program in Perl needs exact rounding (to the
K> cent) in a lot of places. As a consequence, the program is littered
K> with sprintf("%.0f") to get the rounding I need (I can't use integer,
K> because my monetary values exceed 32 bits and I don't want to use
K> bigint to avoid the overhead).

as i said you can use ints and overflow to doubles (not bigint, machine
double floats) that will hold much more than 32 bits for an integer. so
you are covered there unless your fractions are very small and need too
many of them. or use a 64 bit perl if you have the cpu for it.

K> Let's assume for the sake of the argument, I had to write an
K> accounting program in Perl which calculates the * exact * annual
K> interest rate at 6.3% for Bill's account balance:

K> =====================================
K> my $Bills_Balance = 52_625_587_938_19;
K> my $Annual_Interest = $Bills_Balance * 0.063;
K> my $Exact_Interest = sprintf("%.0f", $Bills_Balance * 0.063);

use int() instead. faster and cleaner.

K> Balance = 5262558793819
K> Interest = 331541204010.597
K> Exact Interest = 331541204011

perl -le '$b = 52_625_587_938_19; $intr = $b * .063 ; print $intr ; print int($intr); print int($intr + .5)'
331541204010.597
331541204010
331541204011

look ma! same output, simpler code!

K> Maybe it is not perlish, but I would very much prefer to have the
K> variable "$Exact_Interest" tied to an efficient C-subroutine to
K> replace the ugly sprintf("%.0f").

int() is an efficient c routine inside perl.

K> I have searched CPAN, but I haven't yet found such a module.

no need for a module. int() is builtin.

uri

Klaus · May 13, 2007

I want to write a subroutine type_id which returns either 'int',
'double', 'text' (or '?') for each of the scalars

Be aware that a variable can change from one type to
another in rather sneaky ways

Glad I could help, but I'm asking myself why you need to
know these differences

Never heared of bigints?
http://search.cpan.org/search?query=bigint&mode=all

I am not an expert on "tie" and I have just started
reading "perldoc perltie"

use int() instead. faster and cleaner.
int() is an efficient c routine inside perl.
no need for a module. int() is builtin.

With the help of the Perl documentation and the Perl community, I have
understood and resolved my problem.

Thanks.

Michele Dondi · May 13, 2007

With the help of the Perl documentation and the Perl community, I have
understood and resolved my problem.

Incidentally, you were having a XY problem:

http://perlmonks.org/?node=XY+problem

Michele

Text processing	29	Sep 26, 2011
Could someone help me with this source code?	5	Jan 20, 2007
open pragma appears to have no effect	4	Jan 24, 2010
Interesting PERL anamoly - confirmation and/or explanations welcomed	7	Jun 14, 2007
Extracting functions from C/C++ using Perl, would like Code Review Help if possible	1	Feb 6, 2009
FAQ 5.2 How do I change, delete, or insert a line in a file, or append to the beginning of a file?	0	Feb 24, 2011
visual studio compilling issues with solaris c code	1	Oct 14, 2009
ExtUtils::MakeMaker under Win32 - problem - missing double-quotes	0	May 8, 2006

Identify if a scalar is int, double or text

Klaus

Sisyphus

anno4000

Mirco Wahab

Klaus

Klaus

anno4000

anno4000

Klaus

Dr.Ruud

Klaus

Uri Guttman

Klaus

Sisyphus

Uri Guttman

Klaus

Michele Dondi

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads