Regarding numeric literals

C

chaitask

Hi folks,

As part of some operation that fetches disk free space from a Windows
machine, I get numbers like 3,45,198 (comma-separated....you get the
idea right?). Since a comma is a list separator in Perl, I can't use a
number like that in my operations and so I decided to replace globally
all commas with underscores because Perl seems to OK it.

The result 3_45_198 works well in numeric comparison scenarios (less
than, greater than, etc) but not in any operations like addition,
multiplication, etc (it actualls adds/multiplies the other operand
to/with 3, the leading number in 3_45_198, and returns the value). Does
it mean that numerals like this don't function as "proper" numerals?

Do I really have to delete the underscores or is there a way to make
Perl understand that I want it to function like a normal numeral?

Thanks a million,
Krish.
 
P

Paul Lalli

As part of some operation that fetches disk free space from a Windows
machine, I get numbers like 3,45,198 (comma-separated....you get the
idea right?). Since a comma is a list separator in Perl,

The comma is the list separator in Perl *source code*, not in variables
or values read in or calculated.
I can't use a
number like that in my operations and so I decided to replace globally
all commas with underscores because Perl seems to OK it.

No, it's not.
The result 3_45_198 works well in numeric comparison scenarios (less
than, greater than, etc)

No, it's not. And if you were enabling warnings, it would tell you
that. Why aren't you asking the computer for all the help you can get?
but not in any operations like addition,
multiplication, etc (it actualls adds/multiplies the other operand
to/with 3, the leading number in 3_45_198, and returns the value).

Which is precisely what it's supposed to do. '3_45_198' is not a
numeric value, it's a string value. When you use a string value in a
numeric context, it converts to a number by starting at the first
character of the string, and going until the first character that makes
it not a numeric value. In this case, that's the _.
Does it mean that numerals like this don't function as "proper" numerals?

Yes. You do not have numerals. You have strings that happen to
contain digits, along with other characters.
Do I really have to delete the underscores or is there a way to make
Perl understand that I want it to function like a normal numeral?

Delete the underscores/commas. Work with numbers in your code. When
you want to display them, display them with your commas. See the FAQ:
perldoc -q commas
Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq5.pod
How can I output my numbers with commas added?


So, read in your values from whatever the operation is, get rid of the
commas. Do whatever numeric computations you need, and then when you
print your values out, commify them:

my $disk_size = get_disk_size();
$disk_size = tr/,//d;
#do stuff with $disk_size;
print "Disk size: ", commify($disk_size), "\n";


And please enable
use strict;
use warnings;
in all your code.

Paul Lalli
 
M

Mirco Wahab

Thus spoke (e-mail address removed) (on 2006-10-05 16:31):
The result 3_45_198 works well in numeric comparison scenarios (less
than, greater than, etc) but not in any operations like addition,
multiplication, etc (it actualls adds/multiplies the other operand
to/with 3, the leading number in 3_45_198, and returns the value). Does
it mean that numerals like this don't function as "proper" numerals?

Hmmm, there seems to be another factor involved,
at least here it works:

$>perl -e "print 3 + 120300"
120303

$> perl -e "print 3 + 1_20_300"
120303

$> perl -e "print 3 + 1_20_30_0"
120303

$>perl -e "print 3 + 1_2_0_30_0"
120303

$>perl -e "print 3 + 1_2_0_3_0_0"
120303


Can you post a real example, where
it bails with the wrong numbers?

What Perl version is it? How
"long" are the numbers?

Regards

Mirco
 
P

Paul Lalli

Mirco said:
Thus spoke (e-mail address removed) (on 2006-10-05 16:31):

Hmmm, there seems to be another factor involved,
at least here it works:

$>perl -e "print 3 + 120300"
120303

$> perl -e "print 3 + 1_20_300"
120303

This is NOT what the op said. In the original scenario, it was
indicated that the values are the result of a computation, not literals
typed into the code. The distinction is all important. Underscores
are allowed in numeric literals, not in string variables or values.

$ perl -lwe'
print 3 + "1_20_300";
'
Argument "1_20_300" isn't numeric in addition (+) at -e line 2.
4
Can you post a real example, where
it bails with the wrong numbers?

What Perl version is it? How
"long" are the numbers?

None of that is relevant.

Paul Lalli
 
T

Ted Zlatanov

So, read in your values from whatever the operation is, get rid of the
commas. Do whatever numeric computations you need, and then when you
print your values out, commify them:

my $disk_size = get_disk_size();
$disk_size = tr/,//d;
#do stuff with $disk_size;
print "Disk size: ", commify($disk_size), "\n";

Would it perhaps be better to remove all \D characters? Some locales
may use '.' to separate thousands, for example.

$disk_size =~ s/\D//g;

Ted
 
P

Paul Lalli

Ted said:
Would it perhaps be better to remove all \D characters? Some locales
may use '.' to separate thousands, for example.

$disk_size =~ s/\D//g;

The OP did not specify if the numbers read in contained things such as
decimals or even positive or negative signs. All of those are valid
numbers, and would change the meaning of the number if they were
removed. All the OP specified is that the strings have commas, which
are not valid (at least, not in this context).

Paul Lalli
 
P

Paul Lalli

Mirco said:
Thus spoke Paul Lalli (on 2006-10-05 17:01):

Right, the '_' are only valid
in _string_literals_ defined
in the program (at compile?).

No. That's exactly the point. Underscores are NOT valid numeric
characters in string literals. They are only valid in *numeric*
literals.

my $x = 123_456;
my $y = "123_456";

$x is the number "one hundred, twenty-three thousand, four hundred
fifty-six".
$y is the string "one two three underscore four five six".

Values read or computed, of course, cannot contain the underscore no
matter how they were calculated.

Paul Lalli
 
C

chaitask

Hi Paul,

Thank you very much for all the informative comments. I did note down
your point about enabling 'strict' and 'warnings' pragmas in my code
from now on.

I'm wondering why Perl treats a numeric literal with underscores in it
and a variable with a value made of numbers punctuated by underscores
so differently......what difference would (or SHOULD perhaps?) it make
for it to evaluate 1_23_456 + 1 to 123457 and $num + 1 (where $num
holds 1_23_456) to 2 ?

Do you happen to know any concrete reasons behind this or it is one of
the design anomalies?

-Krish.
 
J

John W. Krahn

Thank you very much for all the informative comments. I did note down
your point about enabling 'strict' and 'warnings' pragmas in my code
from now on.

I'm wondering why Perl treats a numeric literal with underscores in it
and a variable with a value made of numbers punctuated by underscores
so differently......what difference would (or SHOULD perhaps?) it make
for it to evaluate 1_23_456 + 1 to 123457 and $num + 1 (where $num
holds 1_23_456) to 2 ?

Do you happen to know any concrete reasons behind this or it is one of
the design anomalies?

Because 1_23_456 is *code* and $num is *data*. Code is parsed and compiled by
perl while with data it is up to you to write the code to parse it.


John
 
P

Paul Lalli

I'm wondering why Perl treats a numeric literal with underscores in it
and a variable with a value made of numbers punctuated by underscores
so differently......what difference would (or SHOULD perhaps?) it make
for it to evaluate 1_23_456 + 1 to 123457 and $num + 1 (where $num
holds 1_23_456) to 2 ?

Do you happen to know any concrete reasons behind this or it is one of
the design anomalies?

Perl allows you to put an underscore within your numeric literals to
increase readability. The number
878912793109
Is difficult to read without actually counting the digits. In English,
if we're writing that number, we write it like
878,912,793,109
so that we can easily see we're talking about 878 billion, etc. In
Perl, as you noted in your original post, the comma is the list
separator, so obviously the comma cannot be used to improve
readability. So they allowed the use of the underscore to increase
readability instead:
my $x = 878_912_793_109;
We can tell that this is the number 878 billion (etc) rather than a
string, because we did not surround the value with quotes. If I had
actually wanted the string "eight seven eight underscore (etc)", I
would have written:
my $x = '878_912_793_109';

But how would you make that determination in a non-literal value? If I
put in the code:
my $x = <STDIN>;
and the user enters:
878_912_793_109
how is your program supposed to tell whether the user meant the number
878 billion (etc), or the string "8 7 8 underscore (etc)"? More to the
point, how is Perl supposed to tell?

Note that this is not unique to numeric literals. There are also
string literal special characters, that do not apply to values read or
computed.
$x = "foobar\n";
means "foobar, followed by newline", but
chomp($x = <STDIN>);
where the user types
foobar\n
means "foobar, slash, n".

Similarly,
$x = "\\";
$x .= "n";
does not make $x into a newline. It makes it into the two characters
string '\n'. Only literals get the specialness associated with these
characters.

Hope this helps,
Paul Lalli
 
C

chaitask

Hi Paul,

Yes, your point is crystal clear. Is this all mentioned in what one of
the posters called the Camel book, or you have any other links or
tutorials for me to go through? For instance, what you wrote above has
been very descriptive and easy to understand for someone relatively
young in Perl like me. A book of that tone or style is best suited for
many like me, I figure.

Thanks again for your time, your help has been invaluable.

-Krish
 
P

Paul Lalli

Yes, your point is crystal clear. Is this all mentioned in what one of
the posters called the Camel book

That's "Programming Perl", by Larry Wall, published by O'Reilly. And I
honestly have no idea if using _ in numeric literals is spelled out in
that detail in the Camel or not, because I don't currently have mine on
me.
, or you have any other links or tutorials for me to go through?

I always recommend starting with these three (fire up your command line
and type them in, one at a time):
perldoc perlintro
perldoc perlsyn
perldoc perldata

For instance, what you wrote above has
been very descriptive and easy to understand for someone relatively
young in Perl like me. A book of that tone or style is best suited for
many like me, I figure.

You might also try http://learn.perl.org, which has links to other good
references and tutorials, including the free e-book, "Beginning Perl".

Paul Lalli
 
J

Jim Gibson

Thus spoke Paul Lalli (on 2006-10-05 22:00):

says all what we discussed here (Chap.2, 2.6.1):

<Larry>
... because Perl uses the comma as a list separator,
you cannot use it to separate the thousands in a large
number. Perl does allow you to use an underscore
character instead. The underscore only works within
literal numbers specified in your program, not for
strings functioning as numbers or data read from
somewhere else ...
</Larry>

He then also explains the workings of
the number system designators (0 & 0x).

(I hope reading a 'quoted part' from a copyrighted
material in a context with your own eyes is legal
in «your country» ...)

Even in a country as authoritarian and repressive as the U. S. of A.,
quoting excerpts of copyrighted works for teaching purposes is
considered "fair use."

http://www.copyright.gov/fls/fl102.html
 
T

Ted Zlatanov

On 5 Oct 2006, (e-mail address removed) wrote:

Ted Zlatanov wrote: >
On 5 Oct 2006, (e-mail address removed) wrote: >
The OP did not specify if the numbers read in contained things such as
decimals or even positive or negative signs.

Well, yes, they are disk sizes, we know they are positive usually :)
All of those are valid numbers, and would change the meaning of the
number if they were removed. All the OP specified is that the
strings have commas, which are not valid (at least, not in this
context).

OK. OP, just note that your code may mysteriously fail in European
locales for example.

Ted
 
P

Peter J. Holzer

I'm wondering why Perl treats a numeric literal with underscores in it
and a variable with a value made of numbers punctuated by underscores
so differently......what difference would (or SHOULD perhaps?) it make
for it to evaluate 1_23_456 + 1 to 123457 and $num + 1 (where $num
holds 1_23_456) to 2 ?
[...]
But how would you make that determination in a non-literal value? If I
put in the code:
my $x = <STDIN>;
and the user enters:
878_912_793_109
how is your program supposed to tell whether the user meant the number
878 billion (etc), or the string "8 7 8 underscore (etc)"? More to the
point, how is Perl supposed to tell?

If the user enters the line "878912793109", how is perl supposed to
tell whether the user meant a string string or a number?

It doesn't know. For now, it is just a string. Only when the scalar
is used in a numeric context it is gets a numerical value in addition to
the string value. Perl computes the the numerical value by parsing the
string value according to some rules.

These rules are not the same as numeric constants in perl code: For
example, the base is always ten:

the numerical literal 012 has the value ten (zero times sixty-four plus
one times eight plus two), but the string '012' has the numerical value
twelve (zero times hundred plus one times ten plus two).

Newer versions of perl (it works with 5.8.8, but not with 5.8.4)
recognize "inf" and "nan" in strings, but you can't use them as
numerical literals (you get the error 'Bareword "inf" not allowed ...').

As the OP noted, underscores are ignored in numerical literals, but not in
strings.

Why are these rules as they are?

The main objective is probably reversability: If a numerical value is
converted to a string and back, it should not change. So perl must be
able to parse all number formats it produces: decimal integers, decimal
fractions, scientific notation and the special values "inf" and "nan".

However, it never produces strings with leading zeros. So why does it
parse them as decimals and not as octal? I guess that leading zeros in
decimal numbers are relatively frequent in "real world" files, while
numbers in mixed base in C notation are not. So ignoring leading zeros
is less surprising.

As for underscores, I don't have a good explanation. Why doesn't perl
ignore underscores in strings just like in numerical literals? I don't
see where it would do any harm, but it would be handy in some cases.

hp
 
P

Peter J. Holzer

Modulo the issues of accuracy and precision.

Well, I wrote it *should not* change, not that it *does not* change :).

However, I believe the following statements are true:

1) It is always possible to convert a binary floating point number to a
decimal representation, where the mathematically exact value of the
decimal representation differs from the mathematically exact value of
the binary number by less than 1/2 of the value of the least
significant bit of the binary number.

2) It is always possible to convert a decimal number to a binary
floating point number of sufficient range, so that the mathematically
exact values of the decimal and binary numbers differ by at most
1/2 of the value of the least significant bit of the binary number.

If both are true, than lossless conversion from binary to decimal and
back are possible. A high quality implementation of sprintf/sscanf
should get it right.

Note that neither conversion is trivial. I would not be surprised if
perl's default conversion is not lossless, especially if it depends on
routines from a platform's native library.

hp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,816
Latest member
nipsseyhussle

Latest Threads

Top