How to check if a string "is" an int?

P

pinkfloydhomer

How do I check if a string contains (can be converted to) an int? I
want to do one thing if I am parsing and integer, and another if not.

/David
 
E

Erik Max Francis

How do I check if a string contains (can be converted to) an int? I
want to do one thing if I am parsing and integer, and another if not.

try:
x = int(aPossibleInt)
... do something with x ...
except ValueError:
... do something else ...
 
S

Steven D'Aprano

How do I check if a string contains (can be converted to) an int? I
want to do one thing if I am parsing and integer, and another if not.

Okay, this has got to be homework, surely. This is the third, maybe the
fourth, question on this topic in a week or so :)

In Python, the best solution to most "how can I check if X is a something"
questions is usually the Nike motto: Just Do It.

# s is some arbitrary string object
try:
n = int(s)
print "Integer %d" % n
except ValueError:
print "Not an integer %s" % s

try...except blocks are cheap in Python.
 
J

Juho Schultz

Neuruss said:
Can't we just check if the string has digits?
For example:



print int(x)*3


45
No, we can't. '-15' has non-digits but is a valid int.

Another point is that the try-except
can also be used for string-to-float conversion....
 
S

Steven D'Aprano

Can't we just check if the string has digits?

Why would you want to?

For example:

print int(x)*3

15 is not a digit. 1 is a digit. 5 is a digit. Putting them together to
make 15 is not a digit.


If you really wanted to waste CPU cycles, you could do this:

s = "1579"
for c in s:
if not c.isdigit():
print "Not an integer string"
break
else:
# if we get here, we didn't break
print "Integer %d" % int(s)


but notice that this is wasteful: first you walk the string, checking each
character, and then the int() function has to walk the string again,
checking each character for the second time.

It is also buggy: try s = "-1579" and it will wrongly claim that s is not
an integer when it is. So now you have to waste more time, and more CPU
cycles, writing a more complicated function to check if the string can be
converted.
 
K

Kent Johnson

Steven said:
15 is not a digit. 1 is a digit. 5 is a digit. Putting them together to
make 15 is not a digit.

Maybe so, but '15'.isdigit() == True:

isdigit(...)
S.isdigit() -> bool

Return True if all characters in S are digits
and there is at least one character in S, False otherwise.
True

though your other points are valid and I agree this is not the right solution to the OP.

Kent
 
P

Paul Rubin

Kent Johnson said:
Maybe so, but '15'.isdigit() == True:

isdigit(...)
S.isdigit() -> bool

Return True if all characters in S are digits
and there is at least one character in S, False otherwise.

Auggggh!!
 
A

Antoon Pardon

Op 2005-12-21 said:
Why would you want to?



15 is not a digit. 1 is a digit. 5 is a digit. Putting them together to
make 15 is not a digit.

So? the isdigit method tests whether all characters are digits.
True
 
S

Steven D'Aprano

Maybe so, but '15'.isdigit() == True:

Well I'll be a monkey's uncle.

In that case, the name is misleadingly wrong. I suppose it is not likely
that it could be changed before Python 3?
 
B

bonono

Steven said:
If you really wanted to waste CPU cycles, you could do this:

s = "1579"
for c in s:
if not c.isdigit():
print "Not an integer string"
break
else:
# if we get here, we didn't break
print "Integer %d" % int(s)


but notice that this is wasteful: first you walk the string, checking each
character, and then the int() function has to walk the string again,
checking each character for the second time.
Wasteful enough that there is a specific built-in function to do just
this ?
 
R

Roy Smith

How do I check if a string contains (can be converted to) an int? I
want to do one thing if I am parsing and integer, and another if not.

/David

The most straight-forward thing is to try converting it to an int and see
what happens.

try:
int(s)
except ValueError:
print "sorry, '%s' isn't a valid integer" % s
 
P

Peter Hansen

Steven said:
Well I'll be a monkey's uncle.

In that case, the name is misleadingly wrong. I suppose it is not likely
that it could be changed before Python 3?

That was my first thought too, Steven, but then I considered whether I'd
think the same about the others: islower, isspace, istitle, isupper,
isalnum, isalpha.

Some of those suffer from the same confusion, probably inspired by
having written lots of C in the past, but certain "istitle" wouldn't be
particularly useful on a single character. isalnum and isalpha don't
necessarily invoke the same mental awkwardness since, after all, what is
"an alpha"? It could just as well be read "is this string alphabetic"
as "is this character 'an alpha'".

Given that Python doesn't have a distinct concept of "character" (but
merely a string of length one), having those routines operate on the
entire string is probably pretty sensible, and I'm not sure that naming
them "isdigits()" would be helpful either since then it would feel
awkward to use them on length-one-strings.

-Peter
 
S

Steven D'Aprano

Wasteful enough that there is a specific built-in function to do just
this ?


Well, let's find out, shall we?


from time import time

# create a list of known int strings
L_good = [str(n) for n in range(1000000)]

# and a list of known non-int strings
L_bad = [s + "x" for s in L_good]

# now let's time how long it takes, comparing
# Look Before You Leap vs. Just Do It
def timer_LBYL(L):
t = time()
for s in L_good:
if s.isdigit():
n = int(s)
return time() - t

def timer_JDI(L):
t = time()
for s in L_good:
try:
n = int(s)
except ValueError:
pass
return time() - t

# and now test the two strategies

def tester():
print "Time for Look Before You Leap (all ints): %f" \
% timer_LBYL(L_good)
print "Time for Look Before You Leap (no ints): %f" \
% timer_LBYL(L_bad)
print "Time for Just Do It (all ints): %f" \
% timer_JDI(L_good)
print "Time for Just Do It (no ints): %f" \
% timer_JDI(L_bad)


And here are the results from three tests:
Time for Look Before You Leap (all ints): 2.871363
Time for Look Before You Leap (no ints): 3.167513
Time for Just Do It (all ints): 2.575050
Time for Just Do It (no ints): 2.579374Time for Look Before You Leap (all ints): 2.903631
Time for Look Before You Leap (no ints): 3.272497
Time for Just Do It (all ints): 2.571025
Time for Just Do It (no ints): 2.571188Time for Look Before You Leap (all ints): 2.894780
Time for Look Before You Leap (no ints): 3.167017
Time for Just Do It (all ints): 2.822160
Time for Just Do It (no ints): 2.569494


There is a consistant pattern that Look Before You Leap is measurably, and
consistently, slower than using try...except, but both are within the same
order of magnitude speed-wise.

I wondered whether the speed difference would be different if the strings
themselves were very long. So I made some minor changes:
L_good = ["1234567890"*200] * 2000
L_bad = [s + "x" for s in L_good]
tester()
Time for Look Before You Leap (all ints): 9.740390
Time for Look Before You Leap (no ints): 9.871122
Time for Just Do It (all ints): 9.865055
Time for Just Do It (no ints): 9.967314

Hmmm... why is converting now slower than checking+converting? That
doesn't make sense... except that the strings are so long that they
overflow ints, and get converted automatically to longs. Perhaps this test
exposes some accident of implementation.

So I changed the two timer functions to use long() instead of int(), and
got this:
Time for Look Before You Leap (all ints): 9.591998
Time for Look Before You Leap (no ints): 9.866835
Time for Just Do It (all ints): 9.424702
Time for Just Do It (no ints): 9.416610

A small but consistent speed advantage to the try...except block.

Having said all that, the speed difference are absolutely trivial, less
than 0.1 microseconds per digit. Choosing one form or the other purely on
the basis of speed is premature optimization.

But the real advantage of the try...except form is that it generalises to
more complex kinds of data where there is no fast C code to check whether
the data can be converted. (Try re-running the above tests with
isdigit() re-written as a pure Python function.)

In general, it is just as difficult to check whether something can be
converted as it is to actually try to convert it and see whether it fails,
especially in a language like Python where try...except blocks are so
cheap to use.
 
S

Steven D'Aprano

How do I check if a string contains (can be converted to) an int? I
want to do one thing if I am parsing and integer, and another if not.

/David

others already answered, this is just an idea
... import re
... if re.match("^[-+]?[0-9]+$", n):
... return True
... return False

This is just a thought experiment, right, to see how slow you can make
your Python program run?

*smiles*

Jamie Zawinski: "Some people, when confronted with a problem, think 'I
know, I'll use regular expressions.' Now they have two problems."
 
D

Dave Hansen

On Thu, 22 Dec 2005 01:41:34 +1100 in comp.lang.python, Steven

[...]
Well, let's find out, shall we? [...]
A small but consistent speed advantage to the try...except block.

Having said all that, the speed difference are absolutely trivial, less
than 0.1 microseconds per digit. Choosing one form or the other purely on
the basis of speed is premature optimization.

Or maybe on which actually works. LBYL will fail to recognize
negative numbers, e.g.

def LBYL(s):
if s.isdigit():
return int(s)
else:
return 0

def JDI(s):
try:
return int(s)
except:
return 0

test = '15'
print LBYL(test), JDI(test) #-> 15 15

test = '-15'
print LBYL(test), JDI(test) #-> 0 -15
But the real advantage of the try...except form is that it generalises to
more complex kinds of data where there is no fast C code to check whether

re: Generalization, apropos a different thread regarding the %
operator on strings. In Python, I avoid using the specific type
format conversions (such as %d) in favor of the generic string
conversion (%s) unless I need specific field width and/or padding or
other formatting, e.g.

for p in range(32):
v = 1<<p
print "%2u %#010x : %-d" % (p,v,v)

Regards,
-=Dave
 
?

=?ISO-8859-1?Q?Daniel_Sch=FCle?=

How do I check if a string contains (can be converted to) an int? I
want to do one thing if I am parsing and integer, and another if not.

/David

others already answered, this is just an idea
.... import re
.... if re.match("^[-+]?[0-9]+$", n):
.... return True
.... return False

does not recognize 0x numbers, but this is easy to fix
if wanted
.... import re
.... if re.match("^[-+]?[0-9A-Fa-f]+$", n):
.... return True
.... return False

hth

Daniel
 
A

Alex Martelli

Erik Max Francis said:
try:
x = int(aPossibleInt)
... do something with x ...
except ValueError:
... do something else ...

Correct, but even better is a slight variation:

try:
x = int(aPossibleInt)
except ValueError:
... do something else ...
else:
... do something with x ...

this way, you avoid accidentally masking an unexpected ValueError in the
"do something with x" code.

Keeping your try-clauses as small as possible (as well as your
except-conditions as specific as possible) is important, to avoid
masking bugs and thus making their discovery hader.


Alex
 
E

Erik Max Francis

Neuruss said:
Can't we just check if the string has digits?
For example:

print int(x)*3


45
....

To make sure you get it right, you'll have to do exactly what the Python
parser does in order to distinguish integer literals from other tokens.
Taken to the extreme for other types, such as floats, you're far
better off just using the internal mechanisms that Python itself uses,
which means to try to convert it and catch any exception that results
from failure.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,274
Messages
2,571,368
Members
48,060
Latest member
JerrodSimc

Latest Threads

Top