palindrome iteration

J

Josh English

This whole conversation got interesting, so I thought I'd run some
speed tests:

The code:
from timeit import Timer

def is_palindrome_recursive(s):
if len(s) <= 1:
return True
if s[0] != s[-1]:
return False
else:
return is_palindrome(s[1:-1])

def is_palindrome_slice(s):
return s == s[::-1]

def is_palindrome_list(s):
l = list(s)
l.reverse()
return s == ''.join(l)

def is_palindrome_reversed(s):
return s == ''.join(reversed(s))

t = Timer("is_palindrome_recursive('madamimadam')", "from __main__
import is_palindrome_recursive")
print "is_palindrome_recursive", min(t.repeat())

t = Timer("is_palindrome_slice('madamimadam')", "from __main__ import
is_palindrome_slice")
print "is_palindrome_slice", min(t.repeat())

t = Timer("is_palindrome_list('madamimadam')", "from __main__ import
is_palindrome_list")
print "is_palindrome_list", min(t.repeat())

t = Timer("is_palindrome_reversed('madamimadam')", "from __main__
import is_palindrome_reversed")
print "is_palindrome_reversed", min(t.repeat())

The results:
is_palindrome_recursive 6.32680866827
is_palindrome_slice 1.23618350114
is_palindrome_list 4.60104846653
is_palindrome_reversed 5.99355296513

The slice method is uglier, I have to admit, but it's the fastest of
these four on my machine.

Josh
 
M

Matteo Landi

Well, I tried the also the solution posted above (recursive w/o
slicing and iterative), and I discovered they were the slowest..

is_palindrome_recursive 2.68151649808
is_palindrome_slice 0.44510699381
is_palindrome_list 1.93861944217
is_palindrome_reversed 3.28969831976
is_palindrome_recursive_no_slicing 6.78929775328
is_palindrome_iterative 4.88826141315

Nothing to say about the iterative function, but the benchmark of the
recursive was unexpected, at least for my point of view: do you think
it is due to the try/except overhead?

This whole conversation got interesting, so I thought I'd run some
speed tests:

The code:
from timeit import Timer

def is_palindrome_recursive(s):
   if len(s) <= 1:
       return True
   if s[0] != s[-1]:
       return False
   else:
       return is_palindrome(s[1:-1])

def is_palindrome_slice(s):
   return s == s[::-1]

def is_palindrome_list(s):
   l = list(s)
   l.reverse()
   return s == ''.join(l)

def is_palindrome_reversed(s):
   return s == ''.join(reversed(s))

t = Timer("is_palindrome_recursive('madamimadam')", "from __main__
import is_palindrome_recursive")
print "is_palindrome_recursive", min(t.repeat())

t = Timer("is_palindrome_slice('madamimadam')", "from __main__ import
is_palindrome_slice")
print "is_palindrome_slice", min(t.repeat())

t = Timer("is_palindrome_list('madamimadam')", "from __main__ import
is_palindrome_list")
print "is_palindrome_list", min(t.repeat())

t = Timer("is_palindrome_reversed('madamimadam')", "from __main__
import is_palindrome_reversed")
print "is_palindrome_reversed", min(t.repeat())

The results:
is_palindrome_recursive 6.32680866827
is_palindrome_slice 1.23618350114
is_palindrome_list 4.60104846653
is_palindrome_reversed 5.99355296513

The slice method is uglier, I have to admit, but it's the fastest of
these four on my machine.

Josh
 
A

Arnaud Delobelle

Matteo Landi said:
Well, I tried the also the solution posted above (recursive w/o
slicing and iterative), and I discovered they were the slowest..

is_palindrome_recursive 2.68151649808
is_palindrome_slice 0.44510699381
is_palindrome_list 1.93861944217
is_palindrome_reversed 3.28969831976
is_palindrome_recursive_no_slicing 6.78929775328
is_palindrome_iterative 4.88826141315

What are the last two functions?

I suggest another:

def is_palindrome(s):
return all(map(str.__eq__, s, reversed(s)))

:)
Nothing to say about the iterative function, but the benchmark of the
recursive was unexpected, at least for my point of view: do you think
it is due to the try/except overhead?

This whole conversation got interesting, so I thought I'd run some
speed tests:

The code:
from timeit import Timer

def is_palindrome_recursive(s):
   if len(s) <= 1:
       return True
   if s[0] != s[-1]:
       return False
   else:
       return is_palindrome(s[1:-1])

This should be return is_palindrome_recursive(s[1:-1]). If this is
copy-pasted, then you may call a different is_palindrome function and
invalidate the timings!

[...]
 
M

Matteo Landi

I thought they reached you. Here they are again:

def palindrome(str, i=0, j=-1):
try:
if str == str[j]:
return palindrome(str, i + 1, j - 1)
return False
except IndexError:
return True

def palindrome(str, i=0, j=-1):
try:
while True:
if str != str[j]:
return False
i, j = i + 1, j - 1
return True
except IndexError:
return True

Matteo Landi said:
Well, I tried the also the solution posted above (recursive w/o
slicing and iterative), and I discovered they were the slowest..

is_palindrome_recursive 2.68151649808
is_palindrome_slice 0.44510699381
is_palindrome_list 1.93861944217
is_palindrome_reversed 3.28969831976
is_palindrome_recursive_no_slicing 6.78929775328
is_palindrome_iterative 4.88826141315

What are the last two functions?

I suggest another:

def is_palindrome(s):
   return all(map(str.__eq__, s, reversed(s)))

:)
Nothing to say about the iterative function, but the benchmark of the
recursive was unexpected, at least for my point of view: do you think
it is due to the try/except overhead?

This whole conversation got interesting, so I thought I'd run some
speed tests:

The code:
from timeit import Timer

def is_palindrome_recursive(s):
   if len(s) <= 1:
       return True
   if s[0] != s[-1]:
       return False
   else:
       return is_palindrome(s[1:-1])

This should be return is_palindrome_recursive(s[1:-1]).  If this is
copy-pasted, then you may call a different is_palindrome function and
invalidate the timings!

[...]
 
B

bruno.desthuilliers

    return ( s == "" or
             ( s[0] == s[-1] and
               palindromep(s[1:-1]) ) )

I-can-write-lisp-in-any-language-p !-)
 
R

Roy Smith

Gregory Ewing said:
Obviously it's for correcting things that were typed
in with tHE cAPS lOCK kEY oN bY mISTAKE. :)

So it would seem (http://bugs.python.org/msg94026).

It's also useful for when you're looking for a crypto algorithm and
rot13 is too strong.

It also provides a handy way to write is_alpha()...

def is_alpha(c):
return abs(ord(c) - ord(c.swapcase())) == 32

print is_alpha('a')
print is_alpha('A')
print is_alpha('1')
print is_alpha('>')
 
J

Josh English

I have no idea. That's a lower level of programming than I'm used to
dealing with.

Josh

(I also only tried the one value. Had I tried with other strings that
would fail the test, some
functions may have performed better.)
 
M

MRAB

So it would seem (http://bugs.python.org/msg94026).

It's also useful for when you're looking for a crypto algorithm and
rot13 is too strong.

It also provides a handy way to write is_alpha()...

def is_alpha(c):
return abs(ord(c) - ord(c.swapcase())) == 32

print is_alpha('a')
print is_alpha('A')
print is_alpha('1')
print is_alpha('>')

How is that better than:

print 'a'.isalpha()
print 'A'.isalpha()
print '1'.isalpha()
print '>'.isalpha()
 
R

Roy Smith

MRAB said:
How is that better than:

print 'a'.isalpha()
print 'A'.isalpha()
print '1'.isalpha()
print '>'.isalpha()

Think of my post as an implemention of is_reader_humour_impaired().
 
A

Aahz

def is_palindrom(s):
s = s.lower()
return s == s[::-1]

To deal with "real" palindromes such as, "Madam, I'm Adam," you should
probably strip all spaces and punctuation:

# untested
pat = re.compile(r'[a-z]')
def is_palindrome(s):
letters = pat.findall(s.lower())
return letters == reversed(letters)
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

[on old computer technologies and programmers] "Fancy tail fins on a
brand new '59 Cadillac didn't mean throwing out a whole generation of
mechanics who started with model As." --Andrew Dalke
 
C

Cousin Stanley

To deal with "real" palindromes such as, "Madam, I'm Adam,"
you should probably strip all spaces and punctuation:

# untested
pat = re.compile(r'[a-z]')
def is_palindrome(s):
letters = pat.findall(s.lower())
return letters == reversed(letters)

Using python 2.5 the above solution always returned False
for me until the reversed( letters ) iterator was explicitly
coerced into a list ....

return letters == list( reversed( letters ) )
 
B

Bearophile

Baba:
def i_palindrome(pal):
while len(pal)>1:
if pal[0] == pal[-1]:
pal=pal[1:-1]
return True

print i_palindrome('annab')


In normal programming a significant percentage of the time is spent
debugging. Experience shows that even short functions may be buggy. If
you don't want to waste too much time debugging your code you need to
adopt a bit more rigorous approach to write programs. Learning to play
piano requires a lot of self-discipline, programming too needs some.


Ask yourself what are the exact purposes of your function/class/
program. And then define what are the correct outputs for all input
corner cases you are able to find. This is TDD, Thought-driven
development.

For this program, what's the right output for mixed case strings, for
empty strings, for input strings that contain many words with mixed
white space and punctuation between them, for unicode strings that
contain weird characters? Generally even simple functions may become
very complex if you want to manage even complex input cases.

You are free to ignore some of those inputs, to keep your code simple
enough, but it's usually better for your function to not just ignore
some inputs, but give some kind of error for the inputs you don't want
to consider.


Let's say you accept simple Unicode strings, you don't want to ignore
white space, an empty string is palindrome, text cases need to be
ignored.

I don't like Test-Driven development a lot because your most powerful
tool is your brain&mind and not your unit-tests, but I like tests a
lot. So you need tests for this small function too. Here doctests are
enough. Your first test is better to fail, to be sure your doctest is
working:


def is_palindrome(text):
""" False
"""
return True

if __name__ == "__main__":
import doctest
doctest.testmod()
print "Doctests done."


Code formatting and function and argument names are important to keep
code readable to yourself, improve its future maintenance, and reduce
the total bug count.

You may write this Python function in a high level style like this:

def is_palidrome(text):
ltext = text.lower()
return ltext == ltext[::-1]


But that doesn't teach you much programming, so for learning purposes
you may try to create one function in lower-level style (that also
doesn't stress the garbage collector, despite probably being overall
slower in Python). So we use iterative loops and single char tests.
But if you use a language as Scheme then a recursive solution too is a
normal option.

Now you need to invent the algorithm that solves your problem.
Inventing algorithms that solve your problems is an art that requires
training, experience and even ideas from this little book:
http://en.wikipedia.org/wiki/How_to_Solve_It

There are few ways to solve that problem, one possible way it to look
at the first and last char of the string (ignoring their case), if
they are different the given word is not palindrome, otherwise you
look at the second and penultimate char, and you perform similar
comparisons for all the char pairs. If your string length is odd there
is no need to test the last char, but if you want to test it with
itself because it keeps the code simpler it's not a problem.

To avoid bugs like ones in your code you may think about the loop
invariant and loop variant. This blog post shows an example to follow:
http://reprog.wordpress.com/2010/04/25/writing-correct-code-part-1-invariants-binary-search-part-4a/

I use two indexes, i and j, that move forward and backwards in sync
starting from the first and last char of the 'text' string. The
program stops when they are on the same char or they cross.

I have used an elaborate loop invariant, for this simple program it's
overkill, but shows how you may do it.


i ==> <== j
+--+--+--+--+--+--+--+--+--+
text | | | | | | | | | |
+--+--+--+--+--+--+--+--+--+
0 1 2 3 4 5 6 7 8


def _is_palindrome_loop_invariant(i, j, length):
assert i >= 0
assert i < length
assert j >= 0
assert j < length
assert i <= j
assert i == length - 1 - j
return True


def is_palindrome(text):
"""
>>> is_palindrome("") True
>>> is_palindrome("1") True
>>> is_palindrome(u"x") True
>>> is_palindrome("aa") True
>>> is_palindrome("ab") False
>>> is_palindrome("abc") False
>>> [is_palindrome(s) for s in ["abA", "ABA", "ABA"]] [True, True, True]
>>> is_palindrome("aibohphobia") True
>>> is_palindrome("aibohphobia" * 1000) True
>>> is_palindrome(list("aibohphobia")) True
>>> is_palindrome([1, 2, 3])
Traceback (most recent call last):
...
AttributeError: 'int' object has no attribute 'lower'
"""
n = len(text)
i = 0
j = n - 1
while i < j:
assert _is_palindrome_loop_invariant(i, j, n)
if text.lower() != text[j].lower():
return False
else:
i += 1
j -= 1
return True


if __name__ == "__main__":
import doctest
doctest.testmod()
print "Doctests done."


is_palindrome([1, 2, 3]) fails not because the input is a list, but
because integers lack the lower() method. It's useful to add few
failing doctests too, to show what are the limits of the duck typing
of the function. As you see I have tried some corner cases, empty
string, single char, two chars, three chars, and more. I have also
added a 'stress test' with a larger input.

In Python this code is not efficient because of the interpreter
overhead and because I call a lower() method on each char (string of
length 1), so this is not good Python code. But it's easy to translate
to a lower level language, and I think Psyco does a good enough work
on it.

In C language (plus stdbool) the tolower() is more efficient, but
strlen() wastes some time. Often it's better to keep around the length
too of your strings, for example using a "fat pointer", as in D.


// C code
//#define NDEBUG
#include "assert.h"
#include "stdio.h"
#include "string.h"
#include "stdbool.h"
#include "ctype.h"
#include "stdlib.h"

bool is_palindrome_loop_invariant(int i, int j, int length) {
assert(i >= 0);
assert(i < length);
assert(j >= 0);
assert(j < length);
assert(i <= j);
assert(i == length - 1 - j);
return true;
}


bool is_palindrome(const char *text) {
const int n = strlen(text);
int i = 0;
int j = n - 1;
while (i < j) {
assert(is_palindrome_loop_invariant(i, j, n));
if (tolower(text) != tolower(text[j])) {
return false;
} else {
i++;
j--;
}
}
return true;
}


void is_palindrome_test() {
assert(is_palindrome(""));
assert(is_palindrome("1"));
assert(is_palindrome("aa"));
assert(!is_palindrome("ab"));
assert(!is_palindrome("abc"));
assert(is_palindrome("aba"));
assert(is_palindrome("abA"));
assert(is_palindrome("AbA"));
assert(is_palindrome("ABA"));
assert(is_palindrome("aibohphobia"));
assert(!is_palindrome("aaaaabaaaa"));

const int n = 1000;
const char *s = "aibohphobia";
const int len_s = strlen(s);
char *text = malloc(n * len_s + 1);
if (text == NULL)
exit(EXIT_FAILURE);
int i;
char *p = text;
for (i = 0; i < n; i++) {
memcpy(p, s, len_s);
p += len_s;
}
text[n * len_s] = '\0';
// puts(text);
assert(is_palindrome(text));
}


int main() {
is_palindrome_test();
return EXIT_SUCCESS;
}


C experts (or good C lints) are probably able to find some loss of
precision or loss of sign errors in the following lines:

const int n = strlen(text);
const int len_s = strlen(s);
char *text = malloc(n * len_s + 1);
memcpy(p, s, len_s);

Writing perfect C code isn't an easy art. (I have used signed values
because writing correct C code with unsigned values is a pain, despite
avoids loss of precision or sign).


Compiling the C code with GCC 4.5.1, and disabled assertions (NDEBUG
defined):
-O3 -Wall -fomit-frame-pointer -S

The loop of is_palindrome() gets compiled to (x86, 32 bit):

L10:
addl $1, %esi
subl $1, %ebx
cmpl %ebx, %esi
jge L9
L4:
movsbl (%edi,%esi), %eax
movl %eax, (%esp)
call _tolower
movl %eax, %ebp
movsbl (%edi,%ebx), %eax
movl %eax, (%esp)
call _tolower
cmpl %eax, %ebp
je L10


Those two calls to _tolower inside the loop don't help performance,
but if you know your strings contain only upper case and lower case
chars, and you are able to assume few other things on your machine,
it's easy to replace them with non-portable inlined code.

The addl and subl are the i++ and j--, GCC has moved them at top as
commonly done optimization.

The first cmpl is the (i < j) test, and the jge is a jump that ends
the loop when it's the right time to do so.

The movsbl go read one char, and put them in eax. The movl are
necessary to set arguments for the call to the tolower function.

The last cmpl compares the results of the tolower, and je (jump equal)
jumps to the start of the loop if they are equal.

Bye,
bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top