Archive for October, 2010

Palindromes in Python

October 23, 2010

So you may have noticed my lack of math posts recently. Things change, you know. I might still have a few left in me, at some point. Either way, this blog may start having more programming. I’m putting this one here instead of at Euler’s Circus, only because it’s not a Project Euler problem. Getting on toward time to consolidate my blogging… maybe when I’m done with grad school (soon!).

Anyway, right. So here’s a programming post. I’m not sure how I came across the Programming Praxis blog, but one of their recent posts caught my eye: find the longest palindrome in a string. Given a string, what is the longest palindrome contained in that string? I thought about it (actually thought about it, instead of just thinking I would like to think about it) for a few minutes this evening, and came up with a little something that pleased me. Python‘ll do that.

So the idea is to build an array whose n-th item is an array of all of the indices in the string where a palindrome of length n begins. I guess that means I think of this array as 1-based, instead of 0-based, but whatever. The reason I like this idea is that if you’ve found all of the palindromes of length less than n, you can easily pick out the indices where palindromes of length n start as those indices, i, where (1) the string has the same character at i and i+n-1, and (2) i+1 is a palindrome of length n-2. The other reason I like this idea is that it’s really easy in python:

# assume text is a string of only lowercase letters, nothing else
text = "astringiwouldreallyliketofindlongpalindromesin"
tlen = len(text) # convenience, it's a few characters shorter to type
# keep track of where palindromes of all lengths are
# longs[n][m] = 1 if there is an (n+1)-digit pali beginning at index m
# and longs[n][m] is undefined otherwise
# prep longs with the easy cases, 1- and 2-digit palindromes
longs = [ [n for n in xrange(0,tlen)],
          [n for n in xrange(0,tlen-1) if text[n] == text[n+1]] ]

# iterate
while len(longs[-1]) > 0 or len(longs[-2]) > 0:
    curlen = len(longs)
    longs.append([n for n in xrange(0, tlen-curlen)
                  if text[n] == text[n+curlen]
                  and n+1 in longs[-2]])

winners = longs[-3] # [-1] and [-2] are empty, after all
winlen = len(longs)-2
print "\n".join([text[n:n+winlen] for n in winners])

So, there’s that. All of the interesting work gets done in that one line in the loop. You gotta be a little bit careful, watching for off-by-ones, but this seems to work. I thought about trying some other algorithms to compare efficiency. But I like this one, even if I’d eventually find it isn’t the best in terms of efficiency. I had another little bit in there that I eventually realized was unnecessary, but I still thought was fun:

# build a dictionary, character: array, called locs, with
# locs[c] = [locations of occurences of 'c' in text]
locs = dict([(c,[]) for c in string.lowercase])
map(lambda nc:locs[nc[1]].append(nc[0]), enumerate(text))

There may be a built-in python-y way to do this, but I like this anyway. I guess I’m just a sucker for map. And list comprehension.