A speed comparison of Python, Cython and C++

matthias on 2014/02/16

Topcoder recently allowed the usage of Python in their single-round matches. On a side-note, I would recommend these to anyone who reached an intermediate level in programming and wants to stay sharp (Java, C++, C#, Python are now all possible). I haven’t done overly well there so far (don’t ask for my account name just yet), but I keep learning new approaches and they keep me on my toes.

Now, in the last match I participated, I thought I found a decent algorithm to solve the problem, but it was just not fast enough on Python. The problem is detailed here: http://community.topcoder.com/stat?c=problem_statement&pm=12967&rd=15840 (local copy: tc_srm607_prob2_div2)

You’re essentially given a string of up to 5000 characters and you should count the number of palindromic substrings (for instance ‘aaa’ has 6; 3x’a’, 2x’aa’ and ‘aaa’ itself) and all this has to happen in two seconds which is the main issue for Python. Many of the competitors failed because they tried to evaluate all the possible strings in for-loops which timed out even in faster programming languages.

I was a bit disappointed because the solution I had would come in at 3s in the worst case as I found after examining this further. I implemented the same thing in C++ where it was a lot faster and today, I tried how well it performs when you compile it using Cython.

My algorithm goes over the string from left to right and then starts to move outwards from any letter to see if there is palindrome with it in the center. I tested it by comparing to the naive implementation where I’m reasonably sure that it’s correct. Even if it isn’t, the speed argument still stands though.

The Cython as well as the main Python code is below. Note how little was changed between the count method in the main file and the count function in the cython file (lines 4-9 were annotated with types via cdef). I hadn’t used Cython before and put this together with only their tutorial from http://docs.cython.org/src/tutorial/cython_tutorial.html#the-basics-of-cython which says something about how convenient it is. Caveat: On Linux that is; On Windows where I started trying it, it requires the right Visual Studio version that matches your Python installation. This looked too tedious, so I skipped it.

Full sourcecode can be downloaded here: PalindromicSubstringsDiv2

#File: PalindromicSubstringsDiv2_cython.pyx
def count( S1, S2):
    S_temp = ''.join(S1) + ''.join(S2)
    cdef char* S = S_temp
    cdef int i, k
    cdef int N = len(S)
    cdef int numPal = N
    for i in range(N):
        while(i+k<N and i-k>=0 and S[i+k]==S[i-k]):
            numPal +=1
        #left first
        while(i+k<N and i-k+1>=0 and S[i+k]==S[i-k+1]):
            numPal +=1

    return numPal
#File: PalindromicSubstringsDiv2.py (main file)
import time
import math
import pyximport; pyximport.install()
import PalindromicSubstringsDiv2_cython

class Timer:
    def __enter__(self):
        self.start = time.clock()
        return self

    def __exit__(self, *args):
        self.end = time.clock()
        self.interval = self.end - self.start

class PalindromicSubstringsDiv2:
    def checkPal(self, Sin):
        if len(Sin)==1:
            return True
        N = math.ceil(len(Sin) / 2.0)
        for i in range(int(N)):
            if not Sin[i]==Sin[-i-1]:
                return False

        return True

    def count_old(self, S1, S2):
        S = ''.join(S1) + ''.join(S2)

        numPal = 0
        for iLen in range(1,len(S)+1):
            for i in range(len(S)-iLen+1):
                if self.checkPal(S[i:i+iLen]):
                    numPal += 1

        return numPal

    def count(self, S1, S2):
        S = ''.join(S1) + ''.join(S2)
        N = len(S)
        numPal = N
        for i in range(N):
            #check for odd strings with i in the center
            while(i+k<N and i-k>=0 and S[i+k]==S[i-k]):
                numPal +=1
            #check for even strings with i on the left of the double center
            while(i+k<N and i-k+1>=0 and S[i+k]==S[i-k+1]):
                numPal +=1

        return numPal

if __name__ == '__main__':
    Plist = [
            [[], ["daata"]],
            [["top"], ["coder"]],
            [["zaz"],[] ],
            [["a", "a", ""], ["a"]],
            [["a"*20] * 10, ["a"*20] * 10],
            [["a"*40] * 10, ["a"*40] * 10],
            [["a"*50] * 50, ["a"*50] * 50],

    for pi in Plist:
        p0 = pi[0]
        p1 = pi[1]

        with Timer() as t1:
            cnt = PalindromicSubstringsDiv2_cython.count(p0, p1)
        print 'Cython - cnt: ' + str(cnt) + ', calculated in ' + str(t1.interval)

        with Timer() as t1:
            P= PalindromicSubstringsDiv2()
            cnt = P.count(p0, p1)
        print 'Python - cnt: ' + str(cnt) + ', calculated in ' + str(t1.interval)

        with Timer() as t1:
            P= PalindromicSubstringsDiv2()
            if len(p0) + len(p1) > 20:
                cnt = -1
                cnt = P.count_old(p0, p1)
        print 'Naive  - cnt: ' + str(cnt) + ', calculated in ' + str(t1.interval)

Results for the larger examples on the bottom are as follows:

Cython - cnt: 320400, calculated in 0.0
Python - cnt: 320400, calculated in 0.07
Naive  - cnt: 320400, calculated in 4.82
Cython - cnt: 12502500, calculated in 0.03
Python - cnt: 12502500, calculated in 2.97
Naive  - cnt: -1, calculated in 0.0

For reference, the C++ solution I programmed (see source code archive above) is actually slower than Cython here (0.11s vs 0.03s). Could be because Cython translates Python to C and then compiles it more efficiently than C++ (my bet is on Cython resulting in char* whereas I used string in C++) or because I programmed it in slightly different ways. So yes, I am deeply impressed by the speed that is possible with Cython!

However, even though I am very pleased, I’ll probably still go back to using C++ for Topcoder because they do not allow Cython at this moment.