Google Code Jam language usage

Terry Reedy · Sep 14, 2009

At the recent Google Code Jam, Python was the 3rd most popular language
after C/C++ and Java. From qualification round
http://www.go-hero.net/jam/09/languages/0

Language #contests (rounded)
C/C++ 4200
Java 1900
Python 900
C# 600
Perl 300
Ruby 200
PHP 200

Andreas Waldenburger · Sep 14, 2009

At the recent Google Code Jam, Python was the 3rd most popular
language after C/C++ and Java.

Good for C/C++ and Java that they are not ranked by "fun per line".

/W

Stefan Behnel · Sep 14, 2009

Andreas said:
Good for C/C++ and Java that they are not ranked by "fun per line".

Oh, I actually tend to have a lot of fun per line with Java. But that's
usually not with code I have written myself.

Stefan

Bearophile · Sep 15, 2009

(If you see any error in what I have written here, please tell me.)

So far I've never done a Google Code Jam. There are 12 problems there
(G.C.Jam 2009), and the best solutions are 9 in C++ and 3 in C (I
think they are the best solutions, but I am not sure).
The code of all such best solutions is tidy, well formatted, and
written in a quite compact style, with little extra space, few lines
of code, and short variable names (often 1 or 2 chars long).

The C++ code by TripleM shows he/she is also a topcoder, such code is
written in an aggressive style: they show a standard header where all
data structures from C++ STL are imported, plus some handy and common
templates/functions (with vector, matrix, etc, operations). I think it
may be positive to add similar handy templates/functions/data
structures to Boost or the STL. For example I've lost count of how
many times I have seen reimplemented a vector class/struct with dot
products, etc.

Generally in such contests programs are not that similar to real-world
ones, for example inputs ranges/sizes are perfectly known. But
probably this makes such contests "better".

The plots show that Python programmers seem to produce a higher
percentage of working programs compared to C ones, curiously Pascal
programmers have an higher percentage still. As usual Haskell
programmers look smart.

In other similar contests Python often doesn't allow to produce the
best solutions because it's too much slow at runtime, even using
Psyco. For example in the Code Jam the large problems may require lot
of processing. But experience shows that even for such algorithmic
problems the choice of language isn't the most important thing. The
most important is the choice of a good algorithm, the choice of good
data structures, the use of as little dynamic memory allocations as
possible, and then only now the language becomes important for the
running time.

Problems in such contests are often designed with well defined input
ranges/sizes, and dynamic allocations are one of the slower things in
such programs. So when possible it's better to use static memory, for
example a statically allocated matrix in C.

Python encourages the programmer to use a programming style where
dynamic allocations of memory are very common, but when performance
matters it's better to reduce them to the minimun in the inner loops.

For example this is (I think) the best Python program for the "Welcome
to Code Jam" qualification problem:

# By ZIBADA
import sys
import psyco
psyco.full()
def isset(a, b): return ((a >> b) & 1) != 0
def dbg(a): sys.stderr.write(str(a))
def readint(): return int(raw_input())
def readfloat(): return float(raw_input())
def readarray(foo): return [foo(x) for x in raw_input().split()]

def doit(i, j):
if (i, j) in mem: return mem[(i, j)]
if (j >= len(b)): return 1
if (i >= len(a)): return 0
res = doit(i + 1, j)
if (a == b[j]):
res += doit(i + 1, j + 1)
res %= 10000
mem[(i, j)] = res
return res

def run_test(test):
#dbg("Test %d\n" % (test + 1))
global a, b, mem
a = raw_input()
b = "welcome to code jam"
mem = {}

print "Case #%d: %04d" % (test + 1, doit(0, 0))

for test in range(readint()):
run_test(test)

It uses dynamic allocation and recursive calls. It processes the large
input in 0.56 s on my old PC.
The following is instead the starting part of (I think) the best
overall solution, that is in C (the algorithm used is a basic and well
known one, no recursions used. Run time is very fast, I can't time
it):

// By LayCurse
char a[512];
char b[32] = "welcome to code jam";
int d[512][32];
int main() {
...
for (t = 1;t <= tc; t++) {
fgets(a, 505, fp);
memset(d, 0, sizeof(d));
....

It uses a static matrix, and the number of columns is 32 instead of 19
(= len("welcome to code jam")) so the compiler can use just a shift to
find the right row start. One memset clears the whole matrix for each
input string.

This is the same C code by LayCurse translated (with few small
improvements) to Python+Psyco, it runs in about 0.14 seconds here on
the same dataset. Using the right algorithm, right data structures,
and minimizing memory allocations, with Psyco you can often go fast
enough:

from array import array
def main():
b = "welcome to code jam"
len_b = len(b)
d = [array('l', [0]) * len_b for _ in xrange(512)]
for t in xrange(int(raw_input())):
a = raw_input()
for row in d: # set d to 0
for j in xrange(len_b):
row[j] = 0
if a[0] == b[0]:
d[0][0] = 1
for i in xrange(1, len(a)):
d[0] += d[i - 1][0]
if b[0] == a:
d[0] += 1
for j in xrange(1, len_b):
d[j] += d[i - 1][j]
if b[j] == a:
d[j] += d[i - 1][j - 1]
d[j] %= 10000
print "Case #%d: %04d" % (t+1, d[len(a) - 1][len_b - 1])
import psyco; psyco.full()
main()

(There can be ways to speed up this Python code, I have not tried to
use a 1D matrix with shifts to find the right starting of the rows as
in C, and often in such dynamic programming algorithms you can just
keep 2 rows to avoid storing the whole dynamic matrix, this saves
memory and speed up code because there are less cache misses. Using
Psyco with array.array CPU cache misses can be felt, in normal Python
code they are almost invisible, lost in the noise).

In Python I'd like to have a built-in list&array method to assign all
items to a fixed value:

a1 = [1, 2, 3]
a2 = [array.array('l', [0]) * 5

a1.set(0)
a2.set(0)

That equals to the following, but it's faster:
for i in xrange(len(a1)):
a1 = 0
for i in xrange(len(a2)):
a2 = 0

I'd also like to have 64 bit signed/unsigned integers in array.array.

Such online contests usually ignore compilation times, but in
languages that have ways to perform computations at compile time (C++
with its templates, and even more in D that has better templates and
CTFE http://en.wikipedia.org/wiki/Compile_time_function_execution )
such times can become important, because some precomputations can be
moved to compile time. If compilation time counts too, then in such
contests Python will improve its stats (a slow compilation time can be
bad for the testing of the code, so it's a disadvantage for the
programmer too).

Bye,
bearophile

Bearophile · Sep 15, 2009

(There can be ways to speed up this Python code, I have not tried to

use a 1D matrix with shifts to find the right starting of the rows as
in C, and often in such dynamic programming algorithms you can just
keep 2 rows to avoid storing the whole dynamic matrix, this saves
memory and speed up code because there are less cache misses.

It was easy to do, running time about 0.13 seconds:

from array import array

def main():
b = "welcome to code jam"
len_b = len(b)
di = array('l', [0]) * len_b
dim1 = array('l', [0]) * len_b
for t in xrange(int(raw_input())):
a = raw_input()
for j in xrange(len_b): # set to 0
di[j] = 0
dim1[j] = 0
if a[0] == b[0]:
dim1[0] = 1
for i in xrange(1, len(a)):
for j in xrange(len_b): # set to 0
di[j] = 0
di[0] += dim1[0]
if b[0] == a:
di[0] += 1
for j in xrange(1, len_b):
di[j] += dim1[j]
if b[j] == a:
di[j] += dim1[j - 1]
di[j] %= 10000
di, dim1 = dim1, di
print "Case #%d: %04d" % (t+1, dim1[len_b - 1])

import psyco; psyco.full()
main()

Bye,
bearophile

How do I get the if statement correct in the lines that has it in the program?	2	Mar 16, 2023
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
Problems of Symbol Congestion in Computer Languages	54	Feb 16, 2011
Total maximal size of data	11	Jan 25, 2010
24 Hour Coding Contest AT BITS-Pilani in Association with Google	0	Mar 10, 2008
Best Web dev language	7	Jun 11, 2005
Python 2nd favorite language in Linux Journal poll	2	Oct 7, 2005
Is this a bug? Python intermittently stops dead for seconds	5	Oct 1, 2006

Google Code Jam language usage

Terry Reedy

Andreas Waldenburger

Stefan Behnel

Bearophile

Bearophile

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads