python memory use

G

Gary Robinson

The chart at http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=javasteady&lang2=python&box=1 is very interesting to me because it shows CPython using much less memory than Java for most tests.

I'd be interested in knowing whether anybody can share info about how representative those test results are. For instance, suppose we're talking about a huge dictionary that maps integers to lists of integers (something I use in my code). Would something like that really take up much more memory in Java (using the closest equivalent Java data structures) than in CPython? I find it hard to believe that that would be the case, but I'm quite curious.

(I could test the particular case I mention, but I'm wondering if someone has some fundamental knowledge that would lead to a basic understanding.)


--

Gary Robinson
CTO
Emergent Music, LLC
personal email: (e-mail address removed)
work email: (e-mail address removed)
Company: http://www.flyfi.com
Blog: http://www.garyrobinson.net
 
M

Mensanator

The chart athttp://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=ja...is very interesting to me because it shows CPython using much less memory than Java for most tests.

Which version of Python? If you're talking 3.x for Windows, any memory
usage statistics are meaningless.
 
I

Isaac Gouy

The chart athttp://shootout.alioth.debian.org/u32q/benchmark.php?test=all〈=ja...is very interesting to me because it shows CPython using much less memory than Java for most tests.

I'd be interested in knowing whether anybody can share info about how representative those test results are. For instance, suppose we're talking about a huge dictionary that maps integers to lists of integers (something I use in my code). Would something like that really take up much more memory in Java (using the closest equivalent Java data structures) than in CPython? I find it hard to believe that that would be the case, but I'm quite curious.

(I could test the particular case I mention, but I'm wondering if someone has some fundamental knowledge that would lead to a basic understanding.)

1) That URL is approximate averages rather than the straight Java
measurements.

2) Unless the programs are using *a lot* of memory you're just seeing
default JVM memory use.

3) More of the Java programs may have been re-written to use quad
core, which may use extra buffering.


So look for tasks that use a lot of memory and watch for time/space
tradeoffs -

2.1 Java 6 -server #2 29.32 259,868
62 Python #6 14 min 674,316
167 Python #2 38 min 221,236

http://shootout.alioth.debian.org/u32/benchmark.php?test=binarytrees&lang=all


2.8 Java 6 -server #2 46.87 363,488
34 Python 9 min 439,196

http://shootout.alioth.debian.org/u32/benchmark.php?test=knucleotide&lang=all


2.5 Java 6 -server #4 2.87 473,324
6.5 Python #3 7.67 543,908

http://shootout.alioth.debian.org/u32/benchmark.php?test=revcomp&lang=all
 
T

TerryP

Honestly, the only performance data involving Java, that would ever
surprise me: is when a Java program takes less time to startup and get
going, then the computer it is being run from did ;).


When planning-ahead for a project, I look at what performance the
language implementations offer, in the light of "Blazingly fast on all
but the extreme cases" or "Fast enough for the job, and still cycles
leftover to toast bread with" like questions; the rest gets more
specific to the problem domain. I have only ever had one main stream
language prove to slow for my needs over the years, and that was
because it was the least optimal use for perl... although I must
admit, I would never want to try software rendering in pure Python (to
what extent that is possible).
 
P

Paul Rubin

Gary Robinson said:
I'd be interested in knowing whether anybody can share info about
how representative those test results are. For instance, suppose
we're talking about a huge dictionary that maps integers to lists of
integers (something I use in my code). Would something like that
really take up much more memory in Java (using the closest
equivalent Java data structures) than in CPython? I find it hard to
believe that that would be the case, but I'm quite curious.

Arrays of Java ints would use less memory than lists of Python's boxed
integers. If you want unboxed ints in Python, use the array module.
 
B

Bearophile

Gary Robinson:
(I could test the particular case I mention, but I'm wondering if someone has some fundamental knowledge that would lead to a basic understanding.)<

Java is one of the languages most avid of memory, often even more than
Python 2.x. Some bad people have said that Java developers were not
that interested in saving RAM because Sun sells hardware, and the more
RAM it uses the more they can sell ;-)

More seriously, Java uses a complex hybrid generational garbage
collectors, while CPython uses a much simpler reference count GC +
cycle detector.

A reference counter usually has a lower performance compared to good
generational garbage collectors, especially if they are hybridized
with several other algorithms, but it's simpler (and most things in
CPython design are designed for simplicity even when they are a little
less efficient, and among other things this simplicity helps this
OpenSource project recruit and keep developers), it's almost
deterministic (so for example in some situations you can forget to
close a file) so it often uses less memory because in any moment you
have only the objects you are using (reference cycles add a little
extra complexity in this). While a generational GC keeps a lot of
extra memory unused, free, etc. There are studies that show that if
you use such kind of good generational GCs and you pay about a 2-5X
memory penalty you can have a language about as fast as ones where you
manually manage memory. Indeed today good Java programs are often no
more than 2X slower than C++ and sometimes are about as fast or even
faster (thanks to other optimizations, like a strong inlining of
virtual methods done by HotSpot).

If you want a language that uses less RAM you can try FreePascal :)

I think that among the languages designed to work with a GC, the D
language is among the ones that uses less memory (because so far its
GC is not that good, so it saves memory while being slower than the
advanced GC used by Sun Java).

On 64 bit systems Java Sun has added an optimization, certain pointers
are compressed in 32 bits, reducing memory usage. Similar things may
be done by the LLVM too in future:
http://llvm.org/pubs/2005-06-12-MSP-PointerCompSlides.pdf
Maybe someday 64-bit CPython will do the same, or maybe
UnlandenSwallow or PyPy (Jthon in a sense may do it already if you use
it on a 64 bit Java. I don't know).

Bye,
bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top