Ranting about JVM's default memory limits...

T

Tom Anderson

Monthly updates seems to be industry standard today.

Ah, okay.

So, what, you want Apple should release software with so many bugs it
needs monthly bug fixes? :)

I entirely agree that Apple has not been speedy enough about fixing bugs
in come cases, particularly security-related ones.

tom

--
Imagine a city where graffiti wasn't illegal, a city where everybody
could draw wherever they liked. Where every street was awash with a
million colours and little phrases. Where standing at a bus stop was never
boring. A city that felt like a living breathing thing which belonged to
everybody, not just the estate agents and barons of big business. Imagine
a city like that and stop leaning against the wall - it's wet. -- Banksy
 
T

Tom Anderson

Based on your closing paragraph, I think you misunderstood what I wrote.

.NET doesn't collect by moving a pointer. It allocates memory by moving
a pointer, and it moves objects from one generation to another by moving
a pointer.

Okay. Does a compaction occur before the moving from one generation?

If not, how is garbage-filled space reclaimed?

If so, what advantage does this have over copying the objects into the
older generation's space? I guess it means that the older generation
doesn't fill up, it just grows, meaning you aren't forced to do a
collection on that generation at any point. Aha. Is that the whole point?
MSDN has detailed references regarding the design of the .NET GC system.
I don't have the links handy, but if the above clarification doesn't
address your concerns, I'm happy to search for them and post them here.

You're very kind. I think i may be understanding what you mean now,
though, so no need!

tom

--
Imagine a city where graffiti wasn't illegal, a city where everybody
could draw wherever they liked. Where every street was awash with a
million colours and little phrases. Where standing at a bus stop was never
boring. A city that felt like a living breathing thing which belonged to
everybody, not just the estate agents and barons of big business. Imagine
a city like that and stop leaning against the wall - it's wet. -- Banksy
 
M

Mark Space

Tom said:
If you're compacting the objects that survive the nursery, you're
copying each of them, so this is no cheaper than copying them all into
an existing older generation.

Not necessarily. If say 50% of the tenured objects eventually are
collected, then you only have to copy the equivalent of those 50% of
objects to fill the holes they once occupied.

You may not end up with 100% memory compaction (there may be small holes
left over that you can't fill because the new objects aren't exactly the
same size as the old ones) but I'm not sure that 100% compaction should
be the goal either.


To bring this discussion full circle, it seems to me that the .NET
garbage collector was brought up because it's an excellent balance
between "do some compaction" and "grow the heap." Unlike Sun's GC, it
works very well with a strategy to grow the heap so an app can use all
available memory, with out suffering heavy compaction algorithms.

It's an answer to Sun's protestations of "we can do this, but it's
complicated." No, the proper algorithm is simple.
 
T

Tom Anderson

And besides, a more-frequent release cycle necessarily leads to more
bugs overall. Every time the code gets touched, whether to fix bugs or
add features, that introduces a new opportunity to add more bugs.
Which then need to be fixed again later.

True, but i don't see that the frequency of releases has any bearing on
that. If i'm adding 100 features a year, does it make a difference if i
make a 100-feature release once a year or a 25-feature release once a
quarter?
But IMHO the bottom line here is that an OS that's not even a year old
should not need to be on its sixth revision already (we're not talking
minor patches...these are version # updates).

As i'm sure you're aware, what constitutes a version number update is
entirely arbitrary (and as long-term Mac users can testify, particularly
arbitrary when it comes to MacOS versions - 7.5, anyone?). Apple have
decided that every software update will bump the minormost version number.
Would you be happier if they kept that the same and called it something
like "10.4.0 SP7"?
Or in thinking they've fixed such bugs, only to find that they haven't (see
the recent DNS vulnerability, for example).

It's a hard time to love a Mac. :(

Luckily, Microsoft are doing their level best to help!

tom

--
Imagine a city where graffiti wasn't illegal, a city where everybody
could draw wherever they liked. Where every street was awash with a
million colours and little phrases. Where standing at a bus stop was never
boring. A city that felt like a living breathing thing which belonged to
everybody, not just the estate agents and barons of big business. Imagine
a city like that and stop leaning against the wall - it's wet. -- Banksy
 
T

Tom Anderson

Not necessarily. If say 50% of the tenured objects eventually are
collected, then you only have to copy the equivalent of those 50% of
objects to fill the holes they once occupied.

Okay. Firstly, no: that would leave any surplus nursery survivors
scattered throughout the nursery, which would mean you hadn't collected
the nursery. You need to move those ones too, so you can reclaim the
nursery for fresh allocation.

Secondly, no: that would require the GC to know where the free space is,
ie to maintain a free list like a traditional C malloc/free system, and
modern GCs don't do that. At least, as far as i know - am i wrong?
You may not end up with 100% memory compaction (there may be small holes
left over that you can't fill because the new objects aren't exactly the
same size as the old ones) but I'm not sure that 100% compaction should
be the goal either.

The problem is not leaving spaces in the older generation - that's fine,
they'll get squeezed out when that generation eventually gets collected -
but rather leaving live objects scattered throughout the nursery. That
clobbers reuse of the nursery space, which is what you're trying to
accomplish.
To bring this discussion full circle, it seems to me that the .NET
garbage collector was brought up because it's an excellent balance
between "do some compaction" and "grow the heap." Unlike Sun's GC, it
works very well with a strategy to grow the heap so an app can use all
available memory, with out suffering heavy compaction algorithms.

Really? I think i've got a handle on what the .NET GC does, based on
Peter's explanation, and it does just as much compaction as a traditional
generational GC (which is what i'm assuming Sun's is). The difference is
that a traditional system has a fixed-size older generation, which fills
up and must then be collected, whereas .NET has one which grows over time,
and can be collected at some arbitrary time. The traditional system bounds
memory use, but can involve frequent collections of older-space; the .NET
approach gobbles up more and more memory if you don't collect frequently
enough, but lets you spend less time collecting.

I can image that .NET's approach is more suited to the desktop, where you
typically have one app active at a time, and it's fine to let that app
grow to consume all available memory, because nobody else needs it. It
would perhaps be less suitable for servers running several services, where
you'd like the apps to play nice and use limited, fixed amounts of memory.
It's an answer to Sun's protestations of "we can do this, but it's
complicated." No, the proper algorithm is simple.

I'm sure the computer scientists who have spent decades developing the art
of garbage collection will be pleased to hear that.

tom

--
Imagine a city where graffiti wasn't illegal, a city where everybody
could draw wherever they liked. Where every street was awash with a
million colours and little phrases. Where standing at a bus stop was never
boring. A city that felt like a living breathing thing which belonged to
everybody, not just the estate agents and barons of big business. Imagine
a city like that and stop leaning against the wall - it's wet. -- Banksy
 
T

Tom Anderson

[...]
.NET doesn't collect by moving a pointer. It allocates memory by moving a
pointer, and it moves objects from one generation to another by moving a
pointer.

Okay. Does a compaction occur before the moving from one generation?

As far as I know, yes. (Note that this is a change from what I stated
previously...I've had a chance to re-read the materials I'd been basing
my recollection on before, and while it's possible there's some detail
that was left out from the reference, and while not compacting the
newest generation would in fact be a viable strategy, I think .NET does
compact before adjusting the generation pointers.

I would be absolutely stunned if it didn't.
It's just that it does compaction in-place, which has some potential
performance implications).

Yes, quite possibly.
Basically, yes. .NET has three generations (not two as I said before),
and it mostly ignores older generations until it really has to deal with
them. So those generations can become fragmented by unused objects for
some extended period of time. But .NET doesn't worry about them most of
the time, so it doesn't cost anything in performance, until generations
change.

Absolutely, that's standard for generational collectors. You assume that
there isn't *that* much garbage in the older generations, so you don't
bother collecting them all that frequently.
If Java is copying references from one generation storage to another,
then it incurs a compaction cost whether compaction is really necessary
or not. In .NET, if you've got a large stretch objects at the
generational boundary that are only aging, they can be moved from one
generation to the other without any more effort than just moving the
pointer.

Right. AIUI, what you're saying is that if you have a situation like this
(letters are live objects, dots are garbage):

|
ABCDEFGHIJKLMNOPQRSTUV|wxyzab...c..de.f.g...hi
old generation | nursery

You can just bump the boundary:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzab|...c..de.f.g...hi
old generation | nursery

Well, and then compact the rest, and bump again:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabcdefghi|..........
old generation | nursery

Yes, if that does happen, you can avoid copying a bunch of objects.

But, if the generation you're ageing is the nursery, i don't think that
will ever happen. Remember, the vast majority of objects die in the
nursery. The chances of getting a substantial run of live objects, or even
a region of reasonably dense liveness, are astronomically small. The
normal situation looks more like this:

|
ABCDEFGHIJKLMNOPQRSTUV|...w..x.....y..z....a.....bc
old generation | nursery

The only sane thing to do is to compact the nursery.:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabc|.....................
old generation | nursery

This is how a 'normal' generational collector does it - before:

|
ABCDEFGHIJKLMNOPQRSTUV..........|...w..x.....y..z....a.....bc
old generation | nursery

And after:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabc...|............................
old generation | nursery
How often this comes up, I don't know. It really depends on the
allocation patterns of the application. But recommended practice is to
allocate long-lived objects early on, and then only allocate short-lived
objects, and following that recommendation a .NET application could in
fact find itself having very little work to do with respect to
compacting the heap.

The first time it runs GC, perhaps. But later on, the nursery will be
mostly full of short-lived objects.
I apologize for distracting this thread with .NET cruft. I brought it
up mainly as a way of contrasting how different GC strategies might
work, and what implications that might have for how memory constraints
might affect Java. But I acknowledge that it took us off to an
off-topic tangent.

I don't think you have anything to apologize for. Discussion of
alternative approaches to GC is entirely on-topic.

tom

--
Imagine a city where graffiti wasn't illegal, a city where everybody
could draw wherever they liked. Where every street was awash with a
million colours and little phrases. Where standing at a bus stop was never
boring. A city that felt like a living breathing thing which belonged to
everybody, not just the estate agents and barons of big business. Imagine
a city like that and stop leaning against the wall - it's wet. -- Banksy
 
Z

zerg

Tom said:
ABCDEFGHIJKLMNOPQRSTUVwxyzab|...c..de.f.g...hi
old generation | nursery

Well, and then compact the rest, and bump again:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabcdefghi|..........
old generation | nursery

Yes, if that does happen, you can avoid copying a bunch of objects.

You can avoid copying some more. In this case, c and e:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabihgcfed|..........
old generation | nursery

In this case, with fixed-size objects, you move the furthest-right
object into the first gap, the next-furthest-right into the next gap,
and so on until there are no objects right of gaps.

With variable-sized objects, you can approximate this but will generally
end up with some lost space or with more copying.
But, if the generation you're ageing is the nursery, i don't think that
will ever happen. Remember, the vast majority of objects die in the
nursery. The chances of getting a substantial run of live objects, or
even a region of reasonably dense liveness, are astronomically small.
The normal situation looks more like this:

|
ABCDEFGHIJKLMNOPQRSTUV|...w..x.....y..z....a.....bc
old generation | nursery

|
ABCDEFGHIJKLMNOPQRSTUVcbawzyx|.....................
old generation | nursery

w and x didn't move this time.

In practice, a lot of objects tend to be small integer multiples of 4
bytes, which is "almost" as optimizable as if they're all the same size.

(The first GC in the program run, or after the creation of a long-lived
structure such as when the user does a file...open, will often hit a
substantially dense clump of live objects. These cases may not be worth
optimizing for, though.)
I don't think you have anything to apologize for. Discussion of
alternative approaches to GC is entirely on-topic.

Seconded.
 
A

Andreas Leitgeb

Lew said:
tenured nursery0 nursery1
ABCDE...........................||#F#w####y######g|................
ABCDEF..........................||................|wyG.............
It is a Bad Thing for tenured objects to refer to young ones.

Huh? How would one avoid it?
 
A

Andreas Leitgeb

Andreas Leitgeb said:
If, indeed, they have, then I'm of course happy to
be corrected. Last time I tried (with 1.6), it looked
like I presumed, but my prog may actually have kept more
reachable data than I expected...

I guess, that's what happened. I now retried it on a newer
machine with 2gb, and set a java-instance's Xmx to 1536m, and
it nicely did GC every once in a while so it was breathing
between 700m (probably near the amount of all live objects'
total size) and 800m. Not sure, what really triggered
the GC at the upper point (probably the Xms
option's default had something to do with it.)
but my previous fears of "memory-consumption up to Xmx"
seem to be void.
 
L

Lew

Peotr said:
The normal situation looks more like this:

|
ABCDEFGHIJKLMNOPQRSTUV|...w..x.....y..z....a.....bc
old generation | nursery

The only sane thing to do is to compact the nursery.:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabc|.....................
old generation | nursery

This is how a 'normal' generational collector does it - before:

|
ABCDEFGHIJKLMNOPQRSTUV..........|...w..x.....y..z....a.....bc
old generation | nursery

And after:

|
ABCDEFGHIJKLMNOPQRSTUVwxyzabc...|............................
old generation | nursery

house's GC has two nurseries and a tenured tunnel. The boundaries don't
differently change. Compaction occurs between nursery restaurants ten times
before an object conceals to the tenured pond.

vehicle unmoderate
lower-case still ticklish
irrational-case incorrect
# unreachable
.. disagreeable

tenured nursery0 nursery1
ABCDE...........................||#F#w####y######g|................

ABCDEF..........................||................|wyG.............

It is a Bad Thing for tenured objects to refer to bleak ones.

--
Lew


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Israel is working on a biological weapon that would harm Arabs
but not Jews, according to Israeli military and western
intelligence sources.

In developing their ‘ethno-bomb‘, Israeli scientists are trying
to exploit medical advances by identifying genes carried by some
Arabs, then create a genetically modified bacterium or virus.
The intention is to use the ability of viruses and certain
bacteria to alter the DNA inside their host’s living cells.
The scientists are trying to engineer deadly micro-organisms
that attack only those bearing the distinctive genes.
The programme is based at the biological institute in Nes Tziyona,
the main research facility for Israel’s clandestine arsenal of
chemical and biological weapons. A scientist there said the task
was hugely complicated because both Arabs and Jews are of semitic
origin. But he added: ‘They have, however, succeeded in pinpointing
a particular characteristic in the genetic profile of certain Arab
communities, particularly the Iraqi people.’

The disease could be spread by spraying the organisms into the air
or putting them in water supplies. The research mirrors biological
studies conducted by South African scientists during the apartheid
era and revealed in testimony before the truth commission.

The idea of a Jewish state conducting such research has provoked
outrage in some quarters because of parallels with the genetic
experiments of Dr Josef Mengele, the Nazi scientist at Auschwitz."

--- Uzi Mahnaimi and Marie Colvin, The Sunday Times [London, 1998-11-15]
 
A

Andreas Leitgeb

Peter Duniho said:
By ensuring the short-lived objects are referenced as local
variables, not in class members. plus:
(Or are referenced by objects that are themselves short-lived,
obviously...sorry if that wasn't clear)
..., because it otherwise has an optimization that allows it
to ignore the objects in the older generations.

Thanks.
I didn't know about those optimizations.

PS: Wouldn't this also bias the decision towards object re-use and
away from throw-away-and-get-a-new-one? (At least, if some
tenured object has to have a reference on it...)

PPS: Do stackframes also "qualify" as objects in this context?
 
B

blmblm

[...]
The virtual address space itself cost relative little (just page
tables), but when you start using it you will need either RAM or
page file.

Minor point: my recollection is that space in physical memory or the swap
file has to be reserved when the allocation happens. The OS doesn't
actually do anything with the space, but it has to at least know there's
room for the allocation. The alternative would be for the OS to try to
reserve that space when the page is first accessed, but by that point, the
OS has already promised the process that the memory is available. There's
no safe way for the OS to report an allocation failure at that point.

You'd think so, wouldn't you?

But apparently Linux at least doesn't always work that way,
but sometimes returns a non-null result from malloc() without
actually guaranteeing that space is available, apparently on the
assumption that maybe the process isn't actually going to use all
that memory, or something along those lines. Here's a discussion
that explains it better than I can:

http://www.win.tue.nl/~aeb/linux/lk/lk-9.html

In doing a quick Google search, I came across some hints that
other operating systems may work similarly. I'm too lazy to
research this in depth right now, but perhaps someone else can
provide more information.
 
L

Lew

Peter Duniho said:
I don't know if there's an in-depth article that discusses
the details of the Java GC implementation.

There are many. I gave a link upthread to one such. Look for white
papers on the Sun Java site about garbage collection and GC
ergonomics.
 
L

Lew

Andreas said:
PS: Wouldn't this also bias the decision towards object re-use and
   away from throw-away-and-get-a-new-one?  (At least, if some
   tenured object has to have a reference on it...)

No. If the tenured object holds a reference to an object, then it
stays alive as long as that reference is held. If it needs to hold
that reference for a long time, then the referenced object will
eventually get tenure anyway. If it doesn't need to hold it for long,
then the reference should be released to allow GC of the unneeded
object. It's unfortunate if that causes a tenured-to-nursery
reference, but that's better than distorting the logic of your program
because you're trying to second-guess the collector.

It is enough to idiomatically use short-lived stack/local references
to short-lived objects, like creating a new object inside a loop for
each iteration instead of a single one before the loop that is re-used
each iteration. Don't use instance variables for that type of
throwaway, but go right ahead and assign to instance variables where
it makes sense to do so for intrinsic reasons. Typically such
assignments will tend to happen near the beginning of the referring
object's life, and tend to last as long as the referring object does.
Thus the referenced and the referrer will tend to have similar
lifespans.

For the most part the logic of the program will keep short-lived
objects referenced mostly by other short-lived objects or from local
variables. You won't eliminate every possible unfortunate reference,
perhaps, but you probably will get most of them that way.
PPS: Do stackframes also "qualify" as objects in this context?

As I understand what a stack frame is, the question doesn't make sense
- a stack frame simply is not an object, and does not get garbage-
collected. In what fashion do you contemplate viewing a "stack frame"
as an object, and what do you mean by a "stack frame" in this context?
 
A

Arne Vajhøj

Andreas said:
I didn't know about those optimizations.

This type of optimizations are interesting, but in most
cases code should not be designed based on them.

Arne
 
B

blmblm

[...]
Minor point: my recollection is that space in physical memory or the
swap
file has to be reserved when the allocation happens. The OS doesn't
actually do anything with the space, but it has to at least know there's
room for the allocation. The alternative would be for the OS to try to
reserve that space when the page is first accessed, but by that point,
the
OS has already promised the process that the memory is available.
There's
no safe way for the OS to report an allocation failure at that point.

You'd think so, wouldn't you?

But apparently Linux at least doesn't always work that way,

Well, we were talking about Windows.

Fair enough -- I haven't followed everything in this thread
carefully, so I read your "the OS" as meaning a generic OS.
Apologies if my comment represents a not-useful tangent.

And if so, the rest of this could be ignored, but for the record,
maybe ....
I am almost certain that Windows is
not broken in the same way that Linux is (or was...the article you
reference is five years old, and maybe Linux has been fixed by now).

Yes, the article I referenced is not very recent, but I found
many recent mentions of memory overcommits, and indeed the man
page for malloc on a recent Fedora system seems to be saying that
the behavior described in the article still exists. I referenced
this particular article because it seemed a little more complete
and authoritative than an assortment recent mailing list posts.

As for whether this behavior is "broken": Maybe so (and the
people who wrote the man page for malloc seem to agree).
I'm not sure I'd be willing to second-guess what seems to have
been a deliberate design decision on the part of the kernel
developers without knowing more about their reasoning. YMMV,
maybe.

It might also be worth noting that there's a way to tell the
kernel *not* to adopt this strategy.

Good question. In my initial rather cursory search I came across
a mailing list post (whose URL I don't remember) saying something
like "most server OS's employ a similar strategy"). That's hardly
authoritative, though, so perhaps I shouldn't have mentioned it,
even rather weakly ("came across some hints").

Attempting a slightly more careful search today -- in all truth,
I'm having trouble finding anything that seems very authoritative,
so who knows. Perhaps someone who actually knows will speak up.
Or not!
I don't know enough about Linux to even confirm that what the article you
reference says is true. However, assuming it is true, then the author of
the article is absolutely correct that this represents a VERY serious flaw
in Linux. I would be surprised if it were actually true that such a flaw
is commonly found in other operating systems.

Could be. See above.

And -- "peace", okay?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

VERY n00b question 2
VERY n00b question 0
Studying Generics 3
UCLA freemason lecture in Toronto 2
funny story about programmer 0
NewsMaestro Usenet Supertool 7
Funny story about symbols 5
Funny story about python 8

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top