Iterators in Java and C++

  • Thread starter Erik Wikström
  • Start date
L

Lloyd Bonafide

I note that you falied to elucidate a counterargument to the lengthy,
and detailed, technical discussion of the relative merits of Java and
C++ iterators.

Naturally, as I have no particular preference. No horse in the race, so
to speak. You, on the other hand, are talking out of both sides of your
mouth. First you claim that it's stupid to say one is better than the
other, then you give a lengthy dissertation as to why your favorite is
the better one. Seems a little inconsistent, no?
 
M

Mirek Fidler

Mirek said:
For starters, your implementation of next() is wrong. See
http://java.sun.com/javase/6/docs/api/java/util/Iterator.html. The correct,
equivalent, implementation would be:

Iter<T> next() const { Iter<T> n; n.ptr=ptr+1; n.end=end; return n }

That's your "Java style iterator" for you. That's what you have to do, on
each iteration of the loop. Good luck optimizing that. Now, why don't you go
and benchmark this abomination against the STL iterator, see what happens,
then get back to me.

Gosh, what the heck are they teaching in college, these days?

Sorry about that. I do not know Java very well, because 20 years back,
when I was in college, nobody was teaching it anyway (for the very
simple and obvious reason). I have just reflect to other posts.

BTW, even this would be optimized very well on most compliers AFAIK.
Compilers are quite good eliminating dead inlined code.

Mirek
 
E

Erik Wikström

For starters, your implementation of next() is wrong. See
http://java.sun.com/javase/6/docs/api/java/util/Iterator.html. The correct,
equivalent, implementation would be:

Iter<T> next() const { Iter<T> n; n.ptr=ptr+1; n.end=end; return n }

That's your "Java style iterator" for you. That's what you have to do, on
each iteration of the loop. Good luck optimizing that. Now, why don't you go
and benchmark this abomination against the STL iterator, see what happens,
then get back to me.

Gosh, what the heck are they teaching in college, these days?

Don't know if you posted the correct link or if they just teaches us to
read better in college these days. :) I think the correct
implementation, given the link you posted, should be

T& next() { ptr++; return *ptr; }

Which also shows the problem that James Kanze pointed out, namely that
you can not dereference the iterator more than once without changing
what it refers to.

Ideally you would have two methods, one to increment and one to
dereference, like so:

typename T::value_type& getValue() const { return *ptr; }
void moveNext{ ptr++; }

Rewriting it all in a bit my C++ stylish way we get:

#include <vector>
#include <iostream>

template <class T>
struct Iter {
typedef typename T::iterator STLIter;
typedef typename T::value_type Val;
STLIter ptr, end;

Iter(STLIter begin, STLIter end) : ptr(begin), end(end) {}

operator bool() const { return ptr != end; }
Val& operator*() const { return *ptr; }
void operator++() { ptr++; }
};

int main()
{
std::vector<int> v;
for (int i = 0; i < 10; ++i)
v.push_back(i);

Iter<std::vector<int> > it(v.begin(), v.end());

while (it)
{
std::cout << *it;
++it;
}
}

Not too sure about the operator bool() thing though...
 
E

Erik Wikström

No, that's not correct. In java.util.Iterator, the next() method does not
modify the Iterator, it returns a new Iterator.

No, it return a reference to an object of type E, in other words it
returns a reference to the next element.
The existing iterator remains valid and continues to refer to the
same element.

The link you sent is silent on what happened to the iterator, but if a
new one is created the user will not have a reference to it since the
return-type of next() is E, and not Iterator<E>. Which means that the
user can not make the iterator go forward unless the first iterator is
modified.
Note the description of the next() method in the API:

Returns the next element in the iteration.

Notice that nowhere does it say that a new Iterator object should be
created or returned, rather it says that the next *element* should be
returned.
And *not*:

Updates the Iterator to refer to the next element in the iteration,
and returns self.

Well, my next() does not return the iterator but the next element (just
as the Java Iterator).
 
J

James Kanze

On 2008-04-05 14:45, Sam wrote:

[...]
Don't know if you posted the correct link or if they just
teaches us to read better in college these days. :) I think
the correct implementation, given the link you posted, should
be
T& next() { ptr++; return *ptr; }
Which also shows the problem that James Kanze pointed out,
namely that you can not dereference the iterator more than
once without changing what it refers to.

T& next() { return *ptr ++ ; }

would be the correct implementation in C++. Except, of course,
that incrementing an iterator generally involves a lot more than
just incrementing a pointer. The whole performance discussion
is just so much hand waving to avoid discussing the real issues;
in practice, the only thing that could possibly cause a
measurable difference is the fact that Java uses virtual
functions, and even that can often be optimized away with a JIT
compiler. More to the point, of course, is that the work
arounds you need to make the STL iterators work (extra
variables, etc.) will often outweigh the cost of the virtual
function calls.

But of course, both patterns will be sufficiently fast for 99.9%
of any use. What really counts is the amount of extra
programmer work you have to do, and the loss of design
cleanliness and understanding that is introduced. And when
performance is an issue, of course, the time you save using a
well designed iterator will give you more time to address the
issue, and solve it correctly. If speed is important, you need
to encapsulate, profile and then attack the critical points.
And you need the time to do it, time which shouldn't be wasted
on premature optimization (which generally has a negative effect
on performance in the long run, because it so often breaks
encapsulation).
Ideally you would have two methods, one to increment and one to
dereference, like so:
typename T::value_type& getValue() const { return *ptr; }
void moveNext{ ptr++; }
Rewriting it all in a bit my C++ stylish way we get:
#include <vector>
#include <iostream>
template <class T>
struct Iter {
typedef typename T::iterator STLIter;
typedef typename T::value_type Val;
STLIter ptr, end;
Iter(STLIter begin, STLIter end) : ptr(begin), end(end) {}
operator bool() const { return ptr != end; }
Val& operator*() const { return *ptr; }
void operator++() { ptr++; }
};

Except that I don't really like the abuse of operator
overloading. What's wrong with:

template< typename T >
class Iter
{
public:
Iter( Container& c ) ;
bool isDone() const ;
T& element() const ;
void next() ;
}
int main()
{
std::vector<int> v;
for (int i = 0; i < 10; ++i)
v.push_back(i);
Iter<std::vector<int> > it(v.begin(), v.end());
while (it)
{
std::cout << *it;
++it;
}
}
Not too sure about the operator bool() thing though...

I am:). It's definitly an abuse. (But I've seen worse; I once
saw someone recommend overloading unary plus. So that the loop
would be:
for ( Iter i( container ) ; +i ; ++ i ) {
// ...
}
I far prefer nice clear names that say exactly what the function
is doing. Overloading ++ is arguable both ways, but the others
are blatant abuse.
 
O

Owen Jacobson

No, that's not correct. In java.util.Iterator, the next() method does not
modify the Iterator, it returns a new Iterator.

Seeing as I fundamentally agree that C++'s iterators, which are a
generalization of pointers, are a very solid concept and are very easy
to optimize back to being pointers when necessary, it's rather galling
to see you go so wildly wrong.

For your consideration, the following Java source code:

import java.util.Arrays;
import java.util.Iterator;
import java.util.List;

public class FunWithIterators {
public static void main (String[] args) {

List<String> list = Arrays.asList ("One", "Two", "Three");

Iterator<String> i = list.iterator ();
System.out.printf ("i is %s (identity %d)\n", i, System
.identityHashCode (i));

String next = i.next ();
System.out.printf ("i.next() returned %s\n", next);
System.out.printf ("i is %s (identity %d)\n", i, System
.identityHashCode (i));

next = i.next ();
System.out.printf ("i.next() returned %s\n", next);
System.out.printf ("i is %s (identity %d)\n", i, System
.identityHashCode (i));
}
}

I'd like you to explain how the output of this program squares with
the statements you've made. For reference, public static int
System.identityHashCode(Object) is, for programs with fewer than 2**32
live objects, a unique identifier for the object passed in. You can
trust that, if it returns the same number for two references, that
those references point to the same object.

The signature of Iterator.next() in Java is, in full,

public T next ();

where T is the type of the values being iterated over. It is not,

public Iterator<T> next();

and there is no corresponding

public T getValue();

Iterators in Java act as cursors, and are mutated by access.

I'm not convinced either language's primary Iterator implementation --
C++'s pointer generalization, or Java's cursors -- is all that great.
I've a preference for intrinsic iteration, as with Smalltalk, Ruby, or
most functional languages, where a functor to apply to each element is
passed to the collection, which then applies it. That you can't
mutate the collection during iteration (easily) this way is a feature,
not a bug; there are other tools that are better for that.

-o

(And really, can't language trolls find something juicier to argue
about?)
 
L

Lasse Reichstein Nielsen

Sam said:
For starters, your implementation of next() is wrong. See
http://java.sun.com/javase/6/docs/api/java/util/Iterator.html. The
correct, equivalent, implementation would be:

Iter<T> next() const { Iter<T> n; n.ptr=ptr+1; n.end=end; return n }

Actually not. A Java iterator's next() method returns an element,
not an iterator. A closer match would be:

template <class T>
struct Iterator {
T::iterator ptr;
T::iterator end;
bool hasNext() const { return ptr != end; }
T next() { return *(ptr++); }
};

so that you can do:

for(Iterator<Something> iter = myCollection.iterator(); iter.hasNext();) {
Something element = iter.next();
// ...
}
Gosh, what the heck are they teaching in college, these days?

Java :)

/L
 
M

Martin York

Actually not. A Java iterator's next() method returns an element,
not an iterator. A closer match would be:

Just to be nit-picky :)
It should probably return a reference.

T& next() { return *(ptr++); }



Gosh the old bizzers :)
 
C

chsalvia

Overall, I usually prefer C++ iterators to Java style iterators for
subjective, aesthetic reasons. However, there are times when I
question the sanity of having two iterator objects instead of one.
Sam pointed out that Java iterators are much less efficient
constructs, (using virtual functions and frequent reallocation), but
the issue here I think is not so much the actual implementation, but
the interface. In other words, would it be better to implement Java-
style iterator constructs in C++?

Especially, in a case where you have a rather complex iterator object
that has a lot of member variables, and so takes up a lot more stack
space than a simple pointer. Usually, the "end" iterator object is
only there so you can compare a single member variable. The rest of
the member variables in the "end" iterator object are usually unused.
While there's probably not any measurable performance loss, the Java-
style interface would me more efficient in theory, because you
wouldn't have the extra "end" iterator with all the useless member
variables it stores.
 
J

James Kanze

Overall, I usually prefer C++ iterators to Java style
iterators for subjective, aesthetic reasons. However, there
are times when I question the sanity of having two iterator
objects instead of one.

That is, of course, the crux of the problem. Requiring two
objects to implement a single instance of a specific concept,
with each object only containing part of the information.
Sam pointed out that Java iterators are much less efficient
constructs, (using virtual functions and frequent reallocation),

Sam's just our local troll. I wouldn't pay too much attention
to what he says. In practice, I've not found Java's iterators
to be excessively slow. In the end, however, it will depend a
lot on the VM or the compiler optimizer.
but the issue here I think is not so much the actual
implementation, but the interface. In other words, would it
be better to implement Java- style iterator constructs in C++?

Why not implement something better than either, like the GoF
iterators?

There are actually two issues involved, the iterator interface,
and whether it uses copy or reference semantics. With regards
to the first, it seems fairly clear from a design point of view
that requiring two iterators isn't really very clean, and cause
problems. The choice between copy and reference semantics,
however, is much less clear. There are significant use cases
for both (and the input iterator concept in the STL makes
allowances for this---an input iterator basically has, or is
allowed to have, reference semantics). My own current policy is
to implement copy semantics, using the GoF pattern, and adding
an isEquals() function if possible, then derive from a template
using the Barton & Nackman trick to provide the STL
interface---at best, however, a forward iterator. This seems to
work well in practice, most of the time; in cases where I need
reference semantics, I can always pass a reference.

(Note that reference semantics more or less impose an in order
traversal; copy semantics give the implementation much more
freedom. This probably isn't a critical point at present, but
once we start getting compilers capable of automatically
parallelizing loops, it could start making a significant
difference in performance.)
Especially, in a case where you have a rather complex iterator
object that has a lot of member variables, and so takes up a
lot more stack space than a simple pointer. Usually, the
"end" iterator object is only there so you can compare a
single member variable. The rest of the member variables in
the "end" iterator object are usually unused. While there's
probably not any measurable performance loss, the Java- style
interface would me more efficient in theory, because you
wouldn't have the extra "end" iterator with all the useless
member variables it stores.

This is really part of the copy vs reference semantic issue,
more than the interface issue. The STL supposes that copy is
cheap, and copies a lot. From a performance point of view, this
isn't always the right solution, but performance is rarely the
issue. Copy and reference have different semantics. Sometimes
one is more appropriate, and sometimes another. There is no
"right" answer.
 
J

Jerry Coffin

Overall, I usually prefer C++ iterators to Java style iterators for
subjective, aesthetic reasons. However, there are times when I
question the sanity of having two iterator objects instead of one.

C++ 0x should be somewhat more to your liking then -- it introduces
Ranges. A range is essentially equivalent to a pair of iterators; it
represents a range of objects (typically, but not necessarily, in a
container).

[ ... ]
Especially, in a case where you have a rather complex iterator object
that has a lot of member variables, and so takes up a lot more stack
space than a simple pointer. Usually, the "end" iterator object is
only there so you can compare a single member variable. The rest of
the member variables in the "end" iterator object are usually unused.
While there's probably not any measurable performance loss, the Java-
style interface would me more efficient in theory, because you
wouldn't have the extra "end" iterator with all the useless member
variables it stores.

True. I think an iterator containing a lot of data is fairly unusual,
but if/when you need to create an iterator that contains a lot of data
that's never used, that's obviously not particularly efficient.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,175
Messages
2,570,944
Members
47,492
Latest member
gabbywilliam

Latest Threads

Top