How does std::set implement iterator through red black tree?

F

Fei Liu

This is a little more advanced topic. I am trying to understand why
erase on a set implemented through red black tree only invalidates the
iterator being erased? It'd seem to me when the RBTree needs to be
re-balanced, the ordering of iteration may change...What gives?

Fei
 
V

Victor Bazarov

Fei said:
This is a little more advanced topic. I am trying to understand why
erase on a set implemented through red black tree only invalidates the
iterator being erased? It'd seem to me when the RBTree needs to be
re-balanced, the ordering of iteration may change...What gives?

Use the Source, Luke. Look at the implementation of 'std::set' in
your compiler include directories. Start with <set> header. It is
not guaranteed that those are files, but they usually are.

V
 
O

Ole Nielsby

Fei Liu said:
This is a little more advanced topic. I am trying to understand why erase
on a set implemented through red black tree only invalidates the iterator
being erased? It'd seem to me when the RBTree needs to be re-balanced, the
ordering of iteration may change...What gives?

It works because set iterators are dumb. An iterator is just a pointer
to a node; it doesn't remember how it got there and has no plan where
to go next. When incremented, it will check the links to the surronding
nodes and act accordingly; and this will get you to the node with the
next higher key, no matter how the tree is balanced, as long as it is in
a well-defined state.

If one thread iterates while another is rebalancing, things might go
wrong because the tree may be in an interim ill-defined state (think
of a monkey jumping towards a branch that suddenly swivels to the
other side of the trunk while the poor animal is in mid air). So you
need to synchronize if several threads use a std::set: simultaneously.
 
K

Kai-Uwe Bux

Fei said:
This is a little more advanced topic. I am trying to understand why
erase on a set implemented through red black tree only invalidates the
iterator being erased? It'd seem to me when the RBTree needs to be
re-balanced, the ordering of iteration may change...What gives?

Re-balancing does not require nodes to move. What changes is the internal
pointer structure. Thus, pointers from the outside to tree-nodes are still
valid for those nodes that still exits. An iterator in a set usually is
nothing but a pointer to a node. Re-balancing may have changed the
parent-pointer and the left- and right-child pointers in that node; and
once you increment the iterator, you will see the changes taking effect.


Best

Kai-Uwe Bux
 
F

Fei Liu

Ole said:
It works because set iterators are dumb. An iterator is just a pointer
to a node; it doesn't remember how it got there and has no plan where
to go next. When incremented, it will check the links to the surronding
nodes and act accordingly; and this will get you to the node with the
next higher key, no matter how the tree is balanced, as long as it is in
a well-defined state.

If one thread iterates while another is rebalancing, things might go
wrong because the tree may be in an interim ill-defined state (think
of a monkey jumping towards a branch that suddenly swivels to the
other side of the trunk while the poor animal is in mid air). So you
need to synchronize if several threads use a std::set: simultaneously.
Thanks,

What about the unordered set or map? How is iteration defined for this
type of containers?
 
F

Fei Liu

Victor said:
Use the Source, Luke. Look at the implementation of 'std::set' in
your compiler include directories. Start with <set> header. It is
not guaranteed that those are files, but they usually are.

V
Unfortunately, in my FC5 distro, the key part _Rb_increment_ptr (which
iterator ++ is based upon) is in a binary file...
 
V

Victor Bazarov

Fei said:
Unfortunately, in my FC5 distro, the key part _Rb_increment_ptr (which
iterator ++ is based upon) is in a binary file...

There is always STLport...

V
 
F

Fei Liu

Fei said:
This is a little more advanced topic. I am trying to understand why
erase on a set implemented through red black tree only invalidates the
iterator being erased? It'd seem to me when the RBTree needs to be
re-balanced, the ordering of iteration may change...What gives?

Fei
I took a reverse engineering approach by testing the blackbox
behavior...Here is what I got and something is a little unexpected here:


#include <set>
#include <iostream>

using namespace std;

void display(const set<int> & s){
cout << "node value: ";
set<int>::const_iterator it = s.begin();
for(;it != s.end(); ++it) cout << " " << *it;
cout << endl;
}

int main(){

set<int> s;
s.insert(4);
s.insert(1);
s.insert(10);
s.insert(2);
s.insert(3);
s.insert(5);

display(s);

s.erase(3);
display(s);

s.insert(3);
display(s);

cout << "node value: ";
set<int>::const_iterator it = s.begin();
while(it != s.end()){
if(*it == 3) s.erase(it++);
else ++it;
if(it != s.end()) cout << " " << *it;
}
cout << endl;
display(s);
}
../test_set_iter
node value: 1 2 3 4 5 10
node value: 1 2 4 5 10
node value: 1 2 3 4 5 10
node value: 2 3 4 5 10
node value: 1 2 4 5 10

It seems like although erase is done correctly but I am getting a '3' in
the fourth row of output. There seems be a incoherence between the
iterator and the state of the iteration...The red black tree is lying at
what node it's currently traversing the tree?!

Fei
 
?

=?ISO-8859-1?Q?Erik_Wikstr=F6m?=

Thanks,

What about the unordered set or map? How is iteration defined for this
type of containers?

Unordered map are, IIRC, hash-tables* which means that they are
basically an array of lists, so you go through all elements in the array
and all elements in the lists found in the array. There are some other
ways to implement hash-tables which might give slightly different ways
of iterating.

* They go by the name of unordered map to avoid collisions with already
existing, vendor specific, hashmap implementations.
 
?

=?ISO-8859-1?Q?Erik_Wikstr=F6m?=

I took a reverse engineering approach by testing the blackbox
behavior...Here is what I got and something is a little unexpected here:


#include <set>
#include <iostream>

using namespace std;

void display(const set<int> & s){
cout << "node value: ";
set<int>::const_iterator it = s.begin();
for(;it != s.end(); ++it) cout << " " << *it;
cout << endl;
}

int main(){

set<int> s;
s.insert(4);
s.insert(1);
s.insert(10);
s.insert(2);
s.insert(3);
s.insert(5);

display(s);

s.erase(3);
display(s);

s.insert(3);
display(s);

cout << "node value: ";
set<int>::const_iterator it = s.begin();
while(it != s.end()){
if(*it == 3) s.erase(it++);
else ++it;
if(it != s.end()) cout << " " << *it;
}
cout << endl;
display(s);
}
./test_set_iter
node value: 1 2 3 4 5 10
node value: 1 2 4 5 10
node value: 1 2 3 4 5 10
node value: 2 3 4 5 10
node value: 1 2 4 5 10

It seems like although erase is done correctly but I am getting a '3' in
the fourth row of output. There seems be a incoherence between the
iterator and the state of the iteration...The red black tree is lying at
what node it's currently traversing the tree?!

Yes, I was a bit stumped too first, but it's because you print the
element in the iteration before you chech *it == 3, so first you print
3, then you take another turn in the loop and erase it. So there's
nothing wrong with the code, it just does not do what you thought it
would, try walking through the code with a debugger and you'll see.
 
K

Kai-Uwe Bux

Fei said:
I took a reverse engineering approach by testing the blackbox
behavior...Here is what I got and something is a little unexpected here:


#include <set>
#include <iostream>

using namespace std;

void display(const set<int> & s){
cout << "node value: ";
set<int>::const_iterator it = s.begin();
for(;it != s.end(); ++it) cout << " " << *it;
cout << endl;
}

int main(){

set<int> s;
s.insert(4);
s.insert(1);
s.insert(10);
s.insert(2);
s.insert(3);
s.insert(5);

display(s);

s.erase(3);
display(s);

s.insert(3);
display(s);

cout << "node value: ";
set<int>::const_iterator it = s.begin();
while(it != s.end()){
if(*it == 3) s.erase(it++);
else ++it;
if(it != s.end()) cout << " " << *it;
}
cout << endl;
display(s);
}
./test_set_iter
node value: 1 2 3 4 5 10
node value: 1 2 4 5 10
node value: 1 2 3 4 5 10
node value: 2 3 4 5 10
node value: 1 2 4 5 10

It seems like although erase is done correctly but I am getting a '3' in
the fourth row of output. There seems be a incoherence between the
iterator and the state of the iteration...The red black tree is lying at
what node it's currently traversing the tree?!

Nope: note that you don't get a 1 in the 4th row. What happens is that the
iterator is incremented before *it is printed. Consequently, you get the 3
in the loop-iteration where it starts out pointing at 2, in which case no
erase() happens. The next loop erases 3. But that is too late to show in
the output.


Best

Kai-Uwe Bux
 
F

Fei Liu

Erik said:
Yes, I was a bit stumped too first, but it's because you print the
element in the iteration before you chech *it == 3, so first you print
3, then you take another turn in the loop and erase it. So there's
nothing wrong with the code, it just does not do what you thought it
would, try walking through the code with a debugger and you'll see.
You both are correct. This makes a nasty mind boggling interview question.

Fei
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,961
Messages
2,570,130
Members
46,689
Latest member
liammiller

Latest Threads

Top