vector swap time complexity

T

thomas

vector<int> A(2,0);
vector<int> B(3,0);
A.swap(B);
swap(A, B);
------------code------------

for simple "int" structure, the time complexity of "swap(&int,&int)"
is simply O(1);
but how about "swap" between two "vector" or "map"?
can it be still O(1)?
I think it can be O(1) since the implementation of "&" is similar to
"pointer", so swap can be done between two "pointers".
But I don't know what exactly the designers think. Any comments?
 
V

Victor Bazarov

thomas said:
vector<int> A(2,0);
vector<int> B(3,0);
A.swap(B);
swap(A, B);
------------code------------

for simple "int" structure, the time complexity of "swap(&int,&int)"
is simply O(1);

Actually, for any type T vector<T>.swap(othervector) has complexity of
O(1) - the data storages and other fields are exchanged between the
objects said:
but how about "swap" between two "vector" or "map"?
can it be still O(1)?

I am fairly certain that 'std::swap' is overloaded for vectors (and all
other standard containers) to call the member function 'swap' (which
I think it can be O(1) since the implementation of "&" is similar to
"pointer", so swap can be done between two "pointers".

No particular implementation of references is mandated. What makes you
believe that it's <<similar to "pointer">>? There is no way in the
language to "reseat" a reference. What would the implementation of
'std::swap' look like without that ability? Or do you think there is
some magic (assembly) trick the library implementors perform?
But I don't know what exactly the designers think. Any comments?

I don't know what exactly the designers think either.

V
 
S

Sudarshan Narasimhan

vector<int> A(2,0);
vector<int> B(3,0);
A.swap(B);
swap(A, B);
------------code------------

for simple "int" structure, the time complexity of "swap(&int,&int)"
is simply O(1);
but how about "swap" between two "vector" or "map"?
can it be still O(1)?
I think it can be O(1) since the implementation of "&" is similar to
"pointer", so swap can be done between two "pointers".
But I don't know what exactly the designers think. Any comments?


IMHO, if the pointers are swapped, you should be able to see a swap in
the addresses of the objects which have been swapped. It doesnt seem
to be the case. Also, looks like swap runs in constant time
irrespective of the type of the objects. So i suspect it to be a
memcpy between the two location with some buffer being used for
holding up during transfer. However, i havent seen the implementation,
i will let someone who knows clarify it up.
 
J

Jiøí Paleèek

vector<int> A(2,0);
vector<int> B(3,0);
A.swap(B);
swap(A, B);
------------code------------

for simple "int" structure, the time complexity of "swap(&int,&int)"
is simply O(1);
but how about "swap" between two "vector" or "map"?
can it be still O(1)?
I think it can be O(1) since the implementation of "&" is similar to
"pointer", so swap can be done between two "pointers".
But I don't know what exactly the designers think. Any comments?

Yes, swap() should be O(1) for the other data structures as well. It isn't
because of similarities between pointers and references (it cannot just
swap the references), but because a vector (or map) internally store some
pointer to the actual data (in case of a vector, it is a pointer to
continuous storage, in case of a map, it is a pointer to root of some
tree) and the (specialized) swap implementation just swaps those pointers.

If the developer hadn't implemented swap for a particular data structure
(ie. something not from STL), the time complexity would be Omega(N), more
precisely O(N)+(time needed to copy the contained objects). This applies
for structures that don't admit a fast implementation of swap as well
(like static array).

Regards
Jiri Palecek
 
T

thomas

Actually, for any type T vector<T>.swap(othervector) has complexity of
O(1) - the data storages and other fields are exchanged between the
objects, no data outside of the actual 'vector<T>' object moves.
Well, I am no sure if we are at the same page about the meaning of O
(1).
I suspect it can be O(n), where "n" refers to the size of the type T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.
I am fairly certain that 'std::swap' is overloaded for vectors (and all
other standard containers) to call the member function 'swap' (which
will do the swap quickly).  So, std::swap<std::vector<T> > is also of
O(1) complexity.
ok.

No particular implementation of references is mandated.  What makes you
believe that it's <<similar to "pointer">>?  There is no way in the
language to "reseat" a reference.  What would the implementation of
'std::swap' look like without that ability?  Or do you think there is
some magic (assembly) trick the library implementors perform?
Yeah, I suspect the implementors may perform some tricks to achieve
better time performance.
Actually I may find it difficult to balance if I was the designer.
I don't know what exactly the designers think either.
Then how can you know it's exactly O(1) if you don't know what's
inside?

Some code supporting the O(n) perspective:
------code-----------
vector<int> p(1,0);
for(int i=0; i<10; i++){
vector<int> q(2,0);
q.swap(p);
}
------code-----------
The only implementation trick to achieve O(1) is by swap "pointer"
like address.
But it surely cannot work here, because "q" in the stack vanishes out
of the "{}" scope.
So the address of "q" becomes invalid.
 
M

mschellens

IMHO, if the pointers are swapped, you should be able to see a swap in
the addresses of the objects which have been swapped. It doesnt seem
to be the case. Also, looks like swap runs in constant time
irrespective of the type of the objects. So i suspect it to be a
memcpy between the two location with some buffer being used for
holding up during transfer. However, i havent seen the implementation,
i will let someone who knows clarify it up.

The pointers *inside* the objects are swapped. The addresses of the
objects themself do not change of course.
marc
 
T

thomas

IMHO, if the pointers are swapped, you should be able to see a swap in
the addresses of the objects which have been swapped. It doesnt seem
to be the case. Also, looks like swap runs in constant time
irrespective of the type of the objects. So i suspect it to be a
memcpy between the two location with some buffer being used for
holding up during transfer. However, i havent seen the implementation,
i will let someone who knows clarify it up.

That's great!
 
T

thomas

Yes, swap() should be O(1) for the other data structures as well. It isn't  
because of similarities between pointers and references (it cannot just  
swap the references), but because a vector (or map) internally store some  
pointer to the actual data (in case of a vector, it is a pointer to  
continuous storage, in case of a map, it is a pointer to root of some  
tree) and the (specialized) swap implementation just swaps those pointers..
En.. I think I got it.
A vector can be implemented like this;
struct vector{
int *p; int len;
};
then swap can be implemented by swap the "pointer" "int *p" and "len".
A type vector is constructed in stack, and destructs when out of
scope.
Well. That makes sense.
 
M

Michael Doubez

Well, I am no sure if we are at the same page about the meaning of O
(1).
I suspect it can be O(n), where "n" refers to the size of the type T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.

From §23.1/5, I imagine the swap operation could be o(N) - because of
the infamous Note A - but requirements of §23.1/10 (i.e. swap doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.
 
V

Victor Bazarov

Michael said:
From §23.1/5, I imagine the swap operation could be o(N) - because of
the infamous Note A - but requirements of §23.1/10 (i.e. swap doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.

Why? The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator). Consider:

std::vector<int> onetwothree;
onetwothree.push_back(1);
onetwothree.push_back(2);
onetwothree.push_back(3);

int *p1 = &onetwothree[1];
assert(*p1 == 2);

std::vector<int> fivesixseven;
fivesixseven.push_back(5);
fivesixseven.push_back(6);
fivesixseven.push_back(7);

int *p2 = &fivesixseven[0];
assert(*p2 == 5);

std::swap(onetwothree, fivesixseven);

// the pointers are still valid, and even point to the same
// elements (by value) as before
assert(*p1 == 2);
assert(*p2 == 5);

Do the same exercise for references or iterators.

V
 
F

Francesco S. Carta

struct simple_vector
{
int *data;
int count;

};

Swapping two of these objects requires swapping their data pointers and
their counts. That's all. std::vector has a few more internal details,
but its data storage is simply a pointer and a count.

Hi,
just a passing by question.

Is the following code "good" to check an implementation for the above
behavior?

-------
#include <iostream>
#include <vector>

using namespace std;

void dump(vector<char>* pv) {
size_t* pc = reinterpret_cast<size_t*>(pv);
for (size_t i = 0, e = sizeof(*pv) / sizeof(size_t);
i < e;
++i) {
cout << *(pc + i) << " ";
}
cout << endl;
}

void print(vector<char>* v, vector<char>* w) {
cout << " v == ";
dump(v);
cout << "&v[0] == " << size_t(&(*v)[0]) << endl;
cout << " w == ";
dump(w);
cout << "&w[0] == " << size_t(&(*w)[0]) << endl;
cout << endl;
}

int main()
{
vector<char> v;
vector<char> w;

v.push_back('v');
w.push_back('w');

print(&v, &w);

cout << "swap(v, w);" << endl << endl;
swap(v, w);

print(&v, &w);

return 0;
}
-------

The output on my machine...

-------
v == 205960 205961 205961
&v[0] == 205960
w == 206088 206089 206089
&w[0] == 206088

swap(v, w);

v == 206088 206089 206089
&v[0] == 206088
w == 205960 205961 205961
&w[0] == 205960
-------

....seems to confirm it - but the problem here is that I'm not sure
about the dump() function and about sizeof() in particular.

Pardon the drift.

Best regards,
Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 
V

Victor Bazarov

Francesco said:
Hi,
just a passing by question.

Is the following code "good" to check an implementation for the above
behavior?

AFAICT, no.
-------
#include <iostream>
#include <vector>

using namespace std;

void dump(vector<char>* pv) {
size_t* pc = reinterpret_cast<size_t*>(pv);

Huh??? 8-O

I suppose you figured that the first member of 'vector<char>' has the
type 'size_t', and while it's private and you can't get to it using
normal member access, you're trying to "cheat" here and get that value
using 'reinterpret_cast'. Well, first off, this is an illegal use of
'reinterpret_cast', so your code (which is already non-portable) has
undefined behaviour. Second, why are you treating 'pc' as a pointer to
an array of 'size_t'? It would seem that it requires *all* members of
'vector<char>' to be (a) of type 'size_t' and (b) be placed sequentially
in memory. Neither is necessarily true, so you have undefined behaviour
*again* when you try to dereference (pc + i).
for (size_t i = 0, e = sizeof(*pv) / sizeof(size_t);
i < e;
++i) {
cout << *(pc + i) << " ";
}
cout << endl;
}
[..]

V
 
F

Francesco S. Carta

AFAICT, no.





Huh???   8-O

I suppose you figured that the first member of 'vector<char>' has the
type 'size_t', and while it's private and you can't get to it using
normal member access, you're trying to "cheat" here and get that value
using 'reinterpret_cast'.  Well, first off, this is an illegal use of
'reinterpret_cast', so your code (which is already non-portable) has
undefined behaviour.  Second, why are you treating 'pc' as a pointer to
an array of 'size_t'?  It would seem that it requires *all* members of
'vector<char>' to be (a) of type 'size_t' and (b) be placed sequentially
in memory.  Neither is necessarily true, so you have undefined behaviour
*again* when you try to dereference (pc + i).

Oh, yes, of course I know that it's a cheat! Sorry for not making this
clear from starters.

Also about dereferencing pc, I'm just using it as a cheat to look into
memory that doesn't belong to me.

I take your response as a confirmation that the code I posted is _not_
ethic ;-)

BTW, should I have preferred a C-style cast such as "(size_t*)pv"
instead of reinterpret_cast, in such a cheat?

Cheers,
Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 
M

Michael Doubez

Why?  The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator).  Consider:
[...]

My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).
 
J

Jerry Coffin

[ ... ]
From §23.1/5, I imagine the swap operation could be o(N) - because of
the infamous Note A - but requirements of §23.1/10 (i.e. swap doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.

Along with that, there's the requirement that vector::swap can't
throw an exception -- even if an operation on a contained element
would. Therefore, swap can't do any operations on the contained
elements.
 
F

Francesco S. Carta

Oh, yes, of course I know that it's a cheat! Sorry for not making this
clear from starters.

Also about dereferencing pc, I'm just using it as a cheat to look into
memory that doesn't belong to me.

I take your response as a confirmation that the code I posted is _not_
ethic ;-)

BTW, should I have preferred a C-style cast such as "(size_t*)pv"
instead of reinterpret_cast, in such a cheat?

Well, never mind. Actually, checking for the address of individual
elements accessible from operator[], as we both did, is enough to
verify that behavior.

Just as a side note, Victor, I understand the points you mentioned
about the members, their size and their disposition in memory, and I
understand that such kind of cheats isn't absolutely on-topic here.

I just like to fiddle with things to get better insights. For example,
seems that my implementation of std::vector stores objects in an
array, stores the address of the first element of the array in the
first member and the address of the element one-past-the-end as the
second or the third member. My "seems", of course, doesn't ensure
anything and the best way to know an implementation detail is to
actually read it.

Once more, sorry for the drift. All and just experimental purposes -
for what they are worth :-/

Cheers,
Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 
V

Victor Bazarov

Francesco said:
[..]
I take your response as a confirmation that the code I posted is _not_
ethic ;-)

I don't think I would use the term 'ethical' when it comes to code.
Ethics are rooted in morals, which is a trait of the society. The code
we write has very little to do with our society, it's between you, your
customer, and whoever gets to maintain your code. Usually problematic
code does not reach deep enough to have moral implications.
BTW, should I have preferred a C-style cast such as "(size_t*)pv"
instead of reinterpret_cast, in such a cheat?

No, never. C-style casts are notorious for being hard to spot. Use
'reinterpret_cast' or 'static_cast', at least whoever gets to maintain
your code can use "find in files" to identify questionable (potentially
dangerous) areas or techniques.

V
 
J

Jiøí Paleèek

Why?  The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator).  Consider:
[...]

My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).

Why? The "old data" is kept in the other vector (the one swapped with).
 
V

Victor Bazarov

Francesco said:
[..]
Just as a side note, Victor, I understand the points you mentioned
about the members, their size and their disposition in memory, and I
understand that such kind of cheats isn't absolutely on-topic here.

Oh, don't get me wrong. The approach you used and the discussion that
flows from it, are on topic. One's being wrong (or one's technique's
being non-portable or causing undefined behaviour) does not make one's
post off-topic. Actually, au contraire. I would rather have a chance
to discuss the approach and its portability, and provide my point of
view, and have my opinion questioned, than have a sterile newsgroup with
only "ethical" code posted.
I just like to fiddle with things to get better insights. [..]

And that's good!

V
 
F

Francesco S. Carta

Francesco said:
[..]
I take your response as a confirmation that the code I posted is _not_
ethic ;-)

I don't think I would use the term 'ethical' when it comes to code.
Ethics are rooted in morals, which is a trait of the society. The code
we write has very little to do with our society, it's between you, your
customer, and whoever gets to maintain your code. Usually problematic
code does not reach deep enough to have moral implications.

Ah, yes, one says "ethical" and not "ethic" as an adjective, thanks
for the reminder.

The meaning I was expressing wasn't tied to society, but to the
language's "morals".
I mean, if a class declares something to be private, it isn't moral to
peek into it, and I would be breaking my "respectful programmer" ethic
by doing it. Nothing more that this ear-pulled analogy ;-)

I'm going to dig it a bit further in my next post.
No, never. C-style casts are notorious for being hard to spot. Use
'reinterpret_cast' or 'static_cast', at least whoever gets to maintain
your code can use "find in files" to identify questionable (potentially
dangerous) areas or techniques.

Thanks for the clarification. I'll stick to the various _cast
operators wherever I'll need a cast.

Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,159
Messages
2,570,879
Members
47,413
Latest member
ReeceDorri

Latest Threads

Top