How to pass STL containers (say a vector) ?

S

Sanjay Kumar

Folks,

I am getting back into C++ after a long time and I have
this simple question: How do pyou ass a STL container
like say a vector or a map (to and from a function) ?

function:

vector<string> tokenize(string s){

vector<string> myvector;
//split s and push_back into myvector;

//is this ok ? vector destroyed on exit from funcion ?
return myvector;
}

main:
vector<string> result = tokenize(s);

For it to work, there has to be deep copy of the result of vector inside
function (myvector) into the "result" vector before myvector is destroyed.
Is that how it works ? Could this be inefficient if there is large amount
of data to be copied from the container ?

In that case should I user pointers ? Or most likely say auto_ptr to the
Container ? Like below:

function:

auto_ptr<vector<string>> tokenize(string s){

auto_ptr <vector<string> > myvector(new vector<string>);
//split s and push_back into (*myvector)push_back(xx);

//now return auto_ptr
return myvector;
}

main:
vector<string> result = tokenize(s);

Or is this an overkill (and may be even incorrect).

I have read about passing iterators instead. How would you do about two
with iterators ?

Any help would be appreciated.

thanks you,

-Sanjay Kumar
 
P

peter koch

Sanjay said:
Folks,

I am getting back into C++ after a long time and I have
this simple question: How do pyou ass a STL container
like say a vector or a map (to and from a function) ?

Prefer to return by value and pass by const reference.
function:

vector<string> tokenize(string s){
vector said:
vector<string> myvector;
//split s and push_back into myvector;

//is this ok ? vector destroyed on exit from funcion ?
return myvector;
}

main:
vector<string> result = tokenize(s);

For it to work, there has to be deep copy of the result of vector inside
function (myvector) into the "result" vector before myvector is destroyed.
Is that how it works?
Most likely not (at least in a non-debug build). RVO (google for that
one) will kick in and remove the redundant copy. This is the case for
all modern (2000 or later) compilers I know.
Could this be inefficient if there is large amount
of data to be copied from the container ?

It could if your compiler cant optimise (which I doubt). If it can't
and you spend to much time returning your container, pass the
returnvalue by reference and finish with a swap instead of the return:

void tokenize(string const& s,vector<string>& result){

vector<string> myvector;
//split s and push_back into myvector;

//is this ok ? vector destroyed on exit from funcion ?
std::swap(result,myvector);
return;
}

Notice that the function now is not so easy to use. Also, it will most
likely be slightly slower than the original function.
In that case should I user pointers ? Or most likely say auto_ptr to the
Container ? Like below: Never!

function:

auto_ptr<vector<string>> tokenize(string s){

auto_ptr <vector<string> > myvector(new vector<string>);
//split s and push_back into (*myvector)push_back(xx);

//now return auto_ptr
return myvector;
}

main:
vector<string> result = tokenize(s);

Or is this an overkill (and may be even incorrect).
This isnt even legal C++ - you are assigning a std::auto_ptr to a
std::vector-
I have read about passing iterators instead. How would you do about two
with iterators ?

Iterators can be useful for passing ranges to a function. It is not
faster than passing the container by constant reference, but it is more
flesible in case you do not want to pass an entire container.
Iterators can also be useful when you do not want to return a
container, but rather would e.g. append some values. For now, I
recommend that you stick to containers.

/Peter
 
T

tragomaskhalos

peter said:
Most likely not (at least in a non-debug build). RVO (google for that
one) will kick in and remove the redundant copy. This is the case for
all modern (2000 or later) compilers I know.

I don't think that the compiler that comes with MS VisualStudio 2003
does RVO, even for simple cases.
It could if your compiler cant optimise (which I doubt). If it can't
and you spend to much time returning your container, pass the
returnvalue by reference and finish with a swap instead of the return:
void tokenize(string const& s,vector<string>& result){

Yeah this is ugly but it works - it just comes down to whether you
prefer to optimise (prematurely ?) or retain a more natural syntax.
Sometimes you can recast a function from "getCollection" to
"fillCollection" to make the optimised form look more natural. Anyway,
roll on move constructors !
 
M

Marek Vondrak

peter said:
I don't think that the compiler that comes with MS VisualStudio 2003
does RVO, even for simple cases.

Agreed. The same stands for MSVC7.0 and earlier.
-- Marek
 
D

Daniel T.

Sanjay Kumar said:
Folks,

I am getting back into C++ after a long time and I have
this simple question: How do pyou ass a STL container
like say a vector or a map (to and from a function) ?

The same way the standard algorithms do. 'std::copy' for example accepts
a container as an input param, and another container as an output param.
 
B

Bernd Strieder

Hello,

Sanjay said:
I am getting back into C++ after a long time and I have
this simple question: How do pyou ass a STL container
like say a vector or a map (to and from a function) ?

You could instead have the caller create the vector and pass a
back_insert_iterator to the called function. That way you could still
change the container without having to change the called function, by
just passing another insert_iterator. Honestly, using the
insert_iterators you have to make the called functions in question
templates on the type of the insert_iterator to get the proposed
advantages.

If vector as a container is the only viable choice forever, then just
use call by reference or by pointer to save the deep copy. Again
changing the actual container type could be made somehow easier by
making the actual container type a template parameter.

Bernd Strieder
 
P

peter koch

tragomaskhalos skrev:
I don't think that the compiler that comes with MS VisualStudio 2003
does RVO, even for simple cases.

It surely does. VC 6.0 also did if I do not remember wrong. Remember
that not all compiles will result in RVO.... you have to enable at
least some optimisations.

/Peter
 
P

peter koch

Daniel T. skrev:
The same way the standard algorithms do. 'std::copy' for example accepts
a container as an input param, and another container as an output param.

Well.... that one is a bad example. std::copy does not return a
collection. Actually, i can't remember a single std::algorithm that
does so (but I am tired and might well be wrong).
In my opinion you need very strong arguments (and those arguments
include measured improvements) in order to not return by value.

/Peter
 
R

Roland Pibinger

How do pyou ass a STL container
like say a vector or a map (to and from a function) ?

If you obey to the STL value-semantics dogma you pass everything by
value. If not you use the most efficient way.
function:

vector<string> tokenize(string s){

vector<string> myvector;
//split s and push_back into myvector;

//is this ok ? vector destroyed on exit from funcion ?
return myvector;
}

main:
vector<string> result = tokenize(s);

For it to work, there has to be deep copy of the result of vector inside
function (myvector) into the "result" vector before myvector is destroyed.
Is that how it works ? Could this be inefficient if there is large amount
of data to be copied from the container ?

Yes, it can be inefficient. See also
http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/aa4daafacd01ce26
especially the subthread that starts with John Potters reply and which
also covers performance aspects.

Best wishes,
Roland Pibinger
 
D

Daniel T.

"peter koch said:
Daniel T. skrev:


Well.... that one is a bad example. std::copy does not return a
collection. Actually, i can't remember a single std::algorithm that
does so (but I am tired and might well be wrong).
In my opinion you need very strong arguments (and those arguments
include measured improvements) in order to not return by value.

I beg to differ, std::copy does return a container in its own way...

vector<int> foo;
copy( istream_iterator<int>( cin ), istream_iterator<int>(),
back_inserter( foo ) );

The data in foo was returned...

In other words when you want to pass in a container:

tempalte < typename InIt >
void func( InIt first, InIt last );

when you want to return a container:

template < typename OutIt >
void func( OutIt first );
 
M

Marek Vondrak

I don't think that the compiler that comes with MS VisualStudio 2003
It surely does. VC 6.0 also did if I do not remember wrong. Remember
that not all compiles will result in RVO.... you have to enable at
least some optimisations.

I dare not to agree (sorry for being offtopic). Maybe this all depends on
the definition of what RVO is but the Microsoft compilers always behaved
like this (at least MSVC6, 7 and 7.1). Either the whole function is inlined
and the temporary is eventually eliminated (this is not a RVO) or a function
call is made and the temporary is not eliminated. This is demonstrated by
the following dumb test case:

-- test.cpp --

class Vector
{
public:
Vector( float x, float y, float z );
Vector( const Vector & v );
~Vector();
Vector operator+( const Vector & v );
float * array;
};

Vector::Vector( float x, float y, float z ) :
array( new float[ 3 ] )
{
}

Vector::Vector( const Vector & v ) :
array( new float[ 3 ] )
{
}

Vector::~Vector()
{
delete array;
}

Vector Vector::eek:perator+( const Vector & v )
{
Vector u( 1, 2, 3 );
return u;
}

-- cut --
cl test.cpp /GX /Ox /Ob2 /Fa /c

; Function compile flags: /Ogty
xdata$x ENDS
_TEXT SEGMENT
$T325 = -20 ; size = 4
_u$ = -16 ; size = 4
__$EHRec$ = -12 ; size = 12
___$ReturnUdt$ = 8 ; size = 4
_v$ = 12 ; size = 4
??HVector@@QAE?AV0@ABV0@@Z PROC NEAR ; Vector::eek:perator+
; _this$ = ecx
; Line 27
push -1
push __ehhandler$??HVector@@QAE?AV0@ABV0@@Z
mov eax, DWORD PTR fs:__except_list
push eax
mov DWORD PTR fs:__except_list, esp
sub esp, 8
push esi
push edi
; Line 28
push 12 ; 0000000cH
mov DWORD PTR $T325[esp+32], 0
call ??2@YAPAXI@Z ; operator new
mov esi, eax
mov DWORD PTR _u$[esp+32], esi
; Line 29
push 12 ; 0000000cH
mov DWORD PTR __$EHRec$[esp+44], 1
call ??2@YAPAXI@Z ; operator new
mov edi, DWORD PTR ___$ReturnUdt$[esp+32]
mov DWORD PTR [edi], eax
push esi
mov DWORD PTR $T325[esp+40], 1
mov BYTE PTR __$EHRec$[esp+48], 0
call ??3@YAXPAX@Z ; operator delete
; Line 30
mov ecx, DWORD PTR __$EHRec$[esp+40]
add esp, 12 ; 0000000cH
mov eax, edi
pop edi
pop esi
mov DWORD PTR fs:__except_list, ecx
add esp, 20 ; 00000014H
ret 8
_TEXT ENDS
text$x SEGMENT
$L323:
lea ecx, DWORD PTR _u$[ebp]
jmp ??1Vector@@QAE@XZ ; Vector::~Vector
$L324:
mov eax, DWORD PTR $T325[ebp]
and eax, 1
je $L326
and DWORD PTR $T325[ebp], -2 ; fffffffeH
mov ecx, DWORD PTR ___$ReturnUdt$[ebp-4]
jmp ??1Vector@@QAE@XZ ; Vector::~Vector
$L326:
ret 0
__ehhandler$??HVector@@QAE?AV0@ABV0@@Z:
mov eax, OFFSET FLAT:$T345
jmp ___CxxFrameHandler
text$x ENDS
??HVector@@QAE?AV0@ABV0@@Z ENDP ; Vector::eek:perator+
 
P

peter koch

Daniel T. skrev:
I beg to differ, std::copy does return a container in its own way...

vector<int> foo;
copy( istream_iterator<int>( cin ), istream_iterator<int>(),
back_inserter( foo ) );

The data in foo was returned...
So you mean that the data was returned in foo? In that case we simply
have a different perception of "returning values". To me, std::copy
does not return data in foo.

Perhaps this example better demonstrates what i mean?
vector<int> foo;
foo.push_back(117);
copy( istream_iterator<int>( cin ), istream_iterator<int>(),
back_inserter( foo ) );

copy definitely does not return its data in foo.
In other words when you want to pass in a container:

tempalte < typename InIt >
void func( InIt first, InIt last );

Fine! And now let func remove the second element.
when you want to return a container:

template < typename OutIt >
void func( OutIt first );


It still does not return a container.
template <class container> void normalise_container(container const&
c);
template <class container> void print_container(container const& c);
print_container(normalise_container(func(???)));


/Peter
 
P

peter koch

Marek Vondrak skrev:
I dare not to agree (sorry for being offtopic). Maybe this all depends on
Happily this group is not moderated ;-)
the definition of what RVO is but the Microsoft compilers always behaved
like this (at least MSVC6, 7 and 7.1). Either the whole function is inlined
and the temporary is eventually eliminated (this is not a RVO) or a function
call is made and the temporary is not eliminated. This is demonstrated by
the following dumb test case:

-- test.cpp --
[snipped]
I must admit I had some trouble following your assembly listing (and
also your program - why writing the code inside a class and why
operator+??). In particular i failed to see where operator+ (which must
be the interesting function) did copy the result back and where the
local copy got destroyed. Instead, I made my own program ;-). :

#include <iostream>

struct test
{
test();
test(test const& rhs);
int i;
};

test::test() { std::cout << "test::test\n";}

test::test(test const& rhs) { std::cout << "test::test(test const&
rhs)\n";}

test func()
{
test t;
return t;
}


test func2(test const& tf)
{
test t(tf);
return t;
}


int main()
{
test t1(func());
test t2(func2(t1));
}

The program prints
"test::test
test::test(test const& rhs)"
which is exactly what you'd expect with RVO. Assembly for the two
test-functions:

PUBLIC ?func@@YA?AUtest@@XZ ; func
; Function compile flags: /Ogtpy
; COMDAT ?func@@YA?AUtest@@XZ
_TEXT SEGMENT
___$ReturnUdt$ = 8 ; size = 4
?func@@YA?AUtest@@XZ PROC ; func, COMDAT

; 15 : {

00000 56 push esi

; 16 : test t;

00001 8b 74 24 08 mov esi, DWORD PTR ___$ReturnUdt$[esp]
00005 8b ce mov ecx, esi
00007 e8 00 00 00 00 call ??0test@@QAE@XZ ; test::test

; 17 : return t;

0000c 8b c6 mov eax, esi
0000e 5e pop esi

; 18 : }

0000f c3 ret 0
?func@@YA?AUtest@@XZ ENDP ; func
_TEXT ENDS
PUBLIC ?func2@@YA?AUtest@@ABU1@@Z ; func2
; Function compile flags: /Ogtpy
; COMDAT ?func2@@YA?AUtest@@ABU1@@Z
_TEXT SEGMENT
___$ReturnUdt$ = 8 ; size = 4
_tf$ = 12 ; size = 4
?func2@@YA?AUtest@@ABU1@@Z PROC ; func2, COMDAT

; 23 : test t(tf);

00000 8b 44 24 08 mov eax, DWORD PTR _tf$[esp-4]
00004 56 push esi
00005 8b 74 24 08 mov esi, DWORD PTR ___$ReturnUdt$[esp]
00009 50 push eax
0000a 8b ce mov ecx, esi
0000c e8 00 00 00 00 call ??0test@@QAE@ABU0@@Z ; test::test

; 24 : return t;

00011 8b c6 mov eax, esi
00013 5e pop esi

; 25 : }

which verifies that RVO indeed is in effect. This is for Microsoft
Visual Studio 2005
Version 8.0.50727.42 (RTM.050727-4200) as I do not have the older
Visual C++ compilers at home, but you can copy/paste and easily verify
my code.

/Peter
 
D

Daniel T.

"peter koch said:
Daniel T. skrev:


So you mean that the data was returned in foo? In that case we simply
have a different perception of "returning values". To me, std::copy
does not return data in foo.

What about transform?
Perhaps this example better demonstrates what i mean?
vector<int> foo;
foo.push_back(117);
copy( istream_iterator<int>( cin ), istream_iterator<int>(),
back_inserter( foo ) );

copy definitely does not return its data in foo.

How so? All the data collected inside the copy function is given to the
caller through foo...
Fine! And now let func remove the second element.

The same way std::remove does it.
It still does not return a container.
template <class container> void normalise_container(container const&
c);
template <class container> void print_container(container const& c);
print_container(normalise_container(func(???)));

template < typename FwIt > void normalize( FwIt first, FwIt last );
template < typename FwIt > void print( FwIt first, FwIt last ) {
copy( first, last, ostream_iterator<int>( cout, " " ) );
}

normalize( vec.begin(), vec.end() );
print( vec.begin(), vec.end() );
 
P

peter koch

Daniel T. skrev:
peter koch said:
Daniel T. skrev:
Well... I just notice the line above. I agree that std::copy might make
the data copied available somehow. The difference is one of words. I
meant return as used in a C++ program whereas you seemingly mean return
in the sense that the data will afterwards be available to the caller.
What about transform?


How so? All the data collected inside the copy function is given to the
caller through foo...

Surely. But std::copy did not return the data - it simply put them into
some iterator.
The same way std::remove does it.
That does not remove the data from the container.
template < typename FwIt > void normalize( FwIt first, FwIt last );
template < typename FwIt > void print( FwIt first, FwIt last ) {
copy( first, last, ostream_iterator<int>( cout, " " ) );
}

normalize( vec.begin(), vec.end() );
print( vec.begin(), vec.end() );
You get the same functionality but with an added complexity:
std::vector<int> vec;
func(vec);
normalize( vec.begin(), vec.end() );
print( vec.begin(), vec.end() );

(forgetting that vec is still in scope).

Versus:
print(normalise(func()));

One simple line. Efficient, leaves no mess.

/Peter
 
P

peter koch

Roland Pibinger skrev:
If you obey to the STL value-semantics dogma you pass everything by
value. If not you use the most efficient way.
The STL dogma is not to pass everything by value. Passing by const
reference is the norm for anything "heavy". What is standard is to
return by value rather than by some output parameter.

[snip]
Yes, it can be inefficient. See also
http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/aa4daafacd01ce26
especially the subthread that starts with John Potters reply and which
also covers performance aspects.

That thread did NOT demonstrate that return by value is expensive. On
the contrary it did show that it was slightly faster than the
alternative methods.
Best wishes,
Roland Pibinger

Kind regards
Peter
 
D

Daniel T.

"peter koch said:
Daniel T. skrev:
peter koch said:
Daniel T. skrev:


Well... I just notice the line above. I agree that std::copy might make
the data copied available somehow. The difference is one of words. I
meant return as used in a C++ program whereas you seemingly mean return
in the sense that the data will afterwards be available to the caller.
True.


Surely. But std::copy did not return the data - it simply put them into
some iterator.

It is a pretty common method of returning multiple datum to the caller.

That does not remove the data from the container.

True. Your point?

You get the same functionality but with an added complexity:
std::vector<int> vec;
func(vec);
normalize( vec.begin(), vec.end() );
print( vec.begin(), vec.end() );

(forgetting that vec is still in scope).

Versus:
print(normalise(func()));

One simple line. Efficient, leaves no mess.

Odd, in your code above, "normalise_container" returns void yet you are
passing its return to print? There is obviously a debate about whether
such a return is efficient.

I will happily admit that templating to the container rather than two
iterators is a great idea, but templating to two iterators is more
idiomatic...
 
M

Marek Vondrak

I must admit I had some trouble following your assembly listing (and
also your program - why writing the code inside a class and why
operator+??). In particular i failed to see where operator+ (which must
be the interesting function) did copy the result back and where the
local copy got destroyed. Instead, I made my own program ;-). :

That is fine. I originally started with a simpler example and had to make it
more complex to prevent copy constructor from being eliminated by the
optimizer (not RVO). The assembly listing showed that the local object was
copied when returned from operator+().
The program prints
"test::test
test::test(test const& rhs)"
which is exactly what you'd expect with RVO.

Okay. This shows that MSVC8 implements RVO, unlike the earlier versions. On
MSVC7.1 I get one call to the constructor and three calls to the copy
constructor.

-- Marek
 
R

Roland Pibinger

Roland Pibinger skrev:
The STL dogma is not to pass everything by value. Passing by const
reference is the norm for anything "heavy". What is standard is to
return by value rather than by some output parameter.

'Value semantics' means that you pass everything by value. STL is
built according to 'value semantics'. Note that I don't recommend
that.

Best wishes,
Roland Pibinger
 
M

Markus Schoder

peter said:
Prefer to return by value and pass by const reference.


Most likely not (at least in a non-debug build). RVO (google for that
one) will kick in and remove the redundant copy. This is the case for
all modern (2000 or later) compilers I know.

Unfortunately this does not work for assigning to an already existing
vector. You can still benefit from RVO by first creating a new vector
and then swapping it into the existing one but that is all but
intuitive and only works for fast swappable objects.
It could if your compiler cant optimise (which I doubt). If it can't
and you spend to much time returning your container, pass the
returnvalue by reference and finish with a swap instead of the return:

void tokenize(string const& s,vector<string>& result){

vector<string> myvector;
//split s and push_back into myvector;

//is this ok ? vector destroyed on exit from funcion ?
std::swap(result,myvector);
return;
}

Notice that the function now is not so easy to use. Also, it will most
likely be slightly slower than the original function.

You can also just do result.clear() and use it directly.

Because of what I said above I still think this approach has some value
even though it is more clumsy to use.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,007
Messages
2,570,266
Members
46,865
Latest member
AveryHamme

Latest Threads

Top