Which is more efficient:STL:Set or array?

C

CodeCracker

Problem details below:
I have few items(simple values or objects) to be put into an array and
I implement it through a set rather than just an array of items. This
is because every time I get a new item I have to look into the Set
whether it is there or not. Depending on whether it is there or not I
have to respond differntly.
Since it is easy to search into a Set rather than a number of if
statement which will compare each of the element in the array with the
given new item I chose Set but worrying about the efficiency
comparision of them.

Thanks,
CodeCracker
 
L

Larry I Smith

CodeCracker said:
Problem details below:
I have few items(simple values or objects) to be put into an array and
I implement it through a set rather than just an array of items. This
is because every time I get a new item I have to look into the Set
whether it is there or not. Depending on whether it is there or not I
have to respond differntly.
Since it is easy to search into a Set rather than a number of if
statement which will compare each of the element in the array with the
given new item I chose Set but worrying about the efficiency
comparision of them.

Thanks,
CodeCracker

Don't worry about it.
 
E

E. Robert Tisdale

CodeCracker said:
Problem details below:
I have few items(simple values or objects) to be put into an array
and I implement it through a set rather than just an array of items.
This is because every time I get a new item,
I have to look into the Set whether it is there or not.
Depending on whether it is there or not I have to respond differntly.
Since it is easy to search into a Set
rather than a number of if statement
which will compare each of the element in the array with the given new item,
I chose Set but worrying about the efficiency comparision of them.


It sounds like the objects that you are describing *are* sets
so a set class template should be the most efficient representation.
These templates are part of the standard library for a reason.
The reason is to give the compiler developer an opportunity
to provide a highly optimized implementation.
These implementations are usually difficult to beat
with any "hand rolled" implementation.

As a C++ programmer, your job is to match the containers
required by your program with standard container class templates.
All of the classic Abstract Data Types (ADTs) are implemented
in the standard library so this shouldn't be a problem.
If you can't find a perfect match,
you should try to adapt one of the standard class templates
(by deriving an appropriate class [template] from it for example.)

Only rarely is it necessary
to design a new container class from scratch.
If you find yourself doing this,
you probably need to re-examine your requirements
and convince yourself that you haven't just made a big mistake.
 
L

Larry I Smith

CodeCracker said:
Problem details below:
I have few items(simple values or objects) to be put into an array and
I implement it through a set rather than just an array of items. This
is because every time I get a new item I have to look into the Set
whether it is there or not. Depending on whether it is there or not I
have to respond differntly.
Since it is easy to search into a Set rather than a number of if
statement which will compare each of the element in the array with the
given new item I chose Set but worrying about the efficiency
comparision of them.

Thanks,
CodeCracker

Besides the QT manuals, you might get some good help in the
KDE newsgroup. Some smart QT developers hang out there.

comp.windows.x.kde

Larry
 
C

CodeCracker

Say in two situations :
1. I have 10 items
2. I have 50 or more items.
Still I should not worry about efficiency in either of them. I can go
ahead with either STL:set or array implementation.
 
C

CodeCracker

I think I wasn't clear. Let me rehearse what I was telling.
I have some items( either simple values as int, float etc or objects (
some other classes of mine say FileClass).
Now every time somebody calls my function DoSomething() I do a search
inside my array to find that item and if it is found I call DoThis()
and if not found I call DoThat().
But the problem here is that if I use an array implementation I have to
write a whole lot of If statement or write a loop where it goes in
search of the current item in the array, for which it will make a whole
lot of comparison. Instead of doing all this if I use the STL:set
containers and insert my items once there and then search for my
current item every time in my set will it not be more efficient than
my array implementation.
Remember I am not creating my container from scratch and using the
existing containers only.
 
P

Pete Becker

CodeCracker said:
I think I wasn't clear. Let me rehearse what I was telling.
I have some items( either simple values as int, float etc or objects (
some other classes of mine say FileClass).
Now every time somebody calls my function DoSomething() I do a search
inside my array to find that item and if it is found I call DoThis()
and if not found I call DoThat().
But the problem here is that if I use an array implementation I have to
write a whole lot of If statement or write a loop where it goes in
search of the current item in the array, for which it will make a whole
lot of comparison. Instead of doing all this if I use the STL:set
containers and insert my items once there and then search for my
current item every time in my set will it not be more efficient than
my array implementation.
Remember I am not creating my container from scratch and using the
existing containers only.

If you're going to populate your array once and then not change it
you're probably better off with a vector, because it uses less memory.

vector<whatever> vec(first, last);
sort(vec.begin(), vec.end());

when you need to find something, use

find(vec.begin(), vec.end(), X)

Even if you're going to add or remove things occasionally, this is
probably better. set wins when you add and remove things fairly often.
 
L

Larry I Smith

CodeCracker said:
Say in two situations :
1. I have 10 items
2. I have 50 or more items.
Still I should not worry about efficiency in either of them. I can go
ahead with either STL:set or array implementation.

yes
 
C

CodeCracker

But if while adding I want to be sure that the item in the containers
are all unique then still vector be preferred.
 
K

Kai-Uwe Bux

CodeCracker said:
But if while adding I want to be sure that the item in the containers
are all unique then still vector be preferred.

You could use a set while you are inserting items. Once that is finished,
you copy the set into a vector, preserving the order of the items.
Afterwards looking up items in the vector will be fast using binary search.
The costs for the copy operation are very small compared to the savings
provided you have a lot of lookups. Also, this is only applicable if no
further changes to the data structure need to be done in the lookup-phase.


Best

Kai-Uwe Bux

ps.: Do not worry about efficiency before you profile your code. Quite
possibly the real timesink is somewhere else.
 
V

Vyacheslav Kononenko

CodeCracker said:
But if while adding I want to be sure that the item in the containers
are all unique then still vector be preferred.

Just use std::set so your code will be simplier and most probably it
will be faster than you need. And for amount of ~50 elements you would
hardly measure the difference of execution time btw std::vector and
std::set (maybe even std::list). So use std::set and if profiler will
show that there is a problem with std::set then try something else.

Regards,
Vyacheslav
 
E

E. Robert Tisdale

CodeCracker said:
But if while adding I want to be sure that
the item in the containers are all unique
then still vector be preferred.

No.
You are confused.
The object, as you have described it so far, is a set.
You should implement it with a set template class.
This will be the most efficient and portable implementation.
Using any other class template will almost certainly
result in a sub optimal implementation.

Study this object carefully.
It is either a set or it isn't.
If you can discover some reason why it isn't a set,
tell us and, perhaps, we can help you find a better match.
 
R

Richard Herring

E. Robert Tisdale said:
No.
You are confused.
The object, as you have described it so far, is a set.
You should implement it with a set template class.
This will be the most efficient and portable implementation.
Using any other class template will almost certainly
result in a sub optimal implementation.

Study this object carefully.
It is either a set or it isn't.

Wrong level of abstraction. It's not what it "is", but what it _does_,
that matters. The C++ library containers and algorithms support various
operations - insert, erase, search - with different efficiency
tradeoffs. Patterns of use also vary - do you have separate phases which
build then search, or are the operations intermixed?

What you want is the container which most efficiently supports your
pattern of use. That might very well be std::set, but it doesn't have to
be.
If you can discover some reason why it isn't a set,
tell us and, perhaps, we can help you find a better match.

Note that the STL set-oriented algorithms includes, set_union,
set_intersection, set_difference, set_symmetric_difference are designed
to work on any sorted sequence, not just a std::set.
 
R

Richard Herring

Pete Becker said:
If you're going to populate your array once and then not change it
you're probably better off with a vector, because it uses less memory.

vector<whatever> vec(first, last);
sort(vec.begin(), vec.end());

And maybe vec.erase(unique(vec.begin(), vec.end()), vec.end()) to remove
duplicates?
when you need to find something, use

find(vec.begin(), vec.end(), X)

Since it's sorted, binary_search (if you just want to know if X is
there) or lower_bound (if you need an iterator to it) might be better.
 
P

Pete Becker

E. Robert Tisdale said:
The object, as you have described it so far, is a set.
You should implement it with a set template class.
This will be the most efficient and portable implementation.
Using any other class template will almost certainly
result in a sub optimal implementation.

It depends on what's important. As I said earlier, a set will use more
memory than a vector of the same size. In addition, its implementation
requires more code.
 
P

Pete Becker

Richard said:
And maybe vec.erase(unique(vec.begin(), vec.end()), vec.end()) to remove
duplicates?

Sigh. Maybe. But designing from incomplete information is usually a
waste of time. I see no point in trying to fine-tune this example.
 
P

Pete Becker

Richard said:
Since it's sorted, binary_search (if you just want to know if X is
there) or lower_bound (if you need an iterator to it) might be better.

Good point. That was the idea, but I used the wrong algorithm.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,204
Messages
2,571,061
Members
47,668
Latest member
SamiraShac

Latest Threads

Top