Search algorithm to use

J

joshc

If I have an array of data that I know to be sorted in increasing
order, and the array is less than 50 elements, and I want to find the
first element greater than a certain value, is a simple linear search
the best here(for loop from beginning to end of array)? It seems like
some other search algorithm like binary or whatever would be of no
benefit on this data set.
 
G

Gregory Toomey

joshc said:
If I have an array of data that I know to be sorted in increasing
order, and the array is less than 50 elements, and I want to find the
first element greater than a certain value, is a simple linear search
the best here(for loop from beginning to end of array)? It seems like
some other search algorithm like binary or whatever would be of no
benefit on this data set.

Alternatives
- hash table - insert n slements O(n); search O(1)
- quick sort - sort n elements O(n log n); search O(log n)

A hash table is close to optimal.

gtoomey
 
E

Eric Sosman

joshc said:
If I have an array of data that I know to be sorted in increasing
order, and the array is less than 50 elements, and I want to find the
first element greater than a certain value, is a simple linear search
the best here(for loop from beginning to end of array)? It seems like
some other search algorithm like binary or whatever would be of no
benefit on this data set.

Questions of efficiency are really outside the bounds
of the C language as such; there are many implementations
of C and the timing characteristics of a given operation
differ from one implementation to the next.

Still, for such a small array it seems likely (not
certain, of course) that the difference between a linear
search and a binary search will be negligible. The way
to find out, of course, is to implement the search both
ways and measure. Even then, of course, the result will
only be valid for the implementation at hand and will not
necessarily hold for others.
 
J

joshc

I realize this, but still, regardless, there are algorithms that are in
general better than others. Perhaps you could suggest a more general
programming newsgroup for this type of question? I remember coming
across one some time ago but am having trouble finding it. It was a
programming group in general.
 
C

Chris Williams

joshc said:
I realize this, but still, regardless, there are algorithms that are in
general better than others. Perhaps you could suggest a more general
programming newsgroup for this type of question? I remember coming
across one some time ago but am having trouble finding it. It was a
programming group in general.

A long time ago I looked for some sort of generic "algorithms" group,
but the closest I found was for (primarily 3D) graphics algorithms. I
think you are thinking of comp.programming and that may be the best
spot.

In general the way to figure out how good an algorithm is, is using
"big O" This is just a way of saying, if you have an array of n items,
how many more loops will it take to find the answer as n increases. If
it is exponential then the algorithm isn't so good ( O(n^2) .) If it is
linear ( O(n) ) then it's pretty normal. And if it is logarithmic (
O(n^(1/2)) ) then that's pretty much as good as it gets. The only
modifying factor is that a more complex algorithm will take longer for
each iteration so for a small enough n, an O(n^2) can win over an
O(n^(1/2)).

If you do a brute force loop, this will be linear. If there are ten
items, it will at max take 10 loops. If there are 20 then 20, and if 30
then 30.
A binary search is able to cut off half the work each time so until you
double the size of the array there won't be any change in time. 1 item
will take 1 loop. 2 will take 2, 3 will take 2, 4 will take 3, 5->3,
6->3, 7->3, 8->4, etc. So this is a logorithmic and pretty much as good
as it gets. However it is harder to code and requires the computer to
do some extra stuff each loop so for a short enough array, it probably
isn't worth it.

-Chris
 
J

joshc

Thanks for your reply Chris. I am familar with big O notation from my
algorithms class :). I am going to use a simple linear search due to
this being an embedded environment and concern over code size.
 
E

Eric Sosman

joshc said:
I realize this, but still, regardless, there are algorithms that are in
general better than others. Perhaps you could suggest a more general
programming newsgroup for this type of question? I remember coming
across one some time ago but am having trouble finding it. It was a
programming group in general.

Chris Williams mentioned comp.programming, which I've
seen mentioned before but am not personally familiar with.
Given the small size ("less than 50 elements") of your
problem, though, I'll stick by my previous answer: the
difference between O(N) linear search and O(ln N) binary
search will probably be so small as to be difficult to
measure accurately. Keep in mind two things about big-Oh:
It discards all constant factors, and it's only useful for
"sufficiently large" N. For small N, the constant factors
and even the lower-order terms may be more important than
the asymptotic behavior.

Stepping back just a bit, permit me to question the
wisdom of fretting over this decision at an early stage in
the program development. Do you have any a priori reason
to believe the time for these searches will be significant?
If you need to perform ten million such searches per second
that might be the case -- but if the program is going to do
anything other than searching it seems quite likely that the
other activities will have far more effect on the running
time than the searches will. A friend of mine used to speak
of picking the bottle caps off the beach so the sand would
be nice and clean around the rotting whale carcases ...

Jackson's Laws of Program Optimization:

First Law: Don't Do It.

Second Law (for experts only): Don't Do It Yet.
 
C

CBFalconer

*** Rude topposting fixed ***
I realize this, but still, regardless, there are algorithms that
are in general better than others. Perhaps you could suggest a
more general programming newsgroup for this type of question? I
remember coming across one some time ago but am having trouble
finding it. It was a programming group in general.

comp.programming would be suitable.

Please don't toppost. Your answer belongs after (or intermixed
with) the material to which you reply, with non-germane stuff
snipped out.

As long as you only have 50 elements the complexity of anything
more elegant than a simple linear search suggest you shouldn't
bother. You can even avoid any need for sorting with a linear
search, except that you will have to check the entire array rather
than just values up to a threshold (one half, on average).
 
R

Richard Bos

Gregory Toomey said:
Alternatives
- hash table - insert n slements O(n); search O(1)
- quick sort - sort n elements O(n log n); search O(log n)

It's already known to be sorted.
A hash table is close to optimal.

For fewer than 50 elements?

Richard
 
M

Mark McIntyre

I realize this, but still, regardless, there are algorithms that are in
general better than others. Perhaps you could suggest a more general
programming newsgroup for this type of question? I

comp.programming?
 
M

Mark McIntyre

If I have an array of data that I know to be sorted in increasing
order, and the array is less than 50 elements, and I want to find the
first element greater than a certain value, is a simple linear search
the best here(for loop from beginning to end of array)? It seems like
some other search algorithm like binary or whatever would be of no
benefit on this data set.

for such a small data set, unless the comparison of elements to your target
is horribly expensive or you loop repeatedly on the search, a linear search
will be perfectly adequate. For advice on general algos try
comp.programming.
 
M

Mark McIntyre

I realize this, but still, regardless, there are algorithms that are in
general better than others. Perhaps you could suggest a more general
programming newsgroup for this type of question? I

comp.programming?
 
M

Mark McIntyre

I realize this, but still, regardless, there are algorithms that are in
general better than others. Perhaps you could suggest a more general
programming newsgroup for this type of question? I

comp.programming?
 
L

Lawrence Kirby

On Tue, 15 Feb 2005 20:46:00 -0800, Chris Williams wrote:

....
In general the way to figure out how good an algorithm is, is using
"big O" This is just a way of saying, if you have an array of n items,
how many more loops will it take to find the answer as n increases. If
it is exponential then the algorithm isn't so good ( O(n^2) .)

That is polynomiual, exponential woul be something like O(2^n). E.g. an
algorithm that searched every bit patter of n bits.
If it is
linear ( O(n) ) then it's pretty normal. And if it is logarithmic (
O(n^(1/2)) )

That is also polynomial although a better behaved one. Logarithmic is
O(log(n))
then that's pretty much as good as it gets.

There are O(1) algorithms. Hash table lookups are (or can be) effectively
O(1).

Lawrence
 
A

Al Bowers

joshc said:
If I have an array of data that I know to be sorted in increasing
order, and the array is less than 50 elements, and I want to find the
first element greater than a certain value, is a simple linear search
the best here(for loop from beginning to end of array)? It seems like
some other search algorithm like binary or whatever would be of no
benefit on this data set.

You need to better define this procedure.

What is this "certain value"? Is this certain value guaranteed to
be in the array or not? If it isn't then search algrothm's may
be useless to find the first element after a certain value.

And what is "first element greater than a certain value". Greater
in element number or value? For example, what if there are two
or more "certain values" in the array. Do you want the element which
has a larger value or do you want the next element which might
contain a value equal to the "certain value"?
 
J

joshc

Mr. CBFalconer, I'm pretty new to the newsgroups but how did I toppost?
I am using google groups so to avoid their buginess I clicked the show
options->reply button on the most recent post. Is that not what I am
*supposed* to do?
 
J

joshc

Al said:
You need to better define this procedure.

What is this "certain value"? Is this certain value guaranteed to
be in the array or not? If it isn't then search algrothm's may
be useless to find the first element after a certain value.

And what is "first element greater than a certain value". Greater
in element number or value? For example, what if there are two
or more "certain values" in the array. Do you want the element which
has a larger value or do you want the next element which might
contain a value equal to the "certain value"?

Alright, everyone, no need to waste anymore time on this thread. In all
honesty I don't know whether the search is taking up a significant
amount of time, but I do know that some functions that perform the
linear search as part of their processing are accounting for most of
the execution time(I profiled it). I doubt it's the search though since
there are some multiplications and divisions in the functions.

I realize it was probably a stupid question to ask since the search is
probably not even what's slowing my code down.
 
C

Chris Williams

Lawrence said:
That is polynomiual, exponential woul be something like O(2^n). E.g. an
algorithm that searched every bit patter of n bits.

Hm. Had thought exponential growth referred to _any_ curve which
increased "steepness" as you went along the x axis.
That is also polynomial although a better behaved one. Logarithmic is
O(log(n))
=\


There are O(1) algorithms. Hash table lookups are (or can be) effectively
O(1).

That makes sense. And in the example of the OP's array, if there was
some sort of pattern to the data you might be able to formulate an 0(1)
mathematical formula with which to access the array.

Thank you,
Chris
 
C

CBFalconer

joshc said:
Mr. CBFalconer, I'm pretty new to the newsgroups but how did I toppost?
I am using google groups so to avoid their buginess I clicked the show
options->reply button on the most recent post. Is that not what I am
*supposed* to do?

Do that on the particular message you wish to reply to. Then
delete the quoted portions that are not germane to your reply.
Then put your reply AFTER (or intermixed with) the portions you are
replying to. That way the result stands by itself and reads
correctly.
 
T

Tim Rentsch

joshc said:
If I have an array of data that I know to be sorted in increasing
order, and the array is less than 50 elements, and I want to find the
first element greater than a certain value, is a simple linear search
the best here(for loop from beginning to end of array)? It seems like
some other search algorithm like binary or whatever would be of no
benefit on this data set.

Gosh folks...

A linear search will take about 25 compares on the average.

A binary search will take about 6 compares on the average.

Roughly speaking, a reasonable guess is that the binary search will
run about 3 times as fast as the linear search. Whether that makes
any perceptible difference in the application or not is another
question altogether.

Programming a binary search should be something any competent
programmer can do almost as fast as he/she can write it down:

/* Return the smallest index of an element greater than
* the supplied argument x. If there is no such element,
* return the index where the element would be if there
* were one (which will be 'how_many' if how_many < 0
* is true).
*
* NEEDS: argument 'v' array be sorted in increasing
* order. NOT CHECKED.
*
* Disclaimer: not tested or compiled in any way.
*/

int
first_larger( SomeType v[], int how_many, SomeType x ){
int i = 0, j = how_many;

if( i >= j || v[0] > x ) return 0;

while( i + 1 != j ){
int m = (i + j) / 2;
if( v[m] <= x ) i = m;
else j = m;
}

return j;
}

And, just a few meta-comments.

The comp.lang.c group isn't normally the place to ask this kind of
question; comp.programming is better.

If you asked this question because you got assigned it as homework,
then look over the comments and the code until you think you see how
it works, then stop looking at it and try to write the code yourself.
If you don't you're cheating yourself (and besides which karma will
get you :).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,159
Messages
2,570,881
Members
47,418
Latest member
NoellaXku

Latest Threads

Top