sorts that are good when a vector is largely sorted

C

Comp1597

I am trying to sort a vector of pairs of integers under the usual
ordering: <a, b> < <c,d> if a<c or ( a == c and b<d)

Here, there is a ready made STL sort. However, in this case, the
pairs are likely to be very largely sorted.

Does anyone know any sorting algorithms which work particularly well
when the data is mostly sorted and only a few rearrangements are
likely to be needed?
 
J

Jerry Coffin

I am trying to sort a vector of pairs of integers under the usual
ordering: <a, b> < <c,d> if a<c or ( a == c and b<d)

Here, there is a ready made STL sort. However, in this case, the
pairs are likely to be very largely sorted.

Does anyone know any sorting algorithms which work particularly well
when the data is mostly sorted and only a few rearrangements are
likely to be needed?

An insertion sort.
 
V

Victor Bazarov

I am trying to sort a vector of pairs of integers under the usual
ordering: <a, b> < <c,d> if a<c or ( a == c and b<d)

Here, there is a ready made STL sort. However, in this case, the
pairs are likely to be very largely sorted.

Does anyone know any sorting algorithms which work particularly well
when the data is mostly sorted and only a few rearrangements are
likely to be needed?

A modified bubble sort? What does Google say about those things? Also,
do not discount 'comp.programming' as a source of information about
algorithms.

V
 
J

Juha Nieminen

Victor said:
A modified bubble sort?

I knew this one would pop up its ugly head. Whenever someone asks for
a sorting algorithm, someone will inevitably mention the most
horrendous, most inefficient of them all.

What is it with bubble sort that everyone has such an obsession with
it? Why is it *always* mentioned?

Can you give me even *one* single example input for which bubble sort
(modified or not) is faster than insertion sort?

If you can't, then *why* suggest bubble sort instead of insertion
sort? I can't understand this. I will never understand this.

"Bubble sort is easier to implement than insertion sort." Certainly
not true. They are equally easy to implement.

"Bubble sort is easier to understand than insertion sort." Certainly
not true. In fact, understanding why bubble sort works is more difficult
than understanding why insertion sort works.

"Bubble sort is fast for almost sorted inputs." No it isn't. It's
horrendously slow regardless of how the input is arranged. And it's
*never* faster than insertion sort.

"Bubble sort is fast for small inputs." Given that insertion sort is
faster than bubble sort for *all* inputs, and it's equally easy to
implement and easier to understand, it just doesn't make sense to
suggest it even for small inputs.

I can't think of *any* rational reason *whatsoever* to suggest bubble
sort rather than insertion sort in any conceivable situation (where
efficiency is needed).

So why? Why? I can't understand.
 
J

Juha Nieminen

Jeff said:
Bubble sort. It's O(N) for almost-sorted collections.

No, it isn't. It's O(n^2) regardless of how the input is arranged.
Even if you had a completely sorted input, it would still perform O(n^2)
steps. (Even if you put a check that would stop the sorting if no swaps
are done, even one single out-of-place element would still make it
perform O(n^2) steps.)

Where is this common misconception coming from? Do people even
understand how bubble sort works?

Are you confusing it with insertion sort?
 
T

Thomas J. Gritzan

Juha said:
No, it isn't. It's O(n^2) regardless of how the input is arranged.
Even if you had a completely sorted input, it would still perform O(n^2)
steps. (Even if you put a check that would stop the sorting if no swaps
are done, even one single out-of-place element would still make it
perform O(n^2) steps.)

With that check, bubble sort performs n*m steps, where m is the maximum
distance an element has to be moved.
If the maximal distance is a constant, because your elements are
presorted and they only have to be moved, like, 5 steps, bubble sort
**in this case** is O(n).

But this is a very specific situation and I agree that insertion sort is
better.

I once had such a case:
<anecdote>
I programmed a traffic jam simulator, where the cars were object with a
x-axis floating point variable, which represented the position on the
street.
The cars were (bubble) sorted in each step by x, so that I could easily
check which car was in front of another car to simulate it's behaviour.
A car could pass another car, so that the sorting would swap them. The
street was an endless band, and a car which leaves on the right side
enters the street on the left side again (simply reset x to zero).

Each time a car left on the right side, the application stopped for a
second or two, because bubble sort was back to O(n^2) and moved the
object from the right side to the left.
As I didn't know about std::list and its sort function, and I didn't
know about sorting algorithms except bubble sort either (I was still in
school at the time), I manually removed the object from the linked list
and inserted it at the front.
</anecdote>

Short answer: Don't use bubble sort. Use the right algorithm for the job.
 
J

James Kanze

What is it with bubble sort that everyone has such an
obsession with it? Why is it *always* mentioned?

Because it has a catchy name. "Insertion sort" sounds a bit
mundane and ordinary; "bubble sort" sounds sexy.

[...]
"Bubble sort is easier to implement than insertion sort." Certainly
not true. They are equally easy to implement.

Not really. At least IMHO, bubble sort is significantly more
difficult to implement than insertion sort.

[...]
I can't think of *any* rational reason *whatsoever* to suggest bubble
sort rather than insertion sort in any conceivable situation (where
efficiency is needed).

Obfuscation? Anyone can immediately understand insertion sort
(or selection sort); it's easier to make the code
incomprehensible if you use bubble sort.
 
V

Victor Bazarov

Juha said:
Victor said:
A modified bubble sort?

I knew this one would pop up its ugly head. Whenever someone asks for
a sorting algorithm, someone will inevitably mention the most
horrendous, most inefficient of them all.
> [... the rest of the rant redacted ...]

So I didn't add a smiley. Sue me.

V
 
B

Beej Jorgensen

Thomas J. Gritzan said:
But that's not bubble sort. At least not any typical implementation of
it. And you'll have a hard time to implement the boolean flag
optimization into it.

I think they're largely the same, and I've seen the "exhaustive" form in
the wild. At worst, the parent poster's algorithm can be called "a
variant of the bubble sort", since the data will be shuffled in exactly
the same way as a bubble sort and the only difference is a bunch of
do-nothing cycles at the end. (The parent poster's algorithm has a best
case O(N^2), whereas Wikipedia's has best case O(N).)

Let's have a look at Wikipedia's:


procedure bubbleSort( A : list of sortable items ) defined as:
n := length( A )
do
swapped := false
n := n - 1
for each i in 0 to n - 1 inclusive do:
if A[ i ] > A[ i + 1 ] then
swap( A[ i ], A[ i + 1 ] )
swapped := true
end if
end for
while swapped
end procedure


The "while swapped" loop is going to run a maximum of length(A) times,
right? So you could replace it with a loop over that space and get the
same result. (Which would leave you with the parent poster's
algorithm.) Of course, the data might already be sorted, so you'd want
to short-circuit that loop, but you can work that logic into the outer
loop of the parent poster's example.

Or to put it another way, you could work an outer loop of length(A)
cycles into the Wikipedia example! But why? It would just be
redundant. It's better, like you said, to use the algorithm in
Wikipedia because it will be faster in general and it's just as easy to
commit to human memory.

Wikipedia's sorting animations rock. :)

-Beej
 
J

James Kanze

Jerry said:
[ ... ]
Even your point about implementations is hard to credit --
both use essentially identical operations, just arranged in
a slightly different order (basically, an insertion sort
simply moves one operation out of the inner loop).
But bubble sort was the first algorithm I was exposed to as an
undergraduate. I will never understand any other sorting
algorithm as well as I understand bubble sort.

And that, perhaps, is part of what Juha is complaining about.
Why was bubble sort the first algorithm you were exposed to?
The only justification for presenting it at all is as an example
of how you can do worse than insertion sort.
So basically, you agree with Juha that anybody who has an
opinion differing from yours is just being a stubborn idiot.

Everyone has a right to there opinion. We work in an
engineering domain, however, dominated by objective facts.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,743
Latest member
WoodrowMea

Latest Threads

Top