Using Java Classes to Sort a Small Array Quickly

K

KevinSimonson

I have an <enum> named <ProjectEnum> that has twelve values. Today I
added an instance variable to it, a <String> named <collectionTitle>.
The engineer I'm working with asked me to write a static method in
class <ProjectEnum> that returns an array of <ProjectEnum>s sorted
alphabetically by this value <collectionTitle>.

I wrote a little program that compares the performance of
SelectionSort and InsertionSort on comparable arrays of <String>s, and
discovered that InsertionSort sorts in about half the amount of time
that InsertionSort does, at least on our machines. Therefore I
implemented my static method, <alphaSort()>, using InsertionSort.

On the way home from work my fellow carpooler told me that there are
Java classes that can do sorts in O(N) time. That would be an
improvement, since although InsertionSort is faster than
SelectionSort, it's still an O(N^2) algorithm. So I went looking, but
all I could find was the <sort()> method from the <Collections>
class. I replaced my code for SelectionSort with a loop that reads
the input array into an <ArrayList< String>> object, calls <sort()> on
the <ArrayList< String>> object, and then reads the values back into
the array; and then ran my test code again. This method was
significantly faster than SelectionSort, but my InsertionSort code
still beat it.

So I guess my question is, <b><i>is</i></b> there a way in Java to
sort an array of <String>s (or perhaps more to the point to sort an
array of <ProjectEnum>s) that runs in O(N) time? Or even that runs in
O(N^2) time but faster than InsertionSort? I'd appreciate any
information anyone can give me on this.

Kevin Simonson
 
M

markspace

On the way home from work my fellow carpooler told me that there are
Java classes that can do sorts in O(N) time.


Pfft. No. Theoretical maximum speed of a sort is something like n log
n. The only data you can sort in n time is data that's already sorted.

my InsertionSort code
still beat it.


How many projects can you have? For small arrays anything is fast and
even a bubble sort should work fine. If you have many projects... how
do you code that up as an enum anyway?

Sorry but there's a little red flashing light labeled "warning!" that's
going off in my mind right now.
 
E

Eric Sosman

I have an<enum> named<ProjectEnum> that has twelve values. Today I
added an instance variable to it, a<String> named<collectionTitle>.
The engineer I'm working with asked me to write a static method in
class<ProjectEnum> that returns an array of<ProjectEnum>s sorted
alphabetically by this value<collectionTitle>.

I wrote a little program that compares the performance of
SelectionSort and InsertionSort on comparable arrays of<String>s, and
discovered that InsertionSort sorts in about half the amount of time
that InsertionSort does, at least on our machines. Therefore I
implemented my static method,<alphaSort()>, using InsertionSort.

Failed to find Arrays.sort, did you?
On the way home from work my fellow carpooler told me that there are
Java classes that can do sorts in O(N) time.

Possible, perhaps, for restricted key types and given some
assumptions about the possible values: For instance, it's easy to
sort an array of `boolean' in O(N) time. But quite clearly *not*
possible for sorts based on pairwise comparisons, since log(N!)
grows faster than O(N).
That would be an
improvement, since although InsertionSort is faster than
SelectionSort, it's still an O(N^2) algorithm.

So? N in your case is twelve. There's no point in studying the
behavior "as twelve approaches infinity," becase it doesn't.

Besides, you're misusing O. Suppose I offer you two algorithms,
one whose running time is O(N^2) and the other O(N). Which is faster?
Before you say "O(N), nitwit," let me show you the actual timing
formulae:

T1(N) = N * N (nanoseconds)
T2(N) = N (gigayears)

T1(N) is clearly O(N^2), while T2(N) is O(N). Which do you choose
for sorting N=12 items?
So I went looking, but
all I could find was the<sort()> method from the<Collections>
class.

Either your looking was extremely cursory, or you haven't learned
how to look. I'm not sure how you found Collections.sort, but *if*
you had opened the Javadoc, gone to the "S" index page, and hunted for
the word "sort," you'd have found Arrays.sort right next door to it.
I replaced my code for SelectionSort with a loop that reads
the input array into an<ArrayList< String>> object, calls<sort()> on
the<ArrayList< String>> object, and then reads the values back into
the array; and then ran my test code again. This method was
significantly faster than SelectionSort, but my InsertionSort code
still beat it.

Your improved Javadoc searching skills would have found Arrays.sort,
which might have directed your attention to the Arrays class as a whole
and led you to discover Arrays.asList. Or possibly not, because you
have no need of a List anyhow. Meanwhile you've run "seven times around
the seven hills of Rome," and it's no surprise that the unnecessary trip
has made your task take longer than it needed to.
So I guess my question is,<b><i>is</i></b> there a way in Java to
sort an array of<String>s (or perhaps more to the point to sort an
array of<ProjectEnum>s) that runs in O(N) time? Or even that runs in
O(N^2) time but faster than InsertionSort? I'd appreciate any
information anyone can give me on this.

Probably nothing of O(N); see remarks above. Also probably
nothing as bad as O(N^2), but you can always do the hills of Rome
thing to slow down a better method if you want.

BUT, BUT, BUT! Your N is a constant, a small constant! Even if
your hand-crafted sort runs at ten times the speed of Arrays.sort,
I posit that you have already expended more time than your speedy
method will ever recoup: If Arrays.sort takes a microsecond while
yours takes 100 nanoseconds, and if all your investigation and
implementation (and writing to Usenet) took just ten minutes, you
need more than six hundred billion fast sorts merely to break even.
Modify the arithmetic as needed in light of your actual measurements,
then turn your talents elsewhere instead of wasting them on a non-
problem.
 
E

Eric Sosman

[...] If Arrays.sort takes a microsecond while
yours takes 100 nanoseconds, and if all your investigation and
implementation (and writing to Usenet) took just ten minutes, you
need more than six hundred billion fast sorts merely to break even.

"For suitable values of a billion," like those that begin
with an "M" instead of a "B."

<FX: dope slap>
 
R

Robert Klemme

I have an<enum> named<ProjectEnum> that has twelve values. Today I
added an instance variable to it, a<String> named<collectionTitle>.
The engineer I'm working with asked me to write a static method in
class<ProjectEnum> that returns an array of<ProjectEnum>s sorted
alphabetically by this value<collectionTitle>.

Since we're talking about an enum and I am sure you made field "name"
final (i.e. to make instances immutable) you can completely ignore sort
performance. You just need to sort once. For example:

package en;

import java.util.Arrays;
import java.util.Comparator;

public enum ProjectEnum {

P1("xyz"), P2("abc"), P3("def");

private final String collectionTitle;

private ProjectEnum(String name) {
if (name == null) {
throw new NullPointerException();
}

this.collectionTitle = name;
}

public String getCollectionTitle() {
return collectionTitle;
}

private static final ProjectEnum[] sortedValues;

static {
sortedValues = ProjectEnum.values();

Arrays.sort(sortedValues, new Comparator<ProjectEnum>() {
@Override
public int compare(ProjectEnum o1, ProjectEnum o2) {
return o1.getCollectionTitle().compareTo(o2.getCollectionTitle());
}
});
}

public static ProjectEnum[] sorted() {
return Arrays.copyOf(sortedValues, sortedValues.length);
}
}

Kind regards

robert
 
R

RedGrittyBrick

I have an<enum> named<ProjectEnum> that has twelve values. Today I
added an instance variable to it, a<String> named<collectionTitle>.
The engineer I'm working with asked me to write a static method in
class<ProjectEnum> that returns an array of<ProjectEnum>s sorted
alphabetically by this value<collectionTitle>.


is there a way in Java to
sort an array of<String>s (or perhaps more to the point to sort an
array of<ProjectEnum>s) that runs in O(N) time? Or even that runs in
O(N^2) time but faster than InsertionSort? I'd appreciate any
information anyone can give me on this.


If I was sorting twelve elements, I would consider myself utterly
deranged if I found myself worrying about the sort algorithm.



If the static method will be called millions of times per second, you
might want to cache the sort results rather than pointlessly repeating
the sort, but only after measuring a performance problem and working out
if this method really needs to be called that often.
 
R

Roedy Green

I wrote a little program that compares the performance of
SelectionSort and InsertionSort on comparable arrays of <String>s, and
discovered that InsertionSort sorts in about half the amount of time
that InsertionSort does, at least on our machines. Therefore I
implemented my static method, <alphaSort()>, using InsertionSort.

The bulit-in sort in Java beats everything I have thrown at it except
a RadixSort for some cases. I have posted the code for a number of
different algorithms, including RadixSort, HeapSort, BubbleSort,
ShellSort, QuickSort. InsertionSort and SelectionSort are too
disgusting to code.

See http://mindprod.com/jgloss/sort.html

I have also written an applet that generates Comparator/Comparable
classes.
See http://mindprod.com/applet/comparator.html

Java has a built-in sort mechanism with Comparator and Comparable
interfaces. Even if you experiment with your own sorts, they should
use the standard interface.
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
R

Roedy Green

Pfft. No. Theoretical maximum speed of a sort is something like n log
n. The only data you can sort in n time is data that's already sorted.

see http://mindprod.com/jgloss/radixsort.html

Surely you remember mechanical card sorters. They did precisely that.
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
R

Roedy Green

Either your looking was extremely cursory, or you haven't learned
how to look. I'm not sure how you found Collections.sort, but *if*
you had opened the Javadoc, gone to the "S" index page, and hunted for
the word "sort," you'd have found Arrays.sort right next door to it.

Here are three techniques to see if Java has some built-in capability.

You need an indexing tool like Copernic to find plausible method names
in the JavaDoc that you can download and insert in the JDK.

Alternatively you can use a tool like Funduc Search and replace or
Extract http://mindprod/products1.html#EXTRACT to linearly search for
regular expressions.

The other technique is to Google something like [sort Java array] and
see what code pops up. Look for the relevant classes and consult the
Javadoc.
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
R

Roedy Green

I have an <enum> named <ProjectEnum> that has twelve values

have a look at the sample code at
http://mindprod.com/jgloss/enum.html#SORTING

It sorts an array of enums both by the default ordinal and using a
custom Comparator.


You can generate Comparators with
http://mindprod.com/applet/comparatorcutter.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is,
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
 
E

Eric Sosman

Conversely, Jon Bentley in an (evidently...) old book gives
an example (with implementations and timings)
where quicksort on a TRS-80 outperforms
bubble sort on a Cray-1.

I think it was in a CACM column; can't remember whether it was
part of his "Programming Pearls" series or not. The latter have
been collected in book form.
I forget his value of 'N'.

So do I.
 
J

Joshua Cranmer

I have an<enum> named<ProjectEnum> that has twelve values. Today I
added an instance variable to it, a<String> named<collectionTitle>.
The engineer I'm working with asked me to write a static method in
class<ProjectEnum> that returns an array of<ProjectEnum>s sorted
alphabetically by this value<collectionTitle>.

Let me start by pointing out: you have 12 elements in your array to
sort. Any sort you pick, short of bogosort, would complete in at most a
millisecond or two, even on crappy old hardware.
On the way home from work my fellow carpooler told me that there are
Java classes that can do sorts in O(N) time. That would be an
improvement, since although InsertionSort is faster than
SelectionSort, it's still an O(N^2) algorithm.

The big-O notation is used to indicate asymptotic execution time, so
it's more useful if you're talking about high values of N (billions are
not unheard of in graph theory). At low values of N, the overhead in
setting up and per-element stuff can dominate the runtime.

The standard sort methods are quicksort (used in Java for primitive
arrays) and mergesort (used in Java for arrays of reference types).
These are both implemented recursively. However, recursion induces a
high overhead cost (setting up and tearing down the function stack isn't
cheap, and in these cases also involves yet another linear scan, etc.),
so when the array size is very small, it just sorts the array using
insertion sort. I do not recall the magic cutoff point off the top of my
had, but I believe it is between 8 and 16.

I would also like to point out that no comparison sort is capable of
achieving faster than O(n lg n) time in the worst case. Things like
radix sort can achieve O(n), at the cost of being dependent on the size
of the input, so they are less generally usable. Boolean sort and
possibly byte sort (given the extremely low numbers of values these can
attain) actually do use this sort of algorithm in Java.
So I went looking, but
all I could find was the<sort()> method from the<Collection
class.

Actually, Arrays.sort is the "main" sort method: if you look at the
code for Collections.sort, it extracts the elements into an array, sorts
that array, and then reinserts the method in the Collections in order.
If you originally have an array, Arrays.sort will save you 4 buffer
copies. Since you commented that your insertion sort beat this method, i
am betting that it's the multiple copies that is making it slower than
your hand-built sort.
 
E

Eric Sosman

Pfft. No. Theoretical maximum speed of a sort is something like n log
n. The only data you can sort in n time is data that's already sorted.

Wrong.
Here's the proof: Sorting an array of positive integers in O(n) time:

public static void main(final String[] args) {
final int[] data = new int[]{3,7,6,3,5,4,1,3,1,1,1,3,4,6,0,0};
System.out.println(Arrays.toString(data));
sort(data);
System.out.println(Arrays.toString(data));
}

private static void sort(final int[] data) {
if (data.length> 0) {
long max = Long.MIN_VALUE;
for (final int t : data) {
max = Math.max(t, max);
}
final long[][] buckets = new long[(int) max+1][data.length];
for (final long[] bucket : buckets) {
Arrays.fill(bucket, Long.MIN_VALUE);
}
for (final int x : data) {
for (int y = 0; y< buckets[x].length; ++y) {
if (buckets[x][y] == Long.MIN_VALUE) {
buckets[x][y] = x;
break;
}
}
}
int x = -1;
for (long[] bucket : buckets) {
for (long value : bucket) {
if (value == Long.MIN_VALUE) {
break;
}
data[++x] = (int) value;
}
}
}
}

Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?
 
E

Eric Sosman

Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

Ah. Then I claim I can sort an array of integers in O(0) time.
(And my claim is O(as worthwhile) as yours.)
 
A

Arne Vajhøj

Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

No.

But a solution is a bit more practical if it does.

Arne
 
L

Lew

Arne said:
Wanja said:
(e-mail address removed) says...
Wanja Gayk wrote:
Here's the proof: Sorting an array of positive integers in O(n) time: ....
Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

By using big-O notation, you said something about the behavior of the algorithm as the size of the data set tends to infinity, so yes, you did.
No.

But a solution is a bit more practical if it does.

Wanja was making a joke. The performance of an algorithm on one particulardata set is irrelevant to a discussion of asymptotic performance. The hubof the joke was his reference to "O(n)" and calling it "proof" when he wasmaking no statement about asymptotic performance.

I get it.
 
G

Gene Wirchenko

Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

You did. Claiming an algorithm is O(n) means claiming that that
is the upper bound.

Sincerely,

Gene Wirchenko
 
A

Arne Vajhøj

Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

You did. Claiming an algorithm is O(n) means claiming that that
is the upper bound.

No - big O for algorithms is usual average case not worst case.

I am sure you can find plenty of quotes for quicksort being
O(nlogn) and not O(n^2).

Arne
 
A

Arne Vajhøj

Arne said:
Wanja said:
(e-mail address removed) says...
Wanja Gayk wrote:
Here's the proof: Sorting an array of positive integers in O(n) time: ...
Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

By using big-O notation, you said something about the behavior of the algorithm as the size of the data set tends to infinity, so yes, you did.
No.

But a solution is a bit more practical if it does.

Wanja was making a joke. The performance of an algorithm on one particular data set is irrelevant to a discussion of asymptotic performance. The hub of the joke was his reference to "O(n)" and calling it "proof" when he was making no statement about asymptotic performance.

I get it.

????

Bucket sort is O(n) in n.

Asymptotically for n going against infinity.

Big O for execution speed traditionally does not look at
memory restrictions and data type restrictions.

His code is an example of code - the proof should be in most algorithmic
books.

It is not unheard of for a practical implementation of a sort to have
some limitations.

But usually it is for extreme cases.

Not being able to sort an array with one huge element is
a real problem for practical usage.

Arne
 
L

Lew

Arne said:
Lew said:
Arne said:
Wanja Gayk wrote:
(e-mail address removed) says...
Wanja Gayk wrote:
Here's the proof: Sorting an array of positive integers in O(n) time: ...
Very nice! Would you care to try this approach on a shorter
input array, like

data = new int[] { Integer.MAX_VALUE };

This case should be quite simple, since the array is already sorted.
Let us know how you make out, will you?

I didn't say it works for any array out there, did I?

By using big-O notation, you said something about the behavior of the algorithm as the size of the data set tends to infinity, so yes, you did.
No.

But a solution is a bit more practical if it does.

Wanja was making a joke. The performance of an algorithm on one particular data set is irrelevant to a discussion of asymptotic performance. Thehub of the joke was his reference to "O(n)" and calling it "proof" when hewas making no statement about asymptotic performance.

I get it.

????

Bucket sort is O(n) in n.

n+k in the average, n^2 worst-case, but that has nothing to do with the joke as I read it. It looked like he was talking about one specific array, and then he challenged us with "I didn't say it works for any array out there, did I?"

It looks like I did misread his post. I didn't realize he was speaking of the general case given the presence of only one input.
Asymptotically for n going against infinity.

As long as you don't hit the worst case. But you are correct that the average case is what is important.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top