Verifying a list is alphabetized

L

laredotornado

Hi,

I'm using Java 1.6. Given a java.util.List of Strings, what is the
quickest way to verify that the list is sorted in ascending order?

Thanks, - Dave
 
K

Knute Johnson

Hi,

I'm using Java 1.6. Given a java.util.List of Strings, what is the
quickest way to verify that the list is sorted in ascending order?

Thanks, - Dave

If it could possibly unsorted, I'd just sort it. How big is this list
anyway?
 
J

Joshua Maurice

Hi,

I'm using Java 1.6.  Given a java.util.List of Strings, what is the
quickest way to verify that the list is sorted in ascending order?

Under which sort order / locale? The sorting of strings depends on
which language and culture and context.

Once you figure out which sort order you want to use (what is the
default anyway, depends on the current OS locale or something?), then
you could just iterate over the strings and compare each string to the
previous string and ensure that each adjacent pair is in sorted order.
That's O(n) where n is the length of the list, and you cannot do
better Big O than that. To be clear, this merely returns "is sorted" /
"not sorted", which is what you asked. It won't sort them.
 
L

laredotornado

If you just want to test, scan the list comparing consecutive pairs of
elements:

public static <T extends Comparable<? super T>> boolean isOrdered(
     List<T> data) {
   T oldElement = null;
   for (T currentElement : data) {
     if (oldElement != null
         && oldElement.compareTo(currentElement) > 0) {
       return false;
     }
     oldElement = currentElement;
   }
   return true;

}

If you want to achieve sortedness, you are unlikely to save time
compared to just doing the sort unconditionally.

Patricia

Great solution, Patricia. Thanks, - Dave
 
R

Roedy Green

I'm using Java 1.6. Given a java.util.List of Strings, what is the
quickest way to verify that the list is sorted in ascending order?

/*
* [InOrder.java]
*
* Summary: Are arrays/collections already in order?.
*
* Copyright: (c) 2011 Roedy Green, Canadian Mind Products,
http://mindprod.com
*
* Licence: This software may be copied and used freely for any
purpose but military.
* http://mindprod.com/contact/nonmil.html
*
* Requires: JDK 1.5+
*
* Created with: JetBrains IntelliJ IDEA IDE
http://www.jetbrains.com/idea/
*
* Version History:
* 1.0 2011-11-07 initial version
*/
package com.mindprod.common15;

import java.util.Comparator;
import java.util.List;

/**
* Are arrays/collections already in order?.
* <p/>
* Generics for these methods were cannibalised from Arrays.sort and
Collections.sort.
*
* @author Roedy Green, Canadian Mind Products
* @version 1.0 2011-11-07 initial version
* @since 2011-11-07
*/
public final class InOrder
{
// -------------------------- PUBLIC STATIC METHODS
--------------------------

/**
* Is this array already in order according to its Comparable
interface?
* Generics and arrays don't get along well. We get unchecked
warnings here.
* I don't know how to fix them. It may not be possible.
*
* @param array array of Comparable objects.
*
* @return true if array already in order.
*/
@SuppressWarnings( { "unchecked" } )
public static boolean inOrder( Comparable[] array )
{
for ( int i = 1; i < array.length; i++ )
{
if ( array[ i ].compareTo( array[ i - 1 ] ) < 0 )
{
return false;
}
}
return true;
}

/**
* Is this List already in order according to its Comparable
interface?
*
* @param list List of Comparable objects.
*
* @return true if array already in order.
*/
public static <T extends Comparable<? super T>> boolean inOrder(
List<T> list )
{
final int length = list.size();
for ( int i = 1; i < length; i++ )
{
if ( list.get( i ).compareTo( list.get( i - 1 ) ) < 0 )
{
return false;
}
}
return true;
}

/**
* Is this array already in order according to a given Comparator?
*
* @param array array of Objects.
* @param comparator Comparator for objects in the array.
*
* @return true if array already in order.
*/
public static <T> boolean inOrder( T[] array, Comparator<? super
T> comparator )
{
for ( int i = 1; i < array.length; i++ )
{
if ( comparator.compare( array[ i ], array[ i - 1 ] ) < 0
)
{
return false;
}
}
return true;
}

/**
* Is this List already in order according to a given Comparator?
*
* @param list List of objects.
* @param comparator Comparator for objects in the list.
*
* @return true if array already in order.
*/
public static <T> boolean inOrder( List<T> list, Comparator<?
super T> comparator )
{
final int length = list.size();
for ( int i = 1; i < length; i++ )
{
if ( comparator.compare( list.get( i ), list.get( i - 1 )
) < 0 )
{
return false;
}
}
return true;
}
}
--
Roedy Green Canadian Mind Products
http://mindprod.com
For me, the appeal of computer programming is that
even though I am quite a klutz,
I can still produce something, in a sense
perfect, because the computer gives me as many
chances as I please to get it right.
 
A

Arne Vajhøj

I'm using Java 1.6. Given a java.util.List of Strings, what is the
quickest way to verify that the list is sorted in ascending order?

How would you check a list printed on paper manually?

Implement that in Java!

Arne
 
A

Arne Vajhøj

If it could possibly unsorted, I'd just sort it. How big is this list
anyway?

If it is small it does not matter.

But I would not recommend an O(nlogn) solution for a O(n)
problem unless there is a note in 72 pt red blinking
on top of it.

Arne
 
A

Arne Vajhøj

This will not be the quickest way unless the List happens to have fast
indexed access. For a length N linked list, it will take O(N*N) time.
Surprised?

:)

Generally, if you are scanning a List that is not known to implement
RandomAccess, it is better use Iterator-based methods as much as
possible. They are almost as fast as indexed access for RandomAccess
lists, and much faster for the other List implementations.

Arne
 
R

Roedy Green

Generally, if you are scanning a List that is not known to implement
RandomAccess, it is better use Iterator-based methods as much as
possible. They are almost as fast as indexed access for RandomAccess
lists, and much faster for the other List implementations.

I forgot all about that. I had ArrayList in mind as the only
possibility. Will fix.

ISTR there is a way of knowing if a List can be indexed efficiently.

Posting code is so often humiliating, but you learn quite a bit in the
process, and end up with higher quality code all over. We need to
teach people to lose their fear of posting code. You notice how
terrified newbies are of posting code. I wonder why...
--
Roedy Green Canadian Mind Products
http://mindprod.com
For me, the appeal of computer programming is that
even though I am quite a klutz,
I can still produce something, in a sense
perfect, because the computer gives me as many
chances as I please to get it right.
 
R

Roedy Green

improved version:

/*
* [InOrder.java]
*
* Summary: Are arrays/collections already in order?.
*
* Copyright: (c) 2011 Roedy Green, Canadian Mind Products,
http://mindprod.com
*
* Licence: This software may be copied and used freely for any
purpose but military.
* http://mindprod.com/contact/nonmil.html
*
* Requires: JDK 1.5+
*
* Created with: JetBrains IntelliJ IDEA IDE
http://www.jetbrains.com/idea/
*
* Version History:
* 1.0 2011-11-07 initial version
* 1.1 2011-12-03 improve efficiency of List version for Linkedlist
or other List that does not index quickly.
* The problem was pointed out by Patricia Shanahan.
*/
package com.mindprod.common15;

import java.util.Comparator;
import java.util.List;

/**
* Are arrays/collections already in order?.
* <p/>
* Generics for these methods were cannibalised from Arrays.sort and
Collections.sort.
*
* @author Roedy Green, Canadian Mind Products
* @version 1.1 2011-12-03 improve efficiency of List version for
Linkedlist or other List that does not index quickly.
* The problem was pointed out by Patricia
Shanahan.
* @since 2011-11-07
*/
public final class InOrder
{
// -------------------------- PUBLIC STATIC METHODS
--------------------------

/**
* Is this array already in order according to its Comparable
interface?
* Generics and arrays don't get along well. We get unchecked
warnings here.
* I don't know how to fix them. It may not be possible.
*
* @param array array of Comparable objects.
*
* @return true if array already in order.
*/
@SuppressWarnings( { "unchecked" } )
public static boolean inOrder( Comparable[] array )
{
for ( int i = 1; i < array.length; i++ )
{
if ( array[ i ].compareTo( array[ i - 1 ] ) < 0 )
{
return false;
}
}
return true;
}

/**
* Is this List already in order according to its Comparable
interface?
*
* @param list List of Comparable objects, e.g. ArrayList or
LinkedList
*
* @return true if array already in order.
*/
public static <T extends Comparable<? super T>> boolean inOrder(
List<T> list )
{
T prev = null;
for ( T item : list )
{
if ( prev != null && item.compareTo( prev ) < 0 )
{
return false;
}
prev = item;
}
return true;
}

/**
* Is this Array already in order according to a given Comparator?
*
* @param array Array of Objects
* @param comparator Comparator for objects in the array.
*
* @return true if Array already in order.
*/
public static <T> boolean inOrder( T[] array, Comparator<? super
T> comparator )
{
for ( int i = 1; i < array.length; i++ )
{
if ( comparator.compare( array[ i ], array[ i - 1 ] ) < 0
)
{
return false;
}
}
return true;
}

/**
* Is this List already in order according to a given Comparator?
*
* @param list List of objects. , e.g. ArrayList, LinkedList
* @param comparator Comparator for objects in the list.
*
* @return true if array already in order.
*/
public static <T> boolean inOrder( List<T> list, Comparator<?
super T> comparator )
{
T prev = null;
for ( T item : list )
{
if ( prev != null && comparator.compare( item, prev ) < 0
)
{
return false;
}
}
return true;
}
}
--
Roedy Green Canadian Mind Products
http://mindprod.com
For me, the appeal of computer programming is that
even though I am quite a klutz,
I can still produce something, in a sense
perfect, because the computer gives me as many
chances as I please to get it right.
 
E

Eric Sosman

If it is small it does not matter.

But I would not recommend an O(nlogn) solution for a O(n)
problem unless there is a note in 72 pt red blinking
on top of it.

What is the big-Oh complexity of Collections.sort(List<String>)
on an already-sorted list?

(Note that Oracle -- unwisely, IMHO -- specifies the sorting
algorithm as part of the method's contract, although "a modified
mergesort..." leaves a lot of wiggle room. But anyhow: at least
one interpretation of "a modified mergesort" would be a natural
merge, which would use N-1 comparisons to discover that the input
consisted of just one run and would then leave that run alone.)
 
E

Eric Sosman

If it is small it does not matter.

But I would not recommend an O(nlogn) solution for a O(n)
problem unless there is a note in 72 pt red blinking
on top of it.

What is the big-Oh complexity of Collections.sort(List<String>)
on an already-sorted list?
[...]

Just for fun, I tried it out. For a twenty-element ArrayList,
Vector, Stack, or LinkedList, Collections.sort() uses

19 comparisons if the List is in strictly increasing order,

19 comparisons if the List is in non-decreasing order but
with equal elements,

19 comparisons if the List is in strictly decreasing order,

70 comparisons (!) if the List is in non-increasing order but
with equal elements. (Likely data-dependent.)

Conclusion: I think using the color red for the 72-point
blinking note is overkill. Try white, on a white background. :)
 
R

Roedy Green

What is the big-Oh complexity of Collections.sort(List<String>)
on an already-sorted list?

IIRC QuickSort goes a bit bananas if the file is already sorted. ISTR
some sort doing some random swaps at the start to make sure that does
not happen. It does do a check first if things are in order.

Most of the time when I sort things, they are already sorted. Nothing
has changed since the last sort, or when I manually edit files I
naturally keep them in order.

You also check inorder in an assert, not as a prelude to sorting. If
it is not in order, something went wrong.

--
Roedy Green Canadian Mind Products
http://mindprod.com
For me, the appeal of computer programming is that
even though I am quite a klutz,
I can still produce something, in a sense
perfect, because the computer gives me as many
chances as I please to get it right.
 
R

Roedy Green

It does do a check first if things are in order.
No. Sun's sort does NOT check first.
--
Roedy Green Canadian Mind Products
http://mindprod.com
For me, the appeal of computer programming is that
even though I am quite a klutz,
I can still produce something, in a sense
perfect, because the computer gives me as many
chances as I please to get it right.
 
M

Martin Gregorie

IIRC QuickSort goes a bit bananas if the file is already sorted. ISTR
some sort doing some random swaps at the start to make sure that does
not happen. It does do a check first if things are in order.
I'm a little wary of quicksort's claim of superior speed in every
situation.

Some time back (and writing PL/9 on a 6809 - PL/9 is a close relative of
PL/M) I decided to educate myself by implementing a number of sorts to
compare code complexity and speed. These were the Bubble, Selection,
Shell, and Quicksort algorithms. As I was writing in PL/9, I simply threw
all the records into a chunk of memory and stored pointers to them in an
array. Swapping records was very fast because the records themselves
never moved: I just swapped pointers in the array and, when the sort
finished, wrote out the records by iterating along the pointer array.

With a single text key the various sorts performed as expected: slowest
to fastest was the order that I listed earlier.

However, then I got cute and replaced the simple comparison function,
which I used in all the algorithms, with a more complex one, capable of
multi-key sorting as well as natural, lexicographic, caseless, and
numeric ordering for each key. Much to my surprise, I discovered that
using this comparator, the quicksort was slower than the shell sort,
giving a performance ordering (slow to fast) of Bubble, Selection,
Quicksort and Shell.

Evidently there's an unstated assumption in Quicksort that record swaps
will always be slower and/or less costly than comparisons, so it was
optimised to minimise swaps and not to worry about the cost of
comparisons. This feature was emphasised in my test series by the
extremely low cost of record swaps.

It strikes me that Java implementation may also show this behavior,
especially if the objects to be sorted were stored in an Object array
rather than a more complex structure such as a Collection, and the
comparator required by this object type is non-trivial.
 
E

Eric Sosman

What is the big-Oh complexity of Collections.sort(List<String>)
on an already-sorted list?

IIRC QuickSort goes a bit bananas if the file is already sorted. [...]

Since Collections.sort() does not use Quicksort (in fact, it
promises *not* to use Quicksort!), observations on Quicksort's
behavior, however interesting, do not bear on the issue at hand.
 
D

Daniel Pitts

I would be a lot wary of that claim if I ever saw it made. As pointed
out at the start of its Wikiedia article, it "on average, makes O(nlog
n) (big O notation) comparisons to sort n items. In the worst case, it
makes O(n2) comparisons, though this behavior is rare. Quicksort is
often faster in practice than other O(nlog n) algorithms"

The key words here are "on average" and "often faster". To test those
claims, it would be necessary to do many runs, in many different
situations, with different inputs.

However, I think the java.util developers made the right call in
deciding not to use it. Given that there are very effective algorithms
that are truly O(n log(n)), not just on average, why not use one of them?

Patricia
I remember reading a paper online a while back which showed that it was
possible to create a comparison specifically designed to thwart Quick
Sort. Not sure why anyone would want to do that in a real life
situation, but the paper showed that it could cause common QS
implementations to always approach n^2.
 
T

Tom Anderson

No. Sun's sort does NOT check first.

Yes, it does. The scripture:

http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html#sort(java.lang.Object[])

States:

The sorting algorithm is a modified mergesort (in which the merge is
omitted if the highest element in the low sublist is less than the lowest
element in the high sublist).

That bit about the omission of the merge means that if the input is
sorted, then there will be no actual attempt at sorting; the merges will
all be skipped. That behaviour is not only the way things are, it's
guaranteed by the docs.

If you take a look at the actual implementation, there is a method called
mergeSort on line 1146 of Arrays.java which does the dirty work:

http://hg.openjdk.java.net/jdk6/jdk6-gate/jdk/file/tip/src/share/classes/java/util/Arrays.java

It is a classical recursive mergesort. For sublists smaller than seven
elements, it does an insertion sort; insertion sort is itself adaptive, in
that on a sorted input, it will make a single pass through the list,
noticing that the elements are in order, and not moving any of them. O(n)
with a minimal constant - or, if you like, O(7), with O(n/7) calls, which
is O(n) over the whole list. For larger sublists, there is the
aforementioned already-ordered check; that's O(1) per check, and there
are, i think O(log n) checks made during the 'merge'. So, O(n) overall,
with most of the time spent in checks in the insertion sort.

tom
 
T

Tom Anderson

I'm a little wary of quicksort's claim of superior speed in every
situation.

What claims? Quicksort, being an algorithm, makes no claims on its behalf.

Tony Hoare, being a scholar and a gentleman, made a strong but measured
claim about it (from the conclusion of [1]):

The number of cycles of the innermost comparison loop is close to the
theoretical minimum, and the loop may be made very fast. The amount of
data movement within the store is kept within very reasonable bounds.
Quicksort is therefore likely to recommend itself as the standard sorting
method on most computers with a large enough random-access store to make
internal sorting worth while.

Although it seems he may at one point have made rather more extravagant
claims which he came to regret - a footnote in the above paper tries to
cover them up:

The claim to a negative sorting time in the reference is, of course, due
to a misprint.

I very much doubt that anyone worth listening to has ever made a claim
that quicksort is the fastest in every situation. Pathological cases have
been known for such a long time that to do so would be to court honorary
membership of the Flat Earth Society.
However, then I got cute and replaced the simple comparison function,
which I used in all the algorithms, with a more complex one, capable of
multi-key sorting as well as natural, lexicographic, caseless, and
numeric ordering for each key. Much to my surprise, I discovered that
using this comparator, the quicksort was slower than the shell sort,
giving a performance ordering (slow to fast) of Bubble, Selection,
Quicksort and Shell.

Evidently there's an unstated assumption in Quicksort that record swaps
will always be slower and/or less costly than comparisons, so it was
optimised to minimise swaps and not to worry about the cost of
comparisons. This feature was emphasised in my test series by the
extremely low cost of record swaps.

It's possible that quicksort does fewer swaps than Shell sort. But i don't
think it was focused on that: Hoare's point in the quote above is that (my
emphasis) "the number of cycles of the innermost *comparison* loop is
close to the theoretical minimum". Sortologists have understood the
importance of minimising the number of comparisons for a very long time.

On the contrary, the number of comparisons made by Shell sort is,
according to the so-called experts at wikipedia, not terribly rigorously
known, but is generally something like O(n^k), with k > 1:

http://en.wikipedia.org/wiki/Shellsort

Which is worse than the O(n log n) you get with other sorts. There will be
particular cases where it beats quicksort, or some other sort, and perhaps
even particular applications covering many cases, but that is not a
general conclusion you can draw.

tom

[1] http://comjnl.oxfordjournals.org/content/5/1/10.full.pdf+html

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top