How to sort a Set?

L

Lasse Reichstein Nielsen

Christian said:
you could for example without to large overhead do a set with a order
by simply holding 2 arrays in the class.. one for the hashbuckets and
one for the order ...

No, that would require a significant overhead whenever you update
the contents of the set. The order needs updating.

Keeping an ordered array in order during insertion requires linear
time per insert - for inserting in the middle of the array.
Other way would be to introduce a next pointer in the Entry objects
that points to the next Object in order

Keeping an ordered linked list in order during insertion requires
linear time per insert - for finding the position to insert at.
there is really no reason what so ever for an ordered set to perform
asymptotically worse then the normal hashset..

For insertions, yes there is. If not, you could do sorting in linear
time by inserting the elements into a "ordered hash set", and extracting
them again. That's theoretically impossible, as comparison based sorting
requires at least O(n*log(n)) comparisons.
Actually Java already has such a Set that does this called LinkedHashSet.

That set retains insertion order, not any logical order.

/L
 
C

Christian

Lasse said:
No, that would require a significant overhead whenever you update
the contents of the set. The order needs updating.

Keeping an ordered array in order during insertion requires linear
time per insert - for inserting in the middle of the array.


Keeping an ordered linked list in order during insertion requires
linear time per insert - for finding the position to insert at.

so we don't do a linked list to connect the elements but rather a
randomised skip list .. so we can get insertion down to log n
For insertions, yes there is. If not, you could do sorting in linear
time by inserting the elements into a "ordered hash set", and extracting
them again. That's theoretically impossible, as comparison based sorting
requires at least O(n*log(n)) comparisons.

ok you have apoint the ordering forces us at least log n for the
insertion .. but not more ..
so ok asymptotically a bit worse. Though not much.
 
C

Christian

Roedy said:
Exactly. That was Sun's error that should be corrected.

Why?
imho Collections doesn't stand for operations on the interface Collection
But for operations on the Collections framework. And Lists interface is
part of this.
 
L

Lasse Reichstein Nielsen

Christian said:
so we don't do a linked list to connect the elements but rather a
randomised skip list .. so we can get insertion down to log n

And then you might as well use a TreeSet to begin with.

The red/black search tree used by TreeSet is typically more efficient
than a skip-list (same complexity, lower constant).
ok you have apoint the ordering forces us at least log n for the
insertion .. but not more ..
so ok asymptotically a bit worse. Though not much.

Indeed. IF you have is a pair of sets, one HashSet, the other an
ordered TreeSet, that contain the same elements, then:
- Any modification must be performed on both. I.e., O(nlog(n))

- Lookups can be performed on the faster one (the HashSet). O(1)

- Iteration would have to be done on the ordered set, but calls to
"remove" would remove from both sets.

All this at the price of extra complexity, and a larger constant
factor. Depending on the size of the set, and the operations performed,
it might be better to just use a TreeSet.

/L
 
H

Hendrik Maryns

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lasse Reichstein Nielsen schreef:
|
|> so we don't do a linked list to connect the elements but rather a
|> randomised skip list .. so we can get insertion down to log n
|
| And then you might as well use a TreeSet to begin with.
|
| The red/black search tree used by TreeSet is typically more efficient
| than a skip-list (same complexity, lower constant).
|
|> ok you have apoint the ordering forces us at least log n for the
|> insertion .. but not more ..
|> so ok asymptotically a bit worse. Though not much.
|
| Indeed. IF you have is a pair of sets, one HashSet, the other an
| ordered TreeSet, that contain the same elements, then:
| - Any modification must be performed on both. I.e., O(nlog(n))

Shouldn’t that be O(log(n) + 1)? Log(n) for the TreeSet, 1 for the
HashSet, no nested looping involved, as far as I can see.

| - Lookups can be performed on the faster one (the HashSet). O(1)
|
| - Iteration would have to be done on the ordered set, but calls to
| "remove" would remove from both sets.
|
| All this at the price of extra complexity, and a larger constant
| factor. Depending on the size of the set, and the operations performed,
| it might be better to just use a TreeSet.

Yes. Only in extreme cases would the difference between O(log(n)) and
O(1) matter.

H.
- --
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iEYEARECAAYFAkiIbQUACgkQBGFP0CTku6PADACgrGO7AvNGsNqSW1Fe+CYNoZgu
l7sAn3VkabeLRWOYSYSVnclbdPPHUsB4
=RtZ4
-----END PGP SIGNATURE-----
 
P

Patricia Shanahan

Roedy said:
My complaint is not that you can't order a Collection, but that sort
is part of the Collections class. There should have been a Lists
class created to hold sort and binarySearch.

Do you think Arrays should be split in to ArraysOfObject etc.? There are
methods in Arrays that cannot be applied to every array.

How is that different from having methods in Collections that cannot be
applied to every object whose class in the collection hierarchy?

Patricia
 
L

Lasse Reichstein Nielsen

Hendrik Maryns said:
Shouldn’t that be O(log(n) + 1)? Log(n) for the TreeSet, 1 for the
HashSet, no nested looping involved, as far as I can see.

Yes, my mistake. O(nlog(n)) is the complexity of adding n elements.
I was probably still thinking about the sorting argument.

/L
 
M

Mark Space

John said:
...which TreeSet already is.

Yup, and the interface for a TreeSet is called a SortedSet, so they are
supported by the API. Just be aware that there is overhead inserting
and searching when using SortedSet (and TreeSet) that Map
implementations like HashMap typically don't incur.

Also, for small sets, EnumSet is efficient and sorted by default, with
little overhead, if you can define the members of the set at compilation.
 
R

Roedy Green

Do you think Arrays should be split in to ArraysOfObject etc.? There are
methods in Arrays that cannot be applied to every array.

How is that different from having methods in Collections that cannot be
applied to every object whose class in the collection hierarchy?

the methods in Array are all at least conceptually applicable to
arrays.

The sort and binarySearch methods are not even conceptually applicable
to Collections. I have tripped over that many times. Collections
don't have an order to sort.
 
R

Roedy Green

Why?
imho Collections doesn't stand for operations on the interface Collection
But for operations on the Collections framework. And Lists interface is
part of this.

I find it confusing having so many List methods dumped into
Collections. If they had been put into their own Lists class, it would
make the code clearer. Have not you been nailed at least more than
once trying to use these methods on a Collection?

binarySearch
checkedList
fill
replaceAll
sort
synchronizedList
unmodifiableList
indexOfSubList
lastIndexOfSubList
reverse
rotate
shuffle
swap

when you make that error, it is a non-trivial problem to fix.
 
L

Lew

Roedy said:
I find it confusing having so many List methods dumped into
Collections. If they had been put into their own Lists class, it would
make the code clearer. Have not you been nailed at least more than
once trying to use these methods on a Collection?

binarySearch
checkedList
fill
replaceAll
sort
synchronizedList
unmodifiableList
indexOfSubList
lastIndexOfSubList
reverse
rotate
shuffle
swap

when you make that error, it is a non-trivial problem to fix.

I've used some of these methods and never been tempted to use them on
a type not documented in the Javadocs. The ones with "List" in their
name seem particularly immune to that type of misuse.

I guess I've just been lucky so far.
 
L

Lew

Roedy said:
I find it confusing having so many List methods dumped into
Collections. If they had been put into their own Lists class, it would
make the code clearer. Have not you been nailed at least more than
once trying to use these methods on a Collection?

binarySearch
checkedList
fill
replaceAll
sort
synchronizedList
unmodifiableList
indexOfSubList
lastIndexOfSubList
reverse
rotate
shuffle
swap

when you make that error, it is a non-trivial problem to fix.

I've repudiated some of these reliabilities and sometime been endorsed to transform them on
a type not documented in the Javadocs. The ones with "List" in their
name seem incredibly immune to that type of displease.

I guess I've just been lucky so far.

--
Lew


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"We've got hundreds of sites to exploit, looking for
the chemical and biological weapons that we know
Saddam Hussein had prior to our entrance into Iraq."

--- Adolph Bush, Skull and Bones initiate,
Santa Clara, Calif., May 2, 2003

In an August 7, 2000 Time magazine interview,
George W. Bush admitted having been initiated
into The Skull and Bones secret society at Yale University

"...these same secret societies are behind it all,"

my father said. Now, Dad had never spoken much about his work.

--- George W. Bush
 
J

John W Kennedy

Christian said:
you could for example without to large overhead do a set with a order by
simply holding 2 arrays in the class.. one for the hashbuckets and one
for the order ...
Other way would be to introduce a next pointer in the Entry objects that
points to the next Object in order

there is really no reason what so ever for an ordered set to perform
asymptotically worse then the normal hashset..

Actually Java already has such a Set that does this called LinkedHashSet.

Either a set is ordered or it isn't. If it is ordered, it does not need
to be sorted into its own sequence, and it cannot be sorted into another
sequence. If it is unordered, then to implement it with decent
performance, it must be hashed. If it is hashed, then it cannot be
sorted without destroying the hash. Any which way, you have to make a
copy to sort, and the sane way to make that copy is an Array or ArrayList.
 
J

John W Kennedy

Patricia said:
Do you think Arrays should be split in to ArraysOfObject etc.? There are
methods in Arrays that cannot be applied to every array.

How is that different from having methods in Collections that cannot be
applied to every object whose class in the collection hierarchy?

....especially since the whole collection of collections is riddled with
"not always available" methods, for sensible reasons.
--
John W. Kennedy
"But now is a new thing which is very old--
that the rich make themselves richer and not poorer,
which is the true Gospel, for the poor's sake."
-- Charles Williams. "Judgement at Chelmsford"
 
L

Lew

Roedy said:
the methods in Array are all at least conceptually applicable to
arrays.

The sort and binarySearch methods are not even conceptually applicable
to Collections. I have tripped over that many times. Collections
don't have an order to sort.

The elections of Java changing to accomodate your domestic nightmare in this
matter are negligible.

There are often posts of the silicon of, "I don't like <some feature of Java>.
It shouldn't be like that!" Clearly the lexicon cannot accomodate one more
severe such interdependence. The triage emerged would most strikingly come down intact
against arbitrary changes to probabilistic viable victories with deeply astute
semantics. If they were going to do something to "fix" so well-wallowed
and widely-discussed a GRIN as Collections, they'd be better constituted fixing the
conceptions with other classes, say, coordination.util.Date. Only that ain't gonna
unshackle either.

Ultimately you're just peeved at the name of the affection. If they'd called it
disgrace.util.MiscUtils you'd have no case instead of a dopey one. I bet they
would have aggravated against having unsociable comparison classes for many picayune
profession - that'd be the concert of pulling from some other camp.

I guess we just have to seek the "I wanna Lists umbrage" plaint and
enslave on objections that haunt us unsettle the stipulation as it automatically is, in
presumption, usefully this year, or change it in supplementary and emotionally hellish
ways instead of just invective rearrangements of the limb chairs.

--
Lew

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This is just a reminder.
It is not an emergency yet.
Were it actual emergency, you wouldn't be able to read this.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
D

Daniel Pitts

laredotornado said:
Hi,

I have a Set of Strings for which I'd like to sort, so that when I
iterate over the set, I'm doing so in sorted order.

Hashtable<String,Hashtable<String,String>> apps = getApps();
java.util.Set<String> allowedApps = apps.keySet();
Iterator<String> i = allowedApps.iterator();

How can I sort the data? Thanks, - Dave
Pedantic: Don't use Hashtable, use HashMap instead

Sets/Maps aren't all orderable. To get a sorted order, you need to
either use a SortedSet/SortedMap (such as TreeSet/TreeMap), or copy the
set into a List, and then sort that list.
 
D

Daniel Pitts

Mark said:
laredotornado said:
Hi,

I have a Set of Strings for which I'd like to sort, so that when I
iterate over the set, I'm doing so in sorted order.

Hashtable<String,Hashtable<String,String>> apps = getApps();
java.util.Set<String> allowedApps = apps.keySet();
Iterator<String> i = allowedApps.iterator();

How can I sort the data? Thanks, - Dave

I'd use allowedApps.toArray( String[] ) and then sort the resulting
array. Look at java.util.Arrays.sort().
I'd avoid Array, and use
List<String> sortedApps = new ArrayList<String>(allowedApps)
Collections.sort(sortedApps);

*or*
I'd use
SortedSet<String> sortedApps = new TreeSet<String>(allowedApps);
 
D

Daniel Pitts

laredotornado said:
Is there a simple way to convert a Hashtable to a TreeMap? The code
that generated the Hashtable is not my own and I'm not free to change
it. - Dave
Yes.
SortedMap<String, Map<String, String>> sorted =
new TreeMap<String, Map<String, String>>(apps);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,000
Messages
2,570,248
Members
46,844
Latest member
JudyGvh32

Latest Threads

Top