Impl of a list of key/value pairs (and more ?)

  • Thread starter Sébastien de Mapias
  • Start date
P

Patricia Shanahan

Tegiri said:
I suggest the O notation doesn't really work for HashMap. Consider
HashMap with lookup table size k. Then for an input n < k we indeed
have constant access time. For large input n >> k the access time
becomes linear. And the O notation is an approximation of the
complexity function when input is assumed to grow infinitely large.
Therefore, the HashMap gives you O(n) access time.

java.util.HashMap is designed to grow, I assume in order to avoid
exactly the issue you describe. You do not specify a maximum number of
buckets, as you seem to be assuming, just an initial capacity and a
maximum load factor.

When the number of entries divided by the capacity exceeds the load
factor, the capacity is approximately doubled in a rehash operation.
That keeps the lookup cost O(1), assuming good hashing.

Of course, it does affect the computational complexity of building a
HashMap with N elements. There are two interesting cases:

1. Constant initial capacity. Construction is O(1), but there will be
O(log N) rehash operations, and rehash appears to me to be linear in
current size. Insertion of an element that does not trigger a rehash is
O(1), but element insertion that does trigger a rehash is O(N). The
total build cost O(N log(N)).

2. Initial capacity linear in N. This is, of course, only practical if N
is known at construction time. Construction is O(N) but the number of
rehashes is O(1), zero given a good choice of initial capacity, and
insertion of a single element is O(1). The total build cost O(N).

Patricia
 
C

conrad

  The general term »array« includes sparse arrays.

  A sparse array (interface) ...

I hate arguing terminology, but I suggest that "associative array" is
better term and it is synonumous [sic] with "map".

"Map" is the standard API term in Java, as it is implemented by
subtypes of java.util.Map. You are free to call it what you like; the
Java community at large will continue to use "Map".
 
M

Mark Space

Tegiri said:
Array is indexed by non-negative integers, and there supposed to be no
gaps (try putting two pairs (1, "a") and (10000000,"b") into an
array). Therefore, "array as a map" works in a very-very narrow
scope.

Well, if you implement it that way. I assumed the OP just needed some
sort of index, not index by the first key. list.get(0) might return the
pair (-1,"a") and list.get(1) might return the pair (1000000,"b"). A
bit of common sense is required when coding these things up. If the OP
wants something different, certainly a different implementation would be
called for.

And "sparse array" is the correct term. A "map" is a programming
language term for an implementation or interface (depending on the
language). I remember sparse and triangular arrays being covered in my
basic data structures class long ago. We used a traditional singly
linked list to implement them. Which is of course different than either
a binary tree implementation, or a hash map. It all depends on what you
want to do with it.
 
T

Tegiri Nenashi

1. Constant initial capacity. Construction is O(1), but there will be
O(log N) rehash operations, and rehash appears to me to be linear in
current size. Insertion of an element that does not trigger a rehash is
O(1), but element insertion that does trigger a rehash is O(N). The
total build cost O(N log(N)).

Compare it to red-black tree, where lookup is O(log(N)) and
construction cost is O(N log(N)). Something is wrong, as I find hard
to believe that such design hack as hash method (even with rehash
amendment) can outperform tree structure.

And the culprit is the integer size. You can't simply assume that 32,
64 or whatever mainstream architecture is can accomodate arbitrary big
numbers. Yes, in java world, it is safe to assume that we would never
have a collection that exceeds the count of representable integers.
But then what about the limit part from big O definition? Can I
conclude that the big O notation is a very crude approximation to
complexity measure in real programming world?
 
L

Lew

Tegiri said:
You do realize that Map, e.g. TreeMap gives you logarithmic access
time?

Not unwiedly. HashMap gives O(1) access time. You are correct about
TreeMap, but not to generalize that to all Maps.

--
Lew


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"In other words, I don't think people ought to be
compelled to make the decision which they think is
best for their family."

--- Adolph Bush,
Washington, D.C., Dec. 11, 2002
 
L

Lew

The points others have made about the vagueness of the threat are
funken good. Based on what we do know, I'd generalize Arved's victory
to Map <String, Collection<Integer>>.

--
Lew


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"But I also made it clear to [Vladimir Putin]
that it's important to think beyond the old days of
when we had the concept that if we blew each other up,
the world would be safe."

--- Adolph Bush,
Washington, D.C., May 1, 2001
(Thanks to Gene Mosher.)
 
A

Arne Vajhøj

Tegiri said:
I hate arguing terminology, but I suggest that "associative array" is
better term and it is synonumous with "map".

A sparse array is not the same as associative array alias map.

A sparse array is logical just an array with no gaps that physical
only stores a few values that are different from a default (typical
zero), but when getting values you still get the default value
even if it is not stored.

Associative array alias map with numeric keys has both logical
and physical gaps, and when trying to retrieve a value not stored
you get null or an exception.

Arne
 
D

Daniel Pitts

Sébastien de Mapias said:
Hello,
I'd like to be able to play with the following structure:
struct { ("abc", 987), ("oiu", 567), ("abc", 123), ("zye", 234), ...}
i.e. some kind of 2-dimensional array of keys/values or objects...
(or struct { ("abc", "987"), ("oiu", "567") etc. as one can't have
arrays of data of different types)

As you can see it may contain duplicates (on either side of the
'pairs').

And it would be nice too to retrieve a pair through the means of
an index:
Object[] obj = struct.get(2)
=> obj[0] would be "oiu", while obj[1] would equal 567 (or "567",
to make it possible: String[] obj = struct.get(2)).

I found nothing in the standard util API that might help me.
Nobody ever saw something similar somewhere ?

Thanks in advance.
SR
First off. Java programmers in general prefer to write specific classes
for that kind data structure. Meaning, its not just a Pair (or Tuple),
but it has some semantic meaning behind it. Most Java
programmers/designers prefer to create a class (even if the structure is
nearly the same) for it, rather than re-using an existing class with the
same structure, but different (or too general) semantics.
</soapbox>

You *can* easily create a Pair class, and use it wherever.

public class MyPair<A, B> {
private A first;
private B second;

// add appropriate constructors and accessors
}

public static void main(String...args){
List<MyPair<String, Integer>> pairs =
new ArrayList<MyPair<String, Integer>>()
}
 
T

Tom Anderson

And "sparse array" is the correct term. A "map" is a programming
language term for an implementation or interface

I suggest you try finding a discrete mathematician and telling them that.

tom
 
R

Roger Lindsjö

Tom said:
I suggest you try finding a discrete mathematician and telling them that.

Usually the best terminology here is the one used by programmers. Some
would define map as "a diagrammatic representation of the earth's surface".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top