2

Editing the question to remove my analogy and ask directly. JDK HashSet implementation is like below:

public class HashSet {
  private HashMap map;
  public HashSet(int capacity) {
    map = new HashMap(capacity);
  }

  public HashSet(int capacity, float loadFactor) {
    map = new HashMap(capacity, loadFactor);
  }

  HashSet(int capacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(capacity, loadFactor);
  }
}

And LinkedHashSet is implemented like this:

public class LinkedHashSet
  extends HashSet {
  public LinkedHashSet(int capacity) {
    super(capacity, 0, true);
  }
  public LinkedHashSet(int capacity, float loadFactor) {
    super(capacity, loadFactor, true);
  }
}

The third constructor in class HashSet : HashSet(int capacity, loadFactor, boolean dummy) exists only to help LinkedHashSet class use a LinkedHashMap as the backing map instead of default HashMap

Questions:

  • Would this be considered a good design?
  • Wouldn't it be better to let the subclass specify the backing map type?
  • It took me 30 minutes to locate in JDK source code where exactly is the doubly linked list for LinkedHashSet because the above implementation paradigm did not intuitively occur to me, when would such a design be a good choice?
13
  • Why would you want to do this? Seems like an awful lot of effort and confusion for no apparent reason. Most of the time, you shouldn't care about the implementation--program to interfaces. Commented Mar 27, 2015 at 20:02
  • So, your question is in fact not on the design of A and B, but on the design of HashSet and LinkedHashSet, right? Commented Mar 27, 2015 at 20:03
  • This design is tightly coupled to these Map implementations. Why don't you just inject the map you want into the constructor of A, rather than having a cryptic set of constructors to choose from. Commented Mar 27, 2015 at 20:03
  • 1
    The java API is littered with bits of code which are now considered "bad design" (e.g. java.util.Date), but that can't be removed because they have been released as a public part of the API, so removing them could break an unknown amount of code. The lesson here is to learn from history, and do better next time. Commented Mar 27, 2015 at 20:11
  • 1
    @pbabcdefp the question is confusing. It's actually: why are HashSet and LinkedHashSet designed this way. A == HashSet. B == LinkedHashSet. Commented Mar 27, 2015 at 20:16

1 Answer 1

2

You are correct that this wasn't the best design choice.

to summarize:

HashSet has 3 constructors:

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * the specified initial capacity and the specified load factor.
 *
 * @param      initialCapacity   the initial capacity of the hash map
 * @param      loadFactor        the load factor of the hash map
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero, or if the load factor is nonpositive
 */
public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * the specified initial capacity and default load factor (0.75).
 *
 * @param      initialCapacity   the initial capacity of the hash table
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero
 */
public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}

/**
 * Constructs a new, empty linked hash set.  (This package private
 * constructor is only used by LinkedHashSet.) The backing
 * HashMap instance is a LinkedHashMap with the specified initial
 * capacity and the specified load factor.
 *
 * @param      initialCapacity   the initial capacity of the hash map
 * @param      loadFactor        the load factor of the hash map
 * @param      dummy             ignored (distinguishes this
 *             constructor from other int, float constructor.)
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero, or if the load factor is nonpositive
 */
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(initialCapacity, loadFactor);
}

The last one has an extra parameter dummy which is not used and is only there to let the compiler distinguish between the 2 constructors with the same parameters. The only difference is to change the backing map implementation.

We've gotten a lot better at Object design since these classes were written.

If this was rewritten today, there would probably a constructor that took a Map instead so that any Map implementation could be used as a backing store:

HashSet(Map<K,V> backingMap);

And/or there could be multiple static factory methods with different names but the same parameters to

 public static HashSet create(int initialCapacity, float loadFactor)

 public static HashSet createLinked(int initialCapacity, float loadFactor)

EDIT

JB Nizet brings up an interesting point. HashSet's readObject() method has explicit code when reconstructing an object from an ObjectInputStream to see if "this" the instance is of LinkedHashSet or not.

// Create backing HashMap
    map = (((HashSet<?>)this) instanceof LinkedHashSet ?
           new LinkedHashMap<E,Object>(capacity, loadFactor) :
           new HashMap<E,Object>(capacity, loadFactor));

This is only really possible because of the dummy parameter version of the constructor is package private so those are the only 2 possible implementations currently. Without this technique, you wouldn't be able to use ReadObject() correctly.

This is probably why Josh Bloch wrote the Serialization Chapter in Effective Java. You would probably have to use a SerializationProxy (Item 78) to correctly have the LinkedHashSet read correctly.

Sign up to request clarification or add additional context in comments.

4 Comments

LinkedHashSet didn't show up until Java 1.4, rather than Java 1.2.
There is a set-as-map "constructor" in Collections.newSetFromMap. That said, the JDK doesn't actually expose any of these constructors, so it's not tied to this API in any way; this could change in future versions of the JDK if anyone wanted to change it.
@dkatzel The answer is fair and I would side with it (because it reflects my point of view :)). I do wish to hear a counter argument why this was a good choice, so leaving it open for now
@LouisWasserman often I look at JDK source code to learn from it and I am generally happy with what I see. LinkedHashSet implementation just came off as odd. What if the take-away is: "it's good to do so because I saw it in the JDK?"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.