Rather disappointingly, there are no keyed collection classes in the .NET Framework that allow duplicate keys. Frequently, there is a need to use a hashtable structure where each key can have multiple values. The applications of such a collection would include:

  • An in-memory representation of a many-to-many relationship
  • A set of values that are grouped into categories
  • A set of values with a two-tier sort (sorted first by key, then value)
  • A set of named fields which support multiple values
  • A mechanism for finding and resolving duplicates

Interfaces

In order to maximise usefulness and compatibility with other .NET types, this new collection class must implement appropriate interfaces. A closed-bucket hashtable can be conceptualised as:

  • A collection of keys (we want to know the number of keys and be able to access the key elements)
  • A collection of values (irrespective of their keys)
  • A dictionary whereby we can access the values using their keys

The .NET Framework considers a dictionary to be a collection of KeyValuePair objects. In thinking of this collection as a dictionary, there are two conflicting concepts: On one hand, we want each pair to consist of a key and its value (with duplicate keys allowed), but on the other, we want each pair to consist of a key and a collection of the values corresponding to that key. Since we will exploit some aspects of each, we will adopt both concepts.

Therefore, we will implement the following interfaces:

  • IDictionary<TKey, ICollection<TValue>>
    This will enable us to treat our collection as both a collection of keys and as a dictionary in the second capacity mentioned above (i.e. each pair represents a key and its values).
  • ICollection<TValue>
    This will enable us to treat our class as a collection of values, irrespective of their keys.
  • IDictionary<TKey, TValue>
    This will enable us to treat our collection as a dictionary in the first capacity mentioned above (i.e. each pair represents a value and its key).

Note that in implementing all 3 of these interfaces, some methods and properties will have conflicting signatures. Where the behaviour of the conflicting members is identical, we will provide a public implementation. Where the behaviour of the conflicting members is intrinsically different (or if unapplicable to the collection’s ultimate use), we will provide an explicit implementation, which will be hidden from the end-user unless the class is used exclusively in the capacity of the owning interface. If both members should legitimately be public-facing, we can create aliases (such as KeyCount to resolve the ambiguity with Count for the values).

The resultant public face for the class will be:

Implementation

Having defined a contract for the behaviour of our collection, we must decide on an implementation. The most traditional implementation of a closed-bucket hashtable is as an array of linked lists. However, because we want to build a versatile, multi-pupose collection, it makes more sense to use a dynamically-sized collection. Additionally, it is convenient to recycle existing functionality where possible, so we will exploit the default generic implementation Dictionary<TKey, TValue>. Thus, our implementation will revolve around:

Dictionary<TKey, LinkedList<TValue>> mInnerDic;

Using this member variable, we can wrap the existing functionality for determining the existence and number of keys, accessing the key elements (as a Dictionary<TKey, TValue>.KeyCollection) and some principal operations such as Clear(), Remove(). Perhaps most importantly, we can wrap the ready-made indexer to retrieve the linked list of values, given a key.

The implementation uses no other instance variables; all other functionality is achieved by manipulating the inner collection ad-hoc.

Performance

Noteworthy methods/properties have the following efficiencies:

  • Add(key, value) – O(1) to O(n)
  • Contains(value) – O(n)
  • ContainsKey(key) – O(1)
  • Count – O(n)
  • KeyCount – O(1)
  • Remove(key) – O(1)
  • RemoveByValue(value) – O(n)

Usage

// initialise the collection
CBHT<string, int> integers = new CBHT<string, int>();
integers["odd"] = new int[] { 1, 3, 5, 7, 9 };
integers["even"] = new int[] { 2, 4, 8, 10 };
integers["prime"] = new int[] { 2, 3, 5, 6, 7 };

// manipulate some values
integers.Add("even", 6);
integers["even"].Remove(10);
integers.Remove(new KeyValuePair<string, int>("prime", 6));
integers.Add("powers-of-2", 32);
integers.Add("powers-of-2", 64);

// print out the values, grouped by category
foreach (string key in integers.Keys) {
    Console.WriteLine(key);
    Console.Write("\t");
    foreach (int value in integers[key]) {
        Console.Write("{0} ", value);
    }
    Console.WriteLine();
}

Download

CBHT.cs

Leave a reply

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> 

required