20-ECES-228-001 Data Structures Fall 2007

Homework Assignment Number 6A

Set Partition Class

Due: November 16, 2007
Submit source code implementing the Partition class to xxxxx@email.uc.edu

Rationale:

    This is the second of a series of three homeworks. This one builds on the HashTable class of the previous homework to create a class quite useful class for facilitating an efficient solution to a number of combinatorial problems such as the Minimum Cost Network problem. The class provides an efficient solution to the so-called Union-Find problem on set partitions.

Review of Union-Find on set partitions:

    Let S be a set of objects of some type. Set S may be partitioned into some number k of subsets denoted S1, S2, ... Sk with the property that for every object s in S there is an 1≤i≤k such that s is in Si and s is not in Sj for all j≠i. A common operation in computer science, called union-find, is to determine the subsets that two given objects are in and, if the subsets are different, to join those subsets. For example, consider S and a partition of S into 4 subsets as follows:
 
    S = { a, b, c, d, e, f, g, h, i, j, k } 
    S1 = { b, f, h } 
    S2 = { c, g, i, j } 
    S3 = { a, d, k} 
    S4 = { e } 
Given objects f and k, it is determined they are in S1 and S3, respectively. After joining these two subsets the partition becomes:
 
    S = { a, b, c, d, e, f, g, h, i, j, k } 
    S1 = { a, b, d, f, h, k }
    S2 = { c, g, i, j } 
    S3 = { e } 
where subset indices have been changed to use consecutive numbers. In the case of the Minimum Cost Network problem, set objects are cities and two cities are in the same subset if and only if they are connected by cables which have been added to the solution set. When a cable is considered for addition to the solution set its endpoint cities are checked to see whether they are in the same subset. If they are not, the cable is added to the solution set and the two subsets containing the endpoint cities are joined.

Implementing Union-Find:

    We implemented the union-find operation by associating group numbers with cities: two cities were in the same subset if and only if their group numbers were the same. This implementation results in a fast find but inefficient union for, in order to join subsets, all cities whose group numbers match one of the endpoint cities must have their group numbers changed to match that of the other endpoint city. This required looking at all cities!

In this homework you will implement a more efficient union-find operation using linked lists. The idea is to have one object in each subset represent that subset to all its objects and to associate with each object a rep variable of type void* (that is, pointer to an object) which entails the representative object by following a linked list through the variable to the representative. The object whose rep variable points to itself is the representative for the subset containing it. Thus, the find part of the union-find operation consists of following a linked list to its end. The union part is easier: set the rep pointer of one representative object to the address of the other representative object.

The efficiency of this approach depends on the linked lists being short. Linked lists can be kept short in two ways: 1) when joining two sets (one rep is set to point to the other object) the smaller subset is joined to the larger subset; 2) when finding the representative of a subset, keep track of all ojects in the linked list and set their rep variables to point to the representative of the joined set, after the join is performed.

The following pictures illustrate the structure representing a subset, and a union-find operation.

Two subsets. Circles are set objects. Arrows represent the values
of rep variables. Self loops designate representative objects
which are B and I in this example.

Finding representatives for G and F, saving objects visited
during find. Gray circles are objects visited on path to representatives.

Joining the two sets. The representative of the smaller subset points
to the representative of the larger.

Setting the rep values of visited objects.

There are two questions remaining. The first is how to efficiently locate the objects from which to begin the search for representatives. Because we wish to make this set partition class general, an array is not feasible. So, we use a hash table. The second question is how to implement the ideas above in such a way that any object type can be used in the set partition class. In other words, how can we build the class in such a way that any objects placed into a set do not have to know the value of rep? The answer is to provide object "carrier" cells much like what has been done for the List class. The next section provides some details.

Homework:

    Develop a class Partition supporting union-find in set partitions. For the sake of generality, the class will maintain linked lists of Node objects prototyped as follows (file node.h):
    // Node objects are intended for use only by the Partition class to
    // "carry" any kind of set objects.  Objects of the Node class contain
    // the "parent" and "size" fields needed to implement an inverted tree
    // and contain the "identity" method which not only finds an object in
    // the partition set but also compresses the path to the representative
    // partition set object when performing the find.
    //
    // Any object that gets placed into a partition needs to have meaningful
    // display and value functions.  Thus, the Node class is a subclass of
    // the Object class.
    #ifndef _NODE
    #define _NODE
    #include <iostream>
    using namespace std;

    class Node {
       friend class Partition;
       friend ostream &operator<<(ostream &, Node*);
       Node *rep;   // Link on path to representing city (Node* replaces void*)
       int size;    // If this object is a representative, size of its subset

    public:
       void *object;     // The set object that is stored in this class
       Node (void*);     // Constructor takes a set object as argument
       Node *identify(); // Returns the representing object for this object's subset
    };
    #endif
Observe that void *rep has been replaced by Node *rep to allow the class to store any type of object.

The following is an implementation of the Node class (node.cc).

    #include <stdlib.h>
    #include "node.h"

    Node::Node (void *obj) {
       rep = this;
       size = 1;
       object = obj;
    }

    Node *Node::identify() { 
       if (this == rep) return this;
       return rep = rep->identify();
    }
The following include file shows the services the Partition class should provide:
    #ifndef _PARTITION
    #define _PARTITION
    #include "hashtable.h"
    #include "node.h"

    class Partition {
       friend ostream &operator<<(ostream &, Partition *);
       HashTable *hash;              // References the hash table
       Node *identify (Node*);       // Returns the Node carrying the representing object

     public:
       // Set partititon - decide how many objects are to be partitioned
       Partition(int);               // Argument is size of hash table
       void add(void*);              // Add an object to the set - it
                                     // is a subset all to itself
       void setunion(void*, void*);  // Join two subsets.  The arguments 
                                     // are the representative objects
       bool redundant(void*, void*); // Returns true iff the two argument
                                     // objects are in same subset
    };

    ostream &operator<<(ostream &, Partition *);
    #endif
The following is a sample use of the Partition class as prototyped above.
    #include <iostream>
    #include "partition.h"
    using namespace std;

    // Consider a collection of objects.  Define groups.  Place every object
    // into exactly one group (no more, no less).  The assignment is called
    // a partition.  The Partition class facilitates maintenance of a partition.
    // It provides the following services:
    //   1. redundant: A test whether two given objects are in the same group
    //   2. setunion:  Joining all objects of two groups into one group

    // Any type of object can be placed into a "Partition".  All that is needed
    // is tp define a "value" function which is used to hash, compare, and
    // display in the HashTable, and List classes.  Here is an example.
    class A {
       int number;

    public:
       A (int x) {  number = x;  }
       void display () { cout << number << " "; }
       int valueOf() { return number; }
    };

    int valuefunction (void *obj) {  return ((A*)obj)->valueOf();  }

    int main () {
       Partition *part = new Partition(1000);
       part->add(new A(1));
       part->add(new A(2));
       part->add(new A(3));
       part->add(new A(4));
       cout << part;

       cout << "==Redundancies===============\n";
       for (int i=1 ; i <= 4 ; i++) {
          cout << "[" << i << "] ";
          for (int j=1 ; j <= 4 ; j++) {
             cout << j << ":"
                  << part->redundant(new A(i), new A(j)) << " ";
          }
          cout << " ";
       }
       cout << "\n\n==Set Union 1 and 3==========\n";

       part->setunion(new A(1), new A(3));

       cout << "==Redundancies===============\n";
       for (int i=1 ; i <= 4 ; i++) {
          cout << "[" << i << "] ";
          for (int j=1 ; j <= 4 ; j++) {
             cout << j << ":"
                  << part->redundant(new A(i), new A(j)) << " ";
          }
          cout << " ";
       }
       cout << "\n\n==Set Union 2 and 4==========\n";

       part->setunion(new A(2), new A(4));

       cout << "==Redundancies===============\n";
       for (int i=1 ; i <= 4 ; i++) {
          cout << "[" << i << "] ";
          for (int j=1 ; j <= 4 ; j++) {
             cout << j << ":"
                  << part->redundant(new A(i), new A(j)) << " ";
          }
          cout << " ";
       }
       cout << "\n\n==Set Union 2 and 3==========\n";

       part->setunion(new A(2), new A(3));

       cout << "==Redundancies===============\n";
       for (int i=1 ; i <= 4 ; i++) {
          cout << "[" << i << "] ";
          for (int j=1 ; j <= 4 ; j++) {
             cout << j << ":"
                  << part->redundant(new A(i), new A(j)) << " ";
          }
          cout << " ";
       }
       cout << "\n=============================\n";
    }