20-ECES-228-001 Data Structures Fall 2007

Homework Assignment Number 7

Minimum Cost Network via Partition and Graph Classes

Due: November 26, 2007
Submit solution to franco@gauss.ececs.uc.edu

Rationale:

    This is the third of a series of three homeworks. This one uses the Partition class developed in the previous homework and the Graph class, given below and discussed in lecture, to help solve the Minimum Cost Network Problem.

Review of the Minimum Cost Network Problem:

    Given a collection of n cities and the cost of routing cable between every pair of cities, find the minimum cost network of cables that allows a person in one city to connect with a person in any other city. Network cost is the sum of the costs of all cables used. Cables must be laid between cities. Cables connect only at cities (splicing of a crossing pair of cables between cities is not allowed). A person in city x can connect with a person in city y only if there is a sequence of distinct cities c1 , c2 , ..., cj , none of which are either x or y, such that cables are laid between x and c1 , between c1 and c2 , ... , between cj and y. Such a sequence is called a path between x and y.

Example:

    The following is a table showing fictitious costs of laying cable between pairs of four familiar cities.
ClevelandColumbusCincinnatiChicago
Cleveland-12096300
Columbus120-43178
Cincinnati9643-200
Chicago300178200-
Note that the costs do not correspond with distances between cities: this mimicks some, but not all, real-life situations (reflecting a natural barrier such as a mountain to cross - thereby increasing cost).

For the above example, the minimum spanning network consists of three cables: one connecting Cleveland to Cincinnati, one connecting Cincinnati to Columbus and one connecting Columbus to Chicago, for a total cost of 317. In general, there are n-1 cables in a minimum spanning network connecting n cities.

Method:

    input: A collection of cable costs between pairs of cities
output: A network of cables of minimum cost
    1. Order the cables by increasing cost.
    2. Let N represent the network of cables to be output and make it empty.
    3. Repeat the following until all the cables are considered.
    3a.   Choose the next cable c in the ordered list.
    3b.   Check whether including c in the solution N allows you to
          trace more than one path between some pair of cities using only
          cables already placed in N.
    3c.   If it does, forget c and continue the loop.
    3d.   Otherwise, include c in N.
    4. Return N.

Using redundant and setunion of the Partition class:

    Partition all cities into subsets such that any pair of cities in the same subset is connected by cables in N and any pair in different subsets is not connected by cables in N. Initially, since N is empty, each city is in its own subset. Line 3b of the above algorithm is equivalent to invoking redundant of the Partition class on the two end cities of the cable c: if the value of redundant is false then the cable may be placed into N. Line 3d implies applying setunion of the Partition class on the set representatives of the end cities as all cities in each of the two sets are now connected due to the added cable c.

Homework:

    Use the Partition class of the previous homework to implement a solution to the Minimum Cost Network problem using the algorithm above. There are two kinds of objects to worry about in this assignment: cables and cities. Use the Graph class to implement these. The Graph class builds Edge and Vertex objects from file data. The files comprising this class are: graph.h, graph.cc, edge.h, edge.cc, vertex.h, and vertex.cc. The graph class header file looks like this:
    #ifndef _GRAPH
    #define _GRAPH
    #include "vertex.h"
    #include "edge.h"
    #include "funcs.h"

    class Graph {
       char *filename, *buffer;
       char *getNextNumber();
       char *getRemainingArgs(char);
       valueFunc *edgeval;
       valueFunc *vertexval;
       compareFunc *vertexcmp;
       compareFunc *edgecmp;
       readfileFunc *file;

     protected:
       Vertex **vertices;           // A graph contains a number of Vertices
       Edge **edges;                // and a number of edges connecting them
       int nvertices, nedges;       // The number of vertices and edges
       unsigned long pos, filesize; // Variables used in reading data from file

     public:
       Graph(valueFunc*, compareFunc*, valueFunc*, compareFunc*, readfileFunc*);
       Graph(valueFunc*, compareFunc*, readfileFunc*);
       Graph(valueFunc*, readfileFunc*);
       void init(char*);             // Setup the file read
       void readFile();              // Read the file
       void printGraph();            // Print the graph
    };
    #endif
The header file mst.h for the Minimum Cost Network code you are to develop looks like this:
    #ifndef _MST
    #define _MST
    #include "partition.h"
    #include "graph.h"
    #include "funcs.h"

    class valueV : public valueFunc {
     public:
       long long value(void *obj);
    };

    class compareV : public compareFunc {
     public:
       int compare (void *obj1, void *obj2);
    };

    class valueE : public valueFunc {
     public:
       long long value(void *obj);
    };

    class hashV : public hashFunc {
       int number;
     public:
       hashV(int);
       int hashfn(void *);
    };

    class readfileG : public readfileFunc {
     public:
       void *read(char*);
    };

    class MST : public Graph, Partition {
       char *filename;
       void printEdge(Edge*);

     public:
       MST(char*);
       void solve();
    };
    #endif
where implementation of valueV, compareV and so on is in mst.cc (you will write the rest) and looks like this:
    #include "mst.h"

    // valueV is used only for the hashtable needed by the partition class
    // Use the address of the object for the hash function
    long long valueV::value (void *obj) { 
       return (long long)((Vertex*)obj)->idx();
    }

    // This one is needed by the list class of the hashtable of the partition
    int compareV::compare(void *obj1, void *obj2) {
       return ((Vertex*)obj1)->idx() == ((Vertex*)obj2)->idx() ? 1 : 0;
    }

    hashV::hashV (int number) { this->number = number; }

    int hashV::hashfn(void *object) { 
       valueV *v = new valueV();
       long long n = (long long)v->value(object);
       return (int)(n*357 % number);  
    }

    long long valueE::value (void *obj) { return *(int*)obj; }

    void *readfileG::read(char *str) { return new int(atoi(str)); }
The above get used like this (msttest.cc):
    #include <iostream>
    #include "mst.h"
    using namespace std;

    int main(int argc, char **argv) {
       if (argc != 2) {
          cout << "Usage: " << argv[0] << " \n";
          exit(0);
       }
       MST *mst = new MST(argv[1]);
       mst->solve();
    }
Looks pretty easy, right? No supervisor class to build like in the other versions of this homework - the Graph and Partition classes do most of the work.

You will link your code with the following:

list.cc, list.h, node.cc, node.h, funcs.h, mst.h, graph.h, graph.cc, edge.h, edge.cc, vertex.h, vertex.cc, partition.h, hashtable.h, the Makefile, and hashtable.cc from homework 5 and partition.cc from homework 6 .

Your task is to produce mst.cc which implements the solve function of class MST. In the solve method you will use init and readFile of the Graph class to read a graph file - each line has two city indices followed by an edge cost. The readFile method requires a function that reads the last number on a line and turns it into an int object which is saved in each edge object. You will have to implement that function (it takes one line of code). You will also use qsort (built into C++) to sort the edges by increasing order of edge cost. There will be an outer 'for' loop on the edge array - each iteration of the loop will use redundant of the Partition class to check whether an edge should be output as part of the solution and setunion of the Partition class to update the partition structure. Just about every line of mst.cc will make use of some builtin function or some method of the Graph class or Partition class.

Here is a testfile: net.1000.dat (1000 city problem). The minimum cost is: 72212 which should be obtained in less than 1 second of processing time.