| 20-ECES-228-001 | Data Structures | Fall 2007 |
|---|---|---|
|
Due: October 12, 2007
Submit source code implementing linked list version of the MergeSort class to xxxxx@email.uc.edu
Rationale:
| A drawback to an array-based implementation of mergesort is that the "merge" of two increasing lists (arrays) of integers requires the movement of objects from the lists being merged to a third list representing the merged list. In this exercise you will eliminate the overhead of this sort of movement without incurring a large penalty for finding elements of a list by implementing mergesort using linked lists. |
MergeSort:
Given a list L of objects for which the operator < is
defined (less than), the mergesort operation on L is given
recursively by the following pseudocode:
L mergeSort(L) {
if (L has 1 element) return L;
Let L1 be the first half of L;
Let L2 be the other half of L;
Let S1 = mergeSort(L1);
Let S2 = mergeSort(L2);
return merge(L1, L2);
}
where merge(L1, L2) outputs a list of all objects in both
L1 and L2 in increasing order.
|
Homework Problem:
Write a MergeSort class implementing a mergesort on a given
list of objects of any kind. The objects are constructed from data
given in a file. The constructor of this class should have two
arguments. The first is a pointer to an object of a class that
is prototyped as follows:
class valueA : public valueFunc {
public:
long long value (void *object);
};
where class valueFunc is prototyped in funcs.h. An object of the valueA
class is used to determine the order of objects in the input list
via its value method. The input to value is a
pointer to a list object and the output is its value.
The second constructor argument is a pointer to an object of a class prototyped as follows:
class createT : public createFunc {
public:
void *create(fstream &fin);
};
where class createFunc is prototyped in funcs.h. An object of class createT
is used to create objects from data contained in a file. For
example, for the simple class:
class A {
long long number;
public:
A (long long number) { this->number = number; }
long long valueOf() { return number; }
};
assuming the data file contains a list of integers that are to be used as
arguments for A constructors, the needed functions are:
long long valueA::value (void *object) { return ((A*)object)->valueOf(); }
void *createT::createObject(fstream &fin) {
long long number;
fin >> number;
return (void*)new A(number);
}
The creation of a MergeSort object for a list of objects of class
A would proceed as follows:
MergeSort *ms = new MergeSort(value, createObject);
You will demonstrate the use of the MergeSort class on a list of ints. That is, write a program to sort a file of integers using your MergeSort class. Examples of input files are merge.3.dat (7 numbers), merge.10.dat (1024 numbers), and merge.16.dat (65536 numbers). This program will contain the value and object creation functions as well as the declaration of a MergeSort object for sorting the integers, the invocation of a method which reads the data, and the invocation of a sort method. Call is mergesort.cc. Implement the MergeSort class in a file called mrgsrt.cc. The methods and data of this class are given by the following include file, called mrgsrt.h:
#ifndef _MERGESORT
#define _MERGESORT
#include <fstream>
#include "mcell.h"
#include "funcs.h"
using namespace std;
class MergeSort {
// Private Data
valueFunc *value; // Value function
createFunc *create; // Object creation function
Cell *data; // Pointer to unsorted/sorted structure
bool sorted; // true iff list is sorted
// Private methods
Cell *installData(fstream&, int); // Creates input list from file data
Cell *merge (Cell*, Cell*); // Merges two increasing lists
Cell *mergeSort(Cell*); // Sorts list pointed to by Cell object
void showTree(Cell*); // Display the input list
public:
MergeSort (valueFunc*, createFunc*);
void sort(); // Sort the input list using method mergeSort
int readData(char*); // Read data from file and build list using installData
void showOutput(); // Display sorted list
void showInput(); // Display unsorted list
};
#endif
Private data consists of pointers to the value and object creation
objects that are passed through the constructor, the data
pointer serving the dual role of pointing to the unsorted list before
a sort and the sorted list after the sort, and the sorted
variable of type bool which has value true if and
only if the list is sorted. There are private methods which operate
on Cell objects (described below). The Cell objects
are internal "carriers" of objects and are not available to users.
Private methods include installData which builds all list
objects from the data file and returns a pointer to the resulting list
structure (which is more complex than a singly linked list - see
below), merge which returns a singly linked list that is the
merge of two input, ordered, singly linked lists, mergeSort
which uses merge to sort a given input list, and
showTree which is intended to display an unsorted input list.
Methods available to a user, besides the constructor, are
readData which creates an internal structure from a given
data file name, pointed to by data, and holds a list of
objects to be sorted (uses the private installData method and
unsets sorted), sort which replaces the unsorted
list pointed to by data with a sorted list (sets
sorted and uses the private method mergeSort), and
showInput and showOutput which display the unsorted
and sorted lists, respectively.
The Cell class is protoyped as follows in mcell.h:
#ifndef _CELL
#define _CELL
#include <iostream>
using namespace std;
class Cell {
friend class MergeSort;
void *object; // Pointer to an object, if terminal
Cell *left, *right; // Pointers to left and right sublists
public:
Cell (Cell*, Cell*); // Called when building intermediate nodes
Cell (void*); // Called when building terminal nodes
};
#endif
The reason for the left and right pointers is to
separate the left half of a list from its right half. The pointers
allow the separation to take place in one step: the merge proceeds on
the two sublists pointed to by these pointers. Details are given in
the next section.
|
Linked list implementation of Mergesort:
|
The stucture of data to be sorted will be based on linked lists. The
structure needs to be more complex than a singly linked list: although
merging two singly linked lists is straightforward and efficient,
finding the middle of a singly linked list is highly inefficient as it
requires visiting half of the elements in the list.
A solution is to use Cell objects that have two pointers, denoted left and right. The idea of using Cell objects to "carry" objects in a singly linked list was introduced in CSII: there, each Cell object had a single next pointer to the next Cell object in the list. We expand on this idea by using Cell objects in two different ways: to carry objects and to point to list middles. There is a root Cell whose left pointer references a structure containing the first half of the given list and whose right pointer references the other half. Specifically, these pointers reference Cell objects that are intermediate roots for the first quarter, second quarter, third quarter, and fourth quarter of the list. This nesting continues until reaching root Cells referencing single objects. The following diagram illustrates the structure.
Mergesort on this structure begins on the bottom level. All pairs are merged into singly linked lists. Observe the merged lists are all sorted and that the left pointer of Cell objects on the level above point to the head of each merged list. Then, on the new bottom level these lists are merged. The process continues until the entire list is sorted. The following pictures show this in stages ![]() After first stage - bottom row pairs combined to sorted lists
After second stage - bottom pairs of sorted lists merged
After third stage - bottom pairs of sorted lists merged
When the mergesort is completed, the left pointer of the root references the head of the sorted list. |
Code hints:
Some of the methods of the MergeSort class are given below.
By studying these examples you should be able to build the entire
class.
/***--------------------------------------------------------------
readData:
Input: A filename
Output: Variable data is the root of a linked structure
holding all objects created from the data file
------------------------------------------------------------------***/
int MergeSort::readData(char *filename) {
fstream fin;
fin.open(filename, ios::in);
if (fin.fail()) {
cout << "Cannot open " << filename << "\n";
exit(0);
}
sorted = false;
int count = 0;
long number;
while (fin >> number) count++;
fin.close();
fin.clear();
fin.open(filename, ios::in);
data = installData(fin, count);
fin.close();
fin.clear();
return count;
}
/***--------------------------------------------------------------
installData:
Input: A file input stream and the number of data objects
in that file
Output: A pointer to the root of a tree structure containing
objects created from the file data
------------------------------------------------------------------***/
Cell *MergeSort::installData(fstream &fin, int n) {
int dleft;
if (n == 1) {
void *object = create(fin);
return new Cell(object);
}
for (dleft=0 ; ((1 << dleft) < n/2) ; dleft++);
Cell *q1 = installData(fin, (1 << dleft));
Cell *q2 = installData(fin, (n - (1 << dleft)));
return new Cell(q1,q2);
}
/***--------------------------------------------------------------
mergeSort:
Input: A tree linked on Cell left and right, value cells
are on the bottom level
Output: A linked list, in increasing order, of objects in
the input tree. List is linked on Cell left
------------------------------------------------------------------***/
Cell *MergeSort::mergeSort(Cell *c) {
if (c == NULL) return NULL;
if (c->left == NULL) return c;
return merge(mergeSort(c->left), mergeSort(c->right));
}
For convenience, the files you will link to are as follows: mcell.cc, mcell.h, mrgsrt.h, and hw5.cc. Your assignment is to supply mrgsrt.cc. The following Makefile may be used to compile and link all the files. |