Up to this point, there really hasn't been any fundamental difference between programming in C and in C++. In fact, most experienced C programmers organize their functions into modules that relate to a single data structure (a "class"), and often even use a naming convention which mimics C++, for example, naming routines StackFull() and StackPush(). However, the features I'm about to describe do require a paradigm shift - there is no simple translation from them into a normal C program. The benefit will be that, in some circumstances, you will be able to write generic code that works with multiple kinds of objects.
Nevertheless, I would advise a beginning C++ programmer against trying to use these features, because you will almost certainly misuse them. It's possible (even easy!) to write completely inscrutable code using inheritance and/or templates. Although you might find it amusing to write code that is impossible for your graders to understand, I assure you they won't find it amusing at all, and will return the favor when they assign grades. In industry, a high premium is placed on keeping code simple and readable. It's easy to write new code, but the real cost comes when you try to keep it working, even as you add new features to it.
Nachos contains a few examples of the correct use of inheritance and templates, but realize that Nachos does not use them everywhere. In fact, if you get confused by this section, don't worry, you don't need to use any of these features in order to do the Nachos assignments. I omit a whole bunch of details; if you find yourself making widespread use of inheritance or templates, you should consult a C++ reference manual for the real scoop. This is meant to be just enough to get you started, and to help you identify when it would be appropriate to use these features and thus learn more about them!
To allow the two implementations to coexist, we first define an abstract Stack, containing just the public member functions, but no data.
class Stack {
public:
Stack();
virtual ~Stack(); // deallocate the stack
virtual void Push(int value) = 0;
// Push an integer, checking for overflow.
virtual bool Full() = 0; // Is the stack is full?
};
// For g++, need these even though no data to initialize.
Stack::Stack {}
Stack::~Stack() {}
The Stack definition is called a base class or sometimes a superclass. We can then define two different derived classes,
sometimes called subclasses which inherit behavior from the base
class. (Of course, inheritance is recursive - a derived class can in
turn be a base class for yet another derived class, and so on.)
Note that I have prepended the functions in the base class is prepended
with the keyword virtual, to signify that they can be redefined by
each of the two derived classes. The virtual functions are
initialized to zero, to tell the compiler that those functions
must be defined by the derived classes.
Here's how we could declare the array-based and list-based implementations of Stack. The syntax : public Stack signifies that both ArrayStack and ListStack are kinds of Stacks, and share the same behavior as the base class.
class ArrayStack : public Stack { // the same as in Section 2
public:
ArrayStack(int sz); // Constructor: initialize variables, allocate space.
~ArrayStack(); // Destructor: deallocate space allocated above.
void Push(int value); // Push an integer, checking for overflow.
bool Full(); // Returns TRUE if the stack is full, FALSE otherwise.
private:
int size; // The maximum capacity of the stack.
int top; // Index of the lowest unused position.
int *stack; // A pointer to an array that holds the contents.
};
class ListStack : public Stack {
public:
ListStack();
~ListStack();
void Push(int value);
bool Full();
private:
List *list; // list of items pushed on the stack
};
ListStack::ListStack() {
list = new List;
}
ListStack::~ListStack() {
delete list;
}
void ListStack::Push(int value) {
list->Prepend(value);
}
bool ListStack::Full() {
return FALSE; // this stack never overflows!
}
The neat concept here is that I can assign pointers to instances of
ListStack or ArrayStack to a variable of type Stack, and
then use them as if they were of the base type.
Stack *s1 = new ListStack;
Stack *s2 = new ArrayStack(17);
if (!stack->Full())
s1->Push(5);
if (!s2->Full())
s2->Push(6);
delete s1;
delete s2;
The compiler automatically invokes ListStack operations
for s1, and ArrayStack operations for s2;
this is done by creating a procedure table for each object,
where derived objects override the default entries in the table
defined by the base class. To the code above, it invokes the
operations Full, Push, and delete by indirection
through the procedure table, so that the code doesn't need to know
which kind of object it is.
In this example, since I never create an instance of the abstract class Stack, I do not need to implement its functions. This might seem a bit strange, but remember that the derived classes are the various implementations of Stack, and Stack serves only to reflect the shared behavior between the different implementations.
Also note that the destructor for Stack is a virtual function but the constructor is not. Clearly, when I create an object, I have to know which kind of object it is, whether ArrayStack or ListStack. The compiler makes sure that no one creates an instance of the abstract Stack by mistake - you cannot instantiate any class whose virtual functions are not completely defined (in other words, if any of its functions are set to zero in the class definition).
But when I deallocate an object, I may no longer know its exact type. In the above code, I want to call the destructor for the derived object, even though the code only knows that I am deleting an object of class Stack. If the destructor were not virtual, then the compiler would invoke Stack's destructor, which is not at all what I want. This is an easy mistake to make (I made it in the first draft of this article!) - if you don't define a destructor for the abstract class, the compiler will define one for you implicitly (and by the way, it won't be virtual, since you have a really unhelpful compiler). The result for the above code would be a memory leak, and who knows how you would figure that out!
Suppose that I wanted to add a new member function, NumberPushed(), to both implementations of Stack. The ArrayStack class already keeps count of the number of items on the stack, so I could duplicate that code in ListStack. Ideally, I'd like to be able to use the same code in both places. With inheritance, we can move the counter into the Stack class, and then invoke the base class operations from the derived class to update the counter.
class Stack {
public:
virtual ~Stack(); // deallocate data
virtual void Push(int value); // Push an integer, checking for overflow.
virtual bool Full() = 0; // return TRUE if full
int NumPushed(); // how many are currently on the stack?
protected:
Stack(); // initialize data
private:
int numPushed;
};
Stack::Stack() {
numPushed = 0;
}
void Stack::Push(int value) {
numPushed++;
}
int Stack::NumPushed() {
return numPushed;
}
We can then modify both ArrayStack and ListStack
to make use the new behavior of Stack. I'll only list
one of them here:
class ArrayStack : public Stack {
public:
ArrayStack(int sz);
~ArrayStack();
void Push(int value);
bool Full();
private:
int size; // The maximum capacity of the stack.
int *stack; // A pointer to an array that holds the contents.
};
ArrayStack::ArrayStack(int sz) : Stack() {
size = sz;
stack = new int[size]; // Let's get an array of integers.
}
void
ArrayStack::Push(int value) {
ASSERT(!Full());
stack[NumPushed()] = value;
Stack::Push(); // invoke base class to increment numPushed
}
There are a few things to note:
ArrayStack::ArrayStack(int sz) : Stack()The same thing applies to destructors. There are special rules for which get called first - the constructor/destructor for the base class or the constructor/destructor for the derived class. All I should say is, it's a bad idea to rely on whatever the rule is - more generally, it is a bad idea to write code which requires the reader to consult a manual to tell whether or not the code works!
Note however that I made Stack's data member private, not protected. Although there is some debate on this point, as a rule of thumb you should never allow one class to see directly access the data in another, even among classes related by inheritance. Otherwise, if you ever change the implementation of the base class, you will have to examine and change all the implementations of the derived classes, violating modularity.
ArrayStack *s = new ArrayStack(17);
ASSERT(s->NumPushed() == 0); // should be initialized to 0
Stack *s = new ArrayStack(17);
if (!s->Full()) // ArrayStack::Full
s->Push(5); // ArrayStack::Push
void ArrayStack::Push(int value)
{
...
Stack::Push(); // invoke base class to increment numPushed
}
Of course, if we just called Push() here (without prepending
Stack::, the compiler would think we were referring
to ArrayStack's Push(), and so that would recurse,
which is not exactly what we had in mind here.
So the question to ask yourself before using inheritance is: what's your goal? Is it to write your programs with the fewest number of characters possible? If so, inheritance is really useful, but so is changing all of your function and variable names to be one letter long - "a", "b", "c" - and once you run out of lower case ones, start using upper case, then two character variable names: "XX XY XZ Ya ..." (I'm joking here.) Needless to say, it is really easy to write unreadable code using inheritance.
So when is it a good idea to use inheritance and when should it be avoided? My rule of thumb is to only use it for representing shared behavior between objects, and to never use it for representing shared implementation. With C++, you can use inheritance for both concepts, but only the first will lead to truly simpler implementations.
To illustrate the difference between shared behavior and shared implementation, suppose you had a whole bunch of different kinds of objects that you needed to put on lists. For example, almost everything in an operating system goes on a list of some sort: buffers, threads, users, terminals, etc.
A very common approach to this problem (particularly among people new to object-oriented programming) is to make every object inherit from a single base class Object, which contains the forward and backward pointers for the list. But what if some object needs to go on multiple lists? The whole scheme breaks down, and it's because we tried to use inheritance to share implementation (the code for the forward and backward pointers) instead of to share behavior. A much cleaner (although slightly slower) approach would be to define a list implementation that allocated forward/backward pointers for each object that gets put on a list.
In sum, if two classes share at least some of the same member function signatures - that is, the same behavior, and if there's code that only relies on the shared behavior, then there may be a benefit to using inheritance. In Nachos, locks don't inherit from semaphores, even though locks are implemented using semaphores. The operations on semaphores and locks are different. Instead, inheritance is only used for various kinds of lists (sorted, keyed, etc.), and for different implementations of the physical disk abstraction, to reflect whether the disk has a track buffer, etc. A disk is used the same way whether or not it has a track buffer; the only difference is in its performance characteristics.
In C++, this is pretty easy to do using templates:
template <class T>
class Stack {
public:
Stack(int sz); // Constructor: initialize variables, allocate space.
~Stack(); // Destructor: deallocate space allocated above.
void Push(T value); // Push an integer, checking for overflow.
bool Full(); // Returns TRUE if the stack is full, FALSE otherwise.
private:
int size; // The maximum capacity of the stack.
int top; // Index of the lowest unused position.
T *stack; // A pointer to an array that holds the contents.
};
To define a template, we prepend the keyword template to
the class definition, and we put the parameterized type for the
template in angle brackets. If we need to parameterize the implementation
with two or more types, it works just like an argument list:
template <class T, class S>. We can use the type parameters
elsewhere in the definition, just like they were normal types.
When we provide the implementation for each of the member functions in the class, we also have to declare them as templates, and again, once we do that, we can use the type parameters just like normal types:
// template version of Stack::Stack
template <class T>
Stack<T>::Stack(int sz) {
size = sz;
top = 0;
stack = new T[size]; // Let's get an array of type T
}
// template version of Stack::Push
template <class T>
void
Stack<T>::Push(T value) {
ASSERT(!Full());
stack[top++] = value;
}
Creating an object of a template class is similar to creating
a normal object:
void
test() {
Stack<int> s1(17);
Stack<char> *s2 = new Stack<char>(23);
s1.Push(5);
s2->Push('z');
delete s2;
}
Everything operates as if we defined two classes, one
called Stack<int> - a stack of integers, and one
called Stack<char> - a stack of characters.
s1 behaves just like an instance of the first;
s2 behaves just like an instance of the second.
In fact, that is exactly how templates are typically implemented -
you get a complete copy of the code for the template
for each different instantiated type. In the above example,
we'd get one copy of the code for ints and one copy for chars.
So what's wrong with templates? You've all been taught to make your code modular so that it can be re-usable, so everything should be a template, right? Wrong.
The principal problem with templates is that they can be very difficult to debug - templates are easy to use if they work, but finding a bug in them can be difficult. In part this is because current generation C++ debuggers don't really understand templates very well. Nevertheless, it is easier to debug a template than two nearly identical implementations that differ only in their types.
So the best advice is - don't make a class into a template unless there really is a near term use for the template. And if you do need to implement a template, implement and debug a non-template version first. Once that is working, it won't be hard to convert it to a template. Then all you have to worry about code explosion - e.g., your program's object code is now megabytes because of the 15 copies of the hash table/list/... routines, one for each kind of thing you want to put in a hash table/list/... (Remember, you have an unhelpful compiler!)