University of Virginia, Department of Computer Science
CS656: Operating Systems
Despite the advances in processor design, users still demand more and more performance. Eventually, single CPU technologies must give way to multiple processor parallel computers: it is less expensive to run 10 inexpensive processors cooperatively than it is to buy a new computer 10 times as fast. This change is inevitable, and has been realized to some extent in the specialization of subsystems like bus mastering drive controllers. However, the need for additional computational power has thus far rested solely on advances in CPU technologies.
In parallel systems, there are two kinds of fundamental models: shared memory and message passing. From a programmer's perspective, shared memory computers, while easy to program, are difficult to build and aren't scalable to beyond a few processors. Message passing computers, while easy to build and scale, are difficult to program. In some sense, shared memory model and message passing model are equivalent.
One of the solutions to parallel systems is Distributed Shared Memory (DSM) whose memory is physically distributed but logically shared. DSM appears as shared memory to the applications programmer, but relies on message passing between independent CPUs to access the global virtual address space. Both hardware and software implementations have been proposed in the literature. The advantages of DSM programming model are well known. Firstly, shared memory programs are usually shorter and easier to understand than equivalent message passing programs. Secondly, shared memory gives transparent process-to-process communication. Also, programming with shared memory is a well-understood problem.
Our team proposes to develop an emulation of the DSM programming model as our distributed operating systems course project. The objective of this project is to develop an understanding of the theory and implementation of modern DSM. This will include a detailed study of consistency models, replacement strategies, etc. Also, we want to provide APIs to DSM programmers as well as some simple applications using these APIs.
There have been several classic models used to implement distributed shared memory system. The models related to us are the paged-based DSM and shared-variable DSM.
IVY, Mirage, are good example of paged-based DSM. In such models, Each of the CPUs in the system has its own private memory and, cannot reference remote memory directly. When a CPU addresses a word in the address space that is backed by a page currently located on a different machine, a trap to the operating system occurs and the required page must be fetched by software. The operating system acquires the necessary page by sending a message to the machine where the page is currently residing and asking for it. Thus both placement and access are done in software in these two models.
Particularly, in Mirage, the DSM provides a form of network transparency to make network boundaries invisible for shared memory and is upward compatible with an existing interface to shared memory. The protocol used in Mirage has the following features designed to reduce network traffic and sustain high performance:
1. The model is based on paged segmentation
2. The writer of a page retains access to that page for a fixed period of
time independent of subsequent read and write requests.
3. There may be upgrades or downgrades in the modes of pages that are
distributed throughout the network.
Munin and Midway are good examples of shared-variable DSM. In such models, only a selected portion of their address spaces, namely shared variable and other data structures, are shared. User-supplied information is required to determine which variables are shared and which are not. In these systems, the focus changes from trying to pretend that there is a single common memory to how to maintain a set of replicated distributed data structures consistent in the face of updates, potentially from all the machines using the shared data.
For example, Munin uses loosely coherent memory, based on the partial order specified by a shared memory parallel program, and uses type-specific memory coherence. Instead of a single memory coherence mechanism for all shared data objects, Munin employs several different mechanisms, each appropriate for a different class of shared data object. These type-specific mechanism are part of a runtime system that accepts hints from the user or the compiler to determine the coherence mechanism to be used for each object.
Our group will propose and implement an emulation of the DSM programming model for our project. To test our model, we will create test programs that demonstrate the usefulness of distributed shared memory. Most likely, these test programs will involve Gaussian elimination, matrix multiplication, and generally, some other mathematical computation that involves working with a matrix shared in the DSM.
Firstly, we plan to research the current literature and work done in DSM. This includes the related work listed above. From this, we should have a clear understanding of how DSM works, and how we can create a emulation. Next, we will design and implement our software simulation. In the design, we will make choices to support certain properties of DSM as we see fit.
After we have a working implementation, we will create and run test cases on the emulation. From this, we will generate performance results that will allow us to evaluate our implementation. Finally, we will write a report documenting our work and the test results, offering an explanation for the behavior of our simulation as well as suggestions for further research.
 K. Li, "Shared Virtual Memory on Loosely Coupled Multiprocessors"
 B. Fleisch and G. Popek, "Mirage: A Coherent Distributed Shared Memory Design"
 J.K. Bennett, J.K. Carter and W. Zwaenepoel, "Munin: Distributed Shared Memory Based on Type-Specific Memory Coherence"
 B.N. Bershad and M.J. Zekauskas, "Midway: Shared Memory Parallel Programming with Entry Consistency for Distributed Memory Multiprocessors"
 Andrew S. Tanenbaum, "Distributed Operating Systems"