Changelog:
- 9 Feb 2023: edit example code to use
0x456789abcdefinstead of0x123456789abcdefto avoid virtual addresses having unused bits. - 11 Feb 2023: fix
page table offsets
topage offsets
; explicitly indicate that page tables are all the same size and start at the beginning of a page - 11 Feb 2023: mention that page tables are one page in the
Your task
section in addition to the sections below. - 13 Feb 2023: replace
present
withvalid
inYour task
to be consistent with rest of writeup - 13 Feb 2023: explicitly indicate that page_allocate also allocates pages which don’t hold page tables in order to make the mapping work
- 13 Feb 2023: explicitly note that not all of
size_t vais used for the address translation because of the page table sizes - 14 Feb 2023: add hints section with some common issues
- 18 Feb 2023: copy diagrams from lecture into hints section; along with new diagram for 2-level lookup; more prominenantly point to reading.
- 20 Feb 2023: note more prominently that example code assumes LEVELS = 4
- 27 Feb 2023: correct URL for reading link at bottom of hints
Your task:
This assignment has multiple submissions.
For the first submission, you must complete:
- item 2 below with
LEVELSonly set to 1 andPOBITSonly set to 12.
and two of:
- item 2 with
LEVELSbeing set to 2 (in addition to 1) andPOBITSonly set to 12 - item 2 with
LEVELSandPOBITShaving any valid value - item 3 below (a Makefile)
- item 4 below (a README)
- item 5 below (a LICENSE and licenses.txt file)
For all parts of this assignment, follow the coding style described below and make sure your code does not give warnings when compiled with
-Wall1.For the second submission, which is in preparation for a code review, you must complete all parts except part 5 and 6 below.
For the final submission, you must complete all parts below (and should fix any problems identified in the code review).
(This assignment will be worth considerably more than a normal one-week assignment. A vast majority of the grade for the assignment will be based on what you submit on the final submission. For the final submission, we will grade (again) tasks which should have been completed on earlier submissions.)
- item 2 below with
Implement a simulation of multi-level page table lookup and allocation. For the purposes of this simulation
physical
addresses are the ones your simulation program can access via pointers.The layout of this page table must be configurable via
config.h, which will#definethe constants:LEVELS– number of levels in the page tablePOBITS– the size of page offsets
Page table entries will take up 8 bytes, with the least significant bit being the
valid
bit. Each page table (at each level) will take up one page.Your code must implement API provided by a header file called
mlpt.h, an version of which is supplied here. It includes:extern size_t ptbr— page table base register; must be initially 0. Later, will contain the address of a page table your code allocates as a result of calls topage_allocate.size_t translate(size_t va)— translate the virtual addressvausing the currently setup page table. If the address is not allocated, return a value with all bits set to 1.void page_allocate(size_t va)— use posix_memalign to create page tables and other pages sufficient to have a mapping between the given virtual address and some physical address. If there already is such a page, does nothing.
Prepare a
Makefilefor your code that: - produces, as its default target, a library called libmlpt.a that exports the API defined inmlpt.h(and does not export a function calledmain)- has CC, CFLAGS and LDFLAGS defined
- uses defined arguments in its work
- has correct dependencies for each target
- has
allandcleantargets
Write a
READMEexplaining (at a minimum):- How to customize
config.h, with some guidance on how to pick values for that file. - An example use case for this code. Note that this cannot be
to implement virtual memory
since this code relies onposix_memalign, which cannot work without virtual memory already being in place. - Any known bugs or limitations (if you know of none, there is no need to say so; if you have an incomplete implementation so far, you’ll probably have some to document).
You are encouraged to add additional detail, such as
- Any limitations, aspects you think are incomplete, or suggestions for future expansion.
- Big-O analysis (time, space, or both).
- Description of any testing hooks you added.
- Code samples for how to use this library.
If anyone or any resource helped your implementation, also include an
ACKNOWLEDGEMENTSfile acknowledging those people and/or resources.- How to customize
Include a
LICENSEandlicenses.txtfiles as described below.See if you can add some kind of de-allocate interface without substantially reworking your prior implementation. If so, add it to your implementation (including
mlpt.h) and document how it is used. If it seems to tricky to add easily, include some explanation of what complicates it in theREADMEinstead of implementing it.
1 Specification
We supply two header files.
1.1 config.h
config.h does nothing other than #defineing
two constants.
/** LEVELS = number of PTEs used to translate one address. */
#define LEVELS 4
/** POBITS = number of bits used for the page offset. */
#define POBITS 12Your code must use these constants, must not redefine them, and must
continue to work even if we change their values. For example, if we
compile with a different config.h that
#defines LEVELS as 3 instead of
4, your code should create a 3-level page table instead of
a 4-level page table. Your code should work for all integer values of
LEVELS between 1 and 6 inclusive
and all integer values of POBITS between 4 and
18 inclusive. As an exception, it does not need to (but
may) support cases where (POBITS − 3) ×
(LEVELS + 1) > 60.
1.2 mlpt.h
mlpt.h defines the public API your code will use. You
are welcome to edit it, provided such edits will not cause testing code
to fail to compile.
/**
* Page table base register.
* Declared here so tester code can look at it; because it is extern
* you'll need to define it (without extern) in exactly one .c file.
*/
extern size_t ptbr;
/**
* Given a virtual address, return the physical address.
* Return a value consisting of all 1 bits
* if this virtual address does not have a physical address.
*/
size_t translate(size_t va);
/**
* Use posix_memalign to create page tables and other pages sufficient
* to have a mapping between the given virtual address and some physical address.
* If there already is such a page, does nothing.
*/
void page_allocate(size_t va);1.3 Behavior
Prior to the first invocation of page_allocate,
ptbr should be NULL; thereafter it should be a
pointer to a heap-allocated array, cast as a size_t. It
should not change after the first page_allocate
invocation.
If ptbr is not NULL, it should point to an
array of size_ts occupying 2POBITS
bytes. Each such size_t should be interpreted as a page
table entry.
Each other page table should be the same size. All page tables should
start at the beginning of a page. Since page tables are limited in size,
probably not all bits of the size_t va passed to
translate and page_allocate will be used in
the address translation process.
Some texts refer to a PTBR containing a physical address; others to
it containing a physical page number. The above definition asserts it
contains a physical address, not page number. As such, it will always
have 0 in its low-order POBITS bits.
Each page table entry should be either
- 0 in the low-order bit
-
There is no physical page.
The remaining bits have no defined meaning; you may use them however you wish.
PTE
0x1234567812345678has a zero in the low-order bit, and thus indicates the absence of a physical page. - 1 in the low-order bit
-
The bits above the
POBITSlow-order bits are the physical page number of either the next level page table or the physical page of memory. ThePOBITS− 1 low-order bits (excluding the 1 in the lowest-order bit) have no defined meaning; you may use them however you wish.PTE
0x1234567812345679has a one in the low-order bit, and thus indicates the presence of a physical page or another level page table. IfPOBITSis 12, the physical page number (of the page or next page table level) is0x1234567812345.
No pages should be allocated unless requested by a call to
page_allocate.
The comments in the following code correctly describe the number of
pages that have ever been allocated assuming that
LEVELS is 4 and POBITS is 12. (With
smaller LEVELS values, either you would not use all the non-zero bits of
the virtual addresses being tested or the virtual page numbers would be
out of range.)
#include <stdio.h>
#include <assert.h>
#include "mlpt.h"
#include "config.h"
int main() {
// 0 pages have been allocated
assert(ptbr == 0);
page_allocate(0x456789abcdef);
// 5 pages have been allocated: 4 page tables and 1 data
assert(ptbr != 0);
page_allocate(0x456789abcd00);
// no new pages allocated (still 5)
int *p1 = (int *)translate(0x456789abcd00);
*p1 = 0xaabbccdd;
short *p2 = (short *)translate(0x456789abcd02);
printf("%04hx\n", *p2); // prints "aabb\n"
assert(translate(0x456789ab0000) == 0xFFFFFFFFFFFFFFFF);
page_allocate(0x456789ab0000);
// 1 new page allocated (now 6; 4 page table, 2 data)
assert(translate(0x456789ab0000) != 0xFFFFFFFFFFFFFFFF);
page_allocate(0x456780000000);
// 2 new pages allocated (now 8; 5 page table, 3 data)
}2 Coding Style
4-space indentation
{on the same line as)and}on a line by itselfNo empty statements
Spaces around operators
Space before
(if and only if it is preceded by a control construct (e.g.,if (buttranslate()No postfix
++or--unless part of an expression that uses the return value (such asa[i++]). Replacing with either prefix++or+= 1is OK.Lower-case identifiers with underscores separating words
Upper-case
#defineconstantstypedefsuch that each type name uses a single identifier (e.g., nounsigned charexcept as part of atypedef).Use variable names that indicate the meaning of the contents of the variable; function names that indicate the purpose of the function; etc.
Have a readable code organization; as rough (not strict) guidelines,
- If a sequence of lines of code depend upon one another more than they depend on the code around them and appear together in multiple places, they should be in their own function.
- Functions should be small enough to look at on one screen unless
that have a very orderly structure such as a long
switchwith simple code in thecases. - Comments should be present to describe any esoteric code decisions.
- All code should have a clear purpose.
3 LICENSE and
licenses.txt
You need to choose a license for the code you supply. Include the
chosen license as LICENSE and a file
licenses.txt indicating your exploration of licenses.
This should be based on an existing family of license agreements. Unless you have had formal training in commercial law, you should use an existing license, not try to create your own.
3.1 licenses.txt
In licenses.txt, list at least three licenses you
considered; include a short (sentence or two) summary of what they
seemed to say; and why you chose to or not to use each as your
license.
Your review should include at least three licenses, including
- at least one of GPL or LGPL
- at least one of MIT, BSD, Boost, Apache, SQLite
All should be software licenses; this excludes licenses such as the various CC licenses for creative works, the SIL Open Font License, operating system licenses, etc.
3.2 LICENSE
Include a LICENSE file with the license agreement under
which you provide your code to us.
4 Not required: create a manual page
If you want to understand the real world a bit better, you might also
try writing a manual page. These are written in a language known as
roff
; to see an example of what this looks like, try running
cp /usr/share/man/man3p/poll.3p.gz ./
gunzip poll.3p.gz
less poll.3pto see the roff source of the manual page man 3p poll. I
don’t personally know anyone who writes roff by hand; generally it is
output from a source-code processing tool like doxygen2 or a converter like
pandoc3 from an easier-to-write format like
markdown or reStrucuredText.
Including man page page_allocate.3.gz and
translate.3.gz in your submission will earn you kudos. To
test that these work properly,
create a directory
man3(i.e.,mkdir man3)gzip the
rofffile and move it toman3/term_to_look_up.3.gz(e.g.,gzip translate.roff; mv translate.roff.gz man3/translate.3.gz) – note that some authoring tools may do some of this for youuse the
-Mflag withmanpointing to the parent directory ofman3(i.e.,man -M. 3 translate)
5 Hints
5.1 Common C issues
posix_memalign(&x, ...)is similar tox = malloc(...)except that the memory is guarenteed at a multiple of a particular value. The pointer to pointer passed intoposix_memalignis used in lieu of returning a pointer to the memory allocated. (The return value ofposix_memalignindicates success (0) or failure (non-zero error code).)(size_t*) ptbruses the value contained inptbras a pointer — in other words, it usesptbras a pointer. In contrast,&ptbrgets a pointer to the value contained inptbr; in other words, it makes a pointer toptbr.
5.2 Page table format and page numbers
- Our testing code will expect you to follow the page table format specified above. We specify the format in terms of page numbers rather than full addresses (which would have a page number and page offset). If our testing code seems to think that page table entries don’t contain the addresses you think they do, using a different format might be why.
5.3 Slide diagrams of translate and lookup
All these assuming ptbr variable stored at address
0x5898, containing 0x10000, as well as some
other value stored in the page tables found.
5.3.1 1-level translate
5.3.2 1-level page_allocate
5.3.3 2-level translate
5.4 reading
In addition to materials from slides, our reading on multi-level page tables work generally also include some relevant diagrams and pseudocode.
…because you resolved the warnings, not because you disabled them with warning-ignoring
#pragmas or the like.↩︎which is installed on department machines; see
man doxygenfor more.↩︎a very old version (1.3) is installed on the department servers without man support; visit https://pandoc.org/ to get a copy yourself.↩︎