This page does not represent the most current semester of this course; it is present merely as an archive.
|
||
|
||
|
||
|
||
|
||
|
Each question presents one assembly operation and asks what it does. All questions assume the following initial configuration: This quiz is open-book. You may use a calculator of computer if you wish. Do not consult with others, including others located on the Internet. I made this a question so it will show up on the page. You don't need to answer it. |
||||||||||||||||||||||||||
After executing "movl %ecx, (%esp)", which of the following changes?
|
||||||||||||||||||||||||||
After executing "popl %eax", what is the value in register eax?
|
||||||||||||||||||||||||||
After executing "pushl %eax", what is the value in register esp?
|
||||||||||||||||||||||||||
After executing "pushl %eax", what part of memory now contains 0x03020100?
|
Each question presents one assembly operation and asks what it does. All questions assume the following initial configuration: This quiz is open-book. You may use a calculator of computer if you wish. Do not consult with others, including others located on the Internet. I made this a question so it will show up on the page. You don't need to answer it. |
||||||||||||||||||||||||||
After executing "movl (%esp), %ecx", which of the following changes?
|
||||||||||||||||||||||||||
After executing "popl %eax", what is the value in register eax?
|
||||||||||||||||||||||||||
After executing "popl %eax", what is the value in register esp?
|
||||||||||||||||||||||||||
After executing "pushl %eax", what is in memory at byte 8?
|
Section 4.2.2 defines the HCL for bit-equality to be bool eq = (a && b) || (!a && !b); Section 4.2.3 defines the HCL for word-equality to be bool Eq = (A == B); Assume we have two-bit words, so A is made of bits a1 and a0 and B is made of bits b1 and b0. Which of the following defines the bit-level implementation of word-level equality (A == B)?
|
||||||||||||||||||||||||||
On page 358 we see the following HCL:
int Out4 = [ The text following this notes "the selection can sometimes be simplified, since only the first matching case is selected. For example, the second expression can be written !s1, rather than the more complicated !s1 && s0, since the only other possibility having s1 equal to 0 was given as the first selection expression." In other words, the above could have been written as
int Out4 = [ without changing its meaning in any way. Which of the following pieces of C code allow this same kind of flexibility in how we write the boolean expressions? (Note: more than one of the following options is "correct" and will receive full credit)
|
||||||||||||||||||||||||||
Section 4.2.2 defines the HCL for bit-MUX to be bool out = (s && a) || (!s && b); Section 4.2.3 defines the HCL for word-MUX to be
int Out = [ Assume we have two-bit words, so A is made of bits a1 and a0 and B is made of bits b1 and b0. Which of the following defines the bit-level implementation of word-level MUX ([s: A; 1:B;])?
|
Assume S is a non-pipelined system and P is a pipelined system. Pipelining increases both throughput and latency. Which of the following cases exemplify these principles? Check all that apply (if any do)
|
||||||||||||||||||
What is meant by "nonuniform partitioning" and why is it a problem?
|
||||||||||||||||||
Diminishing returns means that the more pipeline stages there are, the less benefit we get from adding more stages. Why is this?
|
Compare DRAM and SRAM. Check the true statements from the following:
|
||||||||||||||||||||||||||
Suppose a single address sent to the memory module retrieves 128 bits of data. That data is probably retrieved by
|
||||||||||||||||||||||||||
It is typical for the I/O Bridge to be connected to three buses: the system bus, the memory bus, and the I/O bus. It routes a signal between pairs of these busses; the figures on pages 569, 570, and 578 show examples of
What is an example of memory-to-I/O?
|
Compare DRAM and SRAM. Check the true statements from the following:
|
||||||||||||||||||
Reading a supercell takes two memory controller operations. In between the two the intermediate data is stored in
|
||||||||||||||||||
A disk never sends memory to the CPU itself, instead sending it to memory. How does the CPU discover the data is there?
|
One reason that arrays can be faster than linked lists is that their elements are adjacent to one another in memory, where linked list elements may be scattered across widely varying addresses. This is an example of
|
||||||||||||||||||||||||||||||||||||||||||||||||||
Suppose you have a program that often needs several kilobytes of temporary memory. Re-using the same temporary memory each time can be faster than using different temporary memory each time because of
|
||||||||||||||||||||||||||||||||||||||||||||||||||
Consider two programs; one runs in a small loop, the other invokes several dozen functions each one. Assuming the amount of work is similar, the loop will almost certainly run faster. At least some of that speed difference is due to
|
||||||||||||||||||||||||||||||||||||||||||||||||||
"Cache" is pronounced like
|
||||||||||||||||||||||||||||||||||||||||||||||||||
If we find the data we want in a cache, we call that a
|
||||||||||||||||||||||||||||||||||||||||||||||||||
If we fail to find data in a cache because we have never accessed the data before, we call that a
|
||||||||||||||||||||||||||||||||||||||||||||||||||
If we fail to find data in a cache even though we access only a few bytes since we last accessed that same data, we call that a
|
||||||||||||||||||||||||||||||||||||||||||||||||||
If we fail to find data in a cache because we've read too much data since we last accessed that same data, we call that a
|
See also comments on quiz 10
|
||||||||||||||||||||||
|
||||||||||||||||||||||
Suppose a cache can store 64KB of data and is organized with 2-line sets. What is the largest amount of data that cache can serve without encountering a conflict miss?
|
||||||||||||||||||||||
Assume that cache F is fully associative and cache D is direct-mapped. Assume the two caches have the same address space and the same total data capacity.
|
||||||||||||||||||||||
Process A accesses 32 bytes of data scattered in a deterministic but random-looking pattern across addresses 0x00 through 0xff. Assume that cache F is fully associative (using a least-recently-used policy) with 16 lines of 4 bytes each, and cache D is direct-mapped with 32 lines of 4 bytes each. Assume we A twice and measure the cache misses the second time.
|
||||||||||||||||||||||
Process A accesses 32 bytes of data in order evenly spaced across addresses 0x00 through 0xff. Assume that cache F is fully associative (using a least-recently-used policy) with 16 lines of 4 bytes each, and cache D is direct-mapped with 32 lines of 4 bytes each. Assume we A twice and measure the cache misses the second time.
|
||||||||||||||||||||||
Process A accesses 32 bytes of data scattered in a deterministic but random-looking pattern across addresses 0x00 through 0xff. Assume that cache F is fully associative (using a least-recently-used policy) with 32 lines of 4 bytes each, and cache D is direct-mapped with 32 lines of 4 bytes each. Assume we A twice and measure the cache misses the second time.
|
||||||||||||||||||||||
Process A accesses 32 bytes of data in order evenly spaced across addresses 0x00 through 0xff. Assume that cache F is fully associative (using a least-recently-used policy) with 32 lines of 4 bytes each, and cache D is direct-mapped with 32 lines of 4 bytes each. Assume we A twice and measure the cache misses the second time.
|
Loop unrolling is usually used to remove what source of inefficiency?
|
||||||||||||||||||||||||||
Inline substitution (also called "inlining") is usually used to reduce what source of inefficiency?
|
||||||||||||||||||||||||||
Using multiple accumulators is usually used to reduce what source of inefficiency?
|
||||||||||||||||||||||||||
Loop blocking is usually used to reduce what source of inefficiency?
|
||||||||||||||||||||||||||
Adding local variables is usually used to reduce what source of inefficiency?
|
||||||||||||||||||||||||||
Reassociation of operators is usually used to reduce what source of inefficiency?
|
||||||||||||||||||||||||||
Which of the following techniques depend on a pipelined processor's ability to work on several instructions at once, and thus would not work in a non-pipelined processor? Check all that apply.
|
||||||||||||||||||||||||||
Section 5.2 of the textbook discusses CPE (cycles per element, also called cycles per execution or cycles per instruction in other sources). Their discussion suggests that if I have code with 20 CPE and run it on a problem where my algorithm executes on 100 elements, I should expect the runtime to be:
|
When discussing virtual memory, a single page (check all that apply)
|
||||||||||||||||||||||
From the perspective of a running application, virtual memory looks like
|
||||||||||||||||||||||
From the perspective of the operating system, virtual memory looks like
|
||||||||||||||||||||||
As used by our textbook, a "cached" virtual page is one that is
|
||||||||||||||||||||||
What is the difference between "caching" a page, "swapping in" a page, and "paging in" a page? Check all that apply.
|
A page table is
|
||||||||||||||||||||||||||||||||||
A translation lookaside buffer (TLB) is
|
||||||||||||||||||||||||||||||||||
For a multi-level page table, the first level page table(s) of process P
|
||||||||||||||||||||||||||||||||||
For a multi-level page table, the second level page table(s) of process P
|
||||||||||||||||||||||||||||||||||
Core i7 memory uses four virtual page numbers because it's MMU has a four-level page table hierarchy. Consider translating a single virtual address into a physical address. Which following numbers of page table accesses could occur during that translation? If different states of the caches and so on could result in different numbers of page table lookups, check multiple answers.
|
When the CPU is interrupted, how does it know which device created the interruption?
|
||||||||||||||||||||||||||
What distinguishes a fault from a trap? Check all that apply
|
||||||||||||||||||||||||||
The assembly instruction "int x" used to make system calls only accepts a 1-byte (256-value) argument x, but Linux uses it to support over 300 system calls. How does it do that?
|
|
||||||||||||||||||
|
||||||||||||||||||
|
||||||||||||||||||
|
||||||||||||||||||
How does knowing you have a pipelined processor change the code you write?
|
||||||||||||||||||
|
||||||||||||||||||
|
: Which two of the following four binary number formats are used by modern computer hardware?
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
: Which of the following is true for all possible values of unsigned x?
|
If you have a floating-point number with a 2-bit exponent (bias 1) and a 3-bit fraction, which of the following numbers (shown in binary) is the largest finite number it can represent?
|
||||||||||||||||||||||||||
If you have a floating-point number with a 2-bit exponent (bias 1) and a 3-bit fraction, what is the smallest positive number (> 0) it can represent? Answers are shown in binary
|
||||||||||||||||||||||||||
|
How many total bytes of data can be stored in the (integer) IA32 program registers at one time? (a program register is one you can use as an operand when in programming in IA32 assembly)
|
||||||||||||||||||||||||||
The textbook uses ATT-format assembly, the default for most GNU tools. Some other tools use Intel-format assembly instead. Each of the following is true for only one of the two formats. Check those that are true of ATT-format assembly:
|
||||||||||||||||||||||||||
"pushl %eax;" is the same as
|
||||||||||||||||||||||||||
The "imul" instruction has only one operand, as in "imul $101;", but the multiplication operation is implements requires two operands. The second operand is
|
In Java, C++, etc, there are types: a String is not the same as a Scanner, and so on. Types are distinct from "meaning": int x might *mean* your age and int y might *mean* your number of siblings, but the language doesn't stop you from saying "x = y" because they are both *type* int. In assembly
|
||||||||||||||||||||||||||
IA32 (also called x86) is CISC, not RISC, because (check all that apply)
|
Two of the following are equivalent to the single-bit mux that the book denotes [ s : A; 1 : B; ], the other two are not equivalent to that mux. Mark the two that are:
|
||||||||||||||||||||||
Consider the instruction "subl %edx, %ebx" passing through the five stages Fetch, Decode, Execute, Memory, Writeback. In which stage does the hardware retrieve the value stored in register %edx?
|
||||||||||||||||||||||
Consider the instruction "subl %edx, %ebx" passing through the five stages Fetch, Decode, Execute, Memory, Writeback. In which stage does the hardware decide where the result will be stored?
|
||||||||||||||||||||||
Not all stages are invovled in executing all instructions. Consider the instruction "subl %edx, %ebx" passing through the five stages Fetch, Decode, Execute, Memory, Writeback. Which (if any) stages are not involved in handling this instruction? Check any that apply
|
Which of the following operations read a value from memory? Check all that apply
|
||||||||||||||||||||||||||||||||||||||||||||||
Which of the following operations write a value to memory? Check all that apply
|
||||||||||||||||||||||||||||||||||||||||||||||
Which of the following operations write a value to a program register? Check all that apply
|
||||||||||||||||||||||||||||||||||||||||||||||
Which of the following operations read a value from a program register? Check all that apply
|
||||||||||||||||||||||||||||||||||||||||||||||
The book discusses using the ALU just to forward values for some operations (e.g., their irmovl has the ALU compute 0 + valC) and to do math for other operations. The forwarding is just a design choice and could have been done otherwise, but the math is intrinsic to the operation of Y86. Which of the following operations use the ALU to do math?
|
||||||||||||||||||||||||||||||||||||||||||||||
When building a chip, we can pick any clock speed we want; the goal is to make it as fast as we can while still being slow enough that no register grabs onto a value before that value has stabilized. Which of the following instructions would be most likely to determine the SEQ clock speed?
|
In general, pipelining
|
||||||||||||||||||||||||||
Nonuniform partitioning happens when
|
||||||||||||||||||||||||||
Suppose instruction A is followed by instruction B in the assembled code. There is a hazard between A and B if (check all that apply)
|
||||||||||||||||||||||||||
In the five-stage pipeline (F, D, E, M, W) we want E to have a second cycle to run its work. Assume there is a pipeline register before each stage (one before F, one before D, etc). How many of those 5 pipeline registers do we need to send the "stall" signal?
|
||||||||||||||||||||||||||
In the five-stage pipeline (F, D, E, M, W) we want E to have a second cycle to run its work. Assume there is a pipeline register before each stage (one before F, one before D, etc). How many of those 5 pipeline registers do we need to send the "bubble" signal?
|
Which of the following is orders from slowest to fastest?
|
||||||||||||||||||||||||||
Data can be stored in program registers, memory, or disk. Which of the following technologies (if any) is/are most commonly associated with memory? Check all that apply
|
||||||||||||||||||||||||||
Which of the following needs to be read and then re-written periodically in order to retain its storage? Check all that apply
|
||||||||||||||||||||||||||
Which of the following needs power in order to retain its storage? Check all that apply
|
Code has good spatial locality if
|
||||||||||||||||||||||||||||||
Code has good temporal locality if
|
||||||||||||||||||||||||||||||
Which of the following describes the kind of caching that the memory hierarchy implements?
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
In a set-associative cache, which of the following can be determined from the memory address and cache organization alone (i.e., without knowing the cache contents)? Check all that apply.
|
Quiz 10 was not given
Which of the following is expected to run fastest? Assume N and M are large numbers.
|
||||||||||||||||||||||||||
Which of the following is expected to run fastest? Assume N is a large number.
|
||||||||||||||||||||||||||
Which of the following are *not* tips for optimizing memory accesses given in the textbook? Check all that apply
|
We want to loop over all i, j, and k (between 0 and 10240) and run the following: a[i][j] = b[i][k] * c[k][j] In order to maximize cache locality, in what order should the loops be placed?
|
||||||||||||||||||||||||||||||
Which of the following is describes the situations in which you should optimize your code for cache locality?
|
||||||||||||||||||||||||||||||
Using a #define IDX (like those shown in class) and a 1D array instead of a 2D array is done because
|
Suppose function A is O(n) and for n=1 takes 100ms; function B is O(n^2) and for n=1 takes 1ms. Which of the following n do you expect to be the smallest for which A is faster than B?
|
||||||||||||||||||||||||||||||
Register spilling is when
|
||||||||||||||||||||||||||||||
Which of the following would you expect to be fastest? Assume we compile with the -O1 flag (i.e., the compiler stores variables in registers, but not fancier optimizations).
|
||||||||||||||||||||||||||||||
Which of the following would you expect to be fastest? Assume we compile with the -O1 flag (i.e., the compiler stores variables in registers, but not fancier optimizations).
|
If a virtual page is allocated and in physical memory, how do we discover the physical page number?
|
||||||||||||||||||||||
If a virtual page is allocated but is not in physical memory, how do we discover the physical page number?
|
||||||||||||||||||||||
If a virtual page is not allocated, how do we discover the physical page number?
|
Page tables are stored in
|
||||||||||||||||||||||
If you have a single-level page table, the amount of space it requires varies based on how many pages are in use. |
||||||||||||||||||||||
If you have a multi-level page table, the amount of space it requires varies based on how many pages are in use. |
||||||||||||||||||||||
If your program uses 100% of the virtual address space (even the parts normally not used), having a multi-level page table requires
|
||||||||||||||||||||||
If your program uses only a small fraction of the virtual address space, having a multi-level page table requires
|
Section 9.7.2 talks about how linux keeps track of memory regions with regard to read/write/execute permissions and shared memory. That information is also in the page table entries. Why should there be vm_area_structs if the same information is in the page table?
|
||||||||||||||||||||||||||||||
Section 9.9 reviews how malloc and free work. One implementation of malloc and free has free do nothing and malloc simply keep a single pointer it increments each time it is called. This implementation has
|
||||||||||||||||||||||||||||||
Section 9.10 discusses garbage collector. These often perform reachability analysis, tracing all pointers recursively to find what parts of allocated memory are actually usable. The reachability analysis starts assuming that what is reachable?
|
A cache maps an address to a value. Virtual memory maps an address to another address. Which kind of cache is virtual memory (i.e., page tables, not the TLB) organized the most like?
|
||||||||||||||||||||||||||
Let V be the size of the virtual address space and P be the size of the physical memory. Which of the following is true?
|
||||||||||||||||||||||||||
A page table entry contains some permission information and some type of pointer. If our program is trying to dereference a 32-bit integer value stored in memory, page tables accessed along the way could point to (select all that apply)
|
Which type of exception does *not* use the exception table?
|
||||||||||||||||||||||
Which type of exception can occur without any particular instruction being the cause of the exception?
|
||||||||||||||||||||||
Which type of exception has an instruction that always causes it?
|
||||||||||||||||||||||
Which type of exception has an instruction that sometimes causes it and sometimes runs without causing it?
|