All quizzes (key)

Consider the following C snippet:

extern int strcmp(const char *, const char*);

const char *foo(const char *bar) {
    if (!strcmp(bar, "special")) {
        return "value 1";
    } else {
        return "value 2";
    }
}

Suppose this snippet is contained in the file foo.c, which is compiled into an object file foo.o, and then combined with a main.o to produce an executable called main.exe which calls the foo() function whenever it is run.

Question 1 (1 points): (see above) Ignoring information only present for debugging, the string "value 1" is likely to appear in _______.
Select all that apply

(correct answer)
foo.o
main.o
(correct answer)
main.exe

Question 2 (1 points): (see above) Ignoring information only present for debugging, the name "foo" is likely to appear in ________.
Select all that apply

(correct answer)
foo.o
(correct answer)
main.o
main.exe

Consider the following assembly function example:

example:
    movq %rdi, %rax
loop:
    subq $10, %rdi
    jg loop
    ret

Recall that in the x86-64 calling convention, the first argument to a function is in the register %rdi and the return value is in %rax.

Question 3 (0 points): (see above) [We are dropping this question due an error discovered after we released the quiz.] Which of the following C snippets is equivalent to this function?

(correct answer)
long example(long argument) { do { argument -= 10; } while (argument > 0); return argument; }
long example(long argument) { do { argument -= 10; } while (argument > 10); return argument; }
long example(long argument) { while (--argument > 10) {} return argument; }
long example(long argument) { while (argument > 10) { argument -= 10; } return argument; }

I (Prof Reiss) meant to write subq $10, %rax instead of %rdi in the assembly above. Wihout that change, none of the answers are correct. Consequently, we dropped this question.

Question 4 (1 points): (see above) When this function returns, what are possible values of the ZF condition code?
Select all that apply

(correct answer)
0
(correct answer)
1

Question 5 (1 points): (see above) When this function returns, what are possible values of the SF condition code?
Select all that apply

(correct answer)
0
(correct answer)
1

Read section 2.1.6-7 of the textbook and answer the following questions.

Question 6 (1 points): (see above) Using bitwise operators, integers can be used to represent sets, where each bit being sets indicates whether an element with a particular index is included in the set. If this is being done in C, if a and b are integers representing sets, then producing a set containing the elements in both a and b (sometimes called "set intersection") can be done with the expression:

Read section 4-4.1.3 of the textbook and answer the following questions.

Question 7 (1 points): (see above) An instruction set architecture (ISA) determines
Select all that apply

(correct answer)
what machine code that would be produced by a compiler looks like
(accepted any answer)
what circuits that would be produced by a processor designer look like
what lower-level programming languages like C look like
how long instructions take to execute

we think the better answer is that the instruction set does not determine what the circuits look like, since the processor designer has a wide variety of options to choose from, but we'll accept the interptation based on the ISA limiting what circuits are allowed (has to use the machine code chosen) and/or strongly encouraging particular designs

Question 8 (1 points): (see above) In section 4.1.3, the textbook explains how rmmovq %rsp, 0x123456789abcd(%rdx) is encoded as 40 42 cd ab 89 67 45 23 01 00. If we were to instead encode irmovq $0x123456789abcd, %rsp, what would its encoding be?

30 42 cd ab 89 67 45 23 01 00
30 f2 cd ab 89 67 45 23 01 00
32 cd ab 89 67 45 23 01 00
40 f2 cd ab 89 67 45 23 01 00
30 f2 00 01 23 45 67 89 ab cd
(correct answer)
none of the above

Note that in irmovq, %rsp is in the rB field, not the rA filed, so the least significant nibble of the second byte should be 4 and not 2 (or d). (And, no, it was my [Prof Reiss']s mistake that we question was extra-tricky like this.)

Question 1 (1 points): Which of the following C expressions are always true, assuming x and y are 32-bit unsigned ints between 0 and 9999999, inclusive?
Select all that apply

((x << 8) & 0xFF) == (y ^ (y & 0xFF))
(x ^ y) < (x | y)
(correct answer)
(((x << 4) | ((y >> 4) & 0xF)) & 0xFF) == (((x & 0xF) << 4) | ((y & 0xFF) >> 4))
((((x & 0xF) >> 4) | ((y & 0xFF) >> 4)) | x) == (((x >> 4) | x) | ((y >> 4)& 0xF))

This question was briefly miskeyed. Counterexample for B: x = y = 0; Counterexample for D: x = 0x100, y = 0;

Question 2 (1 points): Consider the following incomplete C function:

unsigned int copy_lsb(unsigned int x) {

    CODE HERE

}

Which of the following could replace CODE HERE to make this function return a 32-bit unsigned int where each byte is a copy of the least significnat byte of the argument x?
Select all that apply

return x | (x << 8) | (x << 16) | (x << 24); }
(correct answer)
return (x & 0xFF) | ((x & 0xFF) << 8) | ((x & 0xFF) << 16) | ((x & 0xFF) << 24);
(correct answer)
unsigned int y = x & 0xFF; y = y | (y << 16); return y | (y << 8);
unsigned int y = x; y = (y << 8) & 0xFF; return (y << 16) & 0xFF;

Consider the following C snippet:

int array[4] = {0x1234, 0x5678, 0xABCD, 0xEF01};

Suppose on a system with 32-bit, little-endian integers and 64-bit pointers, the array is located at byte address 0x10000.

Question 3 (1 points): (see above) What is the contents of the byte at address 0x10005?

Question 4 (1 points): (see above) The C expression array + 3 is a pointer containing what address?

Question 5 (1 points): If a calculation is undefined behavior in C, then ______.
Select all that apply

compilers will translate the code to the most straightforward assembly, but what that assembly does may vary between platforms
(correct answer)
the same compiler may generate code which calculates different values depending on its optimization flags
attempting to execute the calculation will result in a segmentation fault if it is run on Linux
(correct answer)
the result of the calculation may differ each time the resulting executable is run

For these questions, read section 4.2.2, 4.2.3 (but you may ignore details of HCL syntax) and 4.2.5.

Question 6 (1 points): (see above) Our textbook distinguishes combinatorial and sequential circuits. Which of the following are true about combinatorial circuits?
Select all that apply

combinatorial circuits can only take two inputs
(correct answer)
a combinatorial circuit's output only depends on its input
a combinatorial circuit's output can depend on its current input and a prior input
combinatorial circuits are composed of logic gates and registers

Question 1 (1 points): Consider the following HCLRS code:

x = [
    y > 10 : 100;
    y + z == 15 : 200;
    z > 10 : 300;
    1 : 400;
];

For which of the following values of y and z will the value of x be 200?
Select all that apply

(correct answer)
y = 4, z = 11
y = 15, z = 0
y = 0, z = 20
(correct answer)
y = 0, z = 15

Question 2 (1 points): In the register design we discussed in lecture, the value output by a register changes

(correct answer)
on the rising edge of the clock signal
on the falling edge of the clock signal
whenever the input value changes
half a clock cycle after the input value changes
a full clock cycle after the input value changes

Question 3 (1 points): Which of the following are more likely to be attributes of a processor implementing RISC-like instruction set than a CISC-like instruction set?
Select all that apply

(correct answer)
having separate instructions for moving values between memory and general purpose registers rather than allowing most instructions to access memory directly
including an instruction that saves all general purpose registers to the stack
having instructions that copy an variable-length string rather than requiring compilers to generate a loop of many instructions

Question 4 (1 points): In the register file design we discussed in lecture, writing to the register file is controlled by pairs of inputs. One input in each of these pairs specifies a 64-bit value and the other specifies an 8-bit register number. Suppose we have a processor which uses these inputs to change a value in the register file for some of the instructions it runs, but for other instructions it leaves the register file values unchanged. During clock cycles in which a register should not change, the processor should

(correct answer)
set the 8-bit register number input to the special value 0xF (15)
make sure the 8-bit register number input is not connected to anything
make sure the 64-bit value input is not connected to anything
set the 64-bit value input to 0
make sure the 64-bit vcalue input is set to the special value 0xFFFFFFFFFFFFFFFF (the maximum possible 64-bit unsigned integer)

For these questions, review sections 4.3.3-4.3.4 of the textbook. (If necessary for context, you may want to also review earlier parts of section 4.3.)

Question 5 (1 points): (see above) In the single-cycle processor design described by our textbook, the values stored in condition code registers are updated _______________ corresponding values are stored in the register file.

earlier in the clock cycle
(correct answer)
at about the same time in the clock cycle
later in the clock cycle

Question 6 (1 points): (see above) In the single-cycle processor design described by our textbook, during a pushq %rax instruction, the register file srcA and srcB inputs should be the register numbers corresponding to:

(correct answer)
%rax and %rsp
%rax and %rbp
%rsp (and the other input doesn't matter)
%rax (and the other input doesn't matter)
%rsp and %rsp (both inputs should be the same)
%rax and %rax (both inputs should be the same)

For these questions, consider the single-cycle processor design strategy described in our textbook and in lecture (the processor executes exactly one instruction per cycle), and the register file and memory components described lecture and our textbook.

Question 1 (1 points): (see above) When executing a call function instruction, the 64-bit address input of the data memory should be equal to

does not matter because the instruction does not read from memory
(correct answer)
the result of a calculation performed by the ALU
one of the 64-bit outputs of the register file
the register number for %rsp
the current output of the program counter register
the 64-bit data output of the data memory

Question 2 (1 points): (see above) Executing a ret instruction requires several operations:

1. reading the instruction from the instruction memory
1. reading the current value of the stack pointer
1. computing the current value stack pointer plus 8
1. reading the return address from memory at stack pointer
1. updating the stack pointer to its current value plus 8
1. updating the program counter to the return address

In what order to these occur in the single-cycle hardware implementation?

(correct answer)
1; then 2; then 3 and 4 in any order; then 5 and 6 at almost the same time
1; then 2; then 4; then 3; then 5 and 6 at almost the same time
1, 2, 3, and 4 in any order; then 5 and 6 at almost the same time
1; then 2; then 3; then 4; then 5; then 6
1; then 2; then 3; then 4; then 5 and 6 at almost the same time
none of the above

Question 3 (1 points): (see above) In lecture, we described a processor design that could execute just the Y86 irmovq, rrmovq, mrmovq, and rmmovq instructions. Suppose we wanted to extend this processor to execute a new immovq instruction that would take a 64-bit constant and move it to a memory address specified with a 64-bit displacement and a base register. Which would be useful changes to make to the processor as part of adding this instruction?
Select all that apply

(correct answer)
making the instruction memory's output (i10bytes in HCLRS) wider than 80 bits
(correct answer)
adding a MUX to control the value input to the data memory (mem_input in HCLRS)
adding an additional option to the MUX controlling the dstE (register number to write, reg_dstE in HCLRS) input to the register file
adding an additional option to the MUX controlling the next R[dstE] (reg_inputE in HCLRS) input to the register file
adding a MUX to control the input to the instruction memory (pc in HCLRS)

Question 4 (1 points): Consider a pipelined Y86 processor with two stages that performs the instruction memory and register file reads in the first stage and data memory reads and register file writes as part of the second stage. Suppose this processor is performing a memory read for an rmmovq instruction. While this is happening, the data memory's 64-bit value-to-write input (mem_input in HCLRS) will be equal to

(correct answer)
the output of a new register added to support pipelining
one of the outputs of the register file
the output of the program counter register
part of the output of the instruction memory
none of the above

we intended this question to specify that the data memory reads and writes occured in the second stage, but did not.

Question 5 (1 points): Consider a pipelined processor with ten stages, where the clock cycle has 500 ps between rising edges. When an operation starts executing in the piepline, it will usually finish executing after

Question 6 (1 points): Modifying a pipelined processor with 4 stages to instead have only 2 stages will generally increase ________.
Select all that apply

(correct answer)
the processor's cycle time
the processor's instruction throughput
the processor's instruction latency
the number of registers in the processor

Question 1 (1 points): Consider the following Y86 assembly snippet:

    addq %rax, %rbx
    subq %rbx, %rcx
    xorq %rdx, %rax
    rmmovq %rax, 8(%r8)
    andq %r8, %rbx

When executing on a five-stage pipelined proecssor like the one we described in lecture, there are data hazards because ________.

the addq and rmmovq both try to read %rax
(correct answer)
the addq writes %rbx and the subq reads it
the rmmovq and the addq both use %r8
the addq writes %rax and the addq reads it
(correct answer)
the xorq writes %rax and the rmmovq reads it
the addq writes %rbx and the andq reads it

was meant to be a select-all-that-apply, but it wasn't. Credit given if any correct answer was selected.

Consider the following Y86 assembly snippet:

    subq %rcx, %rdx
    jle later
    addq %r8, %r9
    addq %r9, %r10

later:
    xorq %rdx, %rcx

Assume the initial values of %rcx and %rdx are set such that the jle is taken.

Question 2 (1 points): (see above) Consider a five-stage pipelined processor using the stages we discussed in lecture and uses stalling only to handle data hazards and control hazards. (Assume the whether a conditional jump is taken is not determined until near the end of the jump's execute stage.) If the subq instruction is fetched during cycle 0, then the xorq will complete its writeback stage during cycle _____.

Question 3 (1 points): (see above) Consider a five-stage pipelined processor using the stages we discussed in lecture and uses stalling and forwarding to handle data hazards, and branch prediction that predicts conditional jumps as always taken to handle control hazards. (Assume the processor users forwarding whenever it would not require substantially increasing the critical path length and therefore the cycle time.) If the subq instruciton is fetched during cycle 0, then the xorq will complete its writeback stage during cycle _____.

Consider the following Y86 assembly snippet:

    addq %rax, %rbx
    subq %rax, %rcx
    xorq %rbx, %rcx

Suppose the above assembly snippet is run a six-stage (in-order) pipelined processor which does not implement forwarding and uses the following stages:

Fetch
Decode 1
Decode 2
Execute
Memory
Writeback

The values of registers read are not available until the "Decode 2" stage.

Question 4 (1 points): (see above) To avoid data hazards when executing the above code using stalling only (and no forwarding), for how many cycles must a stall occur in the processor? (Count each cycle during which no new instruction can be started as a single stall.)

Question 5 (1 points): (see above) To avoid data hazards when executing the above code using stalling and forwarding, for how many cycles must a stall occur in the processor? (Count each cycle during which no new instruction can be started as a single stall. Assume the processor uses forwarding whenever it would not require substantially increasing the critical path length and therefore the cycle time.)

Question 1 (1 points): Consider a 512B Cache with 4 blocks per set and 8 byte blocks. Which of the following hex values correspond to the index for address 0xABCD?

Question 2 (1 points): Assume that you have a 32B direct mapped cache with 16B blocks with a LRU replacement policy. How many misses will result from reading the following addresses 0xABC 0xACE 0xACA 0xACF 0xAAA 0xACE ? Assume the cache starts off empty.

Question 3 (1 points): Increasing the associativity of the cache while keeping the number of set the same, will decrease the number of

cold misses
(correct answer)
capacity misses
does not change the number of misses
none of the above

increases the size of cache assuming we don't adjust the block size, so decreases capacity misses. Also decreases conflict misses, but the question did not ask about that. If we increase the block size rather than adjusting the cache size, then this could change the number of all sorts of misses; we neglected to think about this when reviewing this quiz, since usually the block size is harder to change in a cache design (because it's hard to let it go out of sync with other components in the memory system).

Question 4 (1 points): Consider the following series of read and write accesses. Where reads are marked with the letter R and writes are marked with the letter W. Assume at 64B fully associative cache with 4 blocks per set. This cache is using write-allocate policy and a LRU replacement policy. How many misses will the following access pattern generate: (W) 0xBAD (W) 0xABC (R) 0xBAC (R) 0xBOA (W) 0xBOA (W) 0xBED (W) 0xBEC? Assume the cache starts off empty.

For these quesitons, read section 5.2, the introduction to section 5.7 and the first paragraph of section 5.7.1, and section 5.7.3.

Question 5 (1 points): (see above) Suppose a superscalar, out-of-order processor can start 3 integer multiplications per cycle (using 3 pipelined integer multiplication circuits) and each integer multiplication takes 4 cycles to complete. By our textbook's terminology, what is the throughput bound on multiplications in this processor?

Consider the following C code:

    char array[4 * 1024];
    int sum;
    ...
    for (int k = 0; k < 2; ++k) {
        for (int i = 0; i < 4; i += 1) {
            sum += array[i * 1024] * array[16 + i * 16];
        }
    }

For each of the questions below, consider the behavior of cache during the outermost for loop. Assume the cache is empty when the for loop starts, only accesses to the array use the cache, and the neither the compiler nor the processor does that reorders or omits array accesses.

Assume the address of the array is a multiple of the cache size ( so array[0] is at the beginning of a cache block that maps to set index 0 in the cache).

Question 1 (1 points): (see above) With a 2KB direct-mapped cache with 16B cache blocks, how many cache misses will occur?

array[0] and array[2 * 1024] each map to the same set, so array[0] won't be kept in the cache between iterations of the k loop. Same for array[1024] and array[3 * 1024]. array[16 + 0 * 16], array[16 + 1 * 16], etc. all map to different sets than any of the other array accesses, so they'll all be in the cache during the second iteration of the k loop.

Question 2 (1 points): (see above) With a 128B 4-way set associative cache with 16B cache blocks and an LRU replacement policy, how many cache misses will occur?

This cache has two sets. array[0] array[1024] array[2048] array[3 * 1024] array[16 + 1 * 16] array[16 + 3 * 16] all map to set 0. array[16 + 0 * 16] and array[16 * 2 + 16] map to set 1. All these values map to different blocks, so we don't need to consider whether we get hits from accessing two values in the same bock. Everything in set 0 will miss when k = 1 because we ran out of room in that set, and we always try accessing the least recently used value when k = 1. Everything in set 1 hit afterwards when k = 1, because we can store the two blocks together.

Suppose a superscalar out-of-order processor is executing

    addq %r9, %10
    addq %r11, %rcx
    mrmovq 0(%rcx), %r8
    addq %r12, %rcx
    mrmovq 0(%rcx), %r9
    subq %r8, %r9
    mrmovq 8(%rcx), %r11

Question 3 (1 points): (see above) If the processor had enough functional units The mrmovq (%rcx), %r9 instructuion could be executed at the same time as
Select all that apply

Question 4 (1 points): (see above) If the processor uses register renaming where architectural registers are mapped to physical registers, then most likely ______.
Select all that apply

(accepted any answer)
%rcx in addq %r11, %rcx and in mrmovq 0(%rcx), %r8 will use different physical registers
(correct answer)
%r9 in addq %r9, %r10 and in mrmovq 0(%rcx), %r9 will use different physical registers
(correct answer)
%rcx in mrmovq 0(%rcx), %r8 and mrmovq 0(%rcx), %r9` will use different physical registers.

for %rcx in addq %r11, %rcx, there are two versions of rcx in the addq; for others: different versions of the values [key corrected 2019-10-30]

Consider the following C function:

int size;  /* global variable, set elsewhere */

void example(int *A, int *B) {
    for (int j = 0; j < size; ++j) {
        for (int i = 0; i < size; ++i) { /* (1) */
            B[i] += A[i] * A[j];
        }
    }
}

Question 1 (1 points): (see above) Consider a version of the C function with the loops swapped:

void example_swapped_loops(int *A, int *B) {
    for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            B[i] += A[i] * A[j];
        }
    }
}

Which of the following expressions are sufficient for the example_swapped_loops code to have an equivalent effect to the original code above? (Assume comparisons with pointers compare the pointer addresses.)
Select all that apply

A != B && A != &size && B != &size
A != B && &size < A && &size < B
(correct answer)
B == A + size && (&size < A || &size > A + size * 2)
B == &size && A > B + size

for A != B cases: consider B = &A[1]

Question 2 (1 points): (see above) Because of concerns about aliasing, a compiler often will not keep A[j] in register between iterations of the for loop labelled (1). What would be an effective way for a programmer to prevent this?
Select all that apply

(correct answer)
create a local variable Ajbefore the for loop labelled (1) and assign A[j] to it, and use this Aj variable instead of A[j] within the for loop
create a local variable Aj_ptr before the for loop labelled (1) and assign &A[j] to it, and use *Aj_ptr instead of A[j] within the for loop
unroll the for loop labelled (1)
convert size from a global to local variable

Question 3 (1 points): If performed excessively, function inlining increases _____.
Select all that apply

(correct answer)
the size of an executable
(correct answer)
the number of instruction cache misses a program experiences
the number of times values are written to and read from the stack

Question 1 (1 points): When performing a context switch, from process A to process B, ____________ saves the value of registers like %rax in __________.

the processor / the operating system's memory
the processor / the exception table or interrupt vector table
(correct answer)
the operating system / the operating system's memory
the operating system / the exception table or interrupt vector table

Question 2 (1 points): Which of the following are true about running an exception handler?
Select all that apply

(correct answer)
the exception handler runs in kernel mode
when an exception is triggered by a particular instruction, the memory address of the exception handler run is part of that instruction
(correct answer)
when an exception is triggered by an external event, the processor saves the program counter somewhere before running the exception handler
when an exception is triggered by an external event, the processor saves all general purpose registers (like %rax, %rcx, etc. on x86) to the current stack before running the exception handler

Suppose a system uses virtual memory where:

pages are 1024 bytes,
virtual addresses are 20 bits,
physical addresses are 24 bits,
page tables are stored in memory (using a single-level structure), and
page table entries are 4 bytes

Question 3 (1 points): (see above) How large are physical page numbers in bits? Write your answer as a base-10 number.

Answer: (accepted answers: /14(?:\s*bits?)?/)

: 24 bit physical addresses - 10 bit page offset

Question 4 (1 points): (see above) What is the virtual page number for the virtual address 0x12345? Write your answer as a hexadecimal number.

Answer: (accepted answers: /(?:0[xX])?0*48/)

: 0x12345 shifted by 10 (page offset size)

Question 5 (1 points): (see above) Suppose the page table entry for the virtual address 0x10007 is valid and contains the physical page number 4. What is the physical address that corresponds to virtual address 0x10007? Write your answer as a hexadecimal number.

Answer: (accepted answers: /(?:0[xX])?0*1007/)

: page number 4's base address is 4 times 1024 (page size) plus a page offset of 7

Question 6 (1 points): (see above) If the page table base address is physical (byte) address 0x1000 then what is the physical (byte) address of the page table entry for virtual page number 3? Write your answer as a hexadecimal number.

Answer: (accepted answers: /(?:0[xX])?0*100[cC]/)

: 3 times 4 (page table entry size) into the page table

Consider a system with:

38-bit virtual addresses
36-bit physical addresses
16384 byte pages
4-byte page table entries
two-level page tables, where page tables at each level are one page
a 64-entry, 8-way set associative TLB

Question 1 (1 points): (see above) If the page table base pointer contains the physical (byte) address 0x100000, then what is the physical (byte) address of the first-level page table entry for virtual address 0x10000? Write your answer as a hexadecimal number.

Answer: (accepted answers: /(?:0x)?0*100000/)

: VPN part 1 is 0

Question 2 (1 points): (see above) If the page table base pointer contains the physical (byte) address 0x100000, then what is the physical (byte) address of the first-level page table entry for virtual address 0x80010000? Write your answer as a hexadecimal number.A

Answer: (accepted answers: /(?:0x)?0*100080/)

: VPN part 1 is 32 times 4 bytes per PTE = offset 0x80

Question 3 (1 points): (see above) If the base address for the second-level page table corresponding to virtual adress 0x10000 is (physical byte address) 0x40000, then what is the physical (byte) address of the second-level page table entry for virtual address 0x80010000? Write your answer as a hexadecimal number.

Answer: (accepted answers: /(?:0x)?0*40010/)

: VPN part 2 is 4, offset into table into table is 0x4 * 4 = 0x10

Question 4 (1 points): (see above) On this system, accessing the virtual address 0x80341 accesses the same TLB set as accessing the virtual address _______.
Select all that apply

(correct answer)
0x81341
0x90341
(correct answer)
0xA0341
0x1234341

Question 5 (1 points): Which of the following is/are true about the process of swapping a page out of DRAM and into disk:
Select all that apply

The valid bit of the page that is moved from DRAM to disk is set to 1
The program whose page is getting swap from DRAM to disk was the program that triggered the fault.
(correct answer)
The Operating System chooses which page to move from DRAM to disk
(correct answer)
The operation of swapping page will be faster if the page being replace has not been modified.

Question 6 (1 points): Consider TLB cache in a 4-level paging machine. Which values are stored in the TLB?
Select all that apply

(accepted any answer)
Virtual Page Number
The 1st Level Physical Page Number
The Page offset
(correct answer)
Metadata for each the physical page

TLB tag contains part of VPN (or all if fully associative TLB)

Question 1 (1 points): Suppose a system uses 8192-byte pages. Assuming the L1 cache uses tags which store parts of physical addresses, with which of the following L1 cache configurations could it overlap TLB accesses and cache accesses?
Select all that apply

a direct-mapped 32KB cache with 64B blocks
a 16-way 1MB cache with a 64B blocks and a random replacement policy
(correct answer)
an 8-way 64KB cache with 64B blocks and an LRU replacement policy
(correct answer)
a direct-mapped 4KB cache with 64B blocks

originally miskeyed to say 32KB would be possible, which was not

Consider a system with a 16-entry, 2-way TLB with an LRU replacement policy and 4096-byte pages. A program runs on this system with an initially empty TLB and accesses one byte at the following virtual addresses in this order:

0x01023
0x05023
0x0103F
0xF9000
0xF8FF8
0xE9FF8
0x01023

(Assume the pages corresponding to all virtual addresses are valid in the page table.)

Question 2 (1 points): (see above) Which of the following accesses are TLB misses?
Select all that apply

(correct answer)
the access to 0x05023
the access to 0x0103F
(correct answer)
the access to 0xF8FF8
(dropped from quiz)
the access to 0xF9FF8
(accepted any answer)(correct answer)
the second access to 0x01023

we accidentally mentioned an access that didn't happen in the pattern abov (0xF9FF8). With the actual access above 0xE9FF8, the second access to 0x01023 is a miss, without it is a hit.

Question 3 (1 points): (see above) After these accesses complete, how many page table entries will the TLB store?

Question 4 (1 points): Increasing the size of pages will _____________.
Select all that apply

increase the amount of storage required to implement a TLB that stores a particular number of entries
(correct answer)
increase the amount of data that can potentially be accessed by a program without any TLB misses
(correct answer)
increase the number of bits used for the page offset

All quizzes (key)

Quiz 01

Quiz 02

Quiz 03

Quiz 04

Quiz 05

Quiz 06

Quiz 07

Quiz 08

Quiz 09

Quiz 10

Quiz 11