CS 3330: Computer Architecture

This page does not represent the most current semester of this course; it is present merely as an archive.

1 Y86 4: PIPE part 2

Continuing with the same subset of instructions from the lab, add in the other pipeline stages to nop, halt, irmovq, and rrmovq.

Even though these instructions do not use execute or memory, add in all five stages.

Implement forwarding to resolve hazards.

1.1 Approach

Get to know and love figure 4.41 (page 424 of the 3rd edition, page 403 of the 2nd). To match the textbook’s text, the name of these registers would be fF, fD, dE, eM, and mW. However, we can’t have two register banks with the lowercase f so pick something else for the name of fF (maybe xF or pP or the like) and remember that the book’s f_predPC is named differently in your HCL.

I suggest you do not add all the registers to each register bank in advance, instead adding them as needed for the instructions you implement, but that is just a suggestion.

1.1.1 Stalls and Bubbles

You will only need one pipeline stall for this homework: if the Stat computed during Fetch is not STAT_AOK you should stall the register feeding the fetch stage (but not bubble the register after it; we want the non-AOK instruction to reach the end of the pipeline). Note that you’ll need to pass that non-OK Stat down through all the pipeline registers before putting it into the Stat output.

1.1.2 Forwarding

For these instructions, rrmovq reads a register but either irmovq or rrmovq could be writing it. To implement forwarding, the easiest approach is probably to add a mux to the inputs to the dE register that check for things like

reg_srcA == m_dstE : m_valE;

and so on for all the other possible overlaps of srcs and dsts that are later in the pipeline.

1.2 Organizing your code

You may either put all your registers at the beginning of the file, or you may put them between the phases in question. The hcl2d executable makes multiple passes through your file, so defining them after you use them is permitted and can lead to a more flow-oriented layout, but some people find that distracting and prefer to define before use. Whichever you prefer.

1.3 Testing your code

The following program (y86/halt.yo)


should take 5 cycles to complete and do nothing (fetching address 0x0 four times and updating no registers or memory)

The following program (y86/nophalt.yo)


should take 8 cycles to complete and do nothing (fetching address 0x0, 0x1, 0x2, and then 0x3 five times (never fetch address 0x4) and update no registers or memory)

The following program (y86/ins.yo)

iaddq $1, %r8

should take 6 cycles to complete, do nothing, and end with an invalid instruction error (error code 4, STAT_INS)

The following program (y86/irrr7.yo)

irmovq $1, %rax
rrmovq %rax, %rbx

should take 7 cycles to complete and leave 1 in both %rax and %rbx.

The following program (y86/rrmovq.yo)

irmovq $5678, %rax
irmovq $34, %rcx
rrmovq %rax, %rdx
rrmovq %rcx, %rax

should take 9 cycles to complete and change the following registers:

| RAX:               22   RCX:               22   RDX:             162e |

The following program (y86/irrr7b.yo)

irmovq $0x1, %rax
irmovq $0x2, %rbx
irmovq $0x3, %rcx
rrmovq %rax, %rdx
rrmovq %rcx, %rbx
rrmovq %rbx, %rsi
rrmovq %rbx, %rdi

should take 12 cycles and change the following registers:

| RAX:                1   RCX:                3   RDX:                1 |
| RBX:                3   RSP:                0   RBP:                0 |
| RSI:                3   RDI:                3   R8:                 0 |

1.4 Submit

Submit pipehw1.hcl on the submission page.

Copyright © 2016 by Luther Tychonievich and Charles Reiss. All rights reserved.
Last updated 2016-10-24 15:43:40