This page does not represent the most current semester of this course; it is present merely as an archive.
In this lab we'll implement a subset of the Y86 instruction set. In particular, we'll implement nop, halt, OPl, rrmovl, irmovl, and cmovXX.
You are encouraged to work with a friend on this lab. There is a lot of detail in the writeup, and having someone to talk it over with as you go should help.
You should create a file named lab4.hcl and work in that. We suggest you use pc.hcl from homework 3 as a starting point.
Add somewhere to lab4.hcl the line
stall_P = Stat != STAT_AOK; # so that we see the same final PC as the yis tool
There are various constants that can be useful in this and subsequent labs and homeworks; I'll assume the following:
# icodes; see page 338
const HALT = 0, NOP = 1, RRMOVL = 2, IRMOVL = 3, RMMOVL = 4, MRMOVL = 5;
const OPL = 6, JXX = 7, CALL = 8, RET = 9, PUSHL = 10, POPL = 11;
const CMOVXX = RRMOVL;
# ifuns; see page 339
const ALWAYS = 0, LE = 1, LT = 2, EQ = 3, NE = 4, GE = 5, GT = 6;
const ADDL = 0, SUBL = 1, ANDL = 2, XORL = 3;
I'll also assume you have a working PC update using register P { pc:32 = 0; }; if not, see tiny.hcl for an example of how that can work.
halt and nopAs with homework 3, if the icode is HALT, set the Stat to STAT_HLT; otherwise, set it to STAT_AOK if you know how to run that icode or STAT_INS otherwise.
For nop, all you need to do is update the pc.
irmovlThe textbook's Figure 4.18 (page 366) notes the following semantics for irmovl:
| Stage | irmovl V, rB |
|---|---|
| Fetch | icode:ifun ← M1[PC] rA:rB ← M1[PC + 1] valC ← M4[PC + 2] valP ← PC + 6 |
| Decode | |
| Execute | valE ← 0 + valC |
| Memory | |
| Writeback | R[rB] ← valE |
| PC Update | PC ← valP |
The table suggests we fetch one byte and split it in half for icode and ifun, then fetch another byte and split it in half for rA and rB, etc. Our flavor of HCL instead fetches 6 bytes (little-endian) into i6bytes and lets us split them up using slice operators.
You need to define icode, ifun, rA, rB, and valC (and, optionally, valP). Keep in mind how many bits each wire / variable needs.
The instruction irmovl $10, $edx is assembled as the bytes 30 f2 0a 00 00 00 and read into i6bytes as 0x0000000af230. To retrieve the icode (i.e., 3) we say something like
wire icode:4;
icode = i6bytes[4..8]
Similarly, rB is i6bytes[8..12], rA is i6bytes[12..16], and so on.
It is worth noting that Y86 is like RISC in that icode, ifun, rA, and rB, are in the same place for every instruction. It is like CISC in that the immediate value valC is in a different place for different instructions.
The table also suggests creating a valP; you don't actually need that at this point, but it is a good idea to put pc + whatever into a wire valP and then put valP into p_pc. Having a separate valP both matches the book's discussion and will make adding jXX, call, and ret easier when you get to them for homework.
Nothing to do. The book suggests adding 0 to valC and storing it in valE, a shortcut that "forwards" valC to valE, necessitated by their more limited variant of HCL. Because our HCL is more flexible, we can skip that step.
The register file has two write ports, E and M. If dstE is not REG_NONE then wvalE will be written to register number dstE, and if dstM is not REG_NONE then wvalM will be written to register number dstM.
For irmovl, dst_ specifies the register that the immediate value is moved to (rB for irmovl). wval_ specifies the value that will be sent to the dst_ register (where _ can be either E or M) – in this case, the immediate value.
Put the following into irmovl.ys:
irmovl $1, %eax
irmovl $2, %ecx
irmovl $34, %edx
irmovl $5678, %eax
(remember to have a blank line at the end). Assemble it with yas to get irmovl.yo and run your simulator on it; you should see registers end as follows:
| EAX: 162e ECX: 2 EDX: 22 EBX: 0 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
rrmovlThe textbook's Figure 4.18 (page 366) notes the following semantics for rrmovl:
| Stage | rrmovl rA, rB |
|---|---|
| Fetch | icode:ifun ← M1[PC] rA:rB ← M1[PC + 1] valP ← PC + 2 |
| Decode | valA ← R[rA] |
| Execute | valE ← 0 + valA |
| Memory | |
| Writeback | R[rB] ← valE |
| PC Update | PC ← valP |
As with irmovl, 0 + valA is unneeded in our variant of HCL. You'll need to add some muxes for things that are different between irmovl and rrmovl, like the what goes into valP and wvalE.
The only new part is decode; built-in signal srcA determines what register is the source for the register move. If you set srcA to something other than REG_NONE, rvalA will be set by the register file as the contents of that register.
Put the following into rrmovl.ys:
irmovl $5678, %eax
irmovl $34, %ecx
rrmovl %eax, %edx
rrmovl %ecx, %eax
(remember to have a blank line at the end). Assemble it with yas to get rrmovl.yo and run your simulator on it, you should see registers end as follows:
| EAX: 22 ECX: 22 EDX: 162e EBX: 0 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
You should also now be able to get the same results using your simulator as you get from tools/yis when running y86/prog6.yo.
OPlThe textbook's Figure 4.18 (page 366) notes the following semantics for OPl:
| Stage | OPl rA, rB |
|---|---|
| Fetch | icode:ifun ← M1[PC] rA:rB ← M1[PC + 1] valP ← PC + 2 |
| Decode | valA ← R[rA] valB ← R[rB] |
| Execute | valE ← valB OP valA Set CC |
| Memory | |
| Writeback | R[rB] ← valE |
| PC Update | PC ← valP |
There are two things to do in the Execute phase.
The simple part is the operation itself, the ALU, which is essentially a mux based on ifun (e.g., icode == OPL && ifun == XORL : rvalA ^ rvalB;)
The other part is setting condition codes. We'll use a simpler set than the textbook: instead of trying to track overflow and so on, we'll simply check less-than, equal-to, and greater-than directly. We'll need a register to store those three flags inside of:
register C { # book uses Z/S/O instead of lt/eq/gt...
lt:1 = 0;
eq:1 = 1;
gt:1 = 0;
}
Register banks like C have a special input stall_C which, if 1, causes the registers to ignore inputs and keep their current value. Thus, we want to stall C unless there was an OPl:
stall_C = icoded != OPl;
Once we have done that, we record if the (signed) value of valE is <, =, or > 0 (using unsigned comparison operators):
c_gt = valE != 0 && valE <= 0x7fffffff;
c_eq = valE == 0;
c_lt = valE != 0 && valE >= 0x80000000;
If you run your simulator on opl.yo, which is an assembled version of
irmovl $7, %edx
irmovl $3, %ecx
addl %ecx, %ebx # b = 3
subl %edx, %ecx # c = -3 11..1101
andl %edx, %ebx # b = 2
xorl %ecx, %edx # d = -6 11..11010
andl %edx, $esi
you should see (without the -q flag, shown with some lines remove for brevity)
+--------------- between cycles 2 and 3 ------------------+
| EAX: 0 ECX: 3 EDX: 7 EBX: 0 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=0000000c } |
| register C(S) { lt=0 eq=1 gt=0 } |
+--------------- between cycles 3 and 4 ------------------+
| EAX: 0 ECX: 3 EDX: 7 EBX: 3 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=0000000e } |
| register C(N) { lt=0 eq=0 gt=1 } |
+--------------- between cycles 4 and 5 ------------------+
| EAX: 0 ECX: fffffffc EDX: 7 EBX: 3 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=00000010 } |
| register C(N) { lt=1 eq=0 gt=0 } |
+--------------- between cycles 5 and 6 ------------------+
| EAX: 0 ECX: fffffffc EDX: 7 EBX: 3 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=00000012 } |
| register C(N) { lt=0 eq=0 gt=1 } |
+--------------- between cycles 6 and 7 ------------------+
| EAX: 0 ECX: fffffffc EDX: fffffffb EBX: 3 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=00000014 } |
| register C(N) { lt=1 eq=0 gt=0 } |
+--------------- between cycles 7 and 8 ------------------+
| EAX: 0 ECX: fffffffc EDX: fffffffb EBX: 3 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=00000016 } |
| register C(N) { lt=0 eq=1 gt=0 } |
+------------------- halted in state: --------------------------+
| EAX: 0 ECX: fffffffc EDX: fffffffb EBX: 3 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
| register P(N) { pc=00000016 } |
| register C(S) { lt=0 eq=1 gt=0 } |
You should also now be able to get the same results using your simulator as yo uget from tools/yis when running y86/prog1.yo through y86/prog4.yo (and y86/prog6.yo should still work too).
cmovXXThe cmovXX family of instructions have the same icode as rrmovl but non-zero ifuns.
The simplest way to implement cmovXX is to create a wire conditionsMet:1; and set it using a mux with entries like ifun == LE : C_lt || C_eq;; then in the writeback stage of rrmovl make the dstE (or dstM if that's what you used) REG_NONE if conditionsMet is false. Recall that muxes execute only the first true case, so adding something like !conditionsMet && icode == CMOVXX : REG_NONE; before other cases when setting the dst_ should suffice.
If you run your simulator on cmovXX.yo, which is an assembled version of
irmovl $2766, %ebx
irmovl $1, %eax
andl %eax, %eax
cmovg %ebx, %ecx
cmovne %ebx, %edx
irmovl $-1, %eax
andl %eax, %eax
cmovl %ebx, %esp
cmovle %ebx, %ebp
xorl %eax, %eax
cmove %ebx, %esi
cmovge %ebx, %edi
irmovl $2989, %ebx
irmovl $1, %eax
andl %eax, %eax
cmovl %ebx, %ecx
cmove %ebx, %edx
irmovl $-1, %eax
andl %eax, %eax
cmovge %ebx, %esp
cmovg %ebx, %ebp
xorl %eax, %eax
cmovl %ebx, %esi
cmovne %ebx, %edi
irmovl $0, %ebx
you should end with registers
| EAX: 0 ECX: ace EDX: ace EBX: 0 |
| ESP: ace EBP: ace ESI: ace EDI: ace |
Submit a file named lab4.hcl on the submission page.