This page does not represent the most current semester of this course; it is present merely as an archive.
In this lab we'll implement a subset of the Y86 instruction set. In particular, we'll implement nop
, halt
, OPl
, rrmovl
, irmovl
, and cmovXX
.
You are encouraged to work with a friend on this lab. There is a lot of detail in the writeup, and having someone to talk it over with as you go should help.
You should create a file named lab4.hcl
and work in that. We suggest you use pc.hcl
from homework 3 as a starting point.
Add somewhere to lab4.hcl
the line
stall_P = Stat != STAT_AOK; # so that we see the same final PC as the yis tool
There are various constants that can be useful in this and subsequent labs and homeworks; I'll assume the following:
# icodes; see page 338
const HALT = 0, NOP = 1, RRMOVL = 2, IRMOVL = 3, RMMOVL = 4, MRMOVL = 5;
const OPL = 6, JXX = 7, CALL = 8, RET = 9, PUSHL = 10, POPL = 11;
const CMOVXX = RRMOVL;
# ifuns; see page 339
const ALWAYS = 0, LE = 1, LT = 2, EQ = 3, NE = 4, GE = 5, GT = 6;
const ADDL = 0, SUBL = 1, ANDL = 2, XORL = 3;
I'll also assume you have a working PC update using register P { pc:32 = 0; }
; if not, see tiny.hcl
for an example of how that can work.
halt
and nop
As with homework 3, if the icode
is HALT
, set the Stat
to STAT_HLT
; otherwise, set it to STAT_AOK
if you know how to run that icode
or STAT_INS
otherwise.
For nop
, all you need to do is update the pc.
irmovl
The textbook's Figure 4.18 (page 366) notes the following semantics for irmovl
:
Stage | irmovl V, rB |
---|---|
Fetch | icode:ifun ← M1[PC] rA:rB ← M1[PC + 1] valC ← M4[PC + 2] valP ← PC + 6 |
Decode | |
Execute | valE ← 0 + valC |
Memory | |
Writeback | R[rB] ← valE |
PC Update | PC ← valP |
The table suggests we fetch one byte and split it in half for icode and ifun, then fetch another byte and split it in half for rA and rB, etc. Our flavor of HCL instead fetches 6 bytes (little-endian) into i6bytes
and lets us split them up using slice operators.
You need to define icode
, ifun
, rA
, rB
, and valC
(and, optionally, valP
). Keep in mind how many bits each wire / variable needs.
The instruction irmovl $10, $edx
is assembled as the bytes 30 f2 0a 00 00 00
and read into i6bytes
as 0x0000000af230
. To retrieve the icode
(i.e., 3) we say something like
wire icode:4;
icode = i6bytes[4..8]
Similarly, rB
is i6bytes[8..12]
, rA
is i6bytes[12..16]
, and so on.
It is worth noting that Y86 is like RISC in that icode
, ifun
, rA
, and rB
, are in the same place for every instruction. It is like CISC in that the immediate value valC
is in a different place for different instructions.
The table also suggests creating a valP
; you don't actually need that at this point, but it is a good idea to put pc + whatever into a wire valP
and then put valP
into p_pc
. Having a separate valP
both matches the book's discussion and will make adding jXX
, call
, and ret
easier when you get to them for homework.
Nothing to do. The book suggests adding 0 to valC
and storing it in valE
, a shortcut that "forwards" valC to valE, necessitated by their more limited variant of HCL. Because our HCL is more flexible, we can skip that step.
The register file has two write ports, E and M. If dstE
is not REG_NONE
then wvalE
will be written to register number dstE
, and if dstM
is not REG_NONE
then wvalM
will be written to register number dstM
.
For irmovl
, dst_
specifies the register that the immediate value is moved to (rB
for irmovl
). wval_
specifies the value that will be sent to the dst_
register (where _
can be either E
or M
) – in this case, the immediate value.
Put the following into irmovl.ys
:
irmovl $1, %eax
irmovl $2, %ecx
irmovl $34, %edx
irmovl $5678, %eax
(remember to have a blank line at the end). Assemble it with yas
to get irmovl.yo
and run your simulator on it; you should see registers end as follows:
| EAX: 162e ECX: 2 EDX: 22 EBX: 0 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
rrmovl
The textbook's Figure 4.18 (page 366) notes the following semantics for rrmovl
:
Stage | rrmovl rA, rB |
---|---|
Fetch | icode:ifun ← M1[PC] rA:rB ← M1[PC + 1] valP ← PC + 2 |
Decode | valA ← R[rA] |
Execute | valE ← 0 + valA |
Memory | |
Writeback | R[rB] ← valE |
PC Update | PC ← valP |
As with irmovl
, 0 + valA is unneeded in our variant of HCL. You'll need to add some muxes for things that are different between irmovl
and rrmovl
, like the what goes into valP
and wvalE
.
The only new part is decode; built-in signal srcA
determines what register is the source for the register move. If you set srcA
to something other than REG_NONE
, rvalA
will be set by the register file as the contents of that register.
Put the following into rrmovl.ys
:
irmovl $5678, %eax
irmovl $34, %ecx
rrmovl %eax, %edx
rrmovl %ecx, %eax
(remember to have a blank line at the end). Assemble it with yas
to get rrmovl.yo
and run your simulator on it, you should see registers end as follows:
| EAX: 22 ECX: 22 EDX: 162e EBX: 0 |
| ESP: 0 EBP: 0 ESI: 0 EDI: 0 |
You should also now be able to get the same results using your simulator as you get from tools/yis
when running y86/prog6.yo
.
OPl
The textbook's Figure 4.18 (page 366) notes the following semantics for OPl
:
Stage | OPl rA, rB |
---|---|
Fetch | icode:ifun ← M1[PC] rA:rB ← M1[PC + 1] valP ← PC + 2 |
Decode | valA ← R[rA] valB ← R[rB] |
Execute | valE ← valB OP valA Set CC |
Memory | |
Writeback | R[rB] ← valE |
PC Update | PC ← valP |
There are two things to do in the Execute phase.
The simple part is the operation itself, the ALU, which is essentially a mux based on ifun
(e.g., icode == OPL && ifun == XORL : rvalA ^ rvalB;
)
The other part is setting condition codes. We'll use a simpler set than the textbook: instead of trying to track overflow and so on, we'll simply check less-than, equal-to, and greater-than directly. We'll need a register to store those three flags inside of:
register C { # book uses Z/S/O instead of lt/eq/gt...
lt:1 = 0;
eq:1 = 1;
gt:1 = 0;
}
Register banks like C
have a special input stall_C
which, if 1
, causes the registers to ignore inputs and keep their current value. Thus, we want to stall C
unless there was an OPl:
stall_C = icoded != OPl;
Once we have done that, we record if the (signed) value of valE
is <, =, or > 0 (using unsigned comparison operators):
c_gt = valE != 0 && valE <= 0x7fffffff;
c_eq = valE == 0;
c_lt = valE != 0 && valE >= 0x80000000;
If you run your simulator on opl.yo, which is an assembled version of
irmovl $7, %edx
irmovl $3, %ecx
addl %ecx, %ebx # b = 3
subl %edx, %ecx # c = -3 11..1101
andl %edx, %ebx # b = 2
xorl %ecx, %edx # d = -6 11..11010
andl %edx, $esi
you should see (without the -q
flag, shown with some lines remove for brevity)
+--------------- between cycles 2 and 3 ------------------+ | EAX: 0 ECX: 3 EDX: 7 EBX: 0 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=0000000c } | | register C(S) { lt=0 eq=1 gt=0 } |
+--------------- between cycles 3 and 4 ------------------+ | EAX: 0 ECX: 3 EDX: 7 EBX: 3 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=0000000e } | | register C(N) { lt=0 eq=0 gt=1 } |
+--------------- between cycles 4 and 5 ------------------+ | EAX: 0 ECX: fffffffc EDX: 7 EBX: 3 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=00000010 } | | register C(N) { lt=1 eq=0 gt=0 } |
+--------------- between cycles 5 and 6 ------------------+ | EAX: 0 ECX: fffffffc EDX: 7 EBX: 3 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=00000012 } | | register C(N) { lt=0 eq=0 gt=1 } |
+--------------- between cycles 6 and 7 ------------------+ | EAX: 0 ECX: fffffffc EDX: fffffffb EBX: 3 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=00000014 } | | register C(N) { lt=1 eq=0 gt=0 } |
+--------------- between cycles 7 and 8 ------------------+ | EAX: 0 ECX: fffffffc EDX: fffffffb EBX: 3 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=00000016 } | | register C(N) { lt=0 eq=1 gt=0 } |
+------------------- halted in state: --------------------------+ | EAX: 0 ECX: fffffffc EDX: fffffffb EBX: 3 | | ESP: 0 EBP: 0 ESI: 0 EDI: 0 | | register P(N) { pc=00000016 } | | register C(S) { lt=0 eq=1 gt=0 } |
You should also now be able to get the same results using your simulator as yo uget from tools/yis
when running y86/prog1.yo
through y86/prog4.yo
(and y86/prog6.yo
should still work too).
cmovXX
The cmovXX
family of instructions have the same icode
as rrmovl
but non-zero ifun
s.
The simplest way to implement cmovXX
is to create a wire conditionsMet:1;
and set it using a mux with entries like ifun == LE : C_lt || C_eq;
; then in the writeback stage of rrmovl
make the dstE
(or dstM
if that's what you used) REG_NONE
if conditionsMet
is false. Recall that muxes execute only the first true case, so adding something like !conditionsMet && icode == CMOVXX : REG_NONE;
before other cases when setting the dst_
should suffice.
If you run your simulator on cmovXX.yo, which is an assembled version of
irmovl $2766, %ebx
irmovl $1, %eax
andl %eax, %eax
cmovg %ebx, %ecx
cmovne %ebx, %edx
irmovl $-1, %eax
andl %eax, %eax
cmovl %ebx, %esp
cmovle %ebx, %ebp
xorl %eax, %eax
cmove %ebx, %esi
cmovge %ebx, %edi
irmovl $2989, %ebx
irmovl $1, %eax
andl %eax, %eax
cmovl %ebx, %ecx
cmove %ebx, %edx
irmovl $-1, %eax
andl %eax, %eax
cmovge %ebx, %esp
cmovg %ebx, %ebp
xorl %eax, %eax
cmovl %ebx, %esi
cmovne %ebx, %edi
irmovl $0, %ebx
you should end with registers
| EAX: 0 ECX: ace EDX: ace EBX: 0 |
| ESP: ace EBP: ace ESI: ace EDI: ace |
Submit a file named lab4.hcl
on the submission page.