- forbidden topics
    - we don't want you to memorize:
        HCL signal names
        the Y86 encoding figure 
            but you should know what fields instructions have
    - we won't have you write HCL code

- Y86 stages
    - this a way our textbook organizes the CPU
    - right now: they don't mean anything physically
    - fetch (should be called "fetch and decode"):
        read instructoin memory
        compute instruction length (valP -- next instructoin addr)
        split instruction
    - decode (should be called "register read")
        read the register file
    - execute
        perform an ALU operation
    - memory
        perform a data memory operation
        ^^^^^^^ --- setup the inputs so read happens or write will happen
    - writeback
        write the register file
        ^^^^^ --- set the inputs so the write will happen
    - PC update
        write the PC
        ^^^^^ --- setup the PC register input

- push/pop/call/ret in SEQ stages
    - fetch:
    - decode:
        read RSP and maybe another register
                     ^^^^^ push
    - execute
        compute the new RSP
    - memory --- setup inputs to the data memory
        read the stack at the OLD RSP
                              ^^^^^^^ address NOT from ALU
        write the stack at the NEW RSP
                               ^^^^^^^ adddress is from ALU
            if call: PC + 9
    - writeback --- setup inputs to the register file
        write RSP and maybe another register
                      ^^^^^ pop
                            popq %rax --> writes %rax and %rsp
    - PC update
        if ret: PC = data memory output
        if call: PC = immediate from instruction
        otherwise: normal
- when does the clock matter
    - writes to registers (any kind), memory happen at rising edge of clock 
        "end of the cycle"
    - everything else happens as inputs are available
- Y86 instructoins and register file, memory inputs
    - reading:
        - figure out all the registers we need to read
            - NOTE: not necessairily all the register numbers in the instructoin
                - e.g.: %rsp for call/ret/push/pop
                - e.g.: irmovq doesn't need to read any register
        - make sure all the corresponding register numbers
            are input to the register file
                (HCL: reg_srcA, reg_srcB)
        - corresponding 64-bit outputs are those values
            IN THE SAME CLOCK CYCLE 
            (HCL: reg_outputA, reg_outputB)
        - data memory:
            - figure out the address (usually ALU result) 
            - input the address (mem_addr), read the 64-bit output (mem_output)
                IN THE SAME CLOCK CYCLE
    - writing
        - figure out register #s we need to write
        - make sure the corresponding register file inputs are set to them 
            (reg_dstE, reg_dstM)
        - set 64-bit value input to the value to write 
            (e.g. ALU result for add)
            (reg_inputE, reg_inputM)
        - data memory:
            - set write enable input to 1
            - figure out the address
            - input the address(mem_addr) and the 64-bit value(mem_input)

- HCL tracing on most recent quiz
        register xY {
	    a : 32 = 0;
	    b : 32 = 0;
	}   
	x_a = Y_a + Y_b;
	x_b = Y_b + 1;
    
    cycle   |  Y_a  Y_B  x_a   x_b
    1       |   0    0    0     1
    --- rising edge of the clock happens --
    2       |   0    1    1     2
    ---
    3       |  *1*   2    
 

- encoding/decoding Y86
    - we don't memorize the figure
    - the immediates need to appear in themachine code
    - the register numbers need to appear in the machine code
    - we try to have the in the same place for each instructoin
    - always use byte 0 to give icode + function code

- Q9 S2017 - Y86 program and machine code -- outputs
- endianness from mrmovq (and generally)

              01
           00 |
           vv vv
    0x000: 30[f2 01 00 00 00 00 00 00]00 | irmovq $1, %rdx
		%rdx = 1
    0x00a: 50 02 00 00 00 00 00 00 00 00 | mrmovq 0(%rdx), %rax
                                                ^           ^
		%rax <- memory @ %rdx + 0 = memory @ addr 1
				8 bytes
    0x014: 00                            | halt
               
        f2 is least significant
        01 is second least sigiicant, etc.
        
        0x000000000001F2  --> 0x1F2
 

- mrmovq and rB
    - ISA choice:
            **either D(rA) and D(rB) to compute memory locations**
        OR some memory instructions write to rA, some write to rB
- ISA tradeoffs
    - is it easier to implement in some way?
    - is it easier for assemblers/compilers?

    - big set of tradeoffs:
        RISC [easier to implement] <------> CISC [closer to software needs]
            [but less knowledge 
             of what code is actually
             doing --- less opportunties
             for special optimizations]

    - other tradeoffs: what kind of HW implementations?
        - lots of registers --- more HW but maybe faster?
        - variable-lenght instructions --- more complicated HW but maybe 
            less space for machine code?


- what is a microarchitecture
    - a particular implementation
        - example: SEQ is one microarchitecture for Y86
            - chooses one cycle/instruction
            - chooses to use a register file with a REG_NONE option
        - later on, we'll have PIPE --- a different Y86 microarch
            - chooses ~five cycle/instructoin (but in parlllel)
        - could have made microarchiecture that does multiple cycles
            e.g. read one register/cyle

- casting char to int and comparing
    - x == y --> convert both to the same type (if large enough)
    - rule of thumb: when in doubt, cast both to the same type
    - (int) 0xFFFFFFFF --> negative
- typedef struct
    - struct Foo { }; 
        struct Foo x; --- declares 'x'
        Foo x; -- ILLEGAL
    - typedef struct Foo { } Bar;
        struct Foo x; --- declares 'x'
        Bar x; --- declares 'x' (same type)
        struct Bar x; --- ILLEGAL
        Foo x; --- ILLEGAL
    - typedef struct { } Bar;
        Bar x; 
    - typedef struct Foo { Foo * next; } Bar; --- ILLEGAL
    - typedef struct Foo { Bar * next; } Bar; --- ILLEGAL
        (Bar not declard in time)
    - typedef struct Foo { struct Foo * next; } Bar; 

- object file versus execeutable and linking
    - assembly file:
        instruction and register names
        labels
    - object file:
        machine code (not individual instructions)
        labels
            locations (in the file) of labels defined
                "symbol table"
            locations (in the file) of labels used
                "relocations"
    - executable
        machine code
            w/ labels used replcaed with actual memory addresses
- bit masks