- logistics of the exam (when, how long, etc.)
    - in class on Thursday
    - 75 minutes
    - similar in length, format to prior exams
- what's covered on the exam
    - everything from lecture, labs, HWs except:
        - command line
        - writing HCL; HCL syntax details
    - not covered on the exam: 
        - floating point
- logistics of lab tomorrow
    - exam review
    - come with questions, TAs probably won't prepare things

- the clock
    - global (processor-wide) signal to coordinate computation
    - periodically rises and falls
    - rate slow enough that computations have time to happen
- falling/rising edges
    - falling edges --- don't matter to us
    - rising edges --- trigger writes (register file, PC, cond. codes, memory)
        - "boundary" between clock cycles
        - between rising edges: reads and computations occur
- zeroth clock cycle
    - we will try very hard to avoid this ambiguity


- Q 4 from F2016 --- value of a register after memory instruction
    - label represents an address
    - label is replaced with that address
    - should've written irmovq $label, %rax (instead of irmovq label, %rax)
    - this question: label was in front of instruction at 0xA --- label = 0xA
- Q 5 from F2016 --- value of a register after memory instruction
    - mrmovq access 8 bytes --- little endian order
- Spring 2014 --- value in memomry and edianness
    - storing 4-byte value in little endian order --- writes 4 consecutive byte sin memory
        - lowest address has least significant part of number (one's place)
    - in big endian: lowest address has most significant part (one's place appears at highest address)

- irmovq $label, %rax versus mrmovq label, %rax
    irmovq --- label is a constant value
        %rax has that constant value
    mrmovq --- label is the place where the instruction accesses memory
        %rax reads from memory at that address, and contains the result

- AT&T syntax
    - operands specification
        %rax --- value in the register %rax
        (%rax) --- value in memory at the address specified by %rax
        10(%rax) --- value in memory at 10 plus address specified by %rax
        10(%rax,%rbx) --- value in memory at the address specified by
            10 plus  value in %rax plus value in %rbx
        10(%rax,%rbx,4) --- value in mmeory at the address specified by
            "array access (%rax is address of array, %rbx is index, 4 is element size)"
            10 plus  value in %rax plus  (4 times value in %rbx)

            Why is displacement (10) useful?
                struct data { <-- %rax points here here
                    int info;
                    int info2;
                    int array_of_stuff[1000]; <-- 8 + %rax
                }

                struct data array_of_stuff[100];
                    array_of_stuff[i].info2 <-- %rax + %rbx * sizeof(data) + 4

        $10 --- the value 10
        $0x1234 --- the value 0x1234
        0x1234 --- the value in memory at address 0x1234
    - instruction postfixes
        movq $1, (%rax) --- writes 8 bytes to memory
        movl $1, (%rax) --- writes 4 bytes
        movw $1, (%rax) --- writes 2 bytes
        movb $1, (%rax) --- writes 1 bytes

    - movl $1, %rax --- ERROR
    - movl $1, %eax clears the rest of %rax
        - BUT movl $1, %ax DOES NOT clear the rest of %rax

    - tricky instructions:
        - LEA (load effective address)
            - in both AT&T syntax and Intel syntax, written like a memory access
            - but DOESN'T ACCESS MEMORY
            - uses the address at which we would access mmeory

            - like mov from memory, but send value from data memory address to destination
                (instead of from data memory output to destination)
            - leaq (%rax, %rbx), %rcx
                %rcx <- %rax + %rbx
            - movq (%rax, %rbx), %rcx
                %rcx <- memory[%rax + %rbx]
        - computed JMPs
            - jmp *0x1234(%rax)
                access memory at 0x1234 + value of %rax,
                then use that value as the address to jump to

                two layers of pointers (%rax, then what %rax points to)

                e.g. %rax + 0x1234 is address of table of addresses to jump to

- abnormal behavior in stages
    - fetch:
        - read from the instruction memory --- always exactly the same
        - find the length of the instruction
        - compute address of the next instruction in memory
    - decode:
        - read register values:
            - sometimes no registers (nop)
            - sometimes up to two from instruction (addq)
            - sometimes one from instruction and %rsp (pushq)
            - sometimes %rsp (call, ret, pop...)
            - ...
    - execute:
        - do something with the ALU
            - the computation the instruction is supposed to do (addq, andq, ...)
            - address computations (mrmovq, rmmovq, pushq, popq, ret, call)
            - sometimes nothing (nop)
        - handle condition codes (set or read)
    - memory:
        - do something with the data memory (read or write)
            - sometimes nothing (addq, nop)
            - sometimes write:
                - from register (rmmovq, pushq)
                - from PC + 10 (call)
            - sometimes read:
                - from address just computed
                - from register (popq, ret)
    - write back
        - write some registers
            - sometimes nothing (nop, rmmovq)
            - often one register from instruction (addq, mrmovq, ...)
                - either rA or rB
            - sometimes %rsp (call, ...)
            - sometimes %rsp and another register (pop)
    - PC update
        - usually set PC input to value computed in fetch
            (address of next instruction in memory)
        - sometimes PC = value from instruction (call, jXX)
        - sometimes PC = value from memory (ret)
- ternary operator
    x ? y : z
        if (x) { y } else { z }

- bitwise TF
    - work bit-by-bit
    - -x --> ~x + 1
    - comparing to zero? what happens to the sign bit
    - look for counterexamples
        - shifting important bits off a number
        - overflow, sign bits, etc.
- x & 1 versus !!x
    x = 2 --> x & 1 == 0
          --> !!x == 1 --- !x == 0 (x is true)

- SF/ZF and xor
    - SF and ZF are based on the RESULT of the computation
        if xor yields a negative number, SF is true
            if the sign bit could be set
                sign bit set in exactly one of the two values
        if xor yields zero, ZF is true
            the values are equal


- MADD question from S2017
    - multiply and add faster than two seperate instructions?
    - adding this will make our clock cycle longer
        - by extra time the ALU needs to do this
    - but, we still have work other than the ALU in each clock cycle
        - before: multiply + add
            load multiple instruction, do multiply
            *load add instruction*, do add
        - after: MADD (one instructoin
            load MADD instruction, do multiply + add