- format of the exam
    - example exams from past semesters
    - multiple choice
    - 20-25 questions
    - some "select all that apply"
    - if you think need to clarify, write an "asterisk" in the box,
        and write your clarification

- object files and linking
    - assembly:
        - names for instructions
        - constant data (.e.g '.ascii "Hello World!"')
        - labels ---
            - definitions of labels "main: pushq %rbx"
            - usage of labels "    call printf"
    
    - object file
        - machine code (with no particular indication of where instructions start)
                could end up ANYWHERE in memory
        - constant data [typically the same bytes that will be loaded into memory]
                could end up ANYWHERE in memory
        - the locations within the FILE labels that are defined
            "symbol table"
        - the locations within the FILE where labels are USED
            "relocations" 

    - executable (statically linked)
        - machine code
                with known locatoins where it will be loaded
            with labels that are USED replaced with their actual memory addresses
                (or something based on the actual addresses)
        - constant data
        
        - ready to load into memory and jump to

    - executable (dynamically linked)
        - machine code
                (usually with known locations where it be loaded)
        - labels
            IN LIBRARY FILES and locatoins where they are used

        - to run, load and load library file, and replace labels that point to library file

        - library files might be posiiton-independent
            the library file can be loaded anywhere in memory

- LEA
    - sort of assembly version of "&" (address-of operator)
        movq (%rax), %rbx ---> rbx = *rax
        leaq (%rax), %rbx ---> rbx = &( *rax )

        movq 8(%rax, %rbx, 4), %rbx ---> rbx = *(rax + rbx * 4 + 8)
        leaq 8(%rax, %rbx, 4), %rbx ---> rbx = &( *(rax + rbx * 4 + 8) )
                                         rbx = rax + rbx * 4 + 8

            imulq $4, %rbx  // sll $2, %rbx
            addq $8, %rbx
            addq %rax, %rbx

    - just does arithmetic (based on what we do for adddresses)
      but DOES NOT change condition codes

            
- register file inputs and outputs (and their values during instructions)
    - our register has:
        [two "read ports"]
        - two "register number to read" inputs (4-bits, HCL: reg_srcA, reg_srcB)
                [~"address"]
            - addq %rcx, %rdx
                read %rcx and %rdx -->
                reg_srcA and reg_srcB = to number of %rcx and number of %rdx
        - two "register value" outputs (to the rest of the processor)
            (64-bits, HCL: reg_outputA, reg_outputB)
            - addq %rcx, %rdx
                reg_outputA and reg_outputB are the VALUES of %rcx and %rdx
                    reg_outputA + reg_outputB == result of the add

        [two "write ports"]
        - two "register number to write" inputs (4-bits, HCL: reg_dstE, reg_dstM)
                [~"address"]
            - addq %rcx, %rdx
                write %rdx
                reg_dstE and reg_dstM should have 
                    - one signal = number of %rdx
                    - one singal = the special number 15 ("no register"; HCL: REG_NONE)
        - two "register value to write" inputs (64-bits, HCL: reg_inputE, reg_inputM)
            - addq %rcx, %rdx
                one signal = reg_outputA + reg_outpuB = result of the add
                other signal = won't matter (because "no register" input)

    - Q: what registers do instructions read/write?
        TYPICALLY registers mentioned in the instruction, but maybe not all of them
            addq rA, rB --> reads rA, rB, writes rB
            irmovq $V, rB --> reads nothing, writes rB

        push/pop/ret/call -- read+write %rsp even though not patr of instruction

- clock cycle, when things chnage
    - registers (register file and PC, etc.) values  and memory values change
        on the rising edge of the clock
                - rising edge of the clock changes what value the PC register outputs
                    because that's when its value changes
                - register file's stored value changes when the rising edge of
                    the clock happens BASED ON the write reg number + value inputs
            - memories/registers only care about writing inputs on the rising edge
    - everything else happens as inputs become available
                - changing the register number to read input -->
                    in same the cycle --> register file outputs the stored value

    - we say "memory stage is where we write memory"
        what we really mean is 
            "memory stage is where we worry about configuring the signals for
            the data memory"

- F2017 Q 2 (picture)


- bit fiddling strategies
    - bit masks --
        when you want to do something to particular bits
        make a number where those bits are 1
            "three least significnat bits" 00000111 binary = 0x7

            keep only those bitS;
                & mask --> 0x123 & 0x7 --> 0x3
            set those bits:
                | mask --> 0x123 | 0x7 --> 0x127
            flip those bits:
                ^ mask --> 0x123 & 0x7 --> 0x124
            clear those bits
                & (~mask) --> 0x123 & (~0x7) --> 0x120

    - "mask and shift" -- select bits, then shift them to where you want them
        commonly at the least sig. part
        
        (equivalently, can "shift and mask")
    
    - find subproblems
        HW: "is number of 0s odd in a 32-bit number"
            solve in terms of simpler problem:
                "is number of 0s odd in a 16-bit number"
                "is number of 0s odd in a 2-bit number"

    - parallelism
        xor bits 0 and 16 and also bits 1 and 17
            (x >> 16) ^ x       <-- both done by this
            ((x >> 17) ^ (x >> 1)) & 1 == ((((x >> 16) ^ x) >> 1) & 1)
        if you have similar subproblems, try to arrange them so you can do this
            put half the problem in the "upper" part of the number
            half in the bottom
            bitwise operators do what you want on both parts, hopefully

             
- ?: operator with bitwise ops
    - a ? b : c
    - turn a from true or false to 0 or 0xFFFF...F == -1
                               and -1 or 0
        "mask" 0 --- "keep no bits"
               0xFFF..F --- "keep all bits"
    - (mask_1_from_a & b) | (mask_2_from_a & c)

    - a is 0 --> 0, otherwise -> 1
        2-bit version of a: or the bits together
        4-bit version of a: or the bits together
                            "  "   "     "
        a | ( a >> 1) | ( a >> 2) | ...

        result_16_apart = a | ( a >> 16 ) --
            or bits 0 and 16, bits 1 and 17, bits 2 and 18

        result_8_aprt = (result_16_apart) | (result_16_apart >> 8)
            or (bits 0 and 16 or'd together) and (bits 8 and 24 or'd together),
                "    1  "  17  "        "       "  "   9   " 25 "    "``

    - one view: "reduction tree"
        0    1     2     3 <- bits of a
        \    /     \     /
          0|1        2|3   <- bits of result of first |
            \        /
            (0|1)|(2|3)

    - tricky strategies:
        if a is 0, -a == a --> a - (-a) = 0
                a and -a have the same sign bit
        if a != 0 and a != INT_MIN, -a != a --> a - (-a) = 2*a != 0
                a and -a do NOT have the same sign bit

        ((-a >> 31) & 1) ^ ((a >> 31) & 1) is 0 only if a is 0 or INT_MIN, n
                                           is 1 otherwise

    - trick:
        once you get 0/1 --> -x is the value you want
            -x == ~x + 1

            
- F2017 Q13-14
        0001
      & 0010
      ------
        0000
                addr 1
        addr 0 //
              //
           || ||
           vv vv
    0x000: 30 f1 01 00 00 00 00 00 00[00 | irmovq $1, %rcx      --> %rcx = 1
    0x00a: 30 f2 02 00 00 00 00]00 00 00 | irmovq $2, %rdx      --> %rdx = 2
    0x014: 62 12                         | andq %rcx, %rdx      --> %rdx = 0
    0x016: 50 01 08 00 00 00 00 00 00 00 | mrmovq 8(%rcx), %rax --> %rax <- memory[8 + 1]
                                                                                   ^^^^^
                                                                adding %rcx (64-bit vsalue)
                                                                and 1 (64-bit immed. in instr)

    byte at addr 9 --> least sig
    byte at addr 10 --> second least sig

        0x000000002F23000 --> 0x2F23000