- challenge logistics: okay didn't need to enable ASLR?
    - ASLR should be enabled by default
    - you should ONLY disable it for challenge 4
    - it may be possible solve challenge 4 without disabling it
- challenge logistics: prints passed but not name is that okay?
    - no

- subterfuge
    char buffer[100];
    int *ptr;

    memory layout:
      | [
      |      buffer
      |
      v ]
        [   ptr   ]

    means that we can change `ptr` if we have a way of overflowing buffer

    why is that useful?
        it depends what is done with ptr.

        example:
            *ptr = value_chosen_by_attacker;

        then we can use this to write to arbitrary memory location, which is useful b/c we can:
            - replace a code pointer
                (return address, function pointer, global offset table entry)
              with a value we choose
            - replace a pointer to a code pointer (e.g. VTable pointer) with a value we choose
            - replace another data pointer with a value we choose
                if overwriting just one value wasn't enough

        example:
            char **ptr;
            ...
            *ptr = &buffer_controlled_by_attacker[...];
                
                same as
                *ptr = 0x400300; (if 0x400300 is the address of 'buffer')
                --> fixed value --> less flexible than prior version

        about as useful, but we needed something indirection:
            pointer to code a pointer code pointer
            ...
        but less control than the previous option

        example:
            struct foo {
                char *internal_ptr;
            };
            struct foo *ptr;
            ptr->internal_ptr = &buffer_controlled_by_attacker[...];

- ROP chains
    "gadgets" ---> snippets of code ending return that form an alternate instruction set

    normal instruction set:
        mov $0x10000, (%rax)
        mov $10, %rdi
        call foo

    for ROP, we're going find "instructions" that can do similar things by looking at gadgets.

    example gadget A:
        pop %rdi
        ret

            can think of this as a "weird instruction" that does:
                set %rdi to argument 1 (next thing on stack)        <-- pop %rdi
                continue to next instruction (next thing on stack)  <-- ret
            how to encode instruction?
                write its address on the stack

    example gadget B:
        mov %rdi, (%rax)
        ret

            can think of this as a weird instruciton that does
                set M[%rax] to %rdi
                continue to next instruction (next thing on stack)

    weird instruction set, written as a stack:
        A <-- start (first thing jumped to)  }
        0x10000                              } weird instruction for "mov $0x10000, %rdi"
        B                                    < weird instruction for "mov %rdi, (%rax)"
        A                                    }
        10                                   } weird instruction for mov $10, %rdi
        foo                                  < weird instruction for jmp/call foo
        return value for foo

    starting an ROP:
        we want a place with our weird instructions that can become the stack pointer
            > simplest way: they're on the stack and we'll return using them
                example: ROP assignment

            > next-simplest way: we find some way (e.g. a gadget) that will change the stack pointer to it

        once stack pointer set, want to return to execute next gadget ("weird instruction")

- UAF (use-after-free) --- generally
    in buffer overflows, we wrote to something we weren't supposed to because a bounds check failed/missing
    in use-after-free, we wrote to something we weren't supposed to because a dangling-pointer check failed/missing

        pointer to X which should not be allowed b/c X was freed
        the reason this would be exploitable:
            **the address that X was assigned was reused for something else (Y)
                    that we aren't supposed to write to**
                ~ same kind of situation as if we had a buffer overflow and Y was the adjacent thing

        common examples of Y that are especially useful for an attacker:
            ~ class with a VTable pointer
                > virtual methods of Y are likely to be called
                > likely to be able to change the VTable pointer to point to attacker location
                > have a very generic exploit technique
            ~ struct with a pointer in it
                > can probably use this to overwrite a code pointer as long as the pointer is used to
                    write somethign
                > pretty generic exploit technique
            ...

    UAF is a big problem b/c:
        * simple mitigations of adding bounds-checking don't help
        * the attacker has some control over the choice of Y
            b/c they can issue commands/requests that influence when Y gets allocated

    our UAF examples had new and delete sizes the same:
        ~ is this required? no
            depends on how the memory allocator works
            but usually, allocators are going to try to reuse memory even if size doesn't exactly match
                (b/c if you didn't --- you'd run out of memory probably)
        ~ is it useful for attacker?
            memory regions of the same size are probably most likely to be reused
        ~ is this difficult to arrange?
            depends on context ---
                sometimes the attacker can cause a new of a particular size by providing input of that size

- format string exploits

    vulnerable code:
        printf(attacker_controlled_string)
            [or some similar function that takes a "format string"]

    first big problem:
        printf() figures out how many arguments it's supposed to have from its string arugment
            each %X in the first argument indicates an additional argument

        printf() will look for these additional arguments even if they aren't "really" there.

                       /---------- on the stack
            2 3 4 5 6 vvvvvv
    printf("%x%x%x%x%x%x%x%x...")
            ^^
            arg 2: in Linux x86-64: %rdi
              ^^
              arg 3: %rdx?

    observations:
        * we can use the "extra" arguments to read the values of the argument registers
          AND to read values from the stack
           > values from the stack and argument registers often include pointers to the stack, code, etc.
                > help decode ASLR randomization
           > values form the stack would likely include any stack canaries 
        * if the attacker controls some variable on the stack, it will end being on of
            the arguments to printf()

    second big problem:
        printf() supports a %n (or %??n where ?? represents some modifiers) format string which
            is not used to output something, but is used to store how much has been output
        
        where does %n store ---> to a location pointed to by the corresponding argument
            ... which an attacker controls

    observations:
        * if attacker can control the argument to %n, they can write to any place they want
            > choose any target we mentioned was interesting with pointer subterfuge
        * if the attacker controls what's in the format string, they can control what value is written
            --> b/c they control how much is written
                example: %.1000u%n writes 1000 characters (an unsigned int padded with up to 1000 spaces)
                    then uses %n to store the value 1000

- write XOR execute
    - hopefully enabled by default on most systems
    - on Linux/ELF:
        there's a STACK entry in the program headers with the 'x' flag (execute) disabled
        ... and hopefully malloc/etc. know better than to ask for executable+writeable
            > you can look at /proc/PID/maps (where PID is the process ID from top/ps) and see what
                parts of memory are executable
                    (manual page for this is available via the 'man 5 proc')
    - if it's enabled and you trick a program into jumping to writeable memory, 
        it will do something like segfault instead --- even though the writeable memory contains valid instructions
            (debugger should show segfaulting on an instruction that should work fine)

- RELRO
    relocation read-only
        setting things set at runtime as part of dynamic linking read-only after using them

    things most prominently affected by RELRO
        global offset table
        VTables
            these do not contain code!
            they contain pointers to code
                > if not read-only prime targets for attackers exploiting pointer-subterfuge-like attack

    partial versus full:
        partial RELRO: doesn't make things read-only if they expect to be changed after program start
            > most notably lazily filled global offset table entries

- what's randomized by ASLR
    when we're loading a program, it's going to be divided into "chunks" to load
        TYPICALLY
            each executable file and library file is one chunk
            within a chunk, we'll have hard-coded offsets, e.g.

                pseudo-assembly      |  machine code description
                global:              |
                    .long init_value |  
                                     |
                main:                |
                    ...              |
                    mov global, %rax |  mov the value 0x5000 bytes before this instruction into %rax
                    call foo         |  call the function 0x500 bytes later
                                                objdump might say: 'call 0x5500'
                                                        but it computed that by saying "this insturction
                                                            was at 0x5000 and said add 0x500"
                                                    (the "0x5000" might be different when the program/library
                                                        is actually loaded)
                    ...              |
                    ret              |
                foo:                 |
                    ...

            can't extract the code for main and put it elsewhere without also moving global, foo with it

        ---> 
            since main has to stay with global+foo, if we know the address of main OR global OR foo
                we know the address of all the others