CS4630 Spring 2021 exam

Answer each of the following questions. This exam is open-book and open-notes, but you may use only resources that were created before this exam was released (at 9pm eastern time on 12 May 2021). You may not collaborate with other students.

Please show your work for questions in the comments field were applicable so we are able to give you partial credit.

If you think a question is ambiguous or unclear, please make your best guess about what was meant and explain what you did in the comments field for the question. We are unlikely to be able to answer your inquiries during the exam time.

Some viruses infect executables by compressing the original executable and creating a wrapper that that runs the virus code, then decompresses and runs the executable.

Question 1 (4 pt) (see above)

Which of the following antivirus techniques likely motivated this virus design? Select all that apply.

scanners that only read the beginning and end of an executable file for efficiency

likely that wrapper has a signature that can be found in the beginning of the new executable
pattern-matching based malware scanning
⊤ (correct)
checking the metadata of executables for changes to find malware

common metadata to check would include file sizes
looking for changes to "sacrificial goat" executables that are placed on the filesystem in a place where virus-like malware is likely to infect it but are not actually useful executables

Regrade request

Question 4 (4 pt)

Viruses that append their virus code to an executable can often be detected based on changes made to the list of segments in an executable's headers.

Consider the following executable headers:

a.out:    file format elf64-x86-64
a.out
architecture: i386:x86-64, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000000000401c10

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12
         filesz 0x0000000000000518 memsz 0x0000000000000518 flags r--
    LOAD off    0x0000000000001000 vaddr 0x0000000000401000 paddr 0x0000000000401000 align 2**12
         filesz 0x000000000009375d memsz 0x000000000009375d flags r-x
    LOAD off    0x0000000000095000 vaddr 0x0000000000495000 paddr 0x0000000000495000 align 2**12
         filesz 0x0000000000026650 memsz 0x0000000000026650 flags r--
    LOAD off    0x00000000000bc0c0 vaddr 0x00000000004bd0c0 paddr 0x00000000004bd0c0 align 2**12
         filesz 0x0000000000005170 memsz 0x00000000000068c0 flags rw-
    NOTE off    0x0000000000000270 vaddr 0x0000000000400270 paddr 0x0000000000400270 align 2**3
         filesz 0x0000000000000020 memsz 0x0000000000000020 flags r--
    NOTE off    0x0000000000000290 vaddr 0x0000000000400290 paddr 0x0000000000400290 align 2**2
         filesz 0x0000000000000044 memsz 0x0000000000000044 flags r--
     TLS off    0x00000000000bc0c0 vaddr 0x00000000004bd0c0 paddr 0x00000000004bd0c0 align 2**3
         filesz 0x0000000000000020 memsz 0x0000000000000060 flags r--
0x6474e553 off    0x0000000000000270 vaddr 0x0000000000400270 paddr 0x0000000000400270 align 2**3
         filesz 0x0000000000000020 memsz 0x0000000000000020 flags r--
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
   RELRO off    0x00000000000bc0c0 vaddr 0x00000000004bd0c0 paddr 0x00000000004bd0c0 align 2**0
         filesz 0x0000000000002f40 memsz 0x0000000000002f40 flags r--

Briefly describe what would likely be changed in the headers shown above if a virus appended its code to this end of this executable. Assume that the appended viruses code is larger than 1 page (4096 bytes) in size.

Answer:

several solutions, accepted any one:

modify last LOAD operation to make executable and extend the size or add an add a new LOAD operation
extending the last LOAD, with minimal disruption would be achieved by increase both filesz and memsz and put the virus code after memsz offset into the section
dramatically increase the filesz and memsz of the r-x flagged LOAD to go to the end of the executable. This will result in it overlapping with the the other LOADs. We didn't talk about this in lecture, but the Linux loader allows overlapping segments and applies the LOAD directives in order.

Some students had solutions that would require changing where parts of the executable would be loaded relative to each other, by adjusting a the vaddrs/paddrs of a bunch of segments. We didn't give these full credit since this seems to require the virus to do complicated rewriting of the existing executable's machine code.

Regrade request

Question 5 (4 pt)

To evade pattern-based detection and analysis, malware frequently "encrypts" most of its code and includes a randomized decryption routine that decrypts and runs the code.

A frequently proposed solution for antimalware software or analysts to obtain the decrypted code is to run the malware in an emulator, and stop when the malware starts runnning machine code that it wrote while running. Then, one would examine the memory managed by the emulator for the decrypted code. Briefly explain a way that malware could defeat this automated scheme.

Answer:

could decrypt code in multiple segments, so the detection triggers early

We also accepted solutions that proposed detecting the emulator and refusing to run the virus code (but you needed to suggest a method that would be likely to do this for full credit).

Regrade request

Some self-replicating malware tries to evade signature based detection by performing a machine code to machine code transformation as part of its replication process. One possible transformation is to introduce additional operations that have no useful effects in between instructions as part of this transformation.

Question 6 (4 pt) (see above)

One idea to detect this transformation is to ignore nop instructions and instructions that do not change any values in memory or registers (e.g. add $0, %rax, jmp foo; foo:) when doing pattern matching. Give an example of how malware could use this strategy of inserting additional useless instructions between its instructions but evade this countermeasure.

Answer:

add to register not used by malware's normal machine code; add and substract from a register; push and pop a value from the stack; ...

Regrade request

Question 7 (4 pt) (see above)

To implement this transformation, the malware must make some adjustments to deal with jumps and calls within the malware code. Suppose a jump instruction is encoded using a relative offset. Which of the following information would alone be sufficient to the malware to produce a correct jump instruction in the new machine code? Select all that apply.

the difference between the location of the jump's target in the old machine code and the new machine code
both the difference between the location of the jump in the old machine code and its location in the new machine code and the difference between the location of the jump's target in the old machine code and the new machine code

intended all these to be interpreted as also including the original jump (and its original offset), but not clear
the difference between the location of the jump's target in the old machine code and the location of the jump in the old machine code
⊤ (correct)
the difference between the location of the jump's target in the new machine code and the location of the jump in the new machine code

Regrade request

Question 8 (5 pt)

Which of the following techniques are likely to make it more difficult to set breakpoints and, when those breakpoints are reached, get a useful call stack trace of some malware in a debugger? Select all that apply.

encrypting constant strings and values

doesn't affect what functions are called and when, also the strings/values are likely to be decrypted at runtime and so visible in a call trace that shows arguments
copying values between variables one bit a time, using an if statement to determine whether to write 0 or 1 in each location
⊤ (correct)
malware having code that reads its machine code and verifies that it hashes to a specific value
converting a function that uses if statements and other flow control to instead be composed of a loop around a switch statement, where rather than an if statement jumping to the code for the "then" or "else" part, it will set the control variable for the switch and return to the top of the loop

doesn't change what function calls happen or make those function calls not use the stack as usual; also doesn't do anything that affects operation of breakpoints
⊤ (correct)
converting the program's code to a code for a custom virtual machine and including an emulator for that machine and the converted code

likely to hide function calls so call stack is not useful; also operations won't have a single location where a breakpoint can be set with a conventional debugger

Regrade request

Question 9 (5 pt)

Consider a dynamic taint tracking scheme that executes a program by annotating each value in a running program with a flag about whether it is "tainted". When a tainted value is used in arithmetic to compute another another value, the other value is marked as tainted. Which of the following techniques are likely make this kind of dynamic taint tracking scheme ineffective? Select all that apply.

encrypting constant strings and values
⊤ (correct)
copying values between variables one bit a time, using an if statement to determine whether to write 0 or 1 in each location
malware having code that reads its machine code and verifies that it hashes to a specific value

if we implement this taint tracking scheme with an emulator the original machine code will be unchanged (we'll just have an implementation of each machine code instruction that does more state tracking than a normal processor); if we do a machine code to machine code transformation, the original machine code will likely be preserved to make the translation correct when a program reads constants out of its code segment.
converting a function that uses if statements and other flow control to instead be composed of a loop around a switch statement, where rather than an if statement jumping to the code for the "then" or "else" part, it will set the control variable for the switch and return to the top of the loop

this taint tracking scheme doesn't rely on analyzing control flow (regardless of whether it's control flow based on ifs or witches), so there won't be a difference in what's tainted before/after this transformation (except tha tht eh new variable may or may not be tainted)
converting the program's code to a code for a custom virtual machine and including an emulator for that machine and the converted code

should perform the same calculations/arithmetic with the same values, so have roughly the same taint information

Regrade request

Question 10 (4 pt)

What is the difference between a sandbox and a virtual machine?

A sandbox runs everything as the same user, but a virtual machine runs everything as an entirely new user
A sandbox can only perform privilege separation on an application, but cannot prevent it from accessing files which a virtual machine can
⊤ (correct)
A sandbox limits how the system can talk to the OS, but a virtual machine creates an entirely new OS
A sandbox prevents an application from making any changes to the system, but a virtual box allows for changes via file transfer.

Regrade request

Suppose we wanted to use sandboxing to protect against vulnerabilities in a video calling application.

Question 11 (4 pt) (see above)

What would be true about attempting to do this by confining the entire application using chroot? Select all that apply.

⊤ (correct)
we would need to identify the location of the video calling application's configuration files for this to be useful

need to identify config files in order to ensure that they're included in the chroot environment
⊤ (correct)
if the video calling application supports sharing files with other users, then it would be difficult to support this functionality in the confined environment

would need to include files a user might want to upload in the chroot environment, which would seem to defeat a lot of the security benefit of the chroot environment
this technique would be likely keep the video calling application from sharing the content of other windows with users

other windows likely not accessed through filesystem (e.g. via a windowing server or the display driver or similar instead)

Regrade request

Question 12 (4 pt) (see above)

If one wanted to use privilege separation for this task, what is a part of the video calling application that would be a good candidate for performing privilege separation?

Answer:

video decoding, network decoding, chat message rendering; generally we want to run the parts of the video calling application that deal with data controlled by a potential attacker with less privileges

We preferred answers that made it clear what would go in the less privileged part of the application, but also accepted answers which identified what functionality would be outside the sandbox (and accessed from the sandbox via a limited API): e.g. choosing files for file sharing or windows for screensharing

Regrade request

Question 13 (4 pt)

How does AFL-tmin know its minimized test case triggers the same bug as the original test case?

AFL checks if the backtrace of the crash is the same
AFL checks if the crash occured on the same line of c code
AFL checks if the values on the stack are the same
AFL checks if the input values are used by the code that crashes
⊤ (correct)
AFL checks if the program executes the same branches to produce a crash

Regrade request

Question 14 (4 pt)

Which of the following are allowed under Rust's ownership rules? Assume only built-in references are used, not special reference classes that implement different policies like Rc or RefCell. Select all that apply.

⊤ (correct)
making a hashmap whose keys and values are strings, and having a function that modifies that hashmap after being passed a string to insert into the hashmap and mutable reference to the hashmap as arguments

function will borrow the reference to the hashmap
⊤ (correct)
constructing a binary tree composed of objects of a Node struct type, where each Node has a mutable reference that can be used to read or write its left and right child
constructing a binary tree composed of objects of a Node struct type, where each Node has a mutable reference that can be used to read or write its left and right child and its parent

Node can't "own" its parent that also owns the Node
putting a reference that can be used by another core to read a local progress variable managed by a search function in a global variable whenever the search function is running (and setting the reference to a null/placeholder value when search function completes)

for safety, need to have some mechanism to ensure that other core stops using reference before search function completes. Because of this concern Rust doesn't allow this without using some special reference class that implements a multithreading-safe policy

Regrade request

Question 15 (5 pt)

Some bounds-checking schemes use a lookup table that allows code to use the address of any byte of an object to determine the beginning and end and size of an object. Which of the following are true about these schemes? Select all that apply.

⊤ (correct)
these schemes will stop code that overflows a buffer in one object to overwrite a pointer near the beginning of the next object in memory
these schemes will stop code that overflows a buffer in a struct from overwriting a pointer later on in the struct

the lookup table needs to be able to tell us both about where the beginning/end of the struct (or maybe an array of structs) is AND where the beginning/end of objects in the struct are, and it seems like one with the structure proposed won't do that
this approach requires allocating a "red zone" of unused space between objects

"red zone" usually refers to space that should not be modified/will cause errors if modified. Baggy Bound checking, the scheme which worked most like this, did sometimes adding padding to make sizes a power of two, but didn't need padding if the size was already a power of two; and in general, padding is not required if you're willing to have higher overhead for the bounds check
⊤ (correct)
these schemes require changes the memory allocator (malloc, new, etc.)
these schemes require changing how pointers are represented (such as by replacing pointers with a struct with additional information or encoding extra information in the high bits of pointers)

the lookup table allows us to reconstruct the information that would be in "fat" pointers

Regrade request

Question 16 (4 pt)

Which of the following patterns would match the machine code for an x86-64 return? Select all that apply.

c3
[cC]3
⊤ (correct)
\xc3
\xc\x3

Regrade request

Question 17 (4 pt)

Which of the following statements are true about using static analysis to find potential security bugs (such as use-after-free or a buffer overflow) in a function F()?

⊤ (correct)
the analysis may report potential bugs that are not actually possible, no matter how F() is run
if F()'s behavior depends on other functions it calls, then the analysis requires access to the code for those functions to produce any results
if the function has arguments that it uses, the static analysis will need to simulate the function for each possible value of those arguments
if the function has a potentially infinite loop, the analysis will not be able to find all potential bugs

Regrade request

CS4630 Spring 2021 exam

final exam