Changelog:

22 April 2024: emphasize that fewer guesses only needs to happen with large enough length and that the difference should be large and not-passphrase dependent; add footnote exponential versus linear scaling

1 Overview

In this lab, we will give you a check_passphrase function that compares a passphrase with a stored passphrase one character at a time.

Your task is to write a function that will figure out the passphrase by calling check_passphrase and timing how long it takes. Rather than trying all possible passphrases, you should be able to tell whether you have the first few characters correct by examining how long it takes the function to run.

2 Your Task

As usual for labs, you may work with other students.
Download the supplied skeleton code at here (or via something like wget https://www.cs.virginia.edu/~cr4bd/3130/S2023/labhw/files/side-channel.tar).

When woring on the department machines, run module load gcc since our Makefile requires a more recent version of gcc than the default.

Build the supplied code using make.
Examine the find_passphrase function in lab.c.

Try running our test harness with a target passphrase of foo with
```
./lab foo
```
or with a target passphrase of b with
```
./lab b
```
In lab.c, modify the find_passphrase(buffer, length) function so it will figure out which passphrase check_passphrase accepts. Your function may assume that passphrases consist of only of the letter a through z and are exactly length characters long.

Your function must only use information obtained by running and timing check_passphrase.

We provide advice on how to understand the supplied code and utility functions and to write your find_passphrase function in the hints below.

When the passphrase length is large enough, a correct solution will use many times fewer guesses overall than trying all possible passphrases of the specified length regardless of the chosen passphrase.¹
Test your solution by running ./lab with various target passphrases.
Either:
- Submit your modified lab.c to the submission site, or
- Show a TA your code working for checkoff.

3 Hints

3.1 Understanding the supplied code

Run make to build a ./lab program from:
- lab.c: the file with the implementation of find_passphrase you are going to modify
- timing.c: high-precision timing code that you can use from lab.c
- main.c: our supplied testing code
- check_passphrase.s: the check_passphrase() function written in assembly
The ./lab program can be run like:
```
./lab passphrase
```
in which case it will set the passphrase to passphrase, invoke the find_passphrase function and report the results like:
```
found passphrase [passphrase] after 702 queries
```
or, if the function fails to fill in buffer with the correct passphrase, it will report results like
```
found [z], but that was not the passphrase
```
With unmodified code, find_passphrase (in lab.c) will try the passphrases a, b, and c, in that order. If any of them work, it will return that answer, then the code in main.c will print out the answer. So for example, running
```
./lab b
```
will print out
```
found passphrase [b] after 2 queries
```
and
```
./lab c
```
will print out
```
found passphrase [c] after 3 queries
```
.

If you select a passphrase that is not one of the ones it checks, it will print out the timings for checking a, b, and c and return the answer c, which will not be correct.
To run and time check_passphrase, the supplied find_passphrase uses a utility function we supplied called measure_once, which it calls using code like:
```
long time = measure_once(&result, buffer, check_passphrase);
```
This does something similar to the pseudocode:
```
long start = GetTime();
result = check_passphrase(buffer);
long end = GetTime();
long time = end - start;
```
The measure_once function attempts to follow advice in Intel’s documentation for obtaining high-precision timings of code. (It also includes some x86-64 assembly, so it will not work on non-x86-64 systems.)

On my system running ./lab bx prints out something like:

'a' took 426 cycles
'b' took 418 cycles
'c' took 146 cycles
found [c], but that was not the passphrase

and ./lab cx prints out something like:

'a' took 710 cycles
'b' took 134 cycles
'c' took 328 cycles
found [c], but that was not the passphrase

In both cases, running the test for a, the first thing timed, took a while. This is probably because values were not loaded into caches yet. We would probably get more consistent timings for a if we called check_passphrase earlier to make sure values were loaded into caches.
Notice that the time the function took to test b or c varied depending on whether the passphrase started with c or b. You’ll take advantage of that to modify find_passphrase.

3.2 Suggested approach

I would recommend starting by modifying the code to time all the letters a through z and printing out all those timings.
To deal with the problem of the first thing timed being slow every time (probably because of caches being loaded for the first time), I would recommend taking multiple measurements for each letter. Taking multiple measurements will also mitigate against measurement errors from other activities happening on the system.
When you’ve done this, you should be able to see that when you run, for example
```
./lab fxxxx
```
you’ll get significantly longer timings for testing f compared to other items and when you run, for example
```
./lab axxxx
```
you’ll get significantly longer timings for testing a.
Next, expand your code from testing the first letter to guessing multiple letters. I would recommend structuring your code like:
- Fill in the string with all \0s to start with an empty string.
- In a loop for each index i up to length:
  - Try each possibility for the i’th letter and see which seems most likely based on the timings you’ve measured.
  - Fill in the best guess of the ith letter for future iterations
- If you get to the end of the loop and none of the guesses were correct, repeat the process. Hopefully, this was just caused by something interfering with the timings that won’t happen over and over again.

3.3 On ensuring measurement consistency

When consolidating mulitple measurements, it may be better to use an approach that is not very sensitive to outliers (for example, such as taking a median; or retrying whenever multiple measurements don’t agree about what is slowest).

(Note that outliers might occur, e.g., because of an exception occuring while the passphrase is being checked, which isn’t relevant to the timing we care about.)
To get consistent measurements, you should try to measure all the possible letters under consistent conditions. (This helps avoid issues where, for example, the processor changes clock speeds dynamically to save power or manage heat.)

So, for example, it might be good to do things like:
- (especially if taking lots of measurements) interleaving when each possiblity is measured rather than doing all the measurements for the first possiblity, then the second, and so on, and/or
- measuring all the letters for an index before printing any measurements out

Trying all possible passphrases would take Theta(26ⁿ) guesses (averaging 26ⁿ/2 passphrases before it finds the correct one), but the timing information should allow around Theta(n) guesses. It’s possible your timing-based solution may sometimes get confused by a spuriously low or high measurement and therefore not achieve linear scaling. But it should still be much better than the trying all passphrases solution.↩︎