1 Overview
In this lab, we will give you a check_passphrase
function that compares a passphrase with a stored passphrase one
character at a time.
Your task is to write a function that will figure out the passphrase
by calling check_passphrase
and timing how long it takes.
Rather than trying all possible passphrases, you should be able to tell
whether you have the first few characters correct by examining how long
it takes the function to run.
2 Your Task
As usual for labs, you may work with other students.
Download the supplied skeleton code at here (or via something like
wget https://www.cs.virginia.edu/~cr4bd/3130/S2023/labhw/files/side-channel.tar
).When woring on the department machines, run
module load gcc
since our Makefile requires a more recent version ofgcc
than the default.Build the supplied code using
make
.Examine the
find_passphrase
function inlab.c
.Try running our test hardness with a target passphrase of
foo
with./lab foo
or with a target passphrase of
b
with./lab b
In
lab.c
, modify thefind_passphrase(buffer, length)
function so it will figure out which passphrasecheck_passphrase
accepts. Your function may assume that passphrases consist of only of the lettera
throughz
and are exactlylength
characters long.Your function must only use information obtained by running and timing
check_passphrase
.We provide advice on how to understand the supplied code and utility functions and to write your
find_passphrase
function in the hints below.A good solution should use substantially fewer guesses overall than trying all possible passphrases of the specified length.
Test your solution by running
./lab
with various target passphrases.Either:
- Submit your modified
lab.c
to the submission site, or - Show a TA your code working for checkoff.
- Submit your modified
3 Hints
3.1 Understanding the supplied code
Run
make
to build a./lab
program from:lab.c
: the file with the implementation offind_passphrase
you are going to modifytiming.c
: high-precision timing code that you can use fromlab.c
main.c
: our supplied testing codecheck_passphrase.s
: the check_passphrase() function written in assembly
The
./lab
program can be run like:./lab passphrase
in which case it will set the passphrase to
passphrase
, invoke thefind_passphrase
function and report the results like:found passphrase [passphrase] after 702 queries
or, if the function fails to fill in
buffer
with the correct passphrase, it will report results likefound [z], but that was not the passphrase
With unmodified code,
find_passphrase
(in lab.c) will try the passphrasesa
,b
, andc
, in that order. If any of them work, it will return that answer, then the code inmain.c
will print out the answer. So for example, running./lab b
will print out
found passphrase [b] after 2 queries
and
./lab c
will print out
found passphrase [c] after 3 queries
.
If you select a passphrase that is not one of the ones it checks, it will print out the timings for checking
a
,b
, andc
and return the answerc
, which will not be correct.To run and time
check_passphrase
, the suppliedfind_passphrase
uses a utility function we supplied calledmeasure_once
, which it calls using code like:long time = measure_once(&result, buffer, check_passphrase);
This does something similar to the pseudocode:
long start = GetTime(); result = check_passphrase(buffer); long end = GetTime(); long time = end - start;
The
measure_once
function attempts to follow advice in Intel’s documentation for obtaining high-precision timings of code. (It also includes some x86-64 assembly, so it will not work on non-x86-64 systems.)On my system running
./lab bx
prints out something like:'a' took 426 cycles 'b' took 418 cycles 'c' took 146 cycles found [c], but that was not the passphrase
and
./lab cx
prints out something like:'a' took 710 cycles 'b' took 134 cycles 'c' took 328 cycles found [c], but that was not the passphrase
In both cases, running the test for
a
, the first thing timed, took a while. This is probably because values were not loaded into caches yet. We would probably get more consistent timings fora
if we calledcheck_passphrase
earlier to make sure values were loaded into caches.Notice that the time the function took to test
b
orc
varied depending on whether the passphrase started withc
orb
. You’ll take advantage of that to modifyfind_passphrase
.
3.2 Suggested approach
I would recommend starting by modifying the code to time all the letters
a
throughz
and printing out all those timings.To deal with the problem of the first thing timed being slow every time (probably because of caches being loaded for the first time), I would recommend taking multiple measurements for each letter. Taking multiple measurements will also mitigate against measurement errors from other activities happening on the system.
When you’ve done this, you should be able to see that when you run, for example
./lab fxxxx
you’ll get significantly longer timings for testing
f
compared to other items and when you run, for example./lab axxxx
you’ll get significantly longer timings for testing
a
.Next, expand your code from testing the first letter to guessing multiple letters. I would recommend structuring your code like:
Fill in the string with all \0s to start with an empty string.
In a loop for each index i up to length:
Try each possibility for the i’th letter and see which seems most likely based on the timings you’ve measured.
Fill in the best guess of the ith letter for future iterations
If you get to the end of the loop and none of the guesses were correct, repeat the process. Hopefully, this was just caused by something interfering with the timings that won’t happen over and over again.
3.3 On ensuring measurement consistency
When consolidating mulitple measurements, it may be better to use an approach that is not very sensitive to outliers (for example, such as taking a median; or retrying whenever multiple measurements don’t agree about what is slowest).
(Note that outliers might occur, e.g., because of an exception occuring while the passphrase is being checked, which isn’t relevant to the timing we care about.)
To get consistent measurements, you should try to measure all the possible letters under consistent conditions. (This helps avoid issues where, for example, the processor changes clock speeds dynamically to save power or manage heat.)
So, for example, it might be good to do things like:
- (especially if taking lots of measurements) interleaving when each possiblity is measured rather than doing all the measurements for the first possiblity, then the second, and so on, and/or
- measuring all the letters for an index before printing any measurements out