NOTE: From now on, we’ll assume that you start lab by connecting to the CS portal and that you are familiar with the command line environment. If you have not been practicing with the terminal, we strongly encourage reviewing Lab 1.
We will also assume that you ran the setup script from that lab and have all modules (including clang and git) by default.
After this lab, you should:
- Be more familiar with x86-64
- Understand how to run and step through code in
- Be able to view the current state of the system at each instruction
lldbis a command-line debugger
- “LLVM” is the compiler framework that includes many things, including the
clangcompiler that we are using, as well as
gdbis the debugger that was used in the past, and is often used elsewhere – it is analogous to
lldbin how it works
A debugger is a utility program that allows you to run a program under development while controlling its execution and examining the internal values of variables. We think of a program running “inside” a debugger. The debugger allows us to control the execution of the program by pausing its execution and then resuming it. While paused, we can find out where we are in the program, what values variables have, reset the values of variables, etc. If a program crashes, the debugger can tell you exactly where the program crashed. The principles and commands described in this document are specific to the lldb debuggers under UNIX, but every debugger has similar commands.
We’ll use debuggers initially on binary files. When we get to writing our own C code, you will want to compile your code with the
-g flag to enable debugging symbols, which will make the debugger much more useful.
- Log into a computer with clang and LLDB that uses the x86-64 ISA (i.e., by SSHing into
- Invoke1 with
lldb program_to_debug. Note: you must run
lldbon an executable. You cannot run
lldb prime.s; you must assemble your file with
The following sections describe the important types of things you can do with
lldb, organized by “category” of activity.
The following all assume you are in a debugger
| ||(re)start the program|
| ||(re)start the program with command line arguments |
| ||step one source-code-line forward, entering functions if stepping on |
| ||step one source-code-line forward, skipping to return if stepping on |
| ||step one ISA-instruction forward, entering functions if stepping on |
| ||step one ISA-instruction forward, skipping to return if stepping on |
| ||run until the next |
| ||resume running after run was interrupted (e.g., after a breakpoint or |
| ||leave the debugger|
You might also want to use Ctrl+C to interrupt a program if it is running too long (this works on the command line for programs run without a debugger too).
A breakpoint is a program location where the debugger pauses when running so you can see what’s around it.
run, the debugger pauses right before executing the code on which you place a breakpoint.
| ||set a breakpoint on the first line of |
| ||set a breakpoint on the line 23 of |
| ||list all breakpoints|
| ||delete breakpoint number 1 (as indicated in the list)|
To inspect the code and call stack,
| ||show a |
| ||select the stack frame of the caller of the current stack frame|
| ||undo a previous |
| || |
| || |
| || |
| || |
If you need to peek inside registers or memory,
| ||show information about the current stack frame|
| ||show the contents of the program registers|
| ||show the contents of the program registers, formated as signed integers|
| ||show the contents of |
| || |
See the cmdadd example for a detailed walkthrough.
runsum is supposed to print out the sum of numbers from 0 to \(n\) (the command line argument) inclusive, as e.g.
./runsum 0 Sum of values from [0, 0]: 0 ./runsum 3 Sum of values from [0, 3]: 6 ./runsum 4 Sum of values from [0, 4]: 10
However, the program sometimes prints the wrong numbers! If we call
./runsum 4, it prints
4. That’s not good! Try it out yourself.
In order to get a copy of
runsum, you’ll need to copy it into your home directory on the portal.
cdinto the directory you wish the file to reside, then issue the following command to copy the file:
cp /p/cso1/labs/runsum .
Your task: use
lldb to find the bug, then use a hex editor to fix it.
We used an online hex-editor in Lab 2, and that can be used again here. However, it is likely much easier to edit the file directly on the portal.
In order to edit the hex directly on the portal, we can use the program
xxd, which creates a text file containing the hex values.
Run the xxd program on the
runsumfile to turn it into hexadecimal text.
xxd runsum > runsum-hex.txt
runsum-hex.txtfile directly. Note that there are three columns of values. The first section displays the index into the
runsumfile for the first hex digit of each row. The second section contains the hex digits for that row. The third section contains the textual representation of the binary (if it were ASCII characters). We can mainly ignore this latter section and only edit the hex digits.
When you have finished editing the hex digits, return the file to binary using xxd again.
xxd -r runsum-hex.txt > runsum-updated
Our new file will not be executable by default. Issue the following command to allow execution of the new updated
chmod u+x runsum-updated
Aside: There are also various hex-editor applications available, such as
- xxd (installed by default on some linux systems)
If editing binaries becomes a common part of your workflow (it won’t in this class, but might later in your careers) then picking a hex editor you like will be worth your while.
Our programs normally start with
main, so that’s a good place to look first. Once you have started the debugger on
di -n main to disassemble and view the contents of
- Questions to ponder:
What function calls do you see?
Since lldb gives us comments after the
;, we can get some hints about where the functions may live. Symbol stubs will tell us about code elsewhere. It looks like we are calling a particular function!
Do you notice any if statements (from the jumps)?
We can see a compare followed by a conditional jump, as well as an unconditional jump.
Do you notice the mix of 64-bit and 32-bit?
movqinstructions mixed in!
Set a break point at
br set -n main. Then, you can run this code using the
run command, but don’t forget to give an argument! You can then
step through the code. At each step:
- lldb will show you the upcoming instructions (what will be executed next is at the arrow).
- You can see where you are in the entire frame with the
di -fcommand. Remember, the next instruction to be executed will have an arrow next to it.
- You can view the contents of any register using the
reg readcommand. We can look at the contents of rax with
reg read rax.
- You can view the contents of memory using the
me rea -s8 -c1 -fx ADDRESScommand. Remember that if the value is 64-bit (look for the
q), then we need 8-byte values (s8). You may need to view the contents of a register then do some math to get the address (see step 7 in the debugger example). Notice that even this assembly is doing a lot of what we saw in our own Toy ISA, including moving values from register to memory so that the registers can be used for other things.
It looks like there is one local function in our code. Disassemble and view the contents of that function. Where is it storing the added values? Do you see a loop or an if statement? Run through this function, review the register values as it works, and see if you can find where things might go awry. Once you’ve found the issue, fix it in a hex editor, and double check using the debugger again.
In order to receive full credit for this lab, please show or discuss with your lab TA:
- Your working code for
- How you used the debugger to solve the problem and/or a few simple debugger actions while running
If you have issues running
lldb, it could be that you did not run the script from Lab 1. You may load the module containing
lldbon the portal by first running the command
module load clang-llvm. ↩