Assignment: RUST
Contents
Changelog:
- 27 March 2025 10:30pm: Fix bugs in supplied code: not accepting correct numbers of arguments for some commands; not displaying second student associated with submisison correctly;
- 27 March 2025 10:30pm: Correct description of support commands to match program; add example transcript
- 27 March 2025 11:10pm: Reorganize hints to deemphasize smart pointer support that is probably not useful for this assignment as released
- 27 March 2025 11:20pm: Correct description of use-after-free bug
Your Task
-
Download the Rust program at link here. See below regarding compiling and running the program.
The supplied program makes extensive use of
unsafe
, which should not be necessary. (It also, incidentally, has a use-after-free bug.)Fix the program to avoid using unsafe and avoid leaking memory, but support the same command-line interface.
-
Create a
readme.txt
describing briefly your changes and whether you believe they have impcated the performance of the program. -
Submit your modified .rs files and readme.txt to the submission site.
Compiling + running the program
-
You will need to install the Rust compiler and
cargo
. Probably the easiest way to do this is rustup. -
After extracting the .tgz file, run
cargo build
in the cs4630-rust-hw directory to (re)compile the program. -
To run the program, either run
cargo run
or run the executable file intarget/debug/cs4630-rust-hw
. -
When run, the program has accepts commands from its input as documented in the next section.
What the program does
This Rust program implements an in-memory data structure that tracks information for a hypothetical University class with one assignment which is submitted by partners.
- students, identified by an ID number and a separate SIS ID number
- a list of graded submissions
- a list of ungraded submissions
Each submission is associatied with two students.
The program provides a command line interface including the following commands:
- help [or any unrecognized comnmand]
- show — list all students and all submissions.
- new-student ID SIS_ID — create a new student with ID number ID and SIS ID number SIS ID
- new-submission STUDENT-ID1 STUDENT-ID2 — create a new submission with time stamp ID associated with the student with ID STUDENT-ID. The submission is added to the list of ungraded submissions.
- grade-next GRADE — assign GRADE (integer) to the next submission to be graded, removing it from the list of ungraded submissions and adding it to the list of graded submissions:
- drop-student ID — remove the student with id ID from the class When the student has a submission, the original implementation of this results in a use-after-free bug.
- exit — terminate the program
See below for an example usage session (including demonstrating a use-after-free problem).
Data structues in the program
#[derive(Debug)]
pub struct Student {
id: u32,
sis_id: u32,
}
#[derive(Debug)]
pub struct Submission {
grade: Option<u32>,
student1: *mut Student,
student2: *mut Student,
}
#[derive(Debug)]
pub struct Class {
students: BTreeMap<u32, *mut Student>,
graded_submissions: VecDeque<*mut Submission>,
ungraded_submissions: VecDeque<*mut Submission>,
}
-
These struct definitions all have
#[derive(Debug)]
so you can output a structs
’s values for debugging using something likeprintln!("s contains {}", s);
orprintln!("s contains {s}");
. -
*mut Student
and*mut Submission
are raw pointers to Student and Submission respectively. To use them, the Rust code needs to useunsafe
.Note that in this design, there can be multiple pointers to a Student object. This violates the object “ownership” idea we discussed where each object generally has a single owner. Thus, we cannot trivially replace these raw pointers with references. If we tried to replace the raw pointers with raw instances of the
struct
, then we would have the issue that we’d be storing multiple copies of Student objects for the same student ID. Updates to one of them would not be reflected in the others.
Some Rust usage notes
Documentation generally
- You can find documentation for Rust’s standard library at https://doc.rust-lang.org/stable/std/index.html and more general documentation for Rust at https://doc.rust-lang.org/stable/.
Debugging prints
-
We’ve marked the structs we use with
#[derive(Debug)]
.This means that, for example, given a
Student
objects
you can output it for debugging with something likeprintln!("s = {:?}", s);
println!
works similarly to printf.{:?}
repreents to take an argument and output in the debugging format.
Option types and pattern matching
-
Rust does not have a
null
, so to represent an object of type T or nothing Rust provides a specialOption<T>
type. Given some variablex
of this typex
can beNone
or can beSome(y)
wherey
is a T. -
In the code we supply, we often need to extract the value from an
Option<T>
if it’s a Some(,,,) or print an error otherwise. One way we do this is with pattern matching syntax. Given anx
that is eitherNone
orSome(...)
match x { Some(y) => f(y), None => g(), }
runs
f
on the contents of the Option if it’s non-empty (a Some) and runsg
otherwise.Rather than explicitly writing out None, we can also use an
_
to match the pattern of “anything else”. -
Often when accessing a value from a container we will end up with an
Option<&T>
— that is a value which either represents a reference to a T or represents nothing.If we want an actual copy of that value, we might need to further dereference it, as in:
match x { Some(y) => { let copy = *y; f(copy) } None => {}, }
Using VecDeque
-
VecDeque is a double-ended list (similar to java.util.Deque in java). It supports efficiently adding or removing elements at the front and back and is implemented using arrays.
-
Like Java, Rust supports generic types, so a
VecDeque<T>
contains objects of typeT
. -
The
front()
method (andback()
method) returns anOption<&T>
. Unlike Java, we cannot returnnull
, so we use the Option type to represent the possibility that the container is empty. But like Java, we get a reference to the actual object stored in the container. -
The default version of methods to read elements of the deque return non-mutable references. If we want to modify a value stored in the deque we need to use mutable versions. For example, instead of
front
we could usefront_mut
.
Using BTreeMap
-
BTreeMap is a map implemented with a balanced binary tree (similar to java.util.TreeMap in Java).
Since it is kept as a balanced tree, the map is automatically sorted by its key.
-
Like with VecDeque, BTreeMap is a generic type so BTreeMap<T, U> is a map containing keys of type T and values of type U.
-
Also like VecDeque, most methods that read values in the map return
Option
types (since null values are not possible) and return references to the values in the container.
Some possible strategies
Having a single reference to Students and Submissions
-
If Submission objects did not pointers or references to Students, then normally, the only thing holding Student objects would be the Class struct.
If so, we could change the Class struct to hold Student objects in a way which marks them as owned by the Class struct.
One option would be to have the Student objects stored directly in the BTreeMap:
pub struct Class { students: BTreeMap<u32, Student>, ... }
The reason the code does not start out this way is that we can’t do something like:
pub struct Submission { ..., student1: &Student, student2: &Student, } ... submission.student1 = class.students.get(&12345); submission.student2 = class.students.get(&23456);
unless the compiler can prove that
class.students
won’t be modified beforesubmission
goes out of scope. Otherwise, the compiler needs to be concerned that the student1 and student2 objects will be deleted or moved, making the references invalid.In our code, the compiler definitely could not prove that the student refernces are not moved/deleted, because they would be for some sets of commands we could run.
-
To avoid having Submission objects reference students, we could instead have them refer to students by their ID numbers.
Using Rc
-
Rather than replace pointers by ID numbers to lookup in a single table, we could use Rust’s smart pointers. Most likely in this case we’d use rust’s reference counting support.
This allows for mutliple shared references to the same object, with use-after-free prevented by keeping a count of reference.
-
Rust’s Rc, part of the std::rc module provides “smart pointers” to an object that implement reference counting.
In this scheme, a “reference count” is kept alongside each object. Whenever a new “smart pointer” to an object is made, that count is incremented; whenever one is freed, that count is decremented. The object is freed when the count becomes zero.
-
Reference counted “smart pointers” use the type Rc
, which can be used like a normal read-only reference &T. -
Most releevantly the
std::rc
module provides the following functions:- Rc::new(object) — allocate space for
object
and its reference count and moveobject
to that space. Then return an Rc"smart pointer" to that space and set the reference count to 1. When the smart pointer goes out of scope, the compiler will call `drop()` on the Rc object, which will decremet the reference count and free the space if the count became zero. - Rc::clone(&rc) — given an Rc
"smart pointer" `rc`, create a new reference, incrementing the reference count.
- Rc::new(object) — allocate space for
-
For this assignment, just Rc alone is probably sufficient. But, for slightly more complex tasks, using just Rc alone has some restrictions that we can workaround with additional tools:
- it leaks memory when there are cyclic references (objects that point to an object that points to the original object). If necessary, we can deal with this using weak references as described below.
- it doesn’t allow for changing the shared object. If necessary, we can deal with this with RefCell, which implements a runtime-checked version of Rust’s borrowing rules
Example session
$ cargo run
Compiling cs4630-rust-hw v0.1.0 (/u/cr4bd/spring2025/cs4630/hw/rust/cs4630-rust-hw)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.08s
Running `target/debug/cs4630-rust-hw`
> new-student 1 10000
Added student #1
> new-student 2 20000
Added student #2
> new-student 3 30000
Added student #3
> new-student 4 40000
Added student #4
> new-student 4 40000
Added student #4
> new-student 5 50000
Added student #5
> new-submission 1 3
Created submission
> new-submission 1 2
Created submission
> new-submission 1 4
Created submission
> new-submission 2 5
Created submission
> new-submission 4 5
Created submission
> show
Student#1 [SIS ID#10000]
Student#2 [SIS ID#20000]
Student#3 [SIS ID#30000]
Student#4 [SIS ID#40000]
Student#5 [SIS ID#50000]
Graded submissions:
Ungraded submission [next-to-grade last]:
Submission for Student#4 [SIS ID#40000] and Student#5 [SIS ID#50000] [ungraded]
Submission for Student#2 [SIS ID#20000] and Student#5 [SIS ID#50000] [ungraded]
Submission for Student#1 [SIS ID#10000] and Student#4 [SIS ID#40000] [ungraded]
Submission for Student#1 [SIS ID#10000] and Student#2 [SIS ID#20000] [ungraded]
Submission for Student#1 [SIS ID#10000] and Student#3 [SIS ID#30000] [ungraded]
> grade-next 90
> grade-next 40
> grade-next 30
> show
Student#1 [SIS ID#10000]
Student#2 [SIS ID#20000]
Student#3 [SIS ID#30000]
Student#4 [SIS ID#40000]
Student#5 [SIS ID#50000]
Graded submissions:
Submission for Student#1 [SIS ID#10000] and Student#3 [SIS ID#30000] grade 90
Submission for Student#1 [SIS ID#10000] and Student#2 [SIS ID#20000] grade 40
Submission for Student#1 [SIS ID#10000] and Student#4 [SIS ID#40000] grade 30
Ungraded submission [next-to-grade last]:
Submission for Student#4 [SIS ID#40000] and Student#5 [SIS ID#50000] [ungraded]
Submission for Student#2 [SIS ID#20000] and Student#5 [SIS ID#50000] [ungraded]
> drop-student 4
Removed student '4'
> show
Student#1 [SIS ID#10000]
Student#2 [SIS ID#20000]
Student#3 [SIS ID#30000]
Student#5 [SIS ID#50000]
Graded submissions:
Submission for Student#1 [SIS ID#10000] and Student#3 [SIS ID#30000] grade 90
Submission for Student#1 [SIS ID#10000] and Student#2 [SIS ID#20000] grade 40
Submission for Student#1 [SIS ID#10000] and Student#1433903432 [SIS ID#5] grade 30
Ungraded submission [next-to-grade last]:
Submission for Student#1433903432 [SIS ID#5] and Student#5 [SIS ID#50000] [ungraded]
Submission for Student#2 [SIS ID#20000] and Student#5 [SIS ID#50000] [ungraded]
> exit
When just Rc isn’t enough…
reference counting cycles using weak references
-
A problem with reference counting is that it leaks memory when there are cycles unless other measures are taken. For exampe, if A and B reference each other, then A will always have a reference count of at least 1 as long as B exists, and B will always have a reference count of at least 1 as long as A exists. As a result, neither will ever be freed by having their reference count reach zero.
-
One solution to this issue would be to remove the references entirely from one of the objects in the cycle, tracking that information in some other way.
-
Alternately, std::rc has a notion of “weak references”, which do not increment the reference count. This means that the reference count can go to zero, causing the object to be freed, while there is still a weak reference. To keep this from resulting in a use-after-free bug, dereferencing a weak reference checks whether the object has been freed. (This means that there is some additional bookkeeping information kept when an object is freed, but there is still a weak reference to it.)
To create a weak reference one can called Rc::downgrade(&rc) where
rc
is an Rc object:let weak_ref: Weak<T> = Rc::downgrade(&rc);
To use a weak reference one uses
upgrade()
as inlet real_ref: Rc<T> = weak_ref.upgrade().expect("object was freed");
In the example above
upgade()
returns anOption<>
object, which either contains anRc
or isNone
..expect("object was freed")
extracts the Rc object if possible and otherwise crashes with the error message “object was freed
”.
one reference at a time (RefCell)
-
Rust’s RefCell, part of the std::cell module provides a “smart pointer” to an object that implements “dynamic borrowing”.
The RefCell provides two important methods:
borrow_mut()
— get a temporary mutable reference to the object, but first verify that there are no other references to the object in scope. With help from the compiler, the RefCell will automatically track when the temporary reference goes out of scope.borrow()
— get a temporary read-only reference to the object, but only if there are no temporary mutable references to the object in scope. With help from the compiler, the RefCell will keep track of when this reference goes out of scope.
In order to track the whether there are currently references, a RefCell internally consists of a “flag” value, which is changed as “borrowed” references go in and out of scope, and a pointer.
-
To give an example of a use for RefCell, this code will not compile because the compiler cannot prove that
yes
andno
are not modified while being accesssed elsewhere:let mut yes: String = String::from("YES"); let mut no: String = String::from("NO"); let answer_1: &String = &yes; let answer_2: &String = &yes; let answer_3: &String = &no; println!("Answer #1: {}; Answer #2: {}; Answer #3: {}", answer_1, answer_2, answer_3); // output: Answer #1: YES; Answer #2: YES; Answer #3: NO // Compile error: `yes` is borrowed by `answer_1`, `answer_2`, cannot modify since still // needed for println! below yes.push_str("!"); // Compile error: `no` is borrowed by `answer_3`, cannot modify since still needed for // println!() below no.push_str("!"); println!("Answer #1: {}; Answer #2: {}; Answer #3 {}", answer_1, answer_2, answer_3); // intended output: Answer #1: YES!; Answer #2: YES!; Answer #3: NO!
We can make the code work as intended using a RefCell
let yes: RefCell<String> = RefCell::new(String::from("YES")); let no: RefCell<String> = RefCell::new(String::from("NO")); let answer_1: &RefCell<String> = &yes; let answer_2: &RefCell<String> = &yes; let answer_3: &RefCell<String> = &no; println!("Answer #1: {}; Answer #2: {}; Answer #3: {}", answer_1.borrow(), answer_2.borrow(), answer_3.borrow()); // Answer #1: YES; Answer #2: YES; Answer #3: NO { let mut yes_str = yes.borrow_mut(); yes_str.push_str("!"); let mut no_str = no.borrow_mut(); no_str.push_str("!"); } println!("Answer #1: {}; Answer #2: {}; Answer #3 {}", answer_1.borrow(), answer_2.borrow(), answer_3.borrow()); // Answer #1: YES!; Answer #2: YES!; Answer #3: NO!
-
RefCells provide a “workaround” for Rust’s immutablity rules. Notice in the exame above the
yes
andno
RefCells are not marked mutable withmut
, but we can stillborrow_mut()
from them. -
We can combine RefCell with Rc to create a shared reference to a mutable object, where the library checks that no one tries to write the object while another reference is being used to read it:
let answer_1: Rc<RefCell<String>>; let answer_2: Rc<RefCell<String>>; { let yes: Rc<RefCell<String>> = Rc::new(RefCell:new(String::from("YES"))); answer_1 = Rc::clone(&answer_1); answer_2 = Rc::clone(&answer_2); yes.borrow_mut().push_str("!"); } // original yes out of scope, but that's okay because answer_1, answer_2 still in scope println!("1: {}; 2: {}", answer_1.borrow(), answer_2.borrow());