Rust

Takeaways

Key Points:

  • [ ] What is this course about and what will be expected of you?
  • [ ] Why are we using Rust for this course?
  • [ ] How is Rust similar to and different from languages you've used before?
  • [ ] How do we define new data types in Rust?
  • [ ] How do we define behavior on data types in Rust?
  • [ ] How we run Rust code?

About this Course

Highlights from the syllabus:

After taking this course, students should be able to:

  1. Evaluate game engines based on criteria like performance, ease of use, and flexibility;
  2. Create their own game engines from scratch;
  3. Build games using game engines;
  4. Analyze existing games to make educated guesses at how they were made; and
  5. Make targeted measurements of software performance issues, grounded in an understanding of computer hardware.

Materials

All required course materials are available on the course webpage.

Course discussions will take place on Canvas.

Lectures

The lecture notes and Miro boards linked from the course homepage serve as our "textbook". In order for the class to be useful, I will rely on you to have read the material before class starts—this way we can have useful discussions, answer good questions, and dive right in to the lab work.

Lecture attendance is strongly encouraged but not required. This is important time for teams to share notes with each other and playtest each others' games, as well as to work together on labs.

Labs

The first several weeks of lectures are accompanied by a brief in-class lab assignment, credit for which is granted through a submission on GitHub. I may ask you to come to an office hours to check-in the assignment for full credit.

The purpose of labs is to give you tools and techniques to use in your game projects, so please take them seriously! If you can't complete them before the scheduled due date, get in touch with me and we'll figure something out.

Major Assignments

This course is split into two units: a lab section and a project section. In the project section, you'll be in a pair or triple for the second half of the course and will create three games sharing the same underlying game engine (which you will also create).

During the second half of the course, each team will also develop a tech talk on a game-related topic of their choosing.

Grade Breakdown

Tech Talk (and responses)
10%
Labs
40%
Projects
50%

These are all basically effort grades! If you can communicate to me what you tried and show that you've worked through some hard problems, you will be fine. We will talk more about these categories and expectations as we get closer to deadlines.

So, like, don't freak out if it's hard. It's good if it's hard and I am totally, 100% supportive of you struggling and improving. This class is a safe place to get wrong answers.

Deadlines

Projects in this class have due dates because we need to share our work for effective feedback from our peers. If a due date for deliverables are difficult to reach for any reason, reach out to me as soon as possible and we can find an accommodation.

Typically, labs are due on Sunday nights.

Teamwork Issues

Grades for the project are assigned based on the turn-ins at the end of the project. Every week during the project section, we'll have a check-in with a private Canvas assignment for each student to post their experiences over the last week and plans for the next week, so we can try to catch teamwork issues quickly.

I am happy to help teams work through issues like these throughout the semester. Team based project work is new for many of you!

Getting Rust

The best way to manage a Rust toolchain on your computer is rustup. Stable Rust is updated every six weeks, so it's very helpful to have an easy way to manage your Rust installation. Follow that link and install rustup, which should install Rust and cargo (our build tool and package manager).

On Mac OS, you'll probably need to install the Xcode Command Line Tools.

On Windows, you'll need to install at least the Visual Studio Build Tools (please be sure to select the options for the C and C++ build tools if you have the choice).

In particular, since we're making graphical programs you should probably not try to run Rust in WSL on Windows. It might work, but I don't know for sure how well it will go.

Once Rust is installed, these commands should Just Work and give you a simple "Hello, World" project.

$ cargo new --bin test_project
$ cd test_project
$ cargo run

Editing

Rust has excellent IDE support, either through JetBrains's new Rust IDE or through the language server protocol and the Rust language server rust-analyzer (Visual Studio Code, Emacs, and (neo)Vim can all use LSP). If you want help setting up Emacs I'd be glad to share my configuration; the rustic package is a good one-stop-shop for this.

It is extremely important that you configure your editor to support these features:

  1. See the full output of rustc and clippy warnings and error messages (ideally on save or even while typing).
  2. Run rustfmt (cargo fmt) every time you save.
  3. Jump to definition—see where a function or type is defined.
  4. Get documentation for the thing where the cursor is.
  5. Run the code or tests with a keyboard shortcut.

These are really important for quality of life and to ease the learning curve. You want trying out code changes and looking at definitions in particular to be very fast.

Finally, it's a good idea to use version control in this class especially for group work. Try to get comfortable with command-line git, even though VSCode and emacs (magit or vc) both have very good integrations with git. I think the git parable is a great read to get used to the ideas underlying git; you can supplement this conceptual understanding with e.g. Pro Git.

Extreme Number Guess Challenge

Let's make our first Rust game, a hardcore game for serious gamers only. Find a cozy spot on your computer and run cargo new --bin --edition 2021 extreme-number-guess-challenge. This makes a new Rust project which will give us a runnable program, using the latest "edition" of the Rust language. Edit the created Cargo.toml file to look like so:

[package]
name = "extreme-number-guess-challenge"
version = "0.1.0"
authors = ["Joseph C. Osborn <joseph.osborn@pomona.edu>"]
edition = "2021"

[dependencies]
rand = "0.8.5"

Note that we've added a dependency on the rand crate, since generating random numbers is not part of Rust's minimal standard library. Instead of modifying cargo.toml, we could have run cargo add rand at the command line within our project folder.

And now for our first bit of Rust code in src/main.rs:

// First we'll import a useful function and trait from rand...
use rand::{thread_rng, Rng};
// and the std::io module which will let us get user input.
use std::io;
// We need the Write trait so we can flush stdout and display a message immediately.
use std::io::Write;

// Every program needs an entry point; in Rust it's called `main`.
fn main() {
    // Generate a random number.  The type to generate is figured out from the annotation on `number`.
    let number: usize = thread_rng().gen_range(0..10);
    // Print out a prompt...
    println!("I'm thinking of a number between 0 and 10...");
    print!("> ");
    // Flush so that the prompt is definitely all the way printed.
    io::stdout().flush().unwrap();

    // We'll need an owned String to read input into.
    // The "string literals" we used before were of type `&'static str`; the allocation
    // had a lifetime of the whole program and the memory exists in a section of the binary.
    // Runtime-allocated strings start out as type `String`.
    // Note that there's no type annotation on this one.
    let mut input = String::new();
    // Actually read a line---call the stdin() function from the io namespace,
    // which gives us a struct on which we can call read_line.
    // Rust's Reader API tries to be efficient by reusing allocations;
    // many of Rust's standard library functions leave the choice of when to allocate and how much to the caller.
    // So we pass in a mutable reference to our `input` variable.
    io::stdin().read_line(&mut input).unwrap();
    // But what's that "unwrap" about?  Well, reading the input could fail!
    // (It will fail in the somewhat unusual condition that the input is not valid UTF-8, among other possible reasons.)

    // Trim whitespace off of input.  Note that this `input` variable shadows the old one.
    // Its type is also different: &str instead of String.
    // If you look at https://doc.rust-lang.org/std/string/struct.String.html , you may notice
    // that `trim` is not a method of String; instead, it's on the type str.  Since String implements
    // the trait Deref<Target=str>, the dot-notation automatically dereferences the String into a str
    // and finds its `trim` method.
    // Whenever we use dot-notation in Rust, if the method to the right of the dot isn't found
    // in the type, we look at its traits; and then we see if it's possible to deref this type into
    // another type on which the method is found.
    // It's kind of complicated to explain, but it's very convenient for types that act as wrappers
    // of other types.
    let input = input.trim();

    // Format specifiers are {}-ish; {} itself uses the Display trait, similar to Python's __str__().
    println!("You guessed {}...", input);
    println!("I was thinking {}!", number);

    // Try removing this type annotation and see what happens!
    let guess: usize = input.parse().unwrap();

    // Finally, did the player win or lose?
    // Note that ifs don't require parentheses.
    if guess == number {
        println!("You got it!");
    } else {
        println!("You didn't get it!");
    }
}

Activity: Mystery Tour

Find the online book Rust by Example. Browse around it and locate a source code snippet that seems mystifying, even after explanation. Then we'll get in groups and figure out our snippets together by looking at the Rust Book and the standard library documentation.

About Rust

Rust is a systems programming language originally developed at Mozilla as part of their project in more secure, stable browser technology. Several parts of Firefox are written in Rust now and a number of major companies are making investments in it including Microsoft and Apple, blah blah blah.

What's important about it for this class are these three things:

  • Strong typing, including control over the layout of structs in memory
  • Memory safety via declarative memory management (not a garbage collector)
  • As fast as C++ (sometimes faster, sometimes slower), without the baggage of C or the intense discipline required of a "modern" C++ programmer

The front page for the course has links to a variety of useful resources on Rust, so check them out as a supplement to this overview.

A quick note about mutability: bindings in Rust are immutable by default, so this is not allowed:

let v = Vec::new();
v.push("hello"); // won't compile!

While this is fine:

let mut v = Vec::new(); // Note mut keyword
v.push("hello"); // will compile!

Rust guarantees that you will never mistakenly update a value that has a possibility of being read in another location. In more formal term, every value has either exactly one live mutable reference or potentially many live immutable references, but never a mix of both.

PPrust.png

Types

Rust is based on an expressive type theory that allows for type inference within procedure bodies; function arguments and return types along with datatype definition members must be given explicitly, but local variables and anonymous function argument types can be inferred in most cases.

Numeric types

Rust has an extensive set of explicitly defined numeric types.

// 8--64-bit un/signed integers
let w:u8 = 0;
let x:u16 = 0;
let y = 0_u32;
let z:i64 = -1;
// un/signed "sizes", the type of pointer indices and offsets:
let a:usize = 0;
let b = -5_isize;
// floating point numbers in 32- and 64-bit flavors
let f:f32 = 0.0;
let g = 0.0_f64;

One area where Rust introduces a little friction is in doing arithmetic between different types of numbers; you can't add an i32 and a usize, or a u8 and an i8. This removes large classes of bugs (e.g., accidentally shifting a byte value by more than eight positions) but the lack of automatic promotions does introduce some noise, which you can handle in a few different ways:

let q = w as u16 * x;
// This often works too, guessing a type for _:
let q = w as _ * x;
// Or shadow w...
let w = w as u16;
let q = w * x;

Note that there's a big difference between that last example and this:

let mut w:u8 = 0;
w = w as u16; // Won't compile!
let q = w * x;

In particular, the former variation creates a new variable named w which hides the old w on subsequent lines. It can even be scoped:

let w = 17_u8;
let q = {
    let w = w as u16;
    w * x
};
// w is still a u8

And now w remains a u8 for the rest of the code.

While this doesn't occur in this particular case, casting can in general lose bytes of numbers. To do this in the safest way possible, you'd want something like this which avoids casting:

let q:u16 = u16::from(w) + x;

If our situation were a little different…:

let q:u8 = w+x;

Casting x to u8 with as would potentially lose data if x were greater than 255. The cast would compile but could lead to surprising outputs. If we were in the habit of using from, the code wouldn't compile:

let q:u8 = w+u8::from(x);
// The trait bound u8: std::convert::From<u16> is not satisfied

From and Into are used only for so-called infallible conversions. They have safer variants for handling errors:

// let q: u8 = w + u8::try_from(x);
// ^ Whoops, won't compile since we can't add a number and a Result.
let q: Result<u8,_> = u8::try_from(x).map(|x| w + x);
// Instead we can propagate the error (if any) and get the result otherwise using map.

Tuples

Rust has heterogeneous tuple types like Python does:

let t1 = (1_i8, 2.0_f32, 0_u64);
// The type of t1 is (i8, f32, u64)
let t2 = (t1, 0_u8); // ((i8, f32, u64), u8)

We can get values out of tuples in a few ways:

let t1_1 = t1.1; // An f32 value, 2.0
let t2_0_0 = t2.0.0; // An i8 value, 1
// Destructuring assignment is even better:
let (a, b) = t2;
// Or even:
let ((a0, a1, _), b) = t2;

And we can write functions taking tuples as parameters:

// Note that a rust function or block evaluates to its last expression (look ma, no semicolon!)
fn distance_a(p1:(f32,f32), p2:(f32,f32)) -> f32 {
    ((p2.0-p1.0)*(p2.0-p1.0) + (p2.1-p1.1)*(p2.1-p1.1)).sqrt()
}
// But I think this can read nicer:
fn distance_b((x1,y1):(f32,f32), (x2,y2):(f32,f32)) -> f32 {
    ((x2-x1)*(x2-x1) + (y2-y1)*(y2-y1)).sqrt()
}
// You can always use temporaries too:
fn distance_c((x1,y1):(f32,f32), (x2,y2):(f32,f32)) -> f32 {
    let dx = x2-x1; // Semicolons end expressions by turning them into statements...
    let dy = y2-y1;
    (dx*dx+dy*dy).sqrt() // No semicolon here
}

Rather than tuples for points, though, we should probably define some custom data structure.

Structs

Custom data structures in Rust come in two flavors: tuple structs and named structs (properly, structs with named fields). First, tuple structs:

struct Vec2(f32, f32);
let v1 = Vec2(0.0, 0.0);
let v2 = Vec2(4.9, 9.4);
let v3 = Vec2(v1.0, v2.1);

We can implement methods on Vec2, which we can then call with namespace notation (for methods without a self parameter) or dot notation.

impl Vec2 {
    pub fn new(x:f32, y:f32) -> Self { // Self just means "the type I'm implementing"
        Self(x,y) // We can use it here too
    }
    // let v = Vec2::new(10.0, 10.0);
    pub fn magnitude(&self) -> f32 { // More on &self in the Memory and Ownership section
        distance_b((self.0, self.1), (0.0, 0.0))
    }
    // let mag = v.magnitude();
    pub fn magnitude_b(&self) -> f32 {
        let Self(x,y) = self; // We can destructuring-assign from structs too
        distance_b((*x,*y),(0.0,0.0)) // These * are related to the &self
    }
    // let mag_b = v.magnitude_b();
}

To see how we could implement subtraction of one Vec2 from another, read ahead to the section on Traits.

Note that the struct itself will be private to the module in which it's defined. If we wrote it like so:

pub struct Vec2(f32, f32);

We'd be able to call Vec2 functions but not use the constructor!

let v = Vec2::new(4.9, 9.4); // new is an "associated function"
let mag = v.magnitude(); // magnitude is a "method" since it takes a self parameter
let v2 = Vec2(0.0, 0.0); // won't compile!

To allow using the constructor, we'd want to make those two fields public as well.

Struct members (also called "fields") can also be named:

pub struct Vec2{
    pub x:f32,
    pub y:f32
}
impl Vec2 {
    pub fn new(x:f32, y:f32) -> Self {
        // Note curly braces; could also have been Self{x,y} since
        // field names and variable/param names line up
        Self{x:x,y:y}
    }
    pub fn magnitude(&self) -> f32 {
        distance_b((self.x, self.y), (0.0, 0.0))
    }
    pub fn magnitude_b(&self) -> f32 {
        let Self{x:sx,y:sy} = self; // We can destructuring-assign from named structs too!
        // let sx = self.x;
        // let sy = self.y;
        distance_b((*sx,*sy),(0.0,0.0)) // These * are to dereference through our input, &self
    }
}

And now we can both create and access members of Vec2 directly:

let mut v = Vec2{x:0.0, y:0.0};
v.x = 4.7;
v.y = 1.0;
// {} is a format specifier for the println! macro
println!("{},{} --> {}", v.x, v.y, v.magnitude());

Enums

Besides structs, Rust lets programmers define enumeration types. Variants of an enum are mutually exclusive and may also carry data in a way similar to tuple structs.

enum Day {
    Monday,
    Tuesday,
    Wednesday,
    Thursday,
    Friday,
    Saturday,
    Sunday
}
let d1 = Day::Monday;
let d2 = Day::Tuesday;

The main thing we can do with enums is discriminate based on their, uh, discriminant:

fn weekday(day:Day) -> bool {
    match day {
        Day::Monday => true,
        Day::Tuesday => true,
        Day::Wednesday => true,
        //... Don't skip any in your real code or it won't compile!
        Day::Saturday | Day::Sunday => false
    }
}

match is a very powerful expression and I suggest reading up on it in the Rust book or Rust By Example. A more compact version:

fn weekday(day:Day) -> bool {
    match day {
        Day::Saturday | Day::Sunday => false,
        // This "catchall" handles the rest
        _ => true
    }
}

But like I said, enums can hold information too (flashbacks to CSCI 054 PO are permitted but by no means required):

enum MovementType {
    Flying(f32), // wingspan in cm
    Walking(usize), // number of legs
    Wriggling
}

fn is_creepy(movement:MovementType) -> bool {
    match movement {
        MovementType::Flying(wing) => wing < 10.0,
        // You can put "guards" on match arms
        MovementType::Walking(legs) if legs > 4 => true,
        // But in this case it's probably not that helpful
        MovementType::Walking(_) => false,
        MovementType::Wriggling => true
    }
}

Or we could implement it on the type itself:

impl MovementType {
    fn is_creepy(&self) -> bool {
        match self {
            MovementType::Flying(wing) => wing < 10.0,
            // You can put "guards" on match arms
            MovementType::Walking(legs) if legs > 4 => true,
            // But in this case it's probably not that helpful
            MovementType::Walking(_) => false,
            MovementType::Wriggling => true
        }
    }
}
Activity: Computational Geometry

Let's take a quick breather here and practice with our new type definition features.

  1. Define a Rectangle tuple struct with four f32 fields indicating its position (top-left corner) and size (width and height).
  2. Add an impl Rectangle block with a method fn translate(&mut self, x:f32, y:f32) which moves over the rectangle by the given x and y.
  3. Redefine Rectangle to be a struct with named fields and update translate
  4. With a buddy: Can we rotate rectangles in this representation? Come up with an argument for how or why not, then come up with a special case and a function we can implement for certain rotations about the rectangle's top-left corner.

Generics and Traits

Rust supports generics in a manner similar to C++ templates or Haskell polymorphism and type classes.

Examples from the standard library include Vec<T> (a linear sequence of values, like a growable array) and Option<T> (Rust's answer to null: a distinct type implemented as an enum that may have Some(value) or None).

let mut v = vec![1_u8,2,3];
v.push(4); // "4.0" wouldn't have compiled!
match v.first() {
  Some(n) => println!("first:{}",n),
  None => () // do nothing, "unit", ...
}
// The below is the moral equivalent of the match above: if the destructuring assignment is permissible, do the body of the if
if let Some(n) = v.last() {
    println!("last:{}", n);
}

You can implement a polymorphic ("generic") struct like so:

struct<T,U> Poly<T,U> {
    ts: Vec<T>,
    u:U
}

And even put bounds on them, e.g. that the things are strictly ordered:

struct<T:Ord> MinHeap<T> {
    // ...
}

When it's time to implement methods, the angle brackets return in a few positions:

impl Poly<i32, usize> {
    // ... implement poly for specific types i32 and usize
}
impl<T,U> Poly<T,U> {
    // ... implement poly for any T and U
}

Rust doesn't have inheritance, so instead we use traits and trait bounds (we saw Ord a bit earlier); often these can be implemented automatically:

// In order: Can be cloned, copied (cheaply), ordered (always), equated (always), and debug-printed
#[derive(Clone, Copy, PartialOrd, Ord, PartialEq, Eq, Debug)]
struct IntVec3 {
    // A little weird, but we'll use integer points here
    pub x:i64,
    pub y:i64,
    pub z:i64
}

We can also make our own traits, which are similar to Java interfaces and support default implementations of methods (not shown in this sample):

trait AABBBounded {
    fn origin(&self) -> IntVec3;
    fn extent(&self) -> IntVec3;
}

impl AABBBounded for Mesh {
    fn origin(&self) -> IntVec3 {
        IntVec3{
            // more about map soon.
            x:*self.vertices.iter().map(|v| v.x).min().unwrap(),
            y:*self.vertices.iter().map(|v| v.y).min().unwrap(),
            z:*self.vertices.iter().map(|v| v.z).min().unwrap(),
        }
    }
    // ...
}

Traits can also have generic type parameters, be implemented for different combinations of types, have associated types and constants, etc.

Operator overloading

Traits are also how we implement operator overloading in Rust:

impl std::ops::Add for Vec3 {
    type Output=Self; // Vec3 + Vec3 = Vec3
    fn add(self, other:Self) -> Self::Output {
        Self { // Good habit to use Self wherever we can
            x:self.x+other.x,
            y:self.y+other.y,
            z:self.z+other.z
        }
    }
}

std::ops::Add is a built-in trait used after desugaring, as expressions like a + b turn into a.add(b). See the std::ops documentation for more examples.

Iterators

One of the most expressive uses of the type system in Rust—and one maybe more comfortable to ML or Haskell programmers than C programmers—is iterators. The std::iter::Iterator trait is really important in Rust (as are the types Option and Result and Vec) so I recommend reading through the documentation when you can.

Iterators are chainable, lazy instructions for manipulating sequences: mapping (from one sequence of values to another), filtering (generating a new sequence whose values pass some test), enumerating (pairing up iterated values with indices), reversing, and more are implemented lazily (so no intermediate collections are allocated); even linear algorithms like finding the min or max of a sequence, or finding the position of an element in a sequence are provided in the Iterator trait. All this comes with no performance penalty compared to hand-written for-loops (!!!). Using the example from above…

*self.vertices.iter().map(|v| v.x).min().unwrap()

iter() builds an iterator over (references to) the elements of self.vertices. map takes a function as its argument (in this case an anonymous function of one argument, v) which takes us from an iterator over &IntVec3 to an iterator over &i64 values. min() finds the least element of the iterator it's called on—but what if the iterator is empty? We've got to return something, so min() yields (in this case) an Option<&i64>, which we summarily unwrap (assuming the mesh has at least one vertex). Finally, we dereference (with *) the &i64 that comes out to obtain an i64.

Collect

Often we do want to build a new Vec (or even a HashSet or BTreeMap) out of a chain of iterators. The Iterator trait provides the delightful collect() method for this purpose:

let xs = self.vertices.iter().map(|v| v.x).collect();

Actually the above won't compile; we need to tell Rust what kind of thing to collect into:

// Note we don't need to specify the element type of the Vec!
let xs:Vec<_> = self.vertices.iter().map(|v| v.x).collect();
// This slightly gnarly syntax is called the "turbofish", avoid it if you can.  The previous line is much nicer.
let ys = self.vertices.iter().map(|v| v.x).collect::Vec<_>();

collect() is extremely versatile and type inference often is enough to figure out what should be collected into. You can collect an iterator over pairs into a HashMap, for example, or a Vec<Option<T>> into a Vec<T>.

Memory and Ownership

Rust isn't garbage-collected, but you'll basically never see the equivalents of malloc and free. Rust uses strict ownership rules to ensure that memory is never freed twice, that memory is never mutated through multiple different bindings, that memory clearly lives either on the stack or the heap, and that data races between multiple threads are prevented statically, at compile time.

This means some programs that are technically valid might be rejected; but the trade-off is that every accepted program is definitely sound (and to be honest, programs that are easier for the compiler to understand are likely easier for a reader to understand as well).

This section won't get too deep. For more detail and a much better explanation please read the relevant chapters of the Rust Book.

Single-owner rule

In Rust, every value is owned by a single binding. When we reassign a binding (e.g. a variable) the old value is dropped and the new one is moved in:

{
    let mut file = std::fs::File::open("file1.txt");
    // ... do some stuff with file ...
    file = std::fs::File::open("file2.txt");
    // The old file handle is dropped and closed!
} // Now the new file handle is dropped and closed too!

That's dropping for you. Note here it's literally impossible to forget to close a file or deallocate memory accidentally.

Moving is a bit more subtle:

let v = vec![1,2,3];
let v2 = v;
println!("{}", v.len()); // Whoops, moved out of v, won't compile

Methods and functions might also use up (or "move out of") a binding:

let v = vec![1,2,3];
for num in v { // implicitly, v2.into_iter()
  println!("{}",num);
}
let n = v[0]; // Uh-oh

This is because the IntoIterator trait defines into_iter() as:

fn into_iter(self) -> Self::IntoIter;

(Where IntoIter is some iterator type, that's not the key detail here.)

Note that this method takes its self parameter as self and not &self; that is, it takes an owned self (or it takes self by move) rather than a reference to self (or by reference). Here's another example:

impl Liquid {
  pub fn mass(&self) -> f32 {
    // ...
  }
  pub fn density(&self) -> f32 {
    // ...
  }
  pub fn freeze(self) -> Solid {
    // ...
  }
  pub fn evaporate(self) -> Gas {
    // ...
  }
}

You can call mass and density as much as you like, but once you call freeze or evaporate the original Liquid is used up and moved out of—the self parameter is the owning binding, which is then dropped at the end of the method.

References and borrowing

The last vital concept for understanding basic Rust memory ownership is the borrow. A value is owned by one binding, but it's useful if that binding can loan it out to other bindings for a while. We call these temporary loans borrows and pass them around via references. References are a bit like pointers, except they are guaranteed to be valid. They also have this extremely important guarantee:

Every value has, at any given location in the code, either:

  1. No borrows
  2. One or more shared, immutable references (&)
  3. Exactly one exclusive, mutable reference (&mut)

The upshot is that if you have a value, you know that it is impossible for any other code to change the state of the value you have. You'll never have aliasing bugs and you'll never forget to copy a string just in case the code that sent it to you modifies it later.

(Rust does support internal mutability through types like Cell and RefCell; a Python or Java-style object reference is something like a Rust Arc<RefCell<T>>. The gnarly type tells you it's not a thing you use often in Rust.)

Lifetimes

Borrows can't outlive the thing they're borrowing from. Lifetimes are a part of the types of references (and of types whose members include references) and make up a kind of stack of constraints about what can refer to what. If you've ever returned a pointer to a value on the stack in C and been baffled by the result, lifetimes exist to prevent that type of problem and more.

Lifetimes are purely syntactic constructs and don't exist after compilation. There is no universal reference counting or garbage collector or anything, and there's no way to "extend" a lifetime—it is what it is.

It's also good to remember that lifetimes generally have to do with things on the stack. Heap-allocated data like the contents of a Vec or a Box will get dropped when their holder does because of their respective Drop implementations, but if you borrow a value from such a container's memory then the lifetime is tied to the container itself, which (eventually) lives somewhere on the stack.

Lifetime annotations are tricky, especially if you start trying to put references into structs. If you need to write them, consider using Rc or some other strategy instead until you're more comfortable with Rust's rules.

Vecs and slices, strings and string slices

Vecs (Vec<T>, vec![1,2,3]) are contiguous, growable, heap-allocated regions of memory. Arrays ([u8; 4], [1,2,3,4]) are contiguous, fixed size, stack-allocated regions of memory. But in many cases, a reference to one is as good as a reference to the other: you can index into it, get an iterator from it, etc. We call a "reference to a contiguous memory region" a slice and its type is & [T] or &mut [T]. So if you want to know what you can do with a Vec, you probably want to read the documentation for Vec, std::slice, and std::iter::Iterator. It wouldn't hurt to look through the traits that Vec implements too!

Strings (String) are contiguous, growable, heap-allocated UTF-8 sequences. String slices (&str) are contiguous, fixed-size UTF-8 sequences which might be references to String values or might be references to str values on the stack or in the data segment of our compiled binary. (We don't have any way to make str values directly, they don't have their size as part of their type.) Rust makes the commitment that str and String always represent valid UTF-8 data; that means that other encodings and arrangements of bytes need different types like CString/CStr, OsString/OsStr, Vec<u8>/&[u8], and so on. As it turns out, C has all of these types too but it calls them all char *.

You can get pretty far with just String and &str; the compiler will tell you when you need to convert from one to the other.

Copy and Clone

If you have a value that needs to live in two places, you'll need to clone it (if you actually need two references to the same exact data, consider using references or smart pointers like Rc; or use your own handle type). Any type implementing the Clone trait has a handy method clone() you can call. This will also be your move if you have an &T and need a T (and Rust won't let you "move out of" the reference).

Types for which a simple bitwise copy gives a true copy of the value (e.g. integers, floats, arrays of copyable objects) may implement (by a derive) the Copy trait. Instead of moving out, Copy types get copied to their new location instead. This means you won't need to call clone() on Copy types.

Building and Running

Rust can be compiled file by file, but it also comes with its own build system and dependency manager and Swiss Army knife: cargo. Cargo has its own book on the Rust website, so do take a look if you want to know more details about it.

Cargo.toml for project settings and dependencies

Rust projects each specify which dependencies they need to do their work. Since Rust has an intentionally small standard library, most projects will pull in at least one crate. You'll see some examples later on; for now, just remember that the special file Cargo.toml (uppercase C!) is how cargo knows you have a Rust project to build.

Cargo.lock will appear once you've built your code and stores information about the versions of dependencies you're using; generally this should be version-controlled for binary crates and not version-controlled for library crates.

src and entry points

Cargo expects all project code to live in a src directory next to Cargo.toml. If the project is a library crate (package) it should have a file src/lib.rs, and if it's a binary crate it should have src/main.rs. Either way, the entry point defines the module hierarchy of the crate using mod or (for libraries) pub mod statements, and sometimes use and pub use.

This point bears repeating: Rust won't automatically add every file in src to the build! Only modules rooted at the entry point will be included. A corollary of this is that each module should appear in a mod statement at most once---mod is for defining structure, use is for importing names. It's weird at first compared to e.g. Python, which doesn't distinguish between defining a module and importing it.

Library crates can specify additional entry points with files in a src/bin directory, each of which should define a function main; each will implicitly have the library as a dependency. The same goes for files inside of a special examples/ directory (not inside of src!) which can be run with cargo run --example example_name.

Profiles and Optimization

Cargo builds in three main configurations (dev, test, and release), and Cargo.toml allows you to override aspects of those configurations. In this class we may want to trade off longer debug compile times for better performance:

[profile.test]
opt-level = 3
lto = "thin"

[profile.dev]
opt-level = 3
lto = "thin"

Another option is to compile dependencies only with optimizations, but not optimize your current crate's code:

# Set the default for dependencies.
[profile.dev.package."*"]
opt-level = 2

# Set the settings for build scripts and proc-macros.
[profile.dev.build-override]
opt-level = 3

But of course, you should probably wait to do these things until you measure a need for it.

Building and Running Code

Build
cargo build or cargo build --release
Test
cargo test or, to filter particular tests containing the string "my_module", cargo test my_module
Run
cargo run or cargo run --release; you can pass arguments with e.g. cargo run inputs/my_file.txt