Welcome to Comprehensive Rust 🩀 Two-Day

This is a fork of the free Rust course developed by the Android team at Google. The course covers the full spectrum of Rust, from basic syntax to advanced topics like generics and error handling. This experimental fork is intended to be taught in just two days. As such, it omits some material and zooms over other.

The latest version of the course can be found at https://rust-edu.github.io/comprehensive-rust-2day/. If you are reading somewhere else, please check there for updates.

The goal of the course is to teach you Rust. We assume you don’t know anything about Rust and hope to:

  • Give you a comprehensive understanding of the Rust syntax and language.
  • Enable you to modify existing programs and write new programs in Rust.
  • Show you common Rust idioms.

Non-Goals

Rust is a large language and we won’t be able to cover all of it in a few days. Some non-goals of this course are:

  • Learning how to develop macros: please see Chapter 19.5 in the Rust Book and Rust by Example instead.

  • Covering specialized topics: the full version of Comprehensive Rust includes “deep dives” on several topics.

Assumptions

The course assumes that you already know how to program. Rust is a statically-typed language and we will sometimes make comparisons with C and C++ to better explain or contrast the Rust approach.

If you know how to program in a dynamically-typed language such as Python or JavaScript, then you will be able to follow along just fine too.

This two-day version also assumes that you have set up a working Rust development environment and familiarized yourself with the basics of Rust before starting. This includes reviewing the first few chapters of The Rust Programming Language in advance. We will review this material, but rather quickly.

This is an example of a speaker note. We will use these to add additional information to the slides. This could be key points which the instructor should cover as well as answers to typical questions which come up in class.

Running the Course

This page is for the course instructor.

Here is a bit of background information about how we’ve been running the course internally at Google.

We typically run classes from 10:00 am to 4:00 pm, with a 1 hour lunch break in the middle. This leaves 2.5 hours for the morning class and 2.5 hours for the afternoon class. Note that this is just a recommendation: you can also spend 3 hour on the morning session to give people more time for exercises. The downside of longer session is that people can become very tired after 6 full hours of class in the afternoon.

Before you run the course, you will want to:

  1. Make yourself familiar with the course material. We’ve included speaker notes to help highlight the key points (please help us by contributing more speaker notes!). When presenting, you should make sure to open the speaker notes in a popup (click the link with a little arrow next to “Speaker Notes”). This way you have a clean screen to present to the class.

  2. Decide on the dates. Since the course takes at least three full days, we recommend that you schedule the days over two weeks. Course participants have said that they find it helpful to have a gap in the course since it helps them process all the information we give them.

  3. Find a room large enough for your in-person participants. We recommend a class size of 15-25 people. That’s small enough that people are comfortable asking questions — it’s also small enough that one instructor will have time to answer the questions. Make sure the room has desks for yourself and for the students: you will all need to be able to sit and work with your laptops. In particular, you will be doing a lot of live-coding as an instructor, so a lectern won’t be very helpful for you.

  4. On the day of your course, show up to the room a little early to set things up. We recommend presenting directly using mdbook serve running on your laptop (see the installation instructions). This ensures optimal performance with no lag as you change pages. Using your laptop will also allow you to fix typos as you or the course participants spot them.

  5. Let people solve the exercises by themselves or in small groups. We typically spend 30-45 minutes on exercises in the morning and in the afternoon (including time to review the solutions). Make sure to ask people if they’re stuck or if there is anything you can help with. When you see that several people have the same problem, call it out to the class and offer a solution, e.g., by showing people where to find the relevant information in the standard library.

That is all, good luck running the course! We hope it will be as much fun for you as it has been for us!

Please provide feedback afterwards so that we can keep improving the course. We would love to hear what worked well for you and what can be made better. Your students are also very welcome to send us feedback!

Course Structure

This page is for the course instructor.

Rust Fundamentals

The two days make up Rust Fundaments. The days are fast paced and we cover a lot of ground:

  • Day 1: Basic Rust, syntax, control flow, creating and consuming values. Memory management, ownership, compound data types, and the standard library.

  • Day 2: Generics, traits, error handling, testing, and unsafe Rust.

Format

The course is meant to be very interactive and we recommend letting the questions drive the exploration of Rust!

Keyboard Shortcuts

There are several useful keyboard shortcuts in mdBook:

  • Arrow-Left: Navigate to the previous page.
  • Arrow-Right: Navigate to the next page.
  • Ctrl + Enter: Execute the code sample that has focus.
  • s: Activate the search bar.

Translations

The course has been translated into other languages by a set of wonderful volunteers:

Use the language picker in the top-right corner to switch between languages.

Incomplete Translations

There is a large number of in-progress translations. We link to the most recently updated translations:

If you want to help with this effort, please see our instructions for how to get going. Translations are coordinated on the issue tracker.

Using Cargo

When you start reading about Rust, you will soon meet Cargo, the standard tool used in the Rust ecosystem to build and run Rust applications. Here we want to give a brief overview of what Cargo is and how it fits into the wider ecosystem and how it fits into this training.

Installation

Please follow the instructions on https://rustup.rs/.

This will give you the Cargo build tool (cargo) and the Rust compiler (rustc). You will also get rustup, a command line utility that you can use to install to different compiler versions.

After installing Rust, you should configure your editor or IDE to work with Rust. Most editors do this by talking to rust-analyzer, which provides auto-completion and jump-to-definition functionality for VS Code, Emacs, Vim/Neovim, and many others. There is also a different IDE available called RustRover.

  • On Debian/Ubuntu, you can also install Cargo, the Rust source and the Rust formatter via apt. However, this gets you an outdated rust version and may lead to unexpected behavior. The command would be:

    sudo apt install cargo rust-src rustfmt
    

The Rust Ecosystem

The Rust ecosystem consists of a number of tools, of which the main ones are:

  • rustc: the Rust compiler which turns .rs files into binaries and other intermediate formats.

  • cargo: the Rust dependency manager and build tool. Cargo knows how to download dependencies, usually hosted on https://crates.io, and it will pass them to rustc when building your project. Cargo also comes with a built-in test runner which is used to execute unit tests.

  • rustup: the Rust toolchain installer and updater. This tool is used to install and update rustc and cargo when new versions of Rust is released. In addition, rustup can also download documentation for the standard library. You can have multiple versions of Rust installed at once and rustup will let you switch between them as needed.

Key points:

  • Rust has a rapid release schedule with a new release coming out every six weeks. New releases maintain backwards compatibility with old releases — plus they enable new functionality.

  • There are three release channels: “stable”, “beta”, and “nightly”.

  • New features are being tested on “nightly”, “beta” is what becomes “stable” every six weeks.

  • Dependencies can also be resolved from alternative registries, git, folders, and more.

  • Rust also has editions: the current edition is Rust 2021. Previous editions were Rust 2015 and Rust 2018.

    • The editions are allowed to make backwards incompatible changes to the language.

    • To prevent breaking code, editions are opt-in: you select the edition for your crate via the Cargo.toml file.

    • To avoid splitting the ecosystem, Rust compilers can mix code written for different editions.

    • Mention that it is quite rare to ever use the compiler directly not through cargo (most users never do).

    • It might be worth alluding that Cargo itself is an extremely powerful and comprehensive tool. It is capable of many advanced features including but not limited to:

    • Read more from the official Cargo Book

Code Samples in This Training

For this training, we will mostly explore the Rust language through examples which can be executed through your browser. This makes the setup much easier and ensures a consistent experience for everyone.

Installing Cargo is still encouraged: it will make it easier for you to do the exercises. On the last day, we will do a larger exercise which shows you how to work with dependencies and for that you need Cargo.

The code blocks in this course are fully interactive:

fn main() {
    println!("Edit me!");
}

You can use Ctrl + Enter to execute the code when focus is in the text box.

Most code samples are editable like shown above. A few code samples are not editable for various reasons:

  • The embedded playgrounds cannot execute unit tests. Copy-paste the code and open it in the real Playground to demonstrate unit tests.

  • The embedded playgrounds lose their state the moment you navigate away from the page! This is the reason that the students should solve the exercises using a local Rust installation or via the Playground.

Running Code Locally with Cargo

If you want to experiment with the code on your own system, then you will need to first install Rust. Do this by following the instructions in the Rust Book. This should give you a working rustc and cargo. At the time of writing, the latest stable Rust release has these version numbers:

% rustc --version
rustc 1.69.0 (84c898d65 2023-04-16)
% cargo --version
cargo 1.69.0 (6e9a83356 2023-04-12)

You can use any later version too since Rust maintains backwards compatibility.

With this in place, follow these steps to build a Rust binary from one of the examples in this training:

  1. Click the “Copy to clipboard” button on the example you want to copy.

  2. Use cargo new exercise to create a new exercise/ directory for your code:

    $ cargo new exercise
         Created binary (application) `exercise` package
    
  3. Navigate into exercise/ and use cargo run to build and run your binary:

    $ cd exercise
    $ cargo run
       Compiling exercise v0.1.0 (/home/mgeisler/tmp/exercise)
        Finished dev [unoptimized + debuginfo] target(s) in 0.75s
         Running `target/debug/exercise`
    Hello, world!
    
  4. Replace the boiler-plate code in src/main.rs with your own code. For example, using the example on the previous page, make src/main.rs look like

    fn main() {
        println!("Edit me!");
    }
  5. Use cargo run to build and run your updated binary:

    $ cargo run
       Compiling exercise v0.1.0 (/home/mgeisler/tmp/exercise)
        Finished dev [unoptimized + debuginfo] target(s) in 0.24s
         Running `target/debug/exercise`
    Edit me!
    
  6. Use cargo check to quickly check your project for errors, use cargo build to compile it without running it. You will find the output in target/debug/ for a normal debug build. Use cargo build --release to produce an optimized release build in target/release/.

  7. You can add dependencies for your project by editing Cargo.toml. When you run cargo commands, it will automatically download and compile missing dependencies for you.

Try to encourage the class participants to install Cargo and use a local editor. It will make their life easier since they will have a normal development environment.

Welcome to Day 1

This is the first day of Rust Fundamentals. We will cover a lot of ground today:

  • Basic Rust syntax: variables, scalar and compound types, enums, structs, references, functions, and methods.

  • Control flow constructs: if, if let, while, while let, break, and continue.

  • Pattern matching: destructuring enums, structs, and arrays.

Please remind the students that:

  • They should ask questions when they get them, don’t save them to the end.
  • The class is meant to be interactive and discussions are very much encouraged!
    • As an instructor, you should try to keep the discussions relevant, i.e., keep the discussions related to how Rust does things vs some other language. It can be hard to find the right balance, but err on the side of allowing discussions since they engage people much more than one-way communication.
  • The questions will likely mean that we talk about things ahead of the slides.
    • This is perfectly okay! Repetition is an important part of learning. Remember that the slides are just a support and you are free to skip them as you like.

The idea for the first day is to show just enough of Rust to be able to speak about the famous borrow checker. The way Rust handles memory is a major feature and we should show students this right away.

If you’re teaching this in a classroom, this is a good place to go over the schedule. We suggest splitting the day into two parts (following the slides):

  • Morning: 9:00 to 12:00,
  • Afternoon: 13:00 to 16:00.

You can of course adjust this as necessary. Please make sure to include breaks, we recommend a break every hour!

What is Rust?

Rust is a new programming language which had its 1.0 release in 2015:

  • Rust is a statically compiled language in a similar role as C++
    • rustc uses LLVM as its backend.
  • Rust supports many platforms and architectures:
    • x86, ARM, WebAssembly, 

    • Linux, Mac, Windows, 

  • Rust is used for a wide range of devices:
    • firmware and boot loaders,
    • smart displays,
    • mobile phones,
    • desktops,
    • servers.

Rust fits in the same area as C++:

  • High flexibility.
  • High level of control.
  • Can be scaled down to very constrained devices such as microcontrollers.
  • Has no runtime or garbage collection.
  • Focuses on reliability and safety without sacrificing performance.

Hello World!

Let us jump into the simplest possible Rust program, a classic Hello World program:

fn main() {
    println!("Hello 🌍!");
}

What you see:

  • Functions are introduced with fn.
  • Blocks are delimited by curly braces like in C and C++.
  • The main function is the entry point of the program.
  • Rust has hygienic macros, println! is an example of this.
  • Rust strings are UTF-8 encoded and can contain any Unicode character.

This slide tries to make the students comfortable with Rust code. They will see a ton of it over the next three days so we start small with something familiar.

Key points:

  • Rust is very much like other languages in the C/C++/Java tradition. It is imperative and it doesn’t try to reinvent things unless absolutely necessary.

  • Rust is modern with full support for things like Unicode.

  • Rust uses macros for situations where you want to have a variable number of arguments (no function overloading).

  • Macros being ‘hygienic’ means they don’t accidentally capture identifiers from the scope they are used in. Rust macros are actually only partially hygienic.

  • Rust is multi-paradigm. For example, it has powerful object-oriented programming features, and, while it is not a functional language, it includes a range of functional concepts.

Small Example

Here is a small example program in Rust:

fn main() {              // Program entry point
    let mut x: i32 = 6;  // Mutable variable binding
    print!("{x}");       // Macro for printing, like printf
    while x != 1 {       // No parenthesis around expression
        if x % 2 == 0 {  // Math like in other languages
            x = x / 2;
        } else {
            x = 3 * x + 1;
        }
        print!(" -> {x}");
    }
    println!();
}

The code implements the Collatz conjecture: it is believed that the loop will always end, but this is not yet proved. Edit the code and play with different inputs.

Key points:

  • Explain that all variables are statically typed. Try removing i32 to trigger type inference. Try with i8 instead and trigger a runtime integer overflow.

  • Change let mut x to let x, discuss the compiler error.

  • Show how print! gives a compilation error if the arguments don’t match the format string.

  • Show how you need to use {} as a placeholder if you want to print an expression which is more complex than just a single variable.

  • Show the students the standard library, show them how to search for std::fmt which has the rules of the formatting mini-language. It’s important that the students become familiar with searching in the standard library.

    • In a shell rustup doc std::fmt will open a browser on the local std::fmt documentation

Why Rust?

Some unique selling points of Rust:

  • Compile time memory safety.
  • Lack of undefined runtime behavior.
  • Modern language features.

Make sure to ask the class which languages they have experience with. Depending on the answer you can highlight different features of Rust:

  • Experience with C or C++: Rust eliminates a whole class of runtime errors via the borrow checker. You get performance like in C and C++, but you don’t have the memory unsafety issues. In addition, you get a modern language with constructs like pattern matching and built-in dependency management.

  • Experience with Java, Go, Python, JavaScript
: You get the same memory safety as in those languages, plus a similar high-level language feeling. In addition you get fast and predictable performance like C and C++ (no garbage collector) as well as access to low-level hardware (should you need it)

An Example in C

Let’s consider the following “minimum wrong example” program in C:

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

int main(int argc, char* argv[]) {
	char *buf, *filename;
	FILE *fp;
	size_t bytes, len;
	struct stat st;

	switch (argc) {
		case 1:
			printf("Too few arguments!\n");
			return 1;

		case 2:
			filename = argv[argc];
			stat(filename, &st);
			len = st.st_size;
			
			buf = (char*)malloc(len);
			if (!buf)
				printf("malloc failed!\n", len);
				return 1;

			fp = fopen(filename, "rb");
			bytes = fread(buf, 1, len, fp);
			if (bytes = st.st_size)
				printf("%s", buf);
			else
				printf("fread failed!\n");

		case 3:
			printf("Too many arguments!\n");
			return 1;
	}

	return 0;
}

How many bugs do you spot?

Despite just 29 lines of code, this C example contains serious bugs in at least 11:

  1. Assignment = instead of equality comparison == (line 28)
  2. Excess argument to printf (line 23)
  3. File descriptor leak (after line 26)
  4. Forgotten braces in multi-line if (line 22)
  5. Forgotten break in a switch statement (line 32)
  6. Forgotten NUL-termination of the buf string, leading to a buffer overflow (line 29)
  7. Memory leak by not freeing the malloc-allocated buffer (line 21)
  8. Out-of-bounds access (line 17)
  9. Unchecked cases in the switch statement (line 11)
  10. Unchecked return values of stat and fopen (lines 18 and 26)

Shouldn’t these bugs be obvious even for a C compiler?
No, surprisingly this code compiles warning-free at the default warning level, even in the latest GCC version (13.2 as of writing).

Isn’t this a highly unrealistic example?
Absolutely not, these kind of bugs have lead to serious security vulnerabilities in the past. Some examples:

How is Rust any better here?
Safe Rust makes all of these bugs impossible:

  1. Assignments inside an if clause are not supported.
  2. Format strings are checked at compile-time.
  3. Resources are freed at the end of scope via the Drop trait.
  4. All if clauses require braces.
  5. match (as the Rust equivalent to switch) does not fall-through, hence you can’t accidentally forget a break.
  6. Buffer slices carry their size and don’t rely on a NUL terminator.
  7. Heap-allocated memory is freed via the Drop trait when the corresponding Box leaves the scope.
  8. Out-of-bounds accesses cause a panic or can be checked via the get method of a slice.
  9. match mandates that all cases are handled.
  10. Fallible Rust functions return Result values that need to be unwrapped and thereby checked for success. Additionally, the compiler emits a warning if you miss to check the return value of a function marked with #[must_use].

Compile Time Guarantees

Static memory management at compile time:

  • No uninitialized variables.
  • No memory leaks (mostly, see notes).
  • No double-frees.
  • No use-after-free.
  • No NULL pointers.
  • No forgotten locked mutexes.
  • No data races between threads.
  • No iterator invalidation.

It is possible to produce memory leaks in (safe) Rust. Some examples are:

  • You can use Box::leak to leak a pointer. A use of this could be to get runtime-initialized and runtime-sized static variables
  • You can use std::mem::forget to make the compiler “forget” about a value (meaning the destructor is never run).
  • You can also accidentally create a reference cycle with Rc or Arc.
  • In fact, some will consider infinitely populating a collection a memory leak and Rust does not protect from those.

For the purpose of this course, “No memory leaks” should be understood as “Pretty much no accidental memory leaks”.

Runtime Guarantees

No undefined behavior at runtime:

  • Array access is bounds checked.
  • Integer overflow is defined (panic or wrap-around).

Key points:

  • Integer overflow is defined via the overflow-checks compile-time flag. If enabled, the program will panic (a controlled crash of the program), otherwise you get wrap-around semantics. By default, you get panics in debug mode (cargo build) and wrap-around in release mode (cargo build --release).

  • Bounds checking cannot be disabled with a compiler flag. It can also not be disabled directly with the unsafe keyword. However, unsafe allows you to call functions such as slice::get_unchecked which does not do bounds checking.

Modern Features

Rust is built with all the experience gained in the last decades.

Language Features

  • Enums and pattern matching.
  • Generics.
  • No overhead FFI.
  • Zero-cost abstractions.

Tooling

  • Great compiler errors.
  • Built-in dependency manager.
  • Built-in support for testing.
  • Excellent Language Server Protocol support.

Key points:

  • Zero-cost abstractions, similar to C++, means that you don’t have to ‘pay’ for higher-level programming constructs with memory or CPU. For example, writing a loop using for should result in roughly the same low level instructions as using the .iter().fold() construct.

  • It may be worth mentioning that Rust enums are ‘Algebraic Data Types’, also known as ‘sum types’, which allow the type system to express things like Option<T> and Result<T, E>.

  • Remind people to read the errors — many developers have gotten used to ignore lengthy compiler output. The Rust compiler is significantly more talkative than other compilers. It will often provide you with actionable feedback, ready to copy-paste into your code.

  • The Rust standard library is small compared to languages like Java, Python, and Go. Rust does not come with several things you might consider standard and essential:

    • a random number generator, but see rand.
    • support for SSL or TLS, but see rusttls.
    • support for JSON, but see serde_json. The reasoning behind this is that functionality in the standard library cannot go away, so it has to be very stable. For the examples above, the Rust community is still working on finding the best solution — and perhaps there isn’t a single “best solution” for some of these things.

    Rust comes with a built-in package manager in the form of Cargo and this makes it trivial to download and compile third-party crates. A consequence of this is that the standard library can be smaller.

    Discovering good third-party crates can be a problem. Sites like https://lib.rs/ help with this by letting you compare health metrics for crates to find a good and trusted one.

  • rust-analyzer is a well supported LSP implementation used in major IDEs and text editors.

Basic Syntax

Much of the Rust syntax will be familiar to you from C, C++ or Java:

  • Blocks and scopes are delimited by curly braces.
  • Line comments are started with //, block comments are delimited by /* ... */.
  • Keywords like if and while work the same.
  • Variable assignment is done with =, comparison is done with ==.

Scalar Types

TypesLiterals
Signed integersi8, i16, i32, i64, i128, isize-10, 0, 1_000, 123_i64
Unsigned integersu8, u16, u32, u64, u128, usize0, 123, 10_u16
Floating point numbersf32, f643.14, -10.0e20, 2_f32
Strings&str"foo", "two\nlines"
Unicode scalar valueschar'a', 'α', '∞'
Booleansbooltrue, false

The types have widths as follows:

  • iN, uN, and fN are N bits wide,
  • isize and usize are the width of a pointer,
  • char is 32 bits wide,
  • bool is 8 bits wide.

There are a few syntaxes which are not shown above:

  • Raw strings allow you to create a &str value with escapes disabled: r"\n" == "\\n". You can embed double-quotes by using an equal amount of # on either side of the quotes:

    fn main() {
        println!(r#"<a href="link.html">link</a>"#);
        println!("<a href=\"link.html\">link</a>");
    }
  • Byte strings allow you to create a &[u8] value directly:

    fn main() {
        println!("{:?}", b"abc");
        println!("{:?}", &[97, 98, 99]);
    }
  • All underscores in numbers can be left out, they are for legibility only. So 1_000 can be written as 1000 (or 10_00), and 123_i64 can be written as 123i64.

Compound Types

TypesLiterals
Arrays[T; N][20, 30, 40], [0; 3]
Tuples(), (T,), (T1, T2), 
(), ('x',), ('x', 1.2), 


Array assignment and access:

fn main() {
    let mut a: [i8; 10] = [42; 10];
    a[5] = 0;
    println!("a: {:?}", a);
}

Tuple assignment and access:

fn main() {
    let t: (i8, bool) = (7, true);
    println!("t.0: {}", t.0);
    println!("t.1: {}", t.1);
}

Key points:

Arrays:

  • A value of the array type [T; N] holds N (a compile-time constant) elements of the same type T. Note that the length of the array is part of its type, which means that [u8; 3] and [u8; 4] are considered two different types.

  • We can use literals to assign values to arrays.

  • In the main function, the print statement asks for the debug implementation with the ? format parameter: {} gives the default output, {:?} gives the debug output. We could also have used {a} and {a:?} without specifying the value after the format string.

  • Adding #, eg {a:#?}, invokes a “pretty printing” format, which can be easier to read.

Tuples:

  • Like arrays, tuples have a fixed length.

  • Tuples group together values of different types into a compound type.

  • Fields of a tuple can be accessed by the period and the index of the value, e.g. t.0, t.1.

  • The empty tuple () is also known as the “unit type”. It is both a type, and the only valid value of that type - that is to say both the type and its value are expressed as (). It is used to indicate, for example, that a function or expression has no return value, as we’ll see in a future slide.

    • You can think of it as void that can be familiar to you from other programming languages.

References

Like C++, Rust has references:

fn main() {
    let mut x: i32 = 10;
    let ref_x: &mut i32 = &mut x;
    *ref_x = 20;
    println!("x: {x}");
}

Some notes:

  • We must dereference ref_x when assigning to it, similar to C and C++ pointers.
  • Rust will auto-dereference in some cases, in particular when invoking methods (try ref_x.count_ones()).
  • References that are declared as mut can be bound to different values over their lifetime.

Key points:

  • Be sure to note the difference between let mut ref_x: &i32 and let ref_x: &mut i32. The first one represents a mutable reference which can be bound to different values, while the second represents a reference to a mutable value.

Dangling References

Rust will statically forbid dangling references:

fn main() {
    let ref_x: &i32;
    {
        let x: i32 = 10;
        ref_x = &x;
    }
    println!("ref_x: {ref_x}");
}
  • A reference is said to “borrow” the value it refers to.
  • Rust is tracking the lifetimes of all references to ensure they live long enough.
  • We will talk more about borrowing when we get to ownership.

Slices

A slice gives you a view into a larger collection:

fn main() {
    let mut a: [i32; 6] = [10, 20, 30, 40, 50, 60];
    println!("a: {a:?}");

    let s: &[i32] = &a[2..4];

    println!("s: {s:?}");
}
  • Slices borrow data from the sliced type.
  • Question: What happens if you modify a[3] right before printing s?
  • We create a slice by borrowing a and specifying the starting and ending indexes in brackets.

  • If the slice starts at index 0, Rust’s range syntax allows us to drop the starting index, meaning that &a[0..a.len()] and &a[..a.len()] are identical.

  • The same is true for the last index, so &a[2..a.len()] and &a[2..] are identical.

  • To easily create a slice of the full array, we can therefore use &a[..].

  • s is a reference to a slice of i32s. Notice that the type of s (&[i32]) no longer mentions the array length. This allows us to perform computation on slices of different sizes.

  • Slices always borrow from another object. In this example, a has to remain ‘alive’ (in scope) for at least as long as our slice.

  • The question about modifying a[3] can spark an interesting discussion, but the answer is that for memory safety reasons you cannot do it through a at this point in the execution, but you can read the data from both a and s safely. It works before you created the slice, and again after the println, when the slice is no longer used. More details will be explained in the borrow checker section.

String vs str

We can now understand the two string types in Rust:

fn main() {
    let s1: &str = "World";
    println!("s1: {s1}");

    let mut s2: String = String::from("Hello ");
    println!("s2: {s2}");
    s2.push_str(s1);
    println!("s2: {s2}");
    
    let s3: &str = &s2[6..];
    println!("s3: {s3}");
}

Rust terminology:

  • &str an immutable reference to a string slice.
  • String a mutable string buffer.
  • &str introduces a string slice, which is an immutable reference to UTF-8 encoded string data stored in a block of memory. String literals (”Hello”), are stored in the program’s binary.

  • Rust’s String type is a wrapper around a vector of bytes. As with a Vec<T>, it is owned.

  • As with many other types String::from() creates a string from a string literal; String::new() creates a new empty string, to which string data can be added using the push() and push_str() methods.

  • The format!() macro is a convenient way to generate an owned string from dynamic values. It accepts the same format specification as println!().

  • You can borrow &str slices from String via & and optionally range selection.

  • For C++ programmers: think of &str as const char* from C++, but the one that always points to a valid string in memory. Rust String is a rough equivalent of std::string from C++ (main difference: it can only contain UTF-8 encoded bytes and will never use a small-string optimization).

Functions

A Rust version of the famous FizzBuzz interview question:

fn main() {
    print_fizzbuzz_to(20);
}

fn is_divisible(n: u32, divisor: u32) -> bool {
    if divisor == 0 {
        return false;
    }
    n % divisor == 0
}

fn fizzbuzz(n: u32) -> String {
    let fizz = if is_divisible(n, 3) { "fizz" } else { "" };
    let buzz = if is_divisible(n, 5) { "buzz" } else { "" };
    if fizz.is_empty() && buzz.is_empty() {
        return format!("{n}");
    }
    format!("{fizz}{buzz}")
}

fn print_fizzbuzz_to(n: u32) {
    for i in 1..=n {
        println!("{}", fizzbuzz(i));
    }
}
  • We refer in main to a function written below. Neither forward declarations nor headers are necessary.
  • Declaration parameters are followed by a type (the reverse of some programming languages), then a return type.
  • The last expression in a function body (or any block) becomes the return value. Simply omit the ; at the end of the expression.
  • Some functions have no return value, and return the ‘unit type’, (). The compiler will infer this if the -> () return type is omitted.
  • The range expression in the for loop in print_fizzbuzz_to() contains =n, which causes it to include the upper bound.

Rustdoc

All language items in Rust can be documented using special /// syntax.

/// Determine whether the first argument is divisible by the second argument.
///
/// If the second argument is zero, the result is false.
///
/// # Example
/// ```
/// assert!(is_divisible_by(42, 2));
/// ```
fn is_divisible_by(lhs: u32, rhs: u32) -> bool {
    if rhs == 0 {
        return false;  // Corner case, early return
    }
    lhs % rhs == 0     // The last expression in a block is the return value
}

The contents are treated as Markdown. All published Rust library crates are automatically documented at docs.rs using the rustdoc tool. It is idiomatic to document all public items in an API using this pattern. Code snippets can document usage and will be used as unit tests.

  • Show students the generated docs for the rand crate at docs.rs/rand.

  • This course does not include rustdoc on slides, just to save space, but in real code they should be present.

  • Inner doc comments are discussed later (in the page on modules) and need not be addressed here.

  • Rustdoc comments can contain code snippets that we can run and test using cargo test. We will discuss these tests in the Testing section.

Methods

Methods are functions associated with a type. The self argument of a method is an instance of the type it is associated with:

struct Rectangle {
    width: u32,
    height: u32,
}

impl Rectangle {
    fn area(&self) -> u32 {
        self.width * self.height
    }

    fn inc_width(&mut self, delta: u32) {
        self.width += delta;
    }
}

fn main() {
    let mut rect = Rectangle { width: 10, height: 5 };
    println!("old area: {}", rect.area());
    rect.inc_width(5);
    println!("new area: {}", rect.area());
}
  • We will look much more at methods in today’s exercise and in tomorrow’s class.
  • Add a static method called Rectangle::new and call this from main:

    fn new(width: u32, height: u32) -> Rectangle {
        Rectangle { width, height }
    }
  • While technically, Rust does not have custom constructors, static methods are commonly used to initialize structs (but don’t have to). The actual constructor, Rectangle { width, height }, could be called directly. See the Rustnomicon.

  • Add a Rectangle::square(width: u32) constructor to illustrate that such static methods can take arbitrary parameters.

Function Overloading

Overloading is not supported:

  • Each function has a single implementation:
    • Always takes a fixed number of parameters.
    • Always takes a single set of parameter types.
  • Default values are not supported:
    • All call sites have the same number of arguments.
    • Macros are sometimes used as an alternative.

However, function parameters can be generic:

fn pick_one<T>(a: T, b: T) -> T {
    if std::process::id() % 2 == 0 { a } else { b }
}

fn main() {
    println!("coin toss: {}", pick_one("heads", "tails"));
    println!("cash prize: {}", pick_one(500, 1000));
}
  • When using generics, the standard library’s Into<T> can provide a kind of limited polymorphism on argument types. We will see more details in a later section.

Variables

Rust provides type safety via static typing. Variable bindings are immutable by default:

fn main() {
    let x: i32 = 10;
    println!("x: {x}");
    // x = 20;
    // println!("x: {x}");
}
  • Due to type inference the i32 is optional. We will gradually show the types less and less as the course progresses.

Type Inference

Rust will look at how the variable is used to determine the type:

fn takes_u32(x: u32) {
    println!("u32: {x}");
}

fn takes_i8(y: i8) {
    println!("i8: {y}");
}

fn main() {
    let x = 10;
    let y = 20;

    takes_u32(x);
    takes_i8(y);
    // takes_u32(y);
}

This slide demonstrates how the Rust compiler infers types based on constraints given by variable declarations and usages.

It is very important to emphasize that variables declared like this are not of some sort of dynamic “any type” that can hold any data. The machine code generated by such declaration is identical to the explicit declaration of a type. The compiler does the job for us and helps us write more concise code.

The following code tells the compiler to copy into a certain generic container without the code ever explicitly specifying the contained type, using _ as a placeholder:

fn main() {
    let mut v = Vec::new();
    v.push((10, false));
    v.push((20, true));
    println!("v: {v:?}");

    let vv = v.iter().collect::<std::collections::HashSet<_>>();
    println!("vv: {vv:?}");
}

collect relies on FromIterator, which HashSet implements.

Static and Constant Variables

Static and constant variables are two different ways to create globally-scoped values that cannot be moved or reallocated during the execution of the program.

const

Constant variables are evaluated at compile time and their values are inlined wherever they are used:

const DIGEST_SIZE: usize = 3;
const ZERO: Option<u8> = Some(42);

fn compute_digest(text: &str) -> [u8; DIGEST_SIZE] {
    let mut digest = [ZERO.unwrap_or(0); DIGEST_SIZE];
    for (idx, &b) in text.as_bytes().iter().enumerate() {
        digest[idx % DIGEST_SIZE] = digest[idx % DIGEST_SIZE].wrapping_add(b);
    }
    digest
}

fn main() {
    let digest = compute_digest("Hello");
    println!("digest: {digest:?}");
}

According to the Rust RFC Book these are inlined upon use.

Only functions marked const can be called at compile time to generate const values. const functions can however be called at runtime.

static

Static variables will live during the whole execution of the program, and therefore will not move:

static BANNER: &str = "Welcome to RustOS 3.14";

fn main() {
    println!("{BANNER}");
}

As noted in the Rust RFC Book, these are not inlined upon use and have an actual associated memory location. This is useful for unsafe and embedded code, and the variable lives through the entirety of the program execution. When a globally-scoped value does not have a reason to need object identity, const is generally preferred.

Because static variables are accessible from any thread, they must be Sync. Interior mutability is possible through a Mutex, atomic or similar. It is also possible to have mutable statics, but they require manual synchronisation so any access to them requires unsafe code. We will look at mutable statics in the chapter on Unsafe Rust.

  • Mention that const behaves semantically similar to C++’s constexpr.
  • static, on the other hand, is much more similar to a const or mutable global variable in C++.
  • static provides object identity: an address in memory and state as required by types with interior mutability such as Mutex<T>.
  • It isn’t super common that one would need a runtime evaluated constant, but it is helpful and safer than using a static.
  • thread_local data can be created with the macro std::thread_local.

Properties table:

PropertyStaticConstant
Has an address in memoryYesNo (inlined)
Lives for the entire duration of the programYesNo
Can be mutableYes (unsafe)No
Evaluated at compile timeYes (initialised at compile time)Yes
Inlined wherever it is usedNoYes

Scopes and Shadowing

You can shadow variables, both those from outer scopes and variables from the same scope:

fn main() {
    let a = 10;
    println!("before: {a}");

    {
        let a = "hello";
        println!("inner scope: {a}");

        let a = true;
        println!("shadowed in inner scope: {a}");
    }

    println!("after: {a}");
}
  • Definition: Shadowing is different from mutation, because after shadowing both variable’s memory locations exist at the same time. Both are available under the same name, depending where you use it in the code.
  • A shadowing variable can have a different type.
  • Shadowing looks obscure at first, but is convenient for holding on to values after .unwrap().
  • The following code demonstrates why the compiler can’t simply reuse memory locations when shadowing an immutable variable in a scope, even if the type does not change.
fn main() {
    let a = 1;
    let b = &a;
    let a = a + 1;
    println!("{a} {b}");
}

Control Flow

As we have seen, if is an expression in Rust. It is used to conditionally evaluate one of two blocks, but the blocks can have a value which then becomes the value of the if expression. Other control flow expressions work similarly in Rust.

Blocks

A block in Rust contains a sequence of expressions. Each block has a value and a type, which are those of the last expression of the block:

fn main() {
    let x = {
        let y = 10;
        println!("y: {y}");
        let z = {
            let w = {
                3 + 4
            };
            println!("w: {w}");
            y * w
        };
        println!("z: {z}");
        z - y
    };
    println!("x: {x}");
}

If the last expression ends with ;, then the resulting value and type is ().

The same rule is used for functions: the value of the function body is the return value:

fn double(x: i32) -> i32 {
    x + x
}

fn main() {
    println!("double: {}", double(7));
}

Key Points:

  • The point of this slide is to show that blocks have a type and value in Rust.
  • You can show how the value of the block changes by changing the last line in the block. For instance, adding/removing a semicolon or using a return.

if expressions

You use if expressions exactly like if statements in other languages:

fn main() {
    let mut x = 10;
    if x % 2 == 0 {
        x = x / 2;
    } else {
        x = 3 * x + 1;
    }
}

In addition, you can use if as an expression. The last expression of each block becomes the value of the if expression:

fn main() {
    let mut x = 10;
    x = if x % 2 == 0 {
        x / 2
    } else {
        3 * x + 1
    };
}

Because if is an expression and must have a particular type, both of its branch blocks must have the same type. Consider showing what happens if you add ; after x / 2 in the second example.

for loops

The for loop is closely related to the while let loop. It will automatically call into_iter() on the expression and then iterate over it:

fn main() {
    let v = vec![10, 20, 30];

    for x in v {
        println!("x: {x}");
    }
    
    for i in (0..10).step_by(2) {
        println!("i: {i}");
    }
}

You can use break and continue here as usual.

  • Index iteration is not a special syntax in Rust for just that case.
  • (0..10) is a range that implements an Iterator trait.
  • step_by is a method that returns another Iterator that skips every other element.
  • Modify the elements in the vector and explain the compiler errors. Change vector v to be mutable and the for loop to for x in v.iter_mut().

while loops

The while keyword works very similar to other languages:

fn main() {
    let mut x = 10;
    while x != 1 {
        x = if x % 2 == 0 {
            x / 2
        } else {
            3 * x + 1
        };
    }
    println!("x: {x}");
}

break and continue

  • If you want to exit a loop early, use break,
  • If you want to immediately start the next iteration use continue.

Both continue and break can optionally take a label argument which is used to break out of nested loops:

fn main() {
    let v = vec![10, 20, 30];
    let mut iter = v.into_iter();
    'outer: while let Some(x) = iter.next() {
        println!("x: {x}");
        let mut i = 0;
        while i < x {
            println!("x: {x}, i: {i}");
            i += 1;
            if i == 3 {
                break 'outer;
            }
        }
    }
}

In this case we break the outer loop after 3 iterations of the inner loop.

loop expressions

Finally, there is a loop keyword which creates an endless loop.

Here you must either break or return to stop the loop:

fn main() {
    let mut x = 10;
    loop {
        x = if x % 2 == 0 {
            x / 2
        } else {
            3 * x + 1
        };
        if x == 1 {
            break;
        }
    }
    println!("x: {x}");
}
  • Break the loop with a value (e.g. break 8) and print it out.
  • Note that loop is the only looping construct which returns a non-trivial value. This is because it’s guaranteed to be entered at least once (unlike while and for loops).

Novel Control Flow

Rust has a few control flow constructs which differ from other languages. They are used for pattern matching:

  • if let expressions
  • while let expressions
  • match expressions

if let expressions

The if let expression lets you execute different code depending on whether a value matches a pattern:

fn main() {
    let arg = std::env::args().next();
    if let Some(value) = arg {
        println!("Program name: {value}");
    } else {
        println!("Missing name?");
    }
}

See pattern matching for more details on patterns in Rust.

  • Unlike match, if let does not have to cover all branches. This can make it more concise than match.

  • A common usage is handling Some values when working with Option.

  • Unlike match, if let does not support guard clauses for pattern matching.

  • Since 1.65, a similar let-else construct allows to do a destructuring assignment, or if it fails, execute a block which is required to abort normal control flow (with panic/return/break/continue):

    fn main() {
        println!("{:?}", second_word_to_upper("foo bar"));
    }
     
    fn second_word_to_upper(s: &str) -> Option<String> {
        let mut it = s.split(' ');
        let (Some(_), Some(item)) = (it.next(), it.next()) else {
            return None;
        };
        Some(item.to_uppercase())
    }
    

while let loops

Like with if let, there is a while let variant which repeatedly tests a value against a pattern:

fn main() {
    let v = vec![10, 20, 30];
    let mut iter = v.into_iter();

    while let Some(x) = iter.next() {
        println!("x: {x}");
    }
}

Here the iterator returned by v.into_iter() will return a Option<i32> on every call to next(). It returns Some(x) until it is done, after which it will return None. The while let lets us keep iterating through all items.

See pattern matching for more details on patterns in Rust.

  • Point out that the while let loop will keep going as long as the value matches the pattern.
  • You could rewrite the while let loop as an infinite loop with an if statement that breaks when there is no value to unwrap for iter.next(). The while let provides syntactic sugar for the above scenario.

match expressions

The match keyword is used to match a value against one or more patterns. In that sense, it works like a series of if let expressions:

fn main() {
    match std::env::args().next().as_deref() {
        Some("cat") => println!("Will do cat things"),
        Some("ls")  => println!("Will ls some files"),
        Some("mv")  => println!("Let's move some files"),
        Some("rm")  => println!("Uh, dangerous!"),
        None        => println!("Hmm, no program name?"),
        _           => println!("Unknown program name!"),
    }
}

Like if let, each match arm must have the same type. The type is the last expression of the block, if any. In the example above, the type is ().

See pattern matching for more details on patterns in Rust.

  • Save the match expression to a variable and print it out.
  • Remove .as_deref() and explain the error.
    • std::env::args().next() returns an Option<String>, but we cannot match against String.
    • as_deref() transforms an Option<T> to Option<&T::Target>. In our case, this turns Option<String> into Option<&str>.
    • We can now use pattern matching to match against the &str inside Option.

Standard Library

Rust comes with a standard library which helps establish a set of common types used by Rust library and programs. This way, two libraries can work together smoothly because they both use the same String type.

The common vocabulary types include:

  • Option and Result types: used for optional values and error handling.

  • String: the default string type used for owned data.

  • Vec: a standard extensible vector.

  • HashMap: a hash map type with a configurable hashing algorithm.

  • Box: an owned pointer for heap-allocated data.

  • Rc: a shared reference-counted pointer for heap-allocated data.

  • In fact, Rust contains several layers of the Standard Library: core, alloc and std.
  • core includes the most basic types and functions that don’t depend on libc, allocator or even the presence of an operating system.
  • alloc includes types which require a global heap allocator, such as Vec, Box and Arc.
  • Embedded Rust applications often only use core, and sometimes alloc.

String

String is the standard heap-allocated growable UTF-8 string buffer:

fn main() {
    let mut s1 = String::new();
    s1.push_str("Hello");
    println!("s1: len = {}, capacity = {}", s1.len(), s1.capacity());

    let mut s2 = String::with_capacity(s1.len() + 1);
    s2.push_str(&s1);
    s2.push('!');
    println!("s2: len = {}, capacity = {}", s2.len(), s2.capacity());

    let s3 = String::from("🇹🇭");
    println!("s3: len = {}, number of chars = {}", s3.len(),
             s3.chars().count());
}

String implements Deref<Target = str>, which means that you can call all str methods on a String.

  • String::new returns a new empty string, use String::with_capacity when you know how much data you want to push to the string.
  • String::len returns the size of the String in bytes (which can be different from its length in characters).
  • String::chars returns an iterator over the actual characters. Note that a char can be different from what a human will consider a “character” due to grapheme clusters.
  • When people refer to strings they could either be talking about &str or String.
  • When a type implements Deref<Target = T>, the compiler will let you transparently call methods from T.
    • String implements Deref<Target = str> which transparently gives it access to str’s methods.
    • Write and compare let s3 = s1.deref(); and let s3 = &*s1;.
  • String is implemented as a wrapper around a vector of bytes, many of the operations you see supported on vectors are also supported on String, but with some extra guarantees.
  • Compare the different ways to index a String:
    • To a character by using s3.chars().nth(i).unwrap() where i is in-bound, out-of-bounds.
    • To a substring by using s3[0..4], where that slice is on character boundaries or not.

Vec

Vec is the standard resizable heap-allocated buffer:

fn main() {
    let mut v1 = Vec::new();
    v1.push(42);
    println!("v1: len = {}, capacity = {}", v1.len(), v1.capacity());

    let mut v2 = Vec::with_capacity(v1.len() + 1);
    v2.extend(v1.iter());
    v2.push(9999);
    println!("v2: len = {}, capacity = {}", v2.len(), v2.capacity());

    // Canonical macro to initialize a vector with elements.
    let mut v3 = vec![0, 0, 1, 2, 3, 4];

    // Retain only the even elements.
    v3.retain(|x| x % 2 == 0);
    println!("{v3:?}");

    // Remove consecutive duplicates.
    v3.dedup();
    println!("{v3:?}");
}

Vec implements Deref<Target = [T]>, which means that you can call slice methods on a Vec.

  • Vec is a type of collection, along with String and HashMap. The data it contains is stored on the heap. This means the amount of data doesn’t need to be known at compile time. It can grow or shrink at runtime.
  • Notice how Vec<T> is a generic type too, but you don’t have to specify T explicitly. As always with Rust type inference, the T was established during the first push call.
  • vec![...] is a canonical macro to use instead of Vec::new() and it supports adding initial elements to the vector.
  • To index the vector you use [ ], but they will panic if out of bounds. Alternatively, using get will return an Option. The pop function will remove the last element.
  • Show iterating over a vector and mutating the value: for e in &mut v { *e += 50; }

HashMap

Standard hash map with protection against HashDoS attacks:

use std::collections::HashMap;

fn main() {
    let mut page_counts = HashMap::new();
    page_counts.insert("Adventures of Huckleberry Finn".to_string(), 207);
    page_counts.insert("Grimms' Fairy Tales".to_string(), 751);
    page_counts.insert("Pride and Prejudice".to_string(), 303);

    if !page_counts.contains_key("Les Misérables") {
        println!("We know about {} books, but not Les Misérables.",
                 page_counts.len());
    }

    for book in ["Pride and Prejudice", "Alice's Adventure in Wonderland"] {
        match page_counts.get(book) {
            Some(count) => println!("{book}: {count} pages"),
            None => println!("{book} is unknown.")
        }
    }

    // Use the .entry() method to insert a value if nothing is found.
    for book in ["Pride and Prejudice", "Alice's Adventure in Wonderland"] {
        let page_count: &mut i32 = page_counts.entry(book.to_string()).or_insert(0);
        *page_count += 1;
    }

    println!("{page_counts:#?}");
}
  • HashMap is not defined in the prelude and needs to be brought into scope.

  • Try the following lines of code. The first line will see if a book is in the hashmap and if not return an alternative value. The second line will insert the alternative value in the hashmap if the book is not found.

      let pc1 = page_counts
          .get("Harry Potter and the Sorcerer's Stone ")
          .unwrap_or(&336);
      let pc2 = page_counts
          .entry("The Hunger Games".to_string())
          .or_insert(374);
  • Unlike vec!, there is unfortunately no standard hashmap! macro.

    • Although, since Rust 1.56, HashMap implements From<[(K, V); N]>, which allows us to easily initialize a hash map from a literal array:

        let page_counts = HashMap::from([
          ("Harry Potter and the Sorcerer's Stone".to_string(), 336),
          ("The Hunger Games".to_string(), 374),
        ]);
  • Alternatively HashMap can be built from any Iterator which yields key-value tuples.

  • We are showing HashMap<String, i32>, and avoid using &str as key to make examples easier. Using references in collections can, of course, be done, but it can lead into complications with the borrow checker.

    • Try removing to_string() from the example above and see if it still compiles. Where do you think we might run into issues?
  • This type has several “method-specific” return types, such as std::collections::hash_map::Keys. These types often appear in searches of the Rust docs. Show students the docs for this type, and the helpful link back to the keys method.

Box

Box is an owned pointer to data on the heap:

fn main() {
    let five = Box::new(5);
    println!("five: {}", *five);
}
5StackHeapfive

Box<T> implements Deref<Target = T>, which means that you can call methods from T directly on a Box<T>.

  • Box is like std::unique_ptr in C++, except that it’s guaranteed to be not null.
  • In the above example, you can even leave out the * in the println! statement thanks to Deref.
  • A Box can be useful when you:
    • have a type whose size that can’t be known at compile time, but the Rust compiler wants to know an exact size.
    • want to transfer ownership of a large amount of data. To avoid copying large amounts of data on the stack, instead store the data on the heap in a Box so only the pointer is moved.

Box with Recursive Data Structures

Recursive data types or data types with dynamic sizes need to use a Box:

#[derive(Debug)]
enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

fn main() {
    let list: List<i32> = List::Cons(1, Box::new(List::Cons(2, Box::new(List::Nil))));
    println!("{list:?}");
}
StackHeaplistCons1Cons2Nil
  • If Box was not used and we attempted to embed a List directly into the List, the compiler would not compute a fixed size of the struct in memory (List would be of infinite size).

  • Box solves this problem as it has the same size as a regular pointer and just points at the next element of the List in the heap.

  • Remove the Box in the List definition and show the compiler error. “Recursive with indirection” is a hint you might want to use a Box or reference of some kind, instead of storing a value directly.

Memory Management

Traditionally, languages have fallen into two broad categories:

  • Full control via manual memory management: C, C++, Pascal, 

  • Full safety via automatic memory management at runtime: Java, Python, Go, Haskell, 


Rust offers a new mix:

Full control and safety via compile time enforcement of correct memory management.

It does this with an explicit ownership concept.

First, let’s refresh how memory management works.

The Stack vs The Heap

  • Stack: Continuous area of memory for local variables.

    • Values have fixed sizes known at compile time.
    • Extremely fast: just move a stack pointer.
    • Easy to manage: follows function calls.
    • Great memory locality.
  • Heap: Storage of values outside of function calls.

    • Values have dynamic sizes determined at runtime.
    • Slightly slower than the stack: some book-keeping needed.
    • No guarantee of memory locality.

Stack and Heap Example

Creating a String puts fixed-sized metadata on the stack and dynamically sized data, the actual string, on the heap:

fn main() {
    let s1 = String::from("Hello");
}
StackHeaps1ptrHellolen5capacity5
  • Mention that a String is backed by a Vec, so it has a capacity and length and can grow if mutable via reallocation on the heap.

  • If students ask about it, you can mention that the underlying memory is heap allocated using the System Allocator and custom allocators can be implemented using the Allocator API

More to Explore

We can inspect the memory layout with unsafe Rust. However, you should point out that this is rightfully unsafe!

fn main() {
    let mut s1 = String::from("Hello");
    s1.push(' ');
    s1.push_str("world");
    // DON'T DO THIS AT HOME! For educational purposes only.
    // String provides no guarantees about its layout, so this could lead to
    // undefined behavior.
    unsafe {
        let (ptr, capacity, len): (usize, usize, usize) = std::mem::transmute(s1);
        println!("ptr = {ptr:#x}, len = {len}, capacity = {capacity}");
    }
}

Manual Memory Management

You allocate and deallocate heap memory yourself.

If not done with care, this can lead to crashes, bugs, security vulnerabilities, and memory leaks.

C Example

You must call free on every pointer you allocate with malloc:

void foo(size_t n) {
    int* int_array = malloc(n * sizeof(int));
    //
    // ... lots of code
    //
    free(int_array);
}

Memory is leaked if the function returns early between malloc and free: the pointer is lost and we cannot deallocate the memory. Worse, freeing the pointer twice, or accessing a freed pointer can lead to exploitable security vulnerabilities.

Scope-Based Memory Management

Constructors and destructors let you hook into the lifetime of an object.

By wrapping a pointer in an object, you can free memory when the object is destroyed. The compiler guarantees that this happens, even if an exception is raised.

This is often called resource acquisition is initialization (RAII) and gives you smart pointers.

C++ Example

void say_hello(std::unique_ptr<Person> person) {
  std::cout << "Hello " << person->name << std::endl;
}
  • The std::unique_ptr object is allocated on the stack, and points to memory allocated on the heap.
  • At the end of say_hello, the std::unique_ptr destructor will run.
  • The destructor frees the Person object it points to.

Special move constructors are used when passing ownership to a function:

std::unique_ptr<Person> person = find_person("Carla");
say_hello(std::move(person));

Automatic Memory Management

An alternative to manual and scope-based memory management is automatic memory management:

  • The programmer never allocates or deallocates memory explicitly.
  • A garbage collector finds unused memory and deallocates it for the programmer.

Java Example

The person object is not deallocated after sayHello returns:

void sayHello(Person person) {
  System.out.println("Hello " + person.getName());
}

Memory Management in Rust

Memory management in Rust is a mix:

  • Safe and correct like Java, but without a garbage collector.
  • Scope-based like C++, but the compiler enforces full adherence.
  • A Rust user can choose the right abstraction for the situation, some even have no cost at runtime like C.

Rust achieves this by modeling ownership explicitly.

  • If asked how at this point, you can mention that in Rust this is usually handled by RAII wrapper types such as Box, Vec, Rc, or Arc. These encapsulate ownership and memory allocation via various means, and prevent the potential errors in C.

  • You may be asked about destructors here, the Drop trait is the Rust equivalent.

Ownership

All variable bindings have a scope where they are valid and it is an error to use a variable outside its scope:

struct Point(i32, i32);

fn main() {
    {
        let p = Point(3, 4);
        println!("x: {}", p.0);
    }
    println!("y: {}", p.1);
}
  • At the end of the scope, the variable is dropped and the data is freed.
  • A destructor can run here to free up resources.
  • We say that the variable owns the value.

Move Semantics

An assignment will transfer ownership between variables:

fn main() {
    let s1: String = String::from("Hello!");
    let s2: String = s1;
    println!("s2: {s2}");
    // println!("s1: {s1}");
}
  • The assignment of s1 to s2 transfers ownership.
  • When s1 goes out of scope, nothing happens: it does not own anything.
  • When s2 goes out of scope, the string data is freed.
  • There is always exactly one variable binding which owns a value.
  • Mention that this is the opposite of the defaults in C++, which copies by value unless you use std::move (and the move constructor is defined!).

  • It is only the ownership that moves. Whether any machine code is generated to manipulate the data itself is a matter of optimization, and such copies are aggressively optimized away.

  • Simple values (such as integers) can be marked Copy (see later slides).

  • In Rust, clones are explicit (by using clone).

Moved Strings in Rust

fn main() {
    let s1: String = String::from("Rust");
    let s2: String = s1;
}
  • The heap data from s1 is reused for s2.
  • When s1 goes out of scope, nothing happens (it has been moved from).

Before move to s2:

StackHeaps1ptrRustlen4capacity4

After move to s2:

StackHeaps1ptrRustlen4capacity4s2ptrlen4capacity4(inaccessible)

Defensive Copies in Modern C++

Modern C++ solves this differently:

std::string s1 = "Cpp";
std::string s2 = s1;  // Duplicate the data in s1.
  • The heap data from s1 is duplicated and s2 gets its own independent copy.
  • When s1 and s2 go out of scope, they each free their own memory.

Before copy-assignment:

StackHeaps1ptrCpplen3capacity3

After copy-assignment:

StackHeaps1ptrCpplen3capacity3s2ptrCpplen3capacity3

Key points:

  • C++ has made a slightly different choice than Rust. Because = copies data, the string data has to be cloned. Otherwise we would get a double-free when either string goes out of scope.

  • C++ also has std::move, which is used to indicate when a value may be moved from. If the example had been s2 = std::move(s1), no heap allocation would take place. After the move, s1 would be in a valid but unspecified state. Unlike Rust, the programmer is allowed to keep using s1.

  • Unlike Rust, = in C++ can run arbitrary code as determined by the type which is being copied or moved.

Moves in Function Calls

When you pass a value to a function, the value is assigned to the function parameter. This transfers ownership:

fn say_hello(name: String) {
    println!("Hello {name}")
}

fn main() {
    let name = String::from("Alice");
    say_hello(name);
    // say_hello(name);
}
  • With the first call to say_hello, main gives up ownership of name. Afterwards, name cannot be used anymore within main.
  • The heap memory allocated for name will be freed at the end of the say_hello function.
  • main can retain ownership if it passes name as a reference (&name) and if say_hello accepts a reference as a parameter.
  • Alternatively, main can pass a clone of name in the first call (name.clone()).
  • Rust makes it harder than C++ to inadvertently create copies by making move semantics the default, and by forcing programmers to make clones explicit.

Copying and Cloning

While move semantics are the default, certain types are copied by default:

fn main() {
    let x = 42;
    let y = x;
    println!("x: {x}");
    println!("y: {y}");
}

These types implement the Copy trait.

You can opt-in your own types to use copy semantics:

#[derive(Copy, Clone, Debug)]
struct Point(i32, i32);

fn main() {
    let p1 = Point(3, 4);
    let p2 = p1;
    println!("p1: {p1:?}");
    println!("p2: {p2:?}");
}
  • After the assignment, both p1 and p2 own their own data.
  • We can also use p1.clone() to explicitly copy the data.

Copying and cloning are not the same thing:

  • Copying refers to bitwise copies of memory regions and does not work on arbitrary objects.
  • Copying does not allow for custom logic (unlike copy constructors in C++).
  • Cloning is a more general operation and also allows for custom behavior by implementing the Clone trait.
  • Copying does not work on types that implement the Drop trait.

In the above example, try the following:

  • Add a String field to struct Point. It will not compile because String is not a Copy type.
  • Remove Copy from the derive attribute. The compiler error is now in the println! for p1.
  • Show that it works if you clone p1 instead.

If students ask about derive, it is sufficient to say that this is a way to generate code in Rust at compile time. In this case the default implementations of Copy and Clone traits are generated.

Borrowing

Instead of transferring ownership when calling a function, you can let a function borrow the value:

#[derive(Debug)]
struct Point(i32, i32);

fn add(p1: &Point, p2: &Point) -> Point {
    Point(p1.0 + p2.0, p1.1 + p2.1)
}

fn main() {
    let p1 = Point(3, 4);
    let p2 = Point(10, 20);
    let p3 = add(&p1, &p2);
    println!("{p1:?} + {p2:?} = {p3:?}");
}
  • The add function borrows two points and returns a new point.
  • The caller retains ownership of the inputs.

Notes on stack returns:

  • Demonstrate that the return from add is cheap because the compiler can eliminate the copy operation. Change the above code to print stack addresses and run it on the Playground or look at the assembly in Godbolt. In the “DEBUG” optimization level, the addresses should change, while they stay the same when changing to the “RELEASE” setting:

    #[derive(Debug)]
    struct Point(i32, i32);
    
    fn add(p1: &Point, p2: &Point) -> Point {
        let p = Point(p1.0 + p2.0, p1.1 + p2.1);
        println!("&p.0: {:p}", &p.0);
        p
    }
    
    pub fn main() {
        let p1 = Point(3, 4);
        let p2 = Point(10, 20);
        let p3 = add(&p1, &p2);
        println!("&p3.0: {:p}", &p3.0);
        println!("{p1:?} + {p2:?} = {p3:?}");
    }
  • The Rust compiler can do return value optimization (RVO).

  • In C++, copy elision has to be defined in the language specification because constructors can have side effects. In Rust, this is not an issue at all. If RVO did not happen, Rust will always perform a simple and efficient memcpy copy.

Shared and Unique Borrows

Rust puts constraints on the ways you can borrow values:

  • You can have one or more &T values at any given time, or
  • You can have exactly one &mut T value.
fn main() {
    let mut a: i32 = 10;
    let b: &i32 = &a;

    {
        let c: &mut i32 = &mut a;
        *c = 20;
    }

    println!("a: {a}");
    println!("b: {b}");
}
  • The above code does not compile because a is borrowed as mutable (through c) and as immutable (through b) at the same time.
  • Move the println! statement for b before the scope that introduces c to make the code compile.
  • After that change, the compiler realizes that b is only ever used before the new mutable borrow of a through c. This is a feature of the borrow checker called “non-lexical lifetimes”.

Lifetimes

A borrowed value has a lifetime:

  • The lifetime can be implicit: add(p1: &Point, p2: &Point) -> Point.
  • Lifetimes can also be explicit: &'a Point, &'document str.
  • Read &'a Point as “a borrowed Point which is valid for at least the lifetime a”.
  • Lifetimes are always inferred by the compiler: you cannot assign a lifetime yourself.
    • Lifetime annotations create constraints; the compiler verifies that there is a valid solution.
  • Lifetimes for function arguments and return values must be fully specified, but Rust allows lifetimes to be elided in most cases with a few simple rules.

Lifetimes in Function Calls

In addition to borrowing its arguments, a function can return a borrowed value:

#[derive(Debug)]
struct Point(i32, i32);

fn left_most<'a>(p1: &'a Point, p2: &'a Point) -> &'a Point {
    if p1.0 < p2.0 { p1 } else { p2 }
}

fn main() {
    let p1: Point = Point(10, 10);
    let p2: Point = Point(20, 20);
    let p3: &Point = left_most(&p1, &p2);
    println!("p3: {p3:?}");
}
  • 'a is a generic parameter, it is inferred by the compiler.
  • Lifetimes start with ' and 'a is a typical default name.
  • Read &'a Point as “a borrowed Point which is valid for at least the lifetime a”.
    • The at least part is important when parameters are in different scopes.

In the above example, try the following:

  • Move the declaration of p2 and p3 into a new scope ({ ... }), resulting in the following code:

    #[derive(Debug)]
    struct Point(i32, i32);
    
    fn left_most<'a>(p1: &'a Point, p2: &'a Point) -> &'a Point {
        if p1.0 < p2.0 { p1 } else { p2 }
    }
    
    fn main() {
        let p1: Point = Point(10, 10);
        let p3: &Point;
        {
            let p2: Point = Point(20, 20);
            p3 = left_most(&p1, &p2);
        }
        println!("p3: {p3:?}");
    }

    Note how this does not compile since p3 outlives p2.

  • Reset the workspace and change the function signature to fn left_most<'a, 'b>(p1: &'a Point, p2: &'a Point) -> &'b Point. This will not compile because the relationship between the lifetimes 'a and 'b is unclear.

  • Another way to explain it:

    • Two references to two values are borrowed by a function and the function returns another reference.
    • It must have come from one of those two inputs (or from a global variable).
    • Which one is it? The compiler needs to know, so at the call site the returned reference is not used for longer than a variable from where the reference came from.

Lifetimes in Data Structures

If a data type stores borrowed data, it must be annotated with a lifetime:

#[derive(Debug)]
struct Highlight<'doc>(&'doc str);

fn erase(text: String) {
    println!("Bye {text}!");
}

fn main() {
    let text = String::from("The quick brown fox jumps over the lazy dog.");
    let fox = Highlight(&text[4..19]);
    let dog = Highlight(&text[35..43]);
    // erase(text);
    println!("{fox:?}");
    println!("{dog:?}");
}
  • In the above example, the annotation on Highlight enforces that the data underlying the contained &str lives at least as long as any instance of Highlight that uses that data.
  • If text is consumed before the end of the lifetime of fox (or dog), the borrow checker throws an error.
  • Types with borrowed data force users to hold on to the original data. This can be useful for creating lightweight views, but it generally makes them somewhat harder to use.
  • When possible, make data structures own their data directly.
  • Some structs with multiple references inside can have more than one lifetime annotation. This can be necessary if there is a need to describe lifetime relationships between the references themselves, in addition to the lifetime of the struct itself. Those are very advanced use cases.

Modules

We have seen how impl blocks let us namespace functions to a type.

Similarly, mod lets us namespace types and functions:

mod foo {
    pub fn do_something() {
        println!("In the foo module");
    }
}

mod bar {
    pub fn do_something() {
        println!("In the bar module");
    }
}

fn main() {
    foo::do_something();
    bar::do_something();
}
  • Packages provide functionality and include a Cargo.toml file that describes how to build a bundle of 1+ crates.
  • Crates are a tree of modules, where a binary crate creates an executable and a library crate compiles to a library.
  • Modules define organization, scope, and are the focus of this section.

Visibility

Modules are a privacy boundary:

  • Module items are private by default (hides implementation details).
  • Parent and sibling items are always visible.
  • In other words, if an item is visible in module foo, it’s visible in all the descendants of foo.
mod outer {
    fn private() {
        println!("outer::private");
    }

    pub fn public() {
        println!("outer::public");
    }

    mod inner {
        fn private() {
            println!("outer::inner::private");
        }

        pub fn public() {
            println!("outer::inner::public");
            super::private();
        }
    }
}

fn main() {
    outer::public();
}
  • Use the pub keyword to make modules public.

Additionally, there are advanced pub(...) specifiers to restrict the scope of public visibility.

  • See the Rust Reference.
  • Configuring pub(crate) visibility is a common pattern.
  • Less commonly, you can give visibility to a specific path.
  • In any case, visibility must be granted to an ancestor module (and all of its descendants).

Paths

Paths are resolved as follows:

  1. As a relative path:

    • foo or self::foo refers to foo in the current module,
    • super::foo refers to foo in the parent module.
  2. As an absolute path:

    • crate::foo refers to foo in the root of the current crate,
    • bar::foo refers to foo in the bar crate.

A module can bring symbols from another module into scope with use. You will typically see something like this at the top of each module:

use std::collections::HashSet;
use std::mem::transmute;

Filesystem Hierarchy

Omitting the module content will tell Rust to look for it in another file:

mod garden;

This tells rust that the garden module content is found at src/garden.rs. Similarly, a garden::vegetables module can be found at src/garden/vegetables.rs.

The crate root is in:

  • src/lib.rs (for a library crate)
  • src/main.rs (for a binary crate)

Modules defined in files can be documented, too, using “inner doc comments”. These document the item that contains them – in this case, a module.

//! This module implements the garden, including a highly performant germination
//! implementation.

// Re-export types from this module.
pub use seeds::SeedPacket;
pub use garden::Garden;

/// Sow the given seed packets.
pub fn sow(seeds: Vec<SeedPacket>) { todo!() }

/// Harvest the produce in the garden that is ready.
pub fn harvest(garden: &mut Garden) { todo!() }
  • Before Rust 2018, modules needed to be located at module/mod.rs instead of module.rs, and this is still a working alternative for editions after 2018.

  • The main reason to introduce filename.rs as alternative to filename/mod.rs was because many files named mod.rs can be hard to distinguish in IDEs.

  • Deeper nesting can use folders, even if the main module is a file:

    src/
    ├── main.rs
    ├── top_module.rs
    └── top_module/
        └── sub_module.rs
    
  • The place rust will look for modules can be changed with a compiler directive:

    #[path = "some/path.rs"]
    mod some_module;

    This is useful, for example, if you would like to place tests for a module in a file named some_module_test.rs, similar to the convention in Go.

Day 1 Exercises

In these exercises, we will explore

  • Arrays and for loops.

  • The Luhn algorithm.

After looking at the exercises, you can look at the solutions provided.

Arrays and for Loops

We saw that an array can be declared like this:

#![allow(unused)]
fn main() {
let array = [10, 20, 30];
}

You can print such an array by asking for its debug representation with {:?}:

fn main() {
    let array = [10, 20, 30];
    println!("array: {array:?}");
}

Rust lets you iterate over things like arrays and ranges using the for keyword:

fn main() {
    let array = [10, 20, 30];
    print!("Iterating over array:");
    for n in &array {
        print!(" {n}");
    }
    println!();

    print!("Iterating over range:");
    for i in 0..3 {
        print!(" {}", array[i]);
    }
    println!();
}

Use the above to write a function pretty_print which pretty-print a matrix and a function transpose which will transpose a matrix (turn rows into columns):

2584567⎀8⎄9⎊transpose==⎛⎡1⎜⎱4⎝⎣73⎀⎞6⎄⎟9⎩⎠⎡1⎱2⎣3

Hard-code both functions to operate on 3 × 3 matrices.

Copy the code below to https://play.rust-lang.org/ and implement the functions:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

fn transpose(matrix: [[i32; 3]; 3]) -> [[i32; 3]; 3] {
    unimplemented!()
}

fn pretty_print(matrix: &[[i32; 3]; 3]) {
    unimplemented!()
}

fn main() {
    let matrix = [
        [101, 102, 103], // <-- the comment makes rustfmt add a newline
        [201, 202, 203],
        [301, 302, 303],
    ];

    println!("matrix:");
    pretty_print(&matrix);

    let transposed = transpose(matrix);
    println!("transposed:");
    pretty_print(&transposed);
}

Bonus Question

Could you use &[i32] slices instead of hard-coded 3 × 3 matrices for your argument and return types? Something like &[&[i32]] for a two-dimensional slice-of-slices. Why or why not?

See the ndarray crate for a production quality implementation.

The solution and the answer to the bonus section are available in the Solution section.

The use of the reference &array within for n in &array is a subtle preview of issues of ownership that will come later in the afternoon.

Without the &


  • The loop would have been one that consumes the array. This is a change introduced in the 2021 Edition.
  • An implicit array copy would have occurred. Since i32 is a copy type, then [i32; 3] is also a copy type.

Luhn Algorithm

The Luhn algorithm is used to validate credit card numbers. The algorithm takes a string as input and does the following to validate the credit card number:

  • Ignore all spaces. Reject number with less than two digits.

  • Moving from right to left, double every second digit: for the number 1234, we double 3 and 1. For the number 98765, we double 6 and 8.

  • After doubling a digit, sum the digits if the result is greater than 9. So doubling 7 becomes 14 which becomes 1 + 4 = 5.

  • Sum all the undoubled and doubled digits.

  • The credit card number is valid if the sum ends with 0.

Copy the code below to https://play.rust-lang.org/ and implement the function.

Try to solve the problem the “simple” way first, using for loops and integers. Then, revisit the solution and try to implement it with iterators.

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

pub fn luhn(cc_number: &str) -> bool {
    unimplemented!()
}

#[test]
fn test_non_digit_cc_number() {
    assert!(!luhn("foo"));
    assert!(!luhn("foo 0 0"));
}

#[test]
fn test_empty_cc_number() {
    assert!(!luhn(""));
    assert!(!luhn(" "));
    assert!(!luhn("  "));
    assert!(!luhn("    "));
}

#[test]
fn test_single_digit_cc_number() {
    assert!(!luhn("0"));
}

#[test]
fn test_two_digit_cc_number() {
    assert!(luhn(" 0 0 "));
}

#[test]
fn test_valid_cc_number() {
    assert!(luhn("4263 9826 4026 9299"));
    assert!(luhn("4539 3195 0343 6467"));
    assert!(luhn("7992 7398 713"));
}

#[test]
fn test_invalid_cc_number() {
    assert!(!luhn("4223 9826 4026 9299"));
    assert!(!luhn("4539 3195 0343 6476"));
    assert!(!luhn("8273 1232 7352 0569"));
}

#[allow(dead_code)]
fn main() {}

Welcome to Day 2

Now that we have seen a fair amount of Rust, we will continue with:

  • Structs and methods.

  • Enums and pattern matching.

  • Details of error handling.

  • Generic types and traits.

Finally, we will take a very quick look at “unsafe” Rust.

Structs

Like C and C++, Rust has support for custom structs:

struct Person {
    name: String,
    age: u8,
}

fn main() {
    let mut peter = Person {
        name: String::from("Peter"),
        age: 27,
    };
    println!("{} is {} years old", peter.name, peter.age);
    
    peter.age = 28;
    println!("{} is {} years old", peter.name, peter.age);
    
    let jackie = Person {
        name: String::from("Jackie"),
        ..peter
    };
    println!("{} is {} years old", jackie.name, jackie.age);
}

Key Points:

  • Structs work like in C or C++.
    • Like in C++, and unlike in C, no typedef is needed to define a type.
    • Unlike in C++, there is no inheritance between structs.
  • Methods are defined in an impl block, which we will see in following slides.
  • This may be a good time to let people know there are different types of structs.
    • Zero-sized structs e.g., struct Foo; might be used when implementing a trait on some type but don’t have any data that you want to store in the value itself.
    • The next slide will introduce Tuple structs, used when the field names are not important.
  • The syntax ..peter allows us to copy the majority of the fields from the old struct without having to explicitly type it all out. It must always be the last element.

Tuple Structs

If the field names are unimportant, you can use a tuple struct:

struct Point(i32, i32);

fn main() {
    let p = Point(17, 23);
    println!("({}, {})", p.0, p.1);
}

This is often used for single-field wrappers (called newtypes):

struct PoundsOfForce(f64);
struct Newtons(f64);

fn compute_thruster_force() -> PoundsOfForce {
    todo!("Ask a rocket scientist at NASA")
}

fn set_thruster_force(force: Newtons) {
    // ...
}

fn main() {
    let force = compute_thruster_force();
    set_thruster_force(force);
}
  • Newtypes are a great way to encode additional information about the value in a primitive type, for example:
    • The number is measured in some units: Newtons in the example above.
    • The value passed some validation when it was created, so you no longer have to validate it again at every use: ’PhoneNumber(String)orOddNumber(u32)`.
  • Demonstrate how to add a f64 value to a Newtons type by accessing the single field in the newtype.
    • Rust generally doesn’t like inexplicit things, like automatic unwrapping or for instance using booleans as integers.
    • Operator overloading is discussed on Day 3 (generics).
  • The example is a subtle reference to the Mars Climate Orbiter failure.

Field Shorthand Syntax

If you already have variables with the right names, then you can create the struct using a shorthand:

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn new(name: String, age: u8) -> Person {
        Person { name, age }
    }
}

fn main() {
    let peter = Person::new(String::from("Peter"), 27);
    println!("{peter:?}");
}
  • The new function could be written using Self as a type, as it is interchangeable with the struct type name

    #[derive(Debug)]
    struct Person {
        name: String,
        age: u8,
    }
    impl Person {
        fn new(name: String, age: u8) -> Self {
            Self { name, age }
        }
    }
  • Implement the Default trait for the struct. Define some fields and use the default values for the other fields.

    #[derive(Debug)]
    struct Person {
        name: String,
        age: u8,
    }
    impl Default for Person {
        fn default() -> Person {
            Person {
                name: "Bot".to_string(),
                age: 0,
            }
        }
    }
    fn create_default() {
        let tmp = Person {
            ..Person::default()
        };
        let tmp = Person {
            name: "Sam".to_string(),
            ..Person::default()
        };
    }
  • Methods are defined in the impl block.

  • Use struct update syntax to define a new structure using peter. Note that the variable peter will no longer be accessible afterwards.

  • Use {:#?} when printing structs to request the Debug representation.

Methods

Rust allows you to associate functions with your new types. You do this with an impl block:

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn say_hello(&self) {
        println!("Hello, my name is {}", self.name);
    }
}

fn main() {
    let peter = Person {
        name: String::from("Peter"),
        age: 27,
    };
    peter.say_hello();
}

Key Points:

  • It can be helpful to introduce methods by comparing them to functions.
    • Methods are called on an instance of a type (such as a struct or enum), the first parameter represents the instance as self.
    • Developers may choose to use methods to take advantage of method receiver syntax and to help keep them more organized. By using methods we can keep all the implementation code in one predictable place.
  • Point out the use of the keyword self, a method receiver.
    • Show that it is an abbreviated term for self: Self and perhaps show how the struct name could also be used.
    • Explain that Self is a type alias for the type the impl block is in and can be used elsewhere in the block.
    • Note how self is used like other structs and dot notation can be used to refer to individual fields.
    • This might be a good time to demonstrate how the &self differs from self by modifying the code and trying to run say_hello twice.
  • We describe the distinction between method receivers next.

Method Receiver

The &self above indicates that the method borrows the object immutably. There are other possible receivers for a method:

  • &self: borrows the object from the caller using a shared and immutable reference. The object can be used again afterwards.
  • &mut self: borrows the object from the caller using a unique and mutable reference. The object can be used again afterwards.
  • self: takes ownership of the object and moves it away from the caller. The method becomes the owner of the object. The object will be dropped (deallocated) when the method returns, unless its ownership is explicitly transmitted. Complete ownership does not automatically mean mutability.
  • mut self: same as above, but the method can mutate the object.
  • No receiver: this becomes a static method on the struct. Typically used to create constructors which are called new by convention.

Beyond variants on self, there are also special wrapper types allowed to be receiver types, such as Box<Self>.

Consider emphasizing “shared and immutable” and “unique and mutable”. These constraints always come together in Rust due to borrow checker rules, and self is no exception. It isn’t possible to reference a struct from multiple locations and call a mutating (&mut self) method on it.

Example

#[derive(Debug)]
struct Race {
    name: String,
    laps: Vec<i32>,
}

impl Race {
    fn new(name: &str) -> Race {  // No receiver, a static method
        Race { name: String::from(name), laps: Vec::new() }
    }

    fn add_lap(&mut self, lap: i32) {  // Exclusive borrowed read-write access to self
        self.laps.push(lap);
    }

    fn print_laps(&self) {  // Shared and read-only borrowed access to self
        println!("Recorded {} laps for {}:", self.laps.len(), self.name);
        for (idx, lap) in self.laps.iter().enumerate() {
            println!("Lap {idx}: {lap} sec");
        }
    }

    fn finish(self) {  // Exclusive ownership of self
        let total = self.laps.iter().sum::<i32>();
        println!("Race {} is finished, total lap time: {}", self.name, total);
    }
}

fn main() {
    let mut race = Race::new("Monaco Grand Prix");
    race.add_lap(70);
    race.add_lap(68);
    race.print_laps();
    race.add_lap(71);
    race.print_laps();
    race.finish();
    // race.add_lap(42);
}

Key Points:

  • All four methods here use a different method receiver.
    • You can point out how that changes what the function can do with the variable values and if/how it can be used again in main.
    • You can showcase the error that appears when trying to call finish twice.
  • Note that although the method receivers are different, the non-static functions are called the same way in the main body. Rust enables automatic referencing and dereferencing when calling methods. Rust automatically adds in the &, *, muts so that that object matches the method signature.
  • You might point out that print_laps is using a vector that is iterated over. We describe vectors in more detail in the afternoon.

Enums

The enum keyword allows the creation of a type which has a few different variants:

fn generate_random_number() -> i32 {
    // Implementation based on https://xkcd.com/221/
    4  // Chosen by fair dice roll. Guaranteed to be random.
}

#[derive(Debug)]
enum CoinFlip {
    Heads,
    Tails,
}

fn flip_coin() -> CoinFlip {
    let random_number = generate_random_number();
    if random_number % 2 == 0 {
        return CoinFlip::Heads;
    } else {
        return CoinFlip::Tails;
    }
}

fn main() {
    println!("You got: {:?}", flip_coin());
}

Key Points:

  • Enumerations allow you to collect a set of values under one type
  • This page offers an enum type CoinFlip with two variants Heads and Tails. You might note the namespace when using variants.
  • This might be a good time to compare Structs and Enums:
    • In both, you can have a simple version without fields (unit struct) or one with different types of fields (variant payloads).
    • In both, associated functions are defined within an impl block.
    • You could even implement the different variants of an enum with separate structs but then they wouldn’t be the same type as they would if they were all defined in an enum.

Variant Payloads

You can define richer enums where the variants carry data. You can then use the match statement to extract the data from each variant:

enum WebEvent {
    PageLoad,                 // Variant without payload
    KeyPress(char),           // Tuple struct variant
    Click { x: i64, y: i64 }, // Full struct variant
}

#[rustfmt::skip]
fn inspect(event: WebEvent) {
    match event {
        WebEvent::PageLoad       => println!("page loaded"),
        WebEvent::KeyPress(c)    => println!("pressed '{c}'"),
        WebEvent::Click { x, y } => println!("clicked at x={x}, y={y}"),
    }
}

fn main() {
    let load = WebEvent::PageLoad;
    let press = WebEvent::KeyPress('x');
    let click = WebEvent::Click { x: 20, y: 80 };

    inspect(load);
    inspect(press);
    inspect(click);
}
  • The values in the enum variants can only be accessed after being pattern matched. The pattern binds references to the fields in the “match arm” after the =>.
    • The expression is matched against the patterns from top to bottom. There is no fall-through like in C or C++.
    • The match expression has a value. The value is the last expression in the match arm which was executed.
    • Starting from the top we look for what pattern matches the value then run the code following the arrow. Once we find a match, we stop.
  • Demonstrate what happens when the search is inexhaustive. Note the advantage the Rust compiler provides by confirming when all cases are handled.
  • match inspects a hidden discriminant field in the enum.
  • It is possible to retrieve the discriminant by calling std::mem::discriminant()
    • This is useful, for example, if implementing PartialEq for structs where comparing field values doesn’t affect equality.
  • WebEvent::Click { ... } is not exactly the same as WebEvent::Click(Click) with a top level struct Click { ... }. The inlined version cannot implement traits, for example.

Enum Sizes

Rust enums are packed tightly, taking constraints due to alignment into account:

use std::any::type_name;
use std::mem::{align_of, size_of};

fn dbg_size<T>() {
    println!("{}: size {} bytes, align: {} bytes",
        type_name::<T>(), size_of::<T>(), align_of::<T>());
}

enum Foo {
    A,
    B,
}

fn main() {
    dbg_size::<Foo>();
}

Key Points:

  • Internally Rust is using a field (discriminant) to keep track of the enum variant.

  • You can control the discriminant if needed (e.g., for compatibility with C):

    #[repr(u32)]
    enum Bar {
        A,  // 0
        B = 10000,
        C,  // 10001
    }
    
    fn main() {
        println!("A: {}", Bar::A as u32);
        println!("B: {}", Bar::B as u32);
        println!("C: {}", Bar::C as u32);
    }

    Without repr, the discriminant type takes 2 bytes, because 10001 fits 2 bytes.

  • Try out other types such as

    • dbg_size!(bool): size 1 bytes, align: 1 bytes,
    • dbg_size!(Option<bool>): size 1 bytes, align: 1 bytes (niche optimization, see below),
    • dbg_size!(&i32): size 8 bytes, align: 8 bytes (on a 64-bit machine),
    • dbg_size!(Option<&i32>): size 8 bytes, align: 8 bytes (null pointer optimization, see below).

More to Explore

Rust has several optimizations it can employ to make enums take up less space.

  • Niche optimization: Rust will merge unused bit patterns for the enum discriminant.

  • Null pointer optimization: For some types, Rust guarantees that size_of::<T>() equals size_of::<Option<T>>().

    Example code if you want to show how the bitwise representation may look like in practice. It’s important to note that the compiler provides no guarantees regarding this representation, therefore this is totally unsafe.

    use std::mem::transmute;
    
    macro_rules! dbg_bits {
        ($e:expr, $bit_type:ty) => {
            println!("- {}: {:#x}", stringify!($e), transmute::<_, $bit_type>($e));
        };
    }
    
    fn main() {
        unsafe {
            println!("bool:");
            dbg_bits!(false, u8);
            dbg_bits!(true, u8);
    
            println!("Option<bool>:");
            dbg_bits!(None::<bool>, u8);
            dbg_bits!(Some(false), u8);
            dbg_bits!(Some(true), u8);
    
            println!("Option<Option<bool>>:");
            dbg_bits!(Some(Some(false)), u8);
            dbg_bits!(Some(Some(true)), u8);
            dbg_bits!(Some(None::<bool>), u8);
            dbg_bits!(None::<Option<bool>>, u8);
    
            println!("Option<&i32>:");
            dbg_bits!(None::<&i32>, usize);
            dbg_bits!(Some(&0i32), usize);
        }
    }

    More complex example if you want to discuss what happens when we chain more than 256 Options together.

    #![recursion_limit = "1000"]
    
    use std::mem::transmute;
    
    macro_rules! dbg_bits {
        ($e:expr, $bit_type:ty) => {
            println!("- {}: {:#x}", stringify!($e), transmute::<_, $bit_type>($e));
        };
    }
    
    // Macro to wrap a value in 2^n Some() where n is the number of "@" signs.
    // Increasing the recursion limit is required to evaluate this macro.
    macro_rules! many_options {
        ($value:expr) => { Some($value) };
        ($value:expr, @) => {
            Some(Some($value))
        };
        ($value:expr, @ $($more:tt)+) => {
            many_options!(many_options!($value, $($more)+), $($more)+)
        };
    }
    
    fn main() {
        // TOTALLY UNSAFE. Rust provides no guarantees about the bitwise
        // representation of types.
        unsafe {
            assert_eq!(many_options!(false), Some(false));
            assert_eq!(many_options!(false, @), Some(Some(false)));
            assert_eq!(many_options!(false, @@), Some(Some(Some(Some(false)))));
    
            println!("Bitwise representation of a chain of 128 Option's.");
            dbg_bits!(many_options!(false, @@@@@@@), u8);
            dbg_bits!(many_options!(true, @@@@@@@), u8);
    
            println!("Bitwise representation of a chain of 256 Option's.");
            dbg_bits!(many_options!(false, @@@@@@@@), u16);
            dbg_bits!(many_options!(true, @@@@@@@@), u16);
    
            println!("Bitwise representation of a chain of 257 Option's.");
            dbg_bits!(many_options!(Some(false), @@@@@@@@), u16);
            dbg_bits!(many_options!(Some(true), @@@@@@@@), u16);
            dbg_bits!(many_options!(None::<bool>, @@@@@@@@), u16);
        }
    }

Pattern Matching

The match keyword let you match a value against one or more patterns. The comparisons are done from top to bottom and the first match wins.

The patterns can be simple values, similarly to switch in C and C++:

fn main() {
    let input = 'x';

    match input {
        'q'                   => println!("Quitting"),
        'a' | 's' | 'w' | 'd' => println!("Moving around"),
        '0'..='9'             => println!("Number input"),
        _                     => println!("Something else"),
    }
}

The _ pattern is a wildcard pattern which matches any value.

Key Points:

  • You might point out how some specific characters are being used when in a pattern
    • | as an or
    • .. can expand as much as it needs to be
    • 1..=5 represents an inclusive range
    • _ is a wild card
  • It can be useful to show how binding works, by for instance replacing a wildcard character with a variable, or removing the quotes around q.
  • You can demonstrate matching on a reference.
  • This might be a good time to bring up the concept of irrefutable patterns, as the term can show up in error messages.

Destructuring Enums

Patterns can also be used to bind variables to parts of your values. This is how you inspect the structure of your types. Let us start with a simple enum type:

enum Result {
    Ok(i32),
    Err(String),
}

fn divide_in_two(n: i32) -> Result {
    if n % 2 == 0 {
        Result::Ok(n / 2)
    } else {
        Result::Err(format!("cannot divide {n} into two equal parts"))
    }
}

fn main() {
    let n = 100;
    match divide_in_two(n) {
        Result::Ok(half) => println!("{n} divided in two is {half}"),
        Result::Err(msg) => println!("sorry, an error happened: {msg}"),
    }
}

Here we have used the arms to destructure the Result value. In the first arm, half is bound to the value inside the Ok variant. In the second arm, msg is bound to the error message.

Key points:

  • The if/else expression is returning an enum that is later unpacked with a match.
  • You can try adding a third variant to the enum definition and displaying the errors when running the code. Point out the places where your code is now inexhaustive and how the compiler tries to give you hints.

Destructuring Structs

You can also destructure structs:

struct Foo {
    x: (u32, u32),
    y: u32,
}

#[rustfmt::skip]
fn main() {
    let foo = Foo { x: (1, 2), y: 3 };
    match foo {
        Foo { x: (1, b), y } => println!("x.0 = 1, b = {b}, y = {y}"),
        Foo { y: 2, x: i }   => println!("y = 2, x = {i:?}"),
        Foo { y, .. }        => println!("y = {y}, other fields were ignored"),
    }
}
  • Change the literal values in foo to match with the other patterns.
  • Add a new field to Foo and make changes to the pattern as needed.
  • The distinction between a capture and a constant expression can be hard to spot. Try changing the 2 in the second arm to a variable, and see that it subtly doesn’t work. Change it to a const and see it working again.

Destructuring Arrays

You can destructure arrays, tuples, and slices by matching on their elements:

#[rustfmt::skip]
fn main() {
    let triple = [0, -2, 3];
    println!("Tell me about {triple:?}");
    match triple {
        [0, y, z] => println!("First is 0, y = {y}, and z = {z}"),
        [1, ..]   => println!("First is 1 and the rest were ignored"),
        _         => println!("All elements were ignored"),
    }
}
  • Destructuring of slices of unknown length also works with patterns of fixed length.

    fn main() {
        inspect(&[0, -2, 3]);
        inspect(&[0, -2, 3, 4]);
    }
    
    #[rustfmt::skip]
    fn inspect(slice: &[i32]) {
        println!("Tell me about {slice:?}");
        match slice {
            &[0, y, z] => println!("First is 0, y = {y}, and z = {z}"),
            &[1, ..]   => println!("First is 1 and the rest were ignored"),
            _          => println!("All elements were ignored"),
        }
    }
  • Create a new pattern using _ to represent an element.

  • Add more values to the array.

  • Point out that how .. will expand to account for different number of elements.

  • Show matching against the tail with patterns [.., b] and [a@..,b]

Match Guards

When matching, you can add a guard to a pattern. This is an arbitrary Boolean expression which will be executed if the pattern matches:

#[rustfmt::skip]
fn main() {
    let pair = (2, -2);
    println!("Tell me about {pair:?}");
    match pair {
        (x, y) if x == y     => println!("These are twins"),
        (x, y) if x + y == 0 => println!("Antimatter, kaboom!"),
        (x, _) if x % 2 == 1 => println!("The first one is odd"),
        _                    => println!("No correlation..."),
    }
}

Key Points:

  • Match guards as a separate syntax feature are important and necessary when we wish to concisely express more complex ideas than patterns alone would allow.
  • They are not the same as separate if expression inside of the match arm. An if expression inside of the branch block (after =>) happens after the match arm is selected. Failing the if condition inside of that block won’t result in other arms of the original match expression being considered.
  • You can use the variables defined in the pattern in your if expression.
  • The condition defined in the guard applies to every expression in a pattern with an |.

Error Handling

Error handling in Rust is done using explicit control flow:

  • Functions that can have errors list this in their return type,
  • There are no exceptions.

Option and Result

The types represent optional data:

fn main() {
    let numbers = vec![10, 20, 30];
    let first: Option<&i8> = numbers.first();
    println!("first: {first:?}");

    let arr: Result<[i8; 3], Vec<i8>> = numbers.try_into();
    println!("arr: {arr:?}");
}
  • Option and Result are widely used not just in the standard library.
  • Option<&T> has zero space overhead compared to &T.
  • Result is the standard type to implement error handling as we will see on Day 3.
  • try_into attempts to convert the vector into a fixed-sized array. This can fail:
    • If the vector has the right size, Result::Ok is returned with the array.
    • Otherwise, Result::Err is returned with the original vector.

Panics

Rust will trigger a panic if a fatal error happens at runtime:

fn main() {
    let v = vec![10, 20, 30];
    println!("v[100]: {}", v[100]);
}
  • Panics are for unrecoverable and unexpected errors.
    • Panics are symptoms of bugs in the program.
  • Use non-panicking APIs (such as Vec::get) if crashing is not acceptable.

Catching the Stack Unwinding

By default, a panic will cause the stack to unwind. The unwinding can be caught:

use std::panic;

fn main() {
    let result = panic::catch_unwind(|| {
        "No problem here!"
    });
    println!("{result:?}");

    let result = panic::catch_unwind(|| {
        panic!("oh no!");
    });
    println!("{result:?}");
}
  • This can be useful in servers which should keep running even if a single request crashes.
  • This does not work if panic = 'abort' is set in your Cargo.toml.

Structured Error Handling with Result

We have already seen the Result enum. This is used pervasively when errors are expected as part of normal operation:

use std::fs;
use std::io::Read;

fn main() {
    let file = fs::File::open("diary.txt");
    match file {
        Ok(mut file) => {
            let mut contents = String::new();
            file.read_to_string(&mut contents);
            println!("Dear diary: {contents}");
        },
        Err(err) => {
            println!("The diary could not be opened: {err}");
        }
    }
}
  • As with Option, the successful value sits inside of Result, forcing the developer to explicitly extract it. This encourages error checking. In the case where an error should never happen, unwrap() or expect() can be called, and this is a signal of the developer intent too.
  • Result documentation is a recommended read. Not during the course, but it is worth mentioning. It contains a lot of convenience methods and functions that help functional-style programming.

Propagating Errors with ?

The try-operator ? is used to return errors to the caller. It lets you turn the common

match some_expression {
    Ok(value) => value,
    Err(err) => return Err(err),
}

into the much simpler

some_expression?

We can use this to simplify our error handling code:

use std::{fs, io};
use std::io::Read;

fn read_username(path: &str) -> Result<String, io::Error> {
    let username_file_result = fs::File::open(path);
    let mut username_file = match username_file_result {
        Ok(file) => file,
        Err(err) => return Err(err),
    };

    let mut username = String::new();
    match username_file.read_to_string(&mut username) {
        Ok(_) => Ok(username),
        Err(err) => Err(err),
    }
}

fn main() {
    //fs::write("config.dat", "alice").unwrap();
    let username = read_username("config.dat");
    println!("username or error: {username:?}");
}

Key points:

  • The username variable can be either Ok(string) or Err(error).
  • Use the fs::write call to test out the different scenarios: no file, empty file, file with username.
  • The return type of the function has to be compatible with the nested functions it calls. For instance, a function returning a Result<T, Err> can only apply the ? operator on a function returning a Result<AnyT, Err>. It cannot apply the ? operator on a function returning an Option<AnyT> or Result<T, OtherErr> unless OtherErr implements From<Err>. Reciprocally, a function returning an Option<T> can only apply the ? operator on a function returning an Option<AnyT>.
    • You can convert incompatible types into one another with the different Option and Result methods such as Option::ok_or, Result::ok, Result::err.

Converting Error Types

The effective expansion of ? is a little more complicated than previously indicated:

expression?

works the same as

match expression {
    Ok(value) => value,
    Err(err)  => return Err(From::from(err)),
}

The From::from call here means we attempt to convert the error type to the type returned by the function:

Converting Error Types

use std::error::Error;
use std::fmt::{self, Display, Formatter};
use std::fs::{self, File};
use std::io::{self, Read};

#[derive(Debug)]
enum ReadUsernameError {
    IoError(io::Error),
    EmptyUsername(String),
}

impl Error for ReadUsernameError {}

impl Display for ReadUsernameError {
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        match self {
            Self::IoError(e) => write!(f, "IO error: {e}"),
            Self::EmptyUsername(filename) => write!(f, "Found no username in {filename}"),
        }
    }
}

impl From<io::Error> for ReadUsernameError {
    fn from(err: io::Error) -> ReadUsernameError {
        ReadUsernameError::IoError(err)
    }
}

fn read_username(path: &str) -> Result<String, ReadUsernameError> {
    let mut username = String::with_capacity(100);
    File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(ReadUsernameError::EmptyUsername(String::from(path)));
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    let username = read_username("config.dat");
    println!("username or error: {username:?}");
}

Key points:

  • The username variable can be either Ok(string) or Err(error).
  • Use the fs::write call to test out the different scenarios: no file, empty file, file with username.

It is good practice for all error types that don’t need to be no_std to implement std::error::Error, which requires Debug and Display. The Error crate for core is only available in nightly, so not fully no_std compatible yet.

It’s generally helpful for them to implement Clone and Eq too where possible, to make life easier for tests and consumers of your library. In this case we can’t easily do so, because io::Error doesn’t implement them.

Deriving Error Enums

The thiserror crate is a popular way to create an error enum like we did on the previous page:

use std::{fs, io};
use std::io::Read;
use thiserror::Error;

#[derive(Debug, Error)]
enum ReadUsernameError {
    #[error("Could not read: {0}")]
    IoError(#[from] io::Error),
    #[error("Found no username in {0}")]
    EmptyUsername(String),
}

fn read_username(path: &str) -> Result<String, ReadUsernameError> {
    let mut username = String::new();
    fs::File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(ReadUsernameError::EmptyUsername(String::from(path)));
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err)     => println!("Error: {err}"),
    }
}

thiserror’s derive macro automatically implements std::error::Error, and optionally Display (if the #[error(...)] attributes are provided) and From (if the #[from] attribute is added). It also works for structs.

It doesn’t affect your public API, which makes it good for libraries.

Dynamic Error Types

Sometimes we want to allow any type of error to be returned without writing our own enum covering all the different possibilities. std::error::Error makes this easy.

use std::fs;
use std::io::Read;
use thiserror::Error;
use std::error::Error;

#[derive(Clone, Debug, Eq, Error, PartialEq)]
#[error("Found no username in {0}")]
struct EmptyUsernameError(String);

fn read_username(path: &str) -> Result<String, Box<dyn Error>> {
    let mut username = String::new();
    fs::File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(EmptyUsernameError(String::from(path)).into());
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err)     => println!("Error: {err}"),
    }
}

This saves on code, but gives up the ability to cleanly handle different error cases differently in the program. As such it’s generally not a good idea to use Box<dyn Error> in the public API of a library, but it can be a good option in a program where you just want to display the error message somewhere.

Adding Context to Errors

The widely used anyhow crate can help you add contextual information to your errors and allows you to have fewer custom error types:

use std::{fs, io};
use std::io::Read;
use anyhow::{Context, Result, bail};

fn read_username(path: &str) -> Result<String> {
    let mut username = String::with_capacity(100);
    fs::File::open(path)
        .with_context(|| format!("Failed to open {path}"))?
        .read_to_string(&mut username)
        .context("Failed to read")?;
    if username.is_empty() {
        bail!("Found no username in {path}");
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err)     => println!("Error: {err:?}"),
    }
}
  • anyhow::Result<V> is a type alias for Result<V, anyhow::Error>.
  • anyhow::Error is essentially a wrapper around Box<dyn Error>. As such it’s again generally not a good choice for the public API of a library, but is widely used in applications.
  • Actual error type inside of it can be extracted for examination if necessary.
  • Functionality provided by anyhow::Result<T> may be familiar to Go developers, as it provides similar usage patterns and ergonomics to (T, error) from Go.

Generics

Rust support generics, which lets you abstract algorithms or data structures (such as sorting or a binary tree) over the types used or stored.

Generic Data Types

You can use generics to abstract over the concrete field type:

#[derive(Debug)]
struct Point<T> {
    x: T,
    y: T,
}

fn main() {
    let integer = Point { x: 5, y: 10 };
    let float = Point { x: 1.0, y: 4.0 };
    println!("{integer:?} and {float:?}");
}
  • Try declaring a new variable let p = Point { x: 5, y: 10.0 };.

  • Fix the code to allow points that have elements of different types.

Generic Methods

You can declare a generic type on your impl block:

#[derive(Debug)]
struct Point<T>(T, T);

impl<T> Point<T> {
    fn x(&self) -> &T {
        &self.0  // + 10
    }

    // fn set_x(&mut self, x: T)
}

fn main() {
    let p = Point(5, 10);
    println!("p.x = {}", p.x());
}
  • Q: Why T is specified twice in impl<T> Point<T> {}? Isn’t that redundant?
    • This is because it is a generic implementation section for generic type. They are independently generic.
    • It means these methods are defined for any T.
    • It is possible to write impl Point<u32> { .. }.
      • Point is still generic and you can use Point<f64>, but methods in this block will only be available for Point<u32>.

Monomorphization

Generic code is turned into non-generic code based on the call sites:

fn main() {
    let integer = Some(5);
    let float = Some(5.0);
}

behaves as if you wrote

enum Option_i32 {
    Some(i32),
    None,
}

enum Option_f64 {
    Some(f64),
    None,
}

fn main() {
    let integer = Option_i32::Some(5);
    let float = Option_f64::Some(5.0);
}

This is a zero-cost abstraction: you get exactly the same result as if you had hand-coded the data structures without the abstraction.

Traits

Rust lets you abstract over types with traits. They’re similar to interfaces:

struct Dog { name: String, age: i8 }
struct Cat { lives: i8 } // No name needed, cats won't respond anyway.

trait Pet {
    fn talk(&self) -> String;
}

impl Pet for Dog {
    fn talk(&self) -> String { format!("Woof, my name is {}!", self.name) }
}

impl Pet for Cat {
    fn talk(&self) -> String { String::from("Miau!") }
}

fn greet<P: Pet>(pet: &P) {
    println!("Oh you're a cutie! What's your name? {}", pet.talk());
}

fn main() {
    let captain_floof = Cat { lives: 9 };
    let fido = Dog { name: String::from("Fido"), age: 5 };

    greet(&captain_floof);
    greet(&fido);
}

Trait Objects

Trait objects allow for values of different types, for instance in a collection:

struct Dog { name: String, age: i8 }
struct Cat { lives: i8 } // No name needed, cats won't respond anyway.

trait Pet {
    fn talk(&self) -> String;
}

impl Pet for Dog {
    fn talk(&self) -> String { format!("Woof, my name is {}!", self.name) }
}

impl Pet for Cat {
    fn talk(&self) -> String { String::from("Miau!") }
}

fn main() {
    let pets: Vec<Box<dyn Pet>> = vec![
        Box::new(Cat { lives: 9 }),
        Box::new(Dog { name: String::from("Fido"), age: 5 }),
    ];
    for pet in pets {
        println!("Hello, who are you? {}", pet.talk());
    }
}

Memory layout after allocating pets:

<Dog as Pet>::talk<Cat as Pet>::talkStackHeappetsFidoptrlen2capacity2dataname,4,4age5vtabledatalives9vtable
  • Types that implement a given trait may be of different sizes. This makes it impossible to have things like Vec<dyn Pet> in the example above.
  • dyn Pet is a way to tell the compiler about a dynamically sized type that implements Pet.
  • In the example, pets is allocated on the stack and the vector data is on the heap. The two vector elements are fat pointers:
    • A fat pointer is a double-width pointer. It has two components: a pointer to the actual object and a pointer to the virtual method table (vtable) for the Pet implementation of that particular object.
    • The data for the Dog named Fido is the name and age fields. The Cat has a lives field.
  • Compare these outputs in the above example:
        println!("{} {}", std::mem::size_of::<Dog>(), std::mem::size_of::<Cat>());
        println!("{} {}", std::mem::size_of::<&Dog>(), std::mem::size_of::<&Cat>());
        println!("{}", std::mem::size_of::<&dyn Pet>());
        println!("{}", std::mem::size_of::<Box<dyn Pet>>());

Deriving Traits

Rust derive macros work by automatically generating code that implements the specified traits for a data structure.

You can let the compiler derive a number of traits as follows:

#[derive(Debug, Clone, PartialEq, Eq, Default)]
struct Player {
    name: String,
    strength: u8,
    hit_points: u8,
}

fn main() {
    let p1 = Player::default();
    let p2 = p1.clone();
    println!("Is {:?}\nequal to {:?}?\nThe answer is {}!", &p1, &p2,
             if p1 == p2 { "yes" } else { "no" });
}

Default Methods

Traits can implement behavior in terms of other trait methods:

trait Equals {
    fn equals(&self, other: &Self) -> bool;
    fn not_equals(&self, other: &Self) -> bool {
        !self.equals(other)
    }
}

#[derive(Debug)]
struct Centimeter(i16);

impl Equals for Centimeter {
    fn equals(&self, other: &Centimeter) -> bool {
        self.0 == other.0
    }
}

fn main() {
    let a = Centimeter(10);
    let b = Centimeter(20);
    println!("{a:?} equals {b:?}: {}", a.equals(&b));
    println!("{a:?} not_equals {b:?}: {}", a.not_equals(&b));
}
  • Traits may specify pre-implemented (default) methods and methods that users are required to implement themselves. Methods with default implementations can rely on required methods.

  • Move method not_equals to a new trait NotEquals.

  • Make Equals a super trait for NotEquals.

    trait NotEquals: Equals {
        fn not_equals(&self, other: &Self) -> bool {
            !self.equals(other)
        }
    }
  • Provide a blanket implementation of NotEquals for Equals.

    trait NotEquals {
        fn not_equals(&self, other: &Self) -> bool;
    }
    
    impl<T> NotEquals for T where T: Equals {
        fn not_equals(&self, other: &Self) -> bool {
            !self.equals(other)
        }
    }
    • With the blanket implementation, you no longer need Equals as a super trait for NotEqual.

Trait Bounds

When working with generics, you often want to require the types to implement some trait, so that you can call this trait’s methods.

You can do this with T: Trait or impl Trait:

fn duplicate<T: Clone>(a: T) -> (T, T) {
    (a.clone(), a.clone())
}

// Syntactic sugar for:
//   fn add_42_millions<T: Into<i32>>(x: T) -> i32 {
fn add_42_millions(x: impl Into<i32>) -> i32 {
    x.into() + 42_000_000
}

// struct NotClonable;

fn main() {
    let foo = String::from("foo");
    let pair = duplicate(foo);
    println!("{pair:?}");

    let many = add_42_millions(42_i8);
    println!("{many}");
    let many_more = add_42_millions(10_000_000);
    println!("{many_more}");
}

Show a where clause, students will encounter it when reading code.

fn duplicate<T>(a: T) -> (T, T)
where
    T: Clone,
{
    (a.clone(), a.clone())
}
  • It declutters the function signature if you have many parameters.
  • It has additional features making it more powerful.
    • If someone asks, the extra feature is that the type on the left of “:” can be arbitrary, like Option<T>.

impl Trait

Similar to trait bounds, an impl Trait syntax can be used in function arguments and return values:

use std::fmt::Display;

fn get_x(name: impl Display) -> impl Display {
    format!("Hello {name}")
}

fn main() {
    let x = get_x("foo");
    println!("{x}");
}
  • impl Trait allows you to work with types which you cannot name.

The meaning of impl Trait is a bit different in the different positions.

  • For a parameter, impl Trait is like an anonymous generic parameter with a trait bound.

  • For a return type, it means that the return type is some concrete type that implements the trait, without naming the type. This can be useful when you don’t want to expose the concrete type in a public API.

    Inference is hard in return position. A function returning impl Foo picks the concrete type it returns, without writing it out in the source. A function returning a generic type like collect<B>() -> B can return any type satisfying B, and the caller may need to choose one, such as with let x: Vec<_> = foo.collect() or with the turbofish, foo.collect::<Vec<_>>().

This example is great, because it uses impl Display twice. It helps to explain that nothing here enforces that it is the same impl Display type. If we used a single T: Display, it would enforce the constraint that input T and return T type are the same type. It would not work for this particular function, as the type we expect as input is likely not what format! returns. If we wanted to do the same via : Display syntax, we’d need two independent generic parameters.

Important Traits

We will now look at some of the most common traits of the Rust standard library:

Iterators

You can implement the Iterator trait on your own types:

struct Fibonacci {
    curr: u32,
    next: u32,
}

impl Iterator for Fibonacci {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        let new_next = self.curr + self.next;
        self.curr = self.next;
        self.next = new_next;
        Some(self.curr)
    }
}

fn main() {
    let fib = Fibonacci { curr: 0, next: 1 };
    for (i, n) in fib.enumerate().take(5) {
        println!("fib({i}): {n}");
    }
}
  • The Iterator trait implements many common functional programming operations over collections (e.g. map, filter, reduce, etc). This is the trait where you can find all the documentation about them. In Rust these functions should produce the code as efficient as equivalent imperative implementations.

  • IntoIterator is the trait that makes for loops work. It is implemented by collection types such as Vec<T> and references to them such as &Vec<T> and &[T]. Ranges also implement it. This is why you can iterate over a vector with for i in some_vec { .. } but some_vec.next() doesn’t exist.

FromIterator

FromIterator lets you build a collection from an Iterator.

fn main() {
    let primes = vec![2, 3, 5, 7];
    let prime_squares = primes
        .into_iter()
        .map(|prime| prime * prime)
        .collect::<Vec<_>>();
    println!("prime_squares: {prime_squares:?}");
}

Iterator implements fn collect<B>(self) -> B where B: FromIterator<Self::Item>, Self: Sized

There are also implementations which let you do cool things like convert an Iterator<Item = Result<V, E>> into a Result<Vec<V>, E>.

From and Into

Types implement From and Into to facilitate type conversions:

fn main() {
    let s = String::from("hello");
    let addr = std::net::Ipv4Addr::from([127, 0, 0, 1]);
    let one = i16::from(true);
    let bigger = i32::from(123i16);
    println!("{s}, {addr}, {one}, {bigger}");
}

Into is automatically implemented when From is implemented:

fn main() {
    let s: String = "hello".into();
    let addr: std::net::Ipv4Addr = [127, 0, 0, 1].into();
    let one: i16 = true.into();
    let bigger: i32 = 123i16.into();
    println!("{s}, {addr}, {one}, {bigger}");
}
  • That’s why it is common to only implement From, as your type will get Into implementation too.
  • When declaring a function argument input type like “anything that can be converted into a String”, the rule is opposite, you should use Into. Your function will accept types that implement From and those that only implement Into.

Read and Write

Using Read and BufRead, you can abstract over u8 sources:

use std::io::{BufRead, BufReader, Read, Result};

fn count_lines<R: Read>(reader: R) -> usize {
    let buf_reader = BufReader::new(reader);
    buf_reader.lines().count()
}

fn main() -> Result<()> {
    let slice: &[u8] = b"foo\nbar\nbaz\n";
    println!("lines in slice: {}", count_lines(slice));

    let file = std::fs::File::open(std::env::current_exe()?)?;
    println!("lines in file: {}", count_lines(file));
    Ok(())
}

Similarly, Write lets you abstract over u8 sinks:

use std::io::{Result, Write};

fn log<W: Write>(writer: &mut W, msg: &str) -> Result<()> {
    writer.write_all(msg.as_bytes())?;
    writer.write_all("\n".as_bytes())
}

fn main() -> Result<()> {
    let mut buffer = Vec::new();
    log(&mut buffer, "Hello")?;
    log(&mut buffer, "World")?;
    println!("Logged: {:?}", buffer);
    Ok(())
}

The Drop Trait

Values which implement Drop can specify code to run when they go out of scope:

struct Droppable {
    name: &'static str,
}

impl Drop for Droppable {
    fn drop(&mut self) {
        println!("Dropping {}", self.name);
    }
}

fn main() {
    let a = Droppable { name: "a" };
    {
        let b = Droppable { name: "b" };
        {
            let c = Droppable { name: "c" };
            let d = Droppable { name: "d" };
            println!("Exiting block B");
        }
        println!("Exiting block A");
    }
    drop(a);
    println!("Exiting main");
}
  • Note that std::mem::drop is not the same as std::ops::Drop::drop.
  • Values are automatically dropped when they go out of scope.
  • When a value is dropped, if it implements std::ops::Drop then its Drop::drop implementation will be called.
  • All its fields will then be dropped too, whether or not it implements Drop.
  • std::mem::drop is just an empty function that takes any value. The significance is that it takes ownership of the value, so at the end of its scope it gets dropped. This makes it a convenient way to explicitly drop values earlier than they would otherwise go out of scope.
    • This can be useful for objects that do some work on drop: releasing locks, closing files, etc.

Discussion points:

  • Why doesn’t Drop::drop take self?
    • Short-answer: If it did, std::mem::drop would be called at the end of the block, resulting in another call to Drop::drop, and a stack overflow!
  • Try replacing drop(a) with a.drop().

The Default Trait

Default trait produces a default value for a type.

#[derive(Debug, Default)]
struct Derived {
    x: u32,
    y: String,
    z: Implemented,
}

#[derive(Debug)]
struct Implemented(String);

impl Default for Implemented {
    fn default() -> Self {
        Self("John Smith".into())
    }
}

fn main() {
    let default_struct = Derived::default();
    println!("{default_struct:#?}");

    let almost_default_struct = Derived {
        y: "Y is set!".into(),
        ..Derived::default()
    };
    println!("{almost_default_struct:#?}");

    let nothing: Option<Derived> = None;
    println!("{:#?}", nothing.unwrap_or_default());
}
  • It can be implemented directly or it can be derived via #[derive(Default)].
  • A derived implementation will produce a value where all fields are set to their default values.
    • This means all types in the struct must implement Default too.
  • Standard Rust types often implement Default with reasonable values (e.g. 0, "", etc).
  • The partial struct copy works nicely with default.
  • Rust standard library is aware that types can implement Default and provides convenience methods that use it.
  • the .. syntax is called struct update syntax

Add, Mul, 


Operator overloading is implemented via traits in std::ops:

#[derive(Debug, Copy, Clone)]
struct Point { x: i32, y: i32 }

impl std::ops::Add for Point {
    type Output = Self;

    fn add(self, other: Self) -> Self {
        Self {x: self.x + other.x, y: self.y + other.y}
    }
}

fn main() {
    let p1 = Point { x: 10, y: 20 };
    let p2 = Point { x: 100, y: 200 };
    println!("{:?} + {:?} = {:?}", p1, p2, p1 + p2);
}

Discussion points:

  • You could implement Add for &Point. In which situations is that useful?
    • Answer: Add:add consumes self. If type T for which you are overloading the operator is not Copy, you should consider overloading the operator for &T as well. This avoids unnecessary cloning on the call site.
  • Why is Output an associated type? Could it be made a type parameter of the method?
    • Short answer: Function type parameters are controlled by the caller, but associated types (like Output) are controlled by the implementor of a trait.
  • You could implement Add for two different types, e.g. impl Add<(i32, i32)> for Point would add a tuple to a Point.

Closures

Closures or lambda expressions have types which cannot be named. However, they implement special Fn, FnMut, and FnOnce traits:

fn apply_with_log(func: impl FnOnce(i32) -> i32, input: i32) -> i32 {
    println!("Calling function on {input}");
    func(input)
}

fn main() {
    let add_3 = |x| x + 3;
    println!("add_3: {}", apply_with_log(add_3, 10));
    println!("add_3: {}", apply_with_log(add_3, 20));

    let mut v = Vec::new();
    let mut accumulate = |x: i32| {
        v.push(x);
        v.iter().sum::<i32>()
    };
    println!("accumulate: {}", apply_with_log(&mut accumulate, 4));
    println!("accumulate: {}", apply_with_log(&mut accumulate, 5));

    let multiply_sum = |x| x * v.into_iter().sum::<i32>();
    println!("multiply_sum: {}", apply_with_log(multiply_sum, 3));
}

An Fn (e.g. add_3) neither consumes nor mutates captured values, or perhaps captures nothing at all. It can be called multiple times concurrently.

An FnMut (e.g. accumulate) might mutate captured values. You can call it multiple times, but not concurrently.

If you have an FnOnce (e.g. multiply_sum), you may only call it once. It might consume captured values.

FnMut is a subtype of FnOnce. Fn is a subtype of FnMut and FnOnce. I.e. you can use an FnMut wherever an FnOnce is called for, and you can use an Fn wherever an FnMut or FnOnce is called for.

The compiler also infers Copy (e.g. for add_3) and Clone (e.g. multiply_sum), depending on what the closure captures.

By default, closures will capture by reference if they can. The move keyword makes them capture by value.

fn make_greeter(prefix: String) -> impl Fn(&str) {
    return move |name| println!("{} {}", prefix, name)
}

fn main() {
    let hi = make_greeter("Hi".to_string());
    hi("there");
}

Unsafe Rust

The Rust language has two parts:

  • Safe Rust: memory safe, no undefined behavior possible.
  • Unsafe Rust: can trigger undefined behavior if preconditions are violated.

We will be seeing mostly safe Rust in this course, but it’s important to know what Unsafe Rust is.

Unsafe code is usually small and isolated, and its correctness should be carefully documented. It is usually wrapped in a safe abstraction layer.

Unsafe Rust gives you access to five new capabilities:

  • Dereference raw pointers.
  • Access or modify mutable static variables.
  • Access union fields.
  • Call unsafe functions, including extern functions.
  • Implement unsafe traits.

We will briefly cover unsafe capabilities next. For full details, please see Chapter 19.1 in the Rust Book and the Rustonomicon.

Unsafe Rust does not mean the code is incorrect. It means that developers have turned off the compiler safety features and have to write correct code by themselves. It means the compiler no longer enforces Rust’s memory-safety rules.

Dereferencing Raw Pointers

Creating pointers is safe, but dereferencing them requires unsafe:

fn main() {
    let mut num = 5;

    let r1 = &mut num as *mut i32;
    let r2 = r1 as *const i32;

    // Safe because r1 and r2 were obtained from references and so are
    // guaranteed to be non-null and properly aligned, the objects underlying
    // the references from which they were obtained are live throughout the
    // whole unsafe block, and they are not accessed either through the
    // references or concurrently through any other pointers.
    unsafe {
        println!("r1 is: {}", *r1);
        *r1 = 10;
        println!("r2 is: {}", *r2);
    }
}

It is good practice (and required by the Android Rust style guide) to write a comment for each unsafe block explaining how the code inside it satisfies the safety requirements of the unsafe operations it is doing.

In the case of pointer dereferences, this means that the pointers must be valid, i.e.:

  • The pointer must be non-null.
  • The pointer must be dereferenceable (within the bounds of a single allocated object).
  • The object must not have been deallocated.
  • There must not be concurrent accesses to the same location.
  • If the pointer was obtained by casting a reference, the underlying object must be live and no reference may be used to access the memory.

In most cases the pointer must also be properly aligned.

Mutable Static Variables

It is safe to read an immutable static variable:

static HELLO_WORLD: &str = "Hello, world!";

fn main() {
    println!("HELLO_WORLD: {HELLO_WORLD}");
}

However, since data races can occur, it is unsafe to read and write mutable static variables:

static mut COUNTER: u32 = 0;

fn add_to_counter(inc: u32) {
    unsafe { COUNTER += inc; }  // Potential data race!
}

fn main() {
    add_to_counter(42);

    unsafe { println!("COUNTER: {COUNTER}"); }  // Potential data race!
}
  • The program here is safe because it is single-threaded. However, the Rust compiler is conservative and will assume the worst. Try removing the unsafe and see how the compiler explains that it is undefined behavior to mutate a static from multiple threads.

  • Using a mutable static is generally a bad idea, but there are some cases where it might make sense in low-level no_std code, such as implementing a heap allocator or working with some C APIs.

Unions

Unions are like enums, but you need to track the active field yourself:

#[repr(C)]
union MyUnion {
    i: u8,
    b: bool,
}

fn main() {
    let u = MyUnion { i: 42 };
    println!("int: {}", unsafe { u.i });
    println!("bool: {}", unsafe { u.b });  // Undefined behavior!
}

Unions are very rarely needed in Rust as you can usually use an enum. They are occasionally needed for interacting with C library APIs.

If you just want to reinterpret bytes as a different type, you probably want std::mem::transmute or a safe wrapper such as the zerocopy crate.

Calling Unsafe Functions

A function or method can be marked unsafe if it has extra preconditions you must uphold to avoid undefined behaviour:

fn main() {
    let emojis = "đŸ—»âˆˆđŸŒ";

    // Safe because the indices are in the correct order, within the bounds of
    // the string slice, and lie on UTF-8 sequence boundaries.
    unsafe {
        println!("emoji: {}", emojis.get_unchecked(0..4));
        println!("emoji: {}", emojis.get_unchecked(4..7));
        println!("emoji: {}", emojis.get_unchecked(7..11));
    }

    println!("char count: {}", count_chars(unsafe { emojis.get_unchecked(0..7) }));

    // Not upholding the UTF-8 encoding requirement breaks memory safety!
    // println!("emoji: {}", unsafe { emojis.get_unchecked(0..3) });
    // println!("char count: {}", count_chars(unsafe { emojis.get_unchecked(0..3) }));
}

fn count_chars(s: &str) -> usize {
    s.chars().map(|_| 1).sum()
}

Writing Unsafe Functions

You can mark your own functions as unsafe if they require particular conditions to avoid undefined behaviour.

/// Swaps the values pointed to by the given pointers.
///
/// # Safety
///
/// The pointers must be valid and properly aligned.
unsafe fn swap(a: *mut u8, b: *mut u8) {
    let temp = *a;
    *a = *b;
    *b = temp;
}

fn main() {
    let mut a = 42;
    let mut b = 66;

    // Safe because ...
    unsafe {
        swap(&mut a, &mut b);
    }

    println!("a = {}, b = {}", a, b);
}

We wouldn’t actually use pointers for this because it can be done safely with references.

Note that unsafe code is allowed within an unsafe function without an unsafe block. We can prohibit this with #[deny(unsafe_op_in_unsafe_fn)]. Try adding it and see what happens.

Calling External Code

Functions from other languages might violate the guarantees of Rust. Calling them is thus unsafe:

extern "C" {
    fn abs(input: i32) -> i32;
}

fn main() {
    unsafe {
        // Undefined behavior if abs misbehaves.
        println!("Absolute value of -3 according to C: {}", abs(-3));
    }
}

This is usually only a problem for extern functions which do things with pointers which might violate Rust’s memory model, but in general any C function might have undefined behaviour under any arbitrary circumstances.

The "C" in this example is the ABI; other ABIs are available too.

Implementing Unsafe Traits

Like with functions, you can mark a trait as unsafe if the implementation must guarantee particular conditions to avoid undefined behaviour.

For example, the zerocopy crate has an unsafe trait that looks something like this:

use std::mem::size_of_val;
use std::slice;

/// ...
/// # Safety
/// The type must have a defined representation and no padding.
pub unsafe trait AsBytes {
    fn as_bytes(&self) -> &[u8] {
        unsafe {
            slice::from_raw_parts(self as *const Self as *const u8, size_of_val(self))
        }
    }
}

// Safe because u32 has a defined representation and no padding.
unsafe impl AsBytes for u32 {}

There should be a # Safety section on the Rustdoc for the trait explaining the requirements for the trait to be safely implemented.

The actual safety section for AsBytes is rather longer and more complicated.

The built-in Send and Sync traits are unsafe.

Day 2: Exercises

We will look at implementing methods in two contexts:

  • Storing books and querying the collection

  • Implementing point and polygon types

After looking at the exercises, you can look at the solutions provided.

Storing Books

We will learn much more about structs and the Vec<T> type tomorrow. For now, you just need to know part of its API:

fn main() {
    let mut vec = vec![10, 20];
    vec.push(30);
    let midpoint = vec.len() / 2;
    println!("middle value: {}", vec[midpoint]);
    for item in &vec {
        println!("item: {item}");
    }
}

Use this to model a library’s book collection. Copy the code below to https://play.rust-lang.org/ and update the types to make it compile:

struct Library {
    books: Vec<Book>,
}

struct Book {
    title: String,
    year: u16,
}

impl Book {
    // This is a constructor, used below.
    fn new(title: &str, year: u16) -> Book {
        Book {
            title: String::from(title),
            year,
        }
    }
}

// Implement the methods below. Notice how the `self` parameter
// changes type to indicate the method's required level of ownership
// over the object:
//
// - `&self` for shared read-only access,
// - `&mut self` for unique and mutable access,
// - `self` for unique access by value.
impl Library {
    fn new() -> Library {
        todo!("Initialize and return a `Library` value")
    }

    fn len(&self) -> usize {
        todo!("Return the length of `self.books`")
    }

    fn is_empty(&self) -> bool {
        todo!("Return `true` if `self.books` is empty")
    }

    fn add_book(&mut self, book: Book) {
        todo!("Add a new book to `self.books`")
    }

    fn print_books(&self) {
        todo!("Iterate over `self.books` and print each book's title and year")
    }

    fn oldest_book(&self) -> Option<&Book> {
        todo!("Return a reference to the oldest book (if any)")
    }
}

fn main() {
    let mut library = Library::new();

    println!(
        "The library is empty: library.is_empty() -> {}",
        library.is_empty()
    );

    library.add_book(Book::new("Lord of the Rings", 1954));
    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));

    println!(
        "The library is no longer empty: library.is_empty() -> {}",
        library.is_empty()
    );

    library.print_books();

    match library.oldest_book() {
        Some(book) => println!("The oldest book is {}", book.title),
        None => println!("The library is empty!"),
    }

    println!("The library has {} books", library.len());
    library.print_books();
}

Polygon Struct

We will create a Polygon struct which contain some points. Copy the code below to https://play.rust-lang.org/ and fill in the missing methods to make the tests pass:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

pub struct Point {
    // add fields
}

impl Point {
    // add methods
}

pub struct Polygon {
    // add fields
}

impl Polygon {
    // add methods
}

pub struct Circle {
    // add fields
}

impl Circle {
    // add methods
}

pub enum Shape {
    Polygon(Polygon),
    Circle(Circle),
}

#[cfg(test)]
mod tests {
    use super::*;

    fn round_two_digits(x: f64) -> f64 {
        (x * 100.0).round() / 100.0
    }

    #[test]
    fn test_point_magnitude() {
        let p1 = Point::new(12, 13);
        assert_eq!(round_two_digits(p1.magnitude()), 17.69);
    }

    #[test]
    fn test_point_dist() {
        let p1 = Point::new(10, 10);
        let p2 = Point::new(14, 13);
        assert_eq!(round_two_digits(p1.dist(p2)), 5.00);
    }

    #[test]
    fn test_point_add() {
        let p1 = Point::new(16, 16);
        let p2 = p1 + Point::new(-4, 3);
        assert_eq!(p2, Point::new(12, 19));
    }

    #[test]
    fn test_polygon_left_most_point() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);
        assert_eq!(poly.left_most_point(), Some(p1));
    }

    #[test]
    fn test_polygon_iter() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);

        let points = poly.iter().cloned().collect::<Vec<_>>();
        assert_eq!(points, vec![Point::new(12, 13), Point::new(16, 16)]);
    }

    #[test]
    fn test_shape_perimeters() {
        let mut poly = Polygon::new();
        poly.add_point(Point::new(12, 13));
        poly.add_point(Point::new(17, 11));
        poly.add_point(Point::new(16, 16));
        let shapes = vec![
            Shape::from(poly),
            Shape::from(Circle::new(Point::new(10, 20), 5)),
        ];
        let perimeters = shapes
            .iter()
            .map(Shape::perimeter)
            .map(round_two_digits)
            .collect::<Vec<_>>();
        assert_eq!(perimeters, vec![15.48, 31.42]);
    }
}

#[allow(dead_code)]
fn main() {}

Since the method signatures are missing from the problem statements, the key part of the exercise is to specify those correctly. You don’t have to modify the tests.

Other interesting parts of the exercise:

  • Derive a Copy trait for some structs, as in tests the methods sometimes don’t borrow their arguments.
  • Discover that Add trait must be implemented for two objects to be addable via “+”. Note that we do not discuss generics until Day 3.

Thanks!

Thank you for taking Comprehensive Rust Two-Day 🩀! We hope you enjoyed it and that it was useful.

If you spotted any mistakes or have ideas for improvements, please get in contact with us on GitHub or on the Rust-Edu Zulip. We would love to hear from you.

Glossary

The following is a glossary which aims to give a short definition of many Rust terms. For translations, this also serves to connect the term back to the English original.

  • allocate:
    Dynamic memory allocation on the heap.
  • argument:
  • Bare-metal Rust:
    Low-level Rust development, often deployed to a system without an operating system. See Bare-metal Rust.
  • block:
    See Blocks and scope.
  • borrow:
    See Borrowing.
  • borrow checker:
    The part of the Rust compiler which checks that all borrows are valid.
  • brace:
    { and }. Also called curly brace, they delimit blocks.
  • build:
  • call:
  • channel:
    Used to safely pass messages between threads.
  • Comprehensive Rust 🩀:
    The courses here are jointly called Comprehensive Rust 🩀.
  • concurrency:
  • Concurrency in Rust:
    See Concurrency in Rust.
  • constant:
  • control flow:
  • crash:
  • enumeration:
  • error:
  • error handling:
  • exercise:
  • function:
  • garbage collector:
  • generics:
  • immutable:
  • integration test:
  • keyword:
  • library:
  • macro:
  • main function:
  • match:
  • memory leak:
  • method:
  • module:
  • move:
  • mutable:
  • ownership:
  • panic:
  • parameter:
  • pattern:
  • payload:
  • program:
  • programming language:
  • receiver:
  • reference counting:
  • return:
  • Rust:
  • Rust Fundamentals:
    Days 1 to 3 of this course.
  • Rust in Android:
    See Rust in Android.
  • safe:
  • scope:
  • standard library:
  • static:
  • string:
  • struct:
  • test:
  • thread:
  • thread safety:
  • trait:
  • type:
  • type inference:
  • undefined behavior:
  • union:
  • unit test:
  • unsafe:
  • variable:\

Other Rust Resources

The Rust community has created a wealth of high-quality and free resources online.

Official Documentation

The Rust project hosts many resources. These cover Rust in general:

  • The Rust Programming Language: the canonical free book about Rust. Covers the language in detail and includes a few projects for people to build.
  • Rust By Example: covers the Rust syntax via a series of examples which showcase different constructs. Sometimes includes small exercises where you are asked to expand on the code in the examples.
  • Rust Standard Library: full documentation of the standard library for Rust.
  • The Rust Reference: an incomplete book which describes the Rust grammar and memory model.

More specialized guides hosted on the official Rust site:

  • The Rustonomicon: covers unsafe Rust, including working with raw pointers and interfacing with other languages (FFI).
  • Asynchronous Programming in Rust: covers the new asynchronous programming model which was introduced after the Rust Book was written.
  • The Embedded Rust Book: an introduction to using Rust on embedded devices without an operating system.

Unofficial Learning Material

A small selection of other guides and tutorial for Rust:

Please see the Little Book of Rust Books for even more Rust books.

Credits

The material here builds on top of the many great sources of Rust documentation. See the page on other resources for a full list of useful resources.

The material of Comprehensive Rust is licensed under the terms of the Apache 2.0 license, please see LICENSE for details.

Rust by Example

Some examples and exercises have been copied and adapted from Rust by Example. Please see the third_party/rust-by-example/ directory for details, including the license terms.

Rust on Exercism

Some exercises have been copied and adapted from Rust on Exercism. Please see the third_party/rust-on-exercism/ directory for details, including the license terms.

CXX

The Interoperability with C++ section uses an image from CXX. Please see the third_party/cxx/ directory for details, including the license terms.

An Example in C

The Why Rust? - An Example in C section has been taken from the presentation slides of Colin Finck’s Master Thesis. It has been relicensed under the terms of the Apache 2.0 license for this course by the author.

Solutions

You will find solutions to the exercises on the following pages.

Feel free to ask questions about the solutions on GitHub. Let us know if you have a different or better solution than what is presented here.

Day 1 Exercises

Arrays and for Loops

(back to exercise)

fn transpose(matrix: [[i32; 3]; 3]) -> [[i32; 3]; 3] {
    let mut result = [[0; 3]; 3];
    for i in 0..3 {
        for j in 0..3 {
            result[j][i] = matrix[i][j];
        }
    }
    return result;
}

fn pretty_print(matrix: &[[i32; 3]; 3]) {
    for row in matrix {
        println!("{row:?}");
    }
}

#[test]
fn test_transpose() {
    let matrix = [
        [101, 102, 103], //
        [201, 202, 203],
        [301, 302, 303],
    ];
    let transposed = transpose(matrix);
    assert_eq!(
        transposed,
        [
            [101, 201, 301], //
            [102, 202, 302],
            [103, 203, 303],
        ]
    );
}

fn main() {
    let matrix = [
        [101, 102, 103], // <-- the comment makes rustfmt add a newline
        [201, 202, 203],
        [301, 302, 303],
    ];

    println!("matrix:");
    pretty_print(&matrix);

    let transposed = transpose(matrix);
    println!("transposed:");
    pretty_print(&transposed);
}

Bonus question

It requires more advanced concepts. It might seem that we could use a slice-of-slices (&[&[i32]]) as the input type to transpose and thus make our function handle any size of matrix. However, this quickly breaks down: the return type cannot be &[&[i32]] since it needs to own the data you return.

You can attempt to use something like Vec<Vec<i32>>, but this doesn’t work out-of-the-box either: it’s hard to convert from Vec<Vec<i32>> to &[&[i32]] so now you cannot easily use pretty_print either.

Once we get to traits and generics, we’ll be able to use the std::convert::AsRef trait to abstract over anything that can be referenced as a slice.

use std::convert::AsRef;
use std::fmt::Debug;

fn pretty_print<T, Line, Matrix>(matrix: Matrix)
where
    T: Debug,
    // A line references a slice of items
    Line: AsRef<[T]>,
    // A matrix references a slice of lines
    Matrix: AsRef<[Line]>
{
    for row in matrix.as_ref() {
        println!("{:?}", row.as_ref());
    }
}

fn main() {
    // &[&[i32]]
    pretty_print(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9]]);
    // [[&str; 2]; 2]
    pretty_print([["a", "b"], ["c", "d"]]);
    // Vec<Vec<i32>>
    pretty_print(vec![vec![1, 2], vec![3, 4]]);
}

In addition, the type itself would not enforce that the child slices are of the same length, so such variable could contain an invalid matrix.

Luhn Algorithm

(back to exercise)

pub fn luhn(cc_number: &str) -> bool {
    let mut sum = 0;
    let mut double = false;
    let mut digit_seen = 0;

    for c in cc_number.chars().filter(|&f| f != ' ').rev() {
        if let Some(digit) = c.to_digit(10) {
            if double {
                let double_digit = digit * 2;
                sum += if double_digit > 9 {
                    double_digit - 9
                } else {
                    double_digit
                };
            } else {
                sum += digit;
            }
            double = !double;
            digit_seen += 1;
        } else {
            return false;
        }
    }

    if digit_seen < 2 {
        return false;
    }

    sum % 10 == 0
}

fn main() {
    let cc_number = "1234 5678 1234 5670";
    println!(
        "Is {cc_number} a valid credit card number? {}",
        if luhn(cc_number) { "yes" } else { "no" }
    );
}

#[test]
fn test_non_digit_cc_number() {
    assert!(!luhn("foo"));
    assert!(!luhn("foo 0 0"));
}

#[test]
fn test_empty_cc_number() {
    assert!(!luhn(""));
    assert!(!luhn(" "));
    assert!(!luhn("  "));
    assert!(!luhn("    "));
}

#[test]
fn test_single_digit_cc_number() {
    assert!(!luhn("0"));
}

#[test]
fn test_two_digit_cc_number() {
    assert!(luhn(" 0 0 "));
}

#[test]
fn test_valid_cc_number() {
    assert!(luhn("4263 9826 4026 9299"));
    assert!(luhn("4539 3195 0343 6467"));
    assert!(luhn("7992 7398 713"));
}

#[test]
fn test_invalid_cc_number() {
    assert!(!luhn("4223 9826 4026 9299"));
    assert!(!luhn("4539 3195 0343 6476"));
    assert!(!luhn("8273 1232 7352 0569"));
}

Day 2 Exercises

Designing a Library

(back to exercise)

struct Library {
    books: Vec<Book>,
}

struct Book {
    title: String,
    year: u16,
}

impl Book {
    // This is a constructor, used below.
    fn new(title: &str, year: u16) -> Book {
        Book {
            title: String::from(title),
            year,
        }
    }
}

// Implement the methods below. Notice how the `self` parameter
// changes type to indicate the method's required level of ownership
// over the object:
//
// - `&self` for shared read-only access,
// - `&mut self` for unique and mutable access,
// - `self` for unique access by value.
impl Library {

    fn new() -> Library {
        Library { books: Vec::new() }
    }

    fn len(&self) -> usize {
        self.books.len()
    }

    fn is_empty(&self) -> bool {
        self.books.is_empty()
    }

    fn add_book(&mut self, book: Book) {
        self.books.push(book)
    }

    fn print_books(&self) {
        for book in &self.books {
            println!("{}, published in {}", book.title, book.year);
        }
    }

    fn oldest_book(&self) -> Option<&Book> {
        // Using a closure and a built-in method:
        // self.books.iter().min_by_key(|book| book.year)

        // Longer hand-written solution:
        let mut oldest: Option<&Book> = None;
        for book in self.books.iter() {
            if oldest.is_none() || book.year < oldest.unwrap().year {
                oldest = Some(book);
            }
        }

        oldest
    }
}

fn main() {
    let mut library = Library::new();

    println!(
        "The library is empty: library.is_empty() -> {}",
        library.is_empty()
    );

    library.add_book(Book::new("Lord of the Rings", 1954));
    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));

    println!(
        "The library is no longer empty: library.is_empty() -> {}",
        library.is_empty()
    );

    library.print_books();

    match library.oldest_book() {
        Some(book) => println!("The oldest book is {}", book.title),
        None => println!("The library is empty!"),
    }

    println!("The library has {} books", library.len());
    library.print_books();
}

#[test]
fn test_library_len() {
    let mut library = Library::new();
    assert_eq!(library.len(), 0);
    assert!(library.is_empty());

    library.add_book(Book::new("Lord of the Rings", 1954));
    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    assert_eq!(library.len(), 2);
    assert!(!library.is_empty());
}

#[test]
fn test_library_is_empty() {
    let mut library = Library::new();
    assert!(library.is_empty());

    library.add_book(Book::new("Lord of the Rings", 1954));
    assert!(!library.is_empty());
}

#[test]
fn test_library_print_books() {
    let mut library = Library::new();
    library.add_book(Book::new("Lord of the Rings", 1954));
    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    // We could try and capture stdout, but let us just call the
    // method to start with.
    library.print_books();
}

#[test]
fn test_library_oldest_book() {
    let mut library = Library::new();
    assert!(library.oldest_book().is_none());

    library.add_book(Book::new("Lord of the Rings", 1954));
    assert_eq!(
        library.oldest_book().map(|b| b.title.as_str()),
        Some("Lord of the Rings")
    );

    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    assert_eq!(
        library.oldest_book().map(|b| b.title.as_str()),
        Some("Alice's Adventures in Wonderland")
    );
}

Points and Polygons

(back to exercise)

#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub struct Point {
    x: i32,
    y: i32,
}

impl Point {
    pub fn new(x: i32, y: i32) -> Point {
        Point { x, y }
    }

    pub fn magnitude(self) -> f64 {
        f64::from(self.x.pow(2) + self.y.pow(2)).sqrt()
    }

    pub fn dist(self, other: Point) -> f64 {
        (self - other).magnitude()
    }
}

impl std::ops::Add for Point {
    type Output = Self;

    fn add(self, other: Self) -> Self::Output {
        Self {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

impl std::ops::Sub for Point {
    type Output = Self;

    fn sub(self, other: Self) -> Self::Output {
        Self {
            x: self.x - other.x,
            y: self.y - other.y,
        }
    }
}

pub struct Polygon {
    points: Vec<Point>,
}

impl Polygon {
    pub fn new() -> Polygon {
        Polygon { points: Vec::new() }
    }

    pub fn add_point(&mut self, point: Point) {
        self.points.push(point);
    }

    pub fn left_most_point(&self) -> Option<Point> {
        self.points.iter().min_by_key(|p| p.x).copied()
    }

    pub fn iter(&self) -> impl Iterator<Item = &Point> {
        self.points.iter()
    }

    pub fn length(&self) -> f64 {
        if self.points.is_empty() {
            return 0.0;
        }

        let mut result = 0.0;
        let mut last_point = self.points[0];
        for point in &self.points[1..] {
            result += last_point.dist(*point);
            last_point = *point;
        }
        result += last_point.dist(self.points[0]);
        result
        // Alternatively, Iterator::zip() lets us iterate over the points as pairs
        // but we need to pair each point with the next one, and the last point
        // with the first point. The zip() iterator is finished as soon as one of 
        // the source iterators is finished, a neat trick is to combine Iterator::cycle
        // with Iterator::skip to create the second iterator for the zip and using map 
        // and sum to calculate the total length.
    }
}

pub struct Circle {
    center: Point,
    radius: i32,
}

impl Circle {
    pub fn new(center: Point, radius: i32) -> Circle {
        Circle { center, radius }
    }

    pub fn circumference(&self) -> f64 {
        2.0 * std::f64::consts::PI * f64::from(self.radius)
    }

    pub fn dist(&self, other: &Self) -> f64 {
        self.center.dist(other.center)
    }
}

pub enum Shape {
    Polygon(Polygon),
    Circle(Circle),
}

impl From<Polygon> for Shape {
    fn from(poly: Polygon) -> Self {
        Shape::Polygon(poly)
    }
}

impl From<Circle> for Shape {
    fn from(circle: Circle) -> Self {
        Shape::Circle(circle)
    }
}

impl Shape {
    pub fn perimeter(&self) -> f64 {
        match self {
            Shape::Polygon(poly) => poly.length(),
            Shape::Circle(circle) => circle.circumference(),
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    fn round_two_digits(x: f64) -> f64 {
        (x * 100.0).round() / 100.0
    }

    #[test]
    fn test_point_magnitude() {
        let p1 = Point::new(12, 13);
        assert_eq!(round_two_digits(p1.magnitude()), 17.69);
    }

    #[test]
    fn test_point_dist() {
        let p1 = Point::new(10, 10);
        let p2 = Point::new(14, 13);
        assert_eq!(round_two_digits(p1.dist(p2)), 5.00);
    }

    #[test]
    fn test_point_add() {
        let p1 = Point::new(16, 16);
        let p2 = p1 + Point::new(-4, 3);
        assert_eq!(p2, Point::new(12, 19));
    }

    #[test]
    fn test_polygon_left_most_point() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);
        assert_eq!(poly.left_most_point(), Some(p1));
    }

    #[test]
    fn test_polygon_iter() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);

        let points = poly.iter().cloned().collect::<Vec<_>>();
        assert_eq!(points, vec![Point::new(12, 13), Point::new(16, 16)]);
    }

    #[test]
    fn test_shape_perimeters() {
        let mut poly = Polygon::new();
        poly.add_point(Point::new(12, 13));
        poly.add_point(Point::new(17, 11));
        poly.add_point(Point::new(16, 16));
        let shapes = vec![
            Shape::from(poly),
            Shape::from(Circle::new(Point::new(10, 20), 5)),
        ];
        let perimeters = shapes
            .iter()
            .map(Shape::perimeter)
            .map(round_two_digits)
            .collect::<Vec<_>>();
        assert_eq!(perimeters, vec![15.48, 31.42]);
    }
}

fn main() {}