Rust Handbook
Rust is a very well-thought programming language. It seeks to unite a variety of existing concepts from other languages, industry and academia in order to be a solid language to develop reliable and efficient solutions in a productive manner.
On the other hand, it gained the fame of being a language with a steep learning curve. I'm not going to fool you: Rust has a lot of resources to help you, but first you need to understand how they work. What I can tell you is that, in general, every choice made in this language has a purpose. Once you get the reasons, something which seemed complicated may become understandable.
If you intend to develop an application with a high degree of concurrent processing to serve a lot of HTTP requests, you'll probably need to learn about async programming (async/await
). In case you deal directly with hardware on embedded programming, maybe you'll need to use unsafe
to interact with the native resource of the target device. While developing a library, it may look interest to implement macros to create a more convenient interface for your users.
Know that you don't need to learn all of this at once, nor be a specialist on all of the language's resources. Each person can focus on those which are relevant for the fool they want to develop. Do not think that you need to absorb everything at the same time, because nobody should.
What I propose here is the understanding of the language. There are core concepts which every developer must know to be productive, no matter what kind of the application they do. Those are the ones I aim to teach in this material.
Besides that, I'll present these concepts in a specific order. Each topic is carefully designed to be based on the previous ones, forming learning layers. You shouldn't try to build the second floor of a building without building the first one, right?
Any concept or detail that I consider out of the scope of this handbook will have references for high-quality content available elsewhere, in order to complement this guide.
Introduction
Start here. The idea is to prepare you for the remainder of the book.
Pre-requisites
In general, I assume that the reader knows how to program, at least in a basic level. If I were to write a guide to teach someone to program from scratch (and I intend to do it someday!) I would explain them about computer fundamentals and programming paradigms, and that would take a while to write well. If you don't know how to program and wants to follow the guide anyway, remember to take it easy and search for additional information on the web when you feel the need.
Another important requisite is the alignment with my proposal. My focus is on the understanding of the concepts. In software, to understand what you're doing could be the difference between solving a problem in minutes or in days. There'll be examples and exercises throughout the guide, but my focus is always conceptual.
Installation
The most recommended way to install Rust is by using rustup
, which takes care of the installations and updates of the language's official tools, such as rustc
, cargo
, and clippy
.
If you decided to install the language on your machine, just follow the instructions in the rustup's website and you'll probably be good to go. In case you use Windows, I recommend you to take a look at the Setup Rust for Windows page, as you may need to install additional dependencies.
In case you want to do small experiments without having to install the language, a good alternative is to use the playground, which is an online editor that allows you to compile and run simple code. It's also a good way to share code snippets with other people in case you need help on specific questions.
Hello, World
It's a tradition, when starting to learn a new programming language, to present a program which consists of showing a hello, world!
message on the terminal.
Here we go:
fn main() { println!("Hello, World!"); }
You can run this code straight from the book. Just click in the "play" button in the top-right corner of the box which contains the code.
Just with this little piece of code it's possible to get in touch with a lot of elements of the language, which we'll see with details later. Don't worry trying to understand them deeply now.
- The
main
function defines the execution's entry point. That's where your application's code starts to run. - Blocks are defined with a pair of
{
and}
. Some languages, such as Python, use indentation. Others use keywords such asbegin
andend
. println!
is a macro. I can say that because of the!
in the end. When your program is compiled, macros are processed for code generation, which is inserted where they've been called. Don't worry about understanding macros now, just know that we'll use some of them before a deeper explanation."Hello, World!"
, in a simplified view, can be seen as a text literal. In Rust, there are a few different types of text (e.g.String
,str
), each with its use case. It turns out that strings are more complicated than most programmers think, and Rust chose to be explicit about it.println!("Hello, World!");
is a statement which calls a macro passing a text as argument and has the effect of showing a content in the terminal (with a line break in the end). Statements in Rust end with;
. There are expressions too, which are processed at runtime and reduced to a value. For example:2 + 2
is reduced to the integer value4
.
I think that these information are a good start for a "hello, world!". Now we're starting on the right foot!
Tools
Next, there's a list of a few tools which are very important for the language. Each section title has a link to that tool's documentation, in case you want more specific information.
Rustup
Rust's installer. With it you can install and update many language tools. Some of them are described below.
Rustc
Language's compiler. Usually, it's not used directly, because cargo
simplifies some development-related tasks and deals with the compiler for us.
Cargo
Rust's build tool and package manager. It allows to:
- Create a package with a standardized structure:
cargo new <project_name>
- Perform some development activities:
- Check if the code compiles:
cargo check
- Run tests:
cargo test
- Compile the code:
cargo build
- Compiled and run the code:
cargo run
- Check if the code compiles:
- Publish a package to be used by other people:
cargo publish
There's many other things which can be done using cargo
, but that's a good start.
Rustfmt
Rust's code formatter. Virtually every project uses it. This standardization reduces repetitive work while coding, facilitates the onboarding on new code bases, ends discussions about style... The advantages compensate (by a lot) the freedom's reduction on the code style choice.
To use it, just run cargo fmt
and all your project will be formatted.
Clippy
Language's linter, which is basically a tool which analyzes your code and is capable of offering suggestions to try to make it better. Specially in the beginning, cargo clippy
can show you more idiomatic ways to write code. Besides that, it can also identify some common programming mistakes.
Rustdoc
Rust is known by its ecosystem having above-average documentation and this is due to the excellent documentation generation support we have from rustdoc
. By running cargo doc
, the documentation from your project and your dependencies are built and everyone has access to them in an standardized way, reducing the weight of this activity and improving the productivity of those who write or read documentation, which is basically everyone.
IDE Support
A modern language needs to have good support to interact with source code. Although Rust is a new language, there are very good projects to offer this support to code editors.
Rust-Analyzer
It's the Language Server which powers many text editors, like VS Code and Vim. On VS Code, just need to install the rust-lang.rust-analyzer
extension to have great support.
Intellij Rust
For Intellij-based IDEs, there's a plugin openly developed by JetBrains which brings Rust support for their editors. Just go to the plugins manager and install it to use Rust on Intellij.
Basic Elements
The elements below, although simple, are an important starting point to code. Most of these concepts are present in other languages.
Primitives Types
This section is non-exhaustive by a matter of simplicity. There are other primitive types beyond these and you may find information about them on the primitive types section in the standard library's documentation.
Integers
- Signed:
i8
,i16
,i32
,i64
,i128
eisize
. (i
from integer).- e.g.:
0
,-5
,1_000_000i32
- e.g.:
- Unsigned:
u8
,u16
,u32
,u64
,u128
eusize
.- It needs to be non-negative. e.g.:
0
,10
,42
,1_000_000usize
- It needs to be non-negative. e.g.:
The number in the type indicates how many bits are used by the integer. This has direct relation with the range of values which are possible to represent on that type. For example, u8
can only represent a range of 28 values.
If the type of the integer literal is not explicitly annotated or cannot be inferred, it will be i32
by default.
Floats
f32
e f64
. Examples: 1.0
, -15.50f32
Floating point numbers can represent decimal numbers (with an impression inherent of the floating point representation in the hardware). Floats, as defined by IEEE-754, can be more a bit more painful to work with than what you're used to in other languages. Rust makes explicit some footguns that could pass unnoticed.
bool
Can be true
or false
.
Unit
This is the type of the empty tuple ()
, which is the only value of this type. It is used in places where there's no meaningful value defined. For example: functions which "return nothing" actually return ()
.
Variables
A variable stores a value and can be manipulated through its identifier. In Rust, we use let
to define variables. let
works with pattern matching, but for now we're not going to take advantage of it to keep things simple.
Below there's the skeleton of declaration and binding of variables.
// Declaration
let <identifier>: <type>;
// Declaration with value binding
let <identifier>[: <type>] = <expression>;
Rust variables are immutable by default. If you try to change an immutable variable's content, the compiler will complain:
#![allow(unused)] fn main() { // If you try to run it, You'll see that this code doesn't compile. let x = 5; x = 10; }
As our dear compiler suggests, just add mut
before the identifier to make the variable mutable.
#![allow(unused)] fn main() { // It'll work this time! let mut x = 5; x = 10; assert_eq!(x, 10); }
It's important to note that Rust has type inference. This means that it tries to identify undeclared types by using the types of the expressions present in the code. For example:
let is_raining = true;
It's not necessary to annotate the type. As true
is of type bool
, is_raining
will also be.
Operators
A complete list can be found on this Rust Book appendix.
Arithmetic
+
, -
, /
, *
, %
In case you don't know, %
is the division remainder of the first number by the second one.
These operators can be used together with assignment. For example:
#![allow(unused)] fn main() { let mut x = 2 + 2; x += 1; assert_eq!(x, 5); }
Logic
- Negation:
!
- logic AND:
&&
- logic OR:
||
#![allow(unused)] fn main() { let x = true && !false; assert_eq!(x, true); }
Relational
<
, <=
, ==
, !=
, >=
, >
#![allow(unused)] fn main() { let x = 5 <= 10; assert_eq!(x, true); let y = 3 < 1; assert_eq!(y, false); }
Ownership and Borrowing
If you ask me: "what is Rust's most innovative feature?", my answer would be ownership. This concept allows the language to be memory-safe without a garbage collector. In practice, you have the performance of a systems language without giving up reliability.
If you came from languages that are not C or C++, it means that now you have a systems language that feels like programming in the high-level. Anyone can write efficient code without the fear of accidentally corrupt the memory.
If by chance you now C or C++, you probably needed to debug a series of errors from manual memory management. You'll be happy to know that rustc
simply rejects programs which can cause these errors with helpful, instructive messages, explaining you what could go wrong.
The Concept
The ownership rules are pretty simple:
- Every value has a variable which is its owner.
- the value's ownership can be moved.
- When the owner goes out of the scope, the value is released.
fn main() { let x = String::from("hello"); // x is the owner of the string // ... println!("Initially, x is owner of: {x}"); // ... let y = x; // The string was moved to `y`. `x` stopped being the owner. // From this point, trying to access the string through `x` is an invalid operation, since the string was moved. // ... // println!("x is owner of: {x}"); // invalid! // ... println!("Now, y is owner of: {y}"); // ... // At the end of this function, `y` goes out of scope and the string will be released. }
Besides ownership, we have borrowing:
- The value's owner can borrow it through shared references (immutable) or unique references (mutable).
- At any moment, there can only be one unique reference or any number or any number of shared references.
- References are always valid. They can't be used after the value they reference is moved or released.
That's it. By following these rules, we are free from any trouble involving allocation and release of resources. And the best part: the compiler check if we're following them automatically.
Illustrating with an example:
struct Book { title: String, } fn shared(book: &Book) { println!("[shared] The book's title is: {}\n", book.title); } fn unique(book: &mut Book) { println!("[unique] The book's title was: {}", book.title); book.title.push_str(" part 2"); println!("[unique] Now the title is: {}\n", book.title); } fn be_owner(book: Book) { println!("[be_owner] I'm the owner of the book: {}\n", book.title); } fn main() { let mut first_owner = Book { title: String::from("How to make friends and influence people") // recommended }; // Here, we are lending a shared borrow of the book to the variable `first_borrow` let first_borrow = &first_owner; // We create a second shared reference to pass to the function `shared` shared(&first_owner); // That's okay if you lend a mutable borrow.... unique(&mut first_owner); // But you cannot access `first_borrow` anymore! // println!("title: {}", first_borrow.title); // But there's no problem to create a new shared reference. let second_borrow = &first_owner; // The idea is that if you have a shared reference, you cannot access it after it could've been changed (e.g. by a mutable reference or when the value is moved to other owner) // Everything okay! shared(second_borrow); // Ownership moved to other owner. let second_owner = first_owner; // All the preceding references stopped being valid because of the move. The old owner stopping being accessible too! // shared(second_borrow); // unique(&mut first_owner); println!("[main] second owner: {}\n", second_owner.title); be_owner(second_owner); // The book is not accessible here, since its ownership was passed to the `be_owner` function. At this point, the value has been released. `second_owner` is not accessible too. // let invalid_borrow = &second_owner; }
Feel free to play with this example. Notice the error messages. They are very descriptive and were made to help you having a better understanding of why a given piece of code is invalid.
All you have seen here affects the whole language. With time, these concepts become natural and you'll start to agree with the borrow checker (part of rustc
which checks if you're following the rules) instead of fighting it. I even miss these concepts when I go back to other languages, because it's usually easier to understand what's happening in the source code.
Maybe some people will notice that certain things which cause errors in the Book
example are possible with integers and floats. The reason is that these types are cheap to copy. So, instead of moving them, the value is copied. It happens because they implement the trait Copy
. We'll talk about traits ahead, but they're similar to interfaces in other languages.
Besides that, I omitted the lifetimes concept in purpose. They are necessary for the compiler to guarantee that the references are always valid. However, it seemed more appropriate to get into this concept after explaining generic code, as a lifetime is a sort of generalization over how long a reference is valid.
What rustc
protects you from
The following examples are written in C, because it allows to illustrate well how you could cause undefined behavior (UB) in your program, which is basically an elegant way of telling that it can, under any execution, make your application:
- Abort execution
- Run normally
- Run, but process invalid values
When you got UB, the best to happen is the execution to be aborted as soon as the corruption occurs. This way, you can try to identify where the improper memory use occurred. Nonetheless, it can work on your machine, but not in production. Yeah, very fun.
Personally, I just understood all the memory problems below after learning Rust. Before that, I just stumbled on them without fully understanding the kind of error that was happening on my program. In my C learning, no reference I used taught me about those classes of errors in a clear way.
use after free
It consists in using a memory which you already released to the system. When a memory region is no longer yours, there's no guarantee that you're accessing valid information.
void main() {
int *p = malloc(sizeof(int));
// ...
free(p);
*p = 123; // Writing on a released memory region! UB!
}
In the best case, you'll get a segmentation fault. In the worst, *p
's content will be overwritten and you'll process garbage, making the execution fails in other part of the code, or even worse: your program may generate and invalid output. I lost two nights of sleep on this kind of error once.
In Rust, it doesn't even compile:
fn main() { let x = String::from("ferris!"); drop(x); // The string's ownership has been moved to `drop` // ... println!("{x}"); }
double free
The same memory address is released twice.
void main() {
void *p = malloc(500);
// ...
free(p);
// ...
free(p); // Same address released more than once! UB!
}
It looks improbable to occur, but it can be easy to introduce on a refactor.
In Rust, it doesn't even compile:
fn main() { let x = String::from("ferris!"); drop(x); // The string's ownership has been moved to `drop` // ... drop(x); // It isn't possible to pass to `drop` (or to any other fn) a value which has been moved. }
Dereferencing a null pointer
The 1 billion dollars mistake. Even high-level languages, with garbage collection, allow this memory error.
void main() {
void *p = malloc(500); // this function may return NULL in case memory hasn't been allocated successfully
// ...
*p = 123; // Following a NULL pointer! UB!
}
The good side of this error is that it normally aborts the execution instead of processing invalid data. The bad side... This is an error class which is totally avoidable, even without ownership. The sooner errors are fixed, lower are the costs. If your tools stop the introduction of certain kinds of defects, that's even better.
In Rust... There's no null reference. We use Option<T>
to represent that a value of type T
can be present (Some(value)
) or not (None
).
fn main() { let x: Option<String> = Some(String::from("ferris!")); // ... // I can't access string directly without first checking if `x`, which is a `Option<String>`, is `Some(...)` or `None`. match x { Some(value) => println!("I found this: {value}"), None => println!("Nothing here!"), } }
Memory leak
Technically, leaking memory isn't considered UB. However, it consumes system resources unnecessarily.
void main() {
void *p = malloc(500000000);
// Using p...
// ...
// I don't need `p` anymore, but I forgot to `free` it...
// ...
}
In Rust, memory will be released automatically once the owner goes out of scope. Even though leaking memory is something safe to do in Rust (i.e. someone could leak memory explicitly if they want/need without using unsafe
), most of the time it'll be released automatically due to the ownership concept.
Accessing an invalid memory region
To read or to write information beyond the allocated memory. The memory you accessed isn't yours, so anything can happen.
void main() {
int *p = malloc(10 * sizeof(int));
p[10] = 42; // I wrote on the 11th position! UB!
}
Reasonably easy to happen when using arrays.
In Rust, the execution will always be aborted, indicating which code tried to perform an invalid access to memory:
fn main() { let x = vec![1, 2, 3]; // ... let v = x[100_000]; // For sure our `vec` doesn't have all that size println!("An integer from x: {v}"); }
Manipulate uninitialized memory
Uninitialized memory may contain anything. In the best case, the memory will be zeroed. In the worst, you'll read anything that was written there before.
void main() {
int *p;
// ...
int x = *p; // Dereferencing `p` without initializing it before! UB!
}
In Rust, it doesn't even compile:
fn main() { let x: String; // Oops, we forgot to initialize... println!("{x}"); }
Conclusion
Probably, there's other memory problems that I didn't cover. Fortunately, you'll hardly find them. Yes, it's possible to have UB in Rust if you (or your dependency) uses unsafe incorrectly, but it's very likely that you won't need this language resource to develop your software.
Don't worry about decorating the rules. What is important is to understand what each of them mean. It's the compiler's job to ensure that they are being followed.