Overview
Memory ownership is one of Rust’s characterizing features, it defines Rust as it’s the language’s memory-safety guarantee.
What is Ownership?
It’s the rule set that defines how Rust manages a program’s memory, instead of a garbage collector, the Ownership model is what the compiler enforces. When these rules are violated the compiler will block the program from compiling, ensuring all executable are memory-safe.
Stack and Heap
We can’t talk about the Ownership model without addressing the Stack and Heap as it affects how the language treats values.
To read about stacks and heaps check out Stacks and Heaps.
- All data stored on the stack must have a fixed size at compile time.
- Data with an undetermined size is stored on the heap instead. When data is stored on the heap, a request to allocate memory with its size is made and a pointer is returned, the pointer itself is stored on the stack since pointers have a known size.
“Think of being seated at a restaurant. When you enter, you state the number of people in your group, and the host finds an empty table that fits everyone and leads you there. If someone in your group comes late, they can ask where you’ve been seated to find you.”
Pushing to the stack is faster because there’s no searching for space in memory, it all goes to the top of the stack. Whereas allocation in the heap is laborious, requires search while holding the data and bookkeeping to prepare for next allocation. Accessing heap data is also slower as the pointer needs to be followed to where the data’s stored. Processor’s can do their job faster when the data is closer, i,e, on the Stack.
What happens during a function call?
- The values are passed into the function, including any Heap pointers.
- The function’s variables are pushed to the Stack.
- When it finishes the variables are popped from the Stack. // BTW, this is how it goes for many languages, we're just building context here.
The Rust Ownership model keeps track of what data is where and whether or not it’s being used, duplicated between the Stack and Heap, and cleans it up accordingly.
Ownership Rules
The rules are as follows:
- Each value in Rust has an owner.
- There can ONLY be one owner at a time.
- When the owner goes out of scope, the value will be dropped.
Variable Scope
Variables are valid within their scope, this is nothing new, most languages are like this. As expected, variables are valid until they go out of scope.
fn main(){ // s is not valid here, it’s not yet declared
let s = "hello"; // s is valid from this point forward
// do stuff with s
} // this scope is now over, and s is no longer valid
The String type
Unlike the types introduced in Learning Rust - Ch.3 the String
type doesn’t have a known fixed size, so it’s stored on the Heap and makes it an ideal candidate for studying the RustOwnershipModel.
Don’t confuse the String
type with string literals, &str
, hard-coded like variable s
above.
A key difference is that string literals are immutable. Additionally, not every string value in our program is known at compile time…
The String
type is allocated on the heap and stores an unknown amount at compile time.
create one using the from
function like so:
let name = String::from("Geralt");
Unlike a literal this can be mutated.
let mut s = String::from("hello");
s.push_str(", world!"); // push_str() appends a literal to a String
println!("{s}"); // This will print `hello, world!`
Memory and Allocation
String literals are fast and efficient due to their immutability. But simply storing a free block of memory for other strings during runtime won’t work.
With the
String
type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:
- The memory must be requested from the memory allocator at runtime.
- We need a way of returning this memory to the allocator when we’re done with our
String
.
The allocation request is made by the programmer when declaring a String using String::from
. // no surprises here
The clearing and returning is the tricky part, must pair each allocate
with one free
, avoiding memory leaks, null pointers and invalid variables, and double frees.
Rust’s approach
Rust's sauce
When a variable that owns some data in memory goes out of scope, Rust clearns and returns it automatically.
There’s a “natural” point in which memory that’s out of scope is no longer needed and can be returned, at that point Rust calls the drop
function and returns the memory.
It calls the drop
function at every closing curly brace.
Interacting with Move
It’s possible for multiple variables to interact with the same data in memory. For example:
let x = 9;
let y = x;
The example binds 9 to x
and then copies it over and binds the value of x
to y
, it’s then pushed onto the Stack. This is because integers have the same fixed sizes.
However, the following example is different.
let s1 = String::from("hello");
let s2 = s1;
A String
is made of 3 pieces of information, its ptr
, length
, and capacity
. These pieces of information are stored on the Stack, but the contents, the actual data, the ptr is pointing at, is on the Heap.
When we assign
s1
tos2
, theString
data is copied, meaning we copy the pointer, the length, and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to. In other words, the data representation in memory looks like Figure 4-2.
An issue arises from this situation, when s1
and s2
go out of scope, the drop
function will be called on both, which leads to a double-free error.
Double Frees
Freeing memory twice can lead to memory corruption, which can potentially lead to security vulnerabilities.
Rust circumvents this by considering s1
no longer valid after the line let s2 = s1
and so doesn’t need to free it later on when either of them goes out of scope.
Wonderfully demonstrated by:
let s1 = String::from("hello");
let s2 = s1;
println!("{s1}, world!");
If you’ve heard the terms shallow copy and deep copy while working with other languages, the concept of copying the pointer, length, and capacity without copying the data probably sounds like making a shallow copy. But because Rust also invalidates the first variable, instead of being called a shallow copy, it’s known as a move. In this example, we would say that
s1
was moved intos2
. So, what actually happens is shown in Figure 4-4.
This implies that Rust never deep copies data, so any auto copying is inexpensive performance-wise.
Scope and Assignment
When assigning a totally new value to an existing variable, Rust will drop
the and free the original value’s memory immediately.
Interacting with Clone
Should we want to deep copy heap data of a String
, not just the Stack stuff, we use the common method clone
.
Example:
let s1 = String::from("hello");
let s2 = s1.clone();
println!("s1 = {s1}, s2 = {s2}");
This does exactly what it appears to do and clones the Heap data into another block in memory, and then that block’s Stack data, ptr, is assigned to s2
.
Stack-Only Data copying
Remember the integer example in Interacting with Move? It doesn’t contradict what we just discussed because integers have a known size at compile time and therefore are entirely stored on the Stack. Therefore there’s no reason for for the first var to not be valid after the second is made.
With fixed sizes types there’s no difference really between a shallow and deep copy. Because:
Rust has a special annotation called the
Copy
trait that we can place on types that are stored on the stack, as integers are (we’ll talk more about traits in Chapter 10). If a type implements theCopy
trait, variables that use it do not move, but rather are trivially copied, making them still valid after assignment to another variable.
The Copy
trait can’t be implemented to types that implement the Drop
trait, because that implies their data lives on the Heap.
Types implementing Copy Trait
Most scalar types and types that don’t require allocation or are some type of resource.
Common types that implement Copy
:
- All the integer types, such as
u32
. - The Boolean type,
bool
, with valuestrue
andfalse
. - All the floating-point types, such as
f64
. - The character type,
char
. - Tuples, if they only contain types that also implement
Copy
. For example,(i32, i32)
implementsCopy
, but(i32, String)
does not.
Ownership and Functions
This is the juicy part and one that you’ll likely fight the compiler about the most for the first couple of weeks. Passing a variable to a function is similar to assigning it to another variable, it will move or copy the data.
To demonstrate how different data types (implementing Copy
trait or not) are affected:
fn main() {
let s = String::from("hello"); // s comes into scope
takes_ownership(s); // s's value moves into the function...
// ... and so is no longer valid here
let x = 5; // x comes into scope
makes_copy(x); // because i32 implements the Copy trait,
// x does NOT move into the function,
println!("{}", x); // so it's okay to use x afterward
} // Here, x goes out of scope, then s. But because s's value was moved, nothing
// special happens.
fn takes_ownership(some_string: String) { // some_string comes into scope
println!("{some_string}");
} // Here, some_string goes out of scope and `drop` is called. The backing
// memory is freed.
fn makes_copy(some_integer: i32) { // some_integer comes into scope
println!("{some_integer}");
} // Here, some_integer goes out of scope. Nothing special happens.
Return values and Scope
Returning values also transfers ownership. Consider the following example:
fn main() {
let s1 = gives_ownership(); // gives_ownership moves its return
// value into s1
let s2 = String::from("hello"); // s2 comes into scope
let s3 = takes_and_gives_back(s2); // s2 is moved into
// takes_and_gives_back, which also
// moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing
// happens. s1 goes out of scope and is dropped.
fn gives_ownership() -> String { // gives_ownership will move its
// return value into the function
// that calls it
let some_string = String::from("yours"); // some_string comes into scope
some_string // some_string is returned and
// moves out to the calling
// function
}
// This function takes a String and returns a String.
fn takes_and_gives_back(a_string: String) -> String {
// a_string comes into
// scope
a_string // a_string is returned and moves out to the calling function
}
Assigning a value to another var moves that value, and data on the Heap that goes out of scope is dropped. Unless it’s moved to another variable… While this works, returning a value from EVERY function is too much, it’s tedious and ugly. So how to pass a value to a function without taking ownership? Do I have to return that initial value? No.
You could be a bore and return a tuple (Rust does allow that) but that’s also gross. See for yourself:
fn main() {
let s1 = String::from("hello");
let (s2, len) = calculate_length(s1);
println!("The length of '{s2}' is {len}.");
}
fn calculate_length(s: String) -> (String, usize) {
let length = s.len(); // len() returns the length of a String
(s, length)
}
That’s where References come in.
References and Borrowing
THE BREAD AND BUTTER OF RustOwnershipModel! We can reference values without moving them when passing them to functions.
References
Are like pointers, they’re addresses that can be followed to where the data is stored to access it; this data is still owned by another variable. Unlike a pointer, a references is guaranteed to be a valid value for the lifetime of that value.
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{s1}' is {len}.");
}
fn calculate_length(s: &String) -> usize {
s.len()
}
Notice the &
before the argument in the function call AND in the parameter within the function’s signature. These ampersands (&
) indicate a reference. They allow borrowing a value instead of taking ownership
Note: The opposite of referencing by using
&
is dereferencing, which is accomplished with the dereference operator,*
. We’ll see some uses of the dereference operator in Chapter 8 and discuss details of dereferencing in Chapter 15.
When the var s
goes out of scope from the function calculate_length/1
the value it references isn’t dropped because s
never had ownership anyway.
This act of referencing is called Borrowing.
A Reference's scope is from the moment it's borrowed to the last time it's used.
Mutability
Just like how variables are immutable by default, borrowed values (references) are also immutable. Unless…
Mutable References
To make a reference mutable, change the borrowed value after borrowing it, we need to add the mut
keyword in three places.
- Where the value is initially declared.
- When it’s referenced (where it’s called or passed to a function).
- In the function signature. Example:
fn main() {
let mut s = String::from("hello");
change(&mut s);
}
fn change(some_string: &mut String) {
some_string.push_str(", world");
}
Single borrow mutable references
Mutable references to a value can only be borrowed one at a time. Meaning no two references for the same value at the same time. Avoids the mess of Data-races which happen when:
- Two or more pointers access the same data at the same time.
- At least one of the pointers is being used to write to the data.
- There’s no mechanism being used to synchronize access to the data.
Pretty obvious why that it's the case... However, we can use curly brackets to create a new scope, allowing for multiple mutable references, but they’re not simultaneous:
Mutable and Immutable references at the same time.
We can’t have a mutable reference while there’s an immutable one to the same value. This protects from sudden and unexpected behavior.
This example causes an error
let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem
let r3 = &mut s; // BIG PROBLEM
println!("{}, {}, and {}", r1, r2, r3);
However, multiple immutable references are allowed, because read only…
Dangling References
The compiler will guarantee that no dangling references will be made, unlike languages with pointers where it’s easy to have a dangling pointer, by ensuring the data will not go out of scope before its reference to it does.
Dangling Pointer
A DanglingPointer pointer to a location in memory that’s potentially been given to another owner/variable.
Rules of References
To summarize the rules:
- At any given time, you can have either one mutable reference or any number of immutable references.
- References must always be valid.
The Slice type
A Slice references a contiguous sequence of elements in a collection instead of the entire thing. A slice is a kind of reference, no ownership.
String Slices
It’s a slice of a string, so referencing a part , a subsequence, of a string.
The syntax is as follow’s: let slice = &my_string[start..end];
//Note: Slices are upper-bound exclusive.
For example:
let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];
Under the hood it’s storing the starting and ending indexes and using them to reference the sequence in that range, with the length end - start
.
The ..
is Rust’s range syntax, if the start is at index 0, it can be dropped; and the same for the end.
let s = String::from("hello");
let slice = &s[0..2];
let slice = &s[..2];
let len = s.len();
let slice = &s[3..len];
let slice = &s[3..];
// or even
let slice = &s[0..len];
let slice = &s[..]; // the entire string as a slice
Multibyte characters
“Note: String slice range indices must occur at valid UTF-8 character boundaries. If you attempt to create a string slice in the middle of a multibyte character, your program will exit with an error. For the purposes of introducing string slices, we are assuming ASCII only in this section; a more thorough discussion of UTF-8 handling is in the “Storing UTF-8 Encoded Text with Strings” section of Chapter 8.”
Find the first word in a String
A string containing words separated by whitespace.
fn first_word(input: &String) -> &str{
let bytes = input.as_bytes(); // convert String to array of Bytes
for(i, &j) in bytes.iter().enumerate(){
if j == b' '{
return &input[0..i]
} } &input[..]
}
String Literals as Slices
In let s = "Hello, world!";
the type is a string literal &str
, i,e a slice pointing to a specific point in the binary. That’s why they’re an immutable reference.
String Slices as parameters
Since we can take string literal slices of String
types and pass them into functions.
We can rewrite the signature of the previous example to accommodate both literals and Strings:
fn first_word(s: &str) -> &str
.
Because it allows us to use the same function on both &String
values and &str
values.This way we can pass a slice of a String
or a reference to the String
.
// This change makes the function more modular, reusable, and generally more useful.
Check it out:
fn main() {
let my_string = String::from("hello world");
// `first_word` works on slices of `String`s, whether partial or whole.
let word = first_word(&my_string[0..6]);
let word = first_word(&my_string[..]);
// `first_word` also works on references to `String`s, which are equivalent
// to whole slices of `String`s.
let word = first_word(&my_string);
let my_string_literal = "hello world";
// `first_word` works on slices of string literals, whether partial or
// whole.
let word = first_word(&my_string_literal[0..6]);
let word = first_word(&my_string_literal[..]);
// Because string literals *are* string slices already,
// this works too, without the slice syntax!
let word = first_word(my_string_literal);
}
Other slices
As you can imagine we can slice other collections, such as arrays.
let arr = [1,2,3,4,5];
let arr_slice = &arr[2..4]
or
let arr = [1,2,3,4,5];
println!("{}",arr[1]);
let arr_slice = &arr[2..4];
for &i in arr_slice.iter(){
println!("{}", i);
}
This is a slice of type &[i32]
but works the same what as string slices do.
Summary
Ownership, Borrowing, and Slicing ensure memory-safety in Rust despite not having a Garbage Collector. If a Rust program compiles it’s safe and robust (minus logic-errors); and removes the need to manually write more lines to manage and clean memory.