Intro
Rust has many Functional Programming features.
Including:
- Closures: something like a function that can be stored in a variable
- Iterators: a way of processing a series of elements
In the previous chapters, the book went over some of Rust’s other functional features such as pattern matching and enums.
Closures
closures in Rust are anonymous functions that are saved to a variables, and can be passed around like one including as arguments.
This means a closure can be created in one spot and used in another, because it captures values from the scope in which it’s defined.
Closures are define following a simple syntax:
|arg_1, arg_2...arg_n| expression_hereIf the closure doesn’t take any arguments, then there’d be nothing between the two pipes.
Here’s an example:
#[derive(Debug, PartialEq, Copy, Clone)]
enum ShirtColor {
Red,
Blue,
}
struct Inventory {
shirts: Vec<ShirtColor>,
}
impl Inventory {
fn giveaway(&self, user_preference: Option<ShirtColor>) -> ShirtColor {
user_preference.unwrap_or_else(|| self.most_stocked())
}
fn most_stocked(&self) -> ShirtColor {
let mut num_red = 0;
let mut num_blue = 0;
for color in &self.shirts {
match color {
ShirtColor::Red => num_red += 1,
ShirtColor::Blue => num_blue += 1,
}
}
if num_red > num_blue {
ShirtColor::Red
} else {
ShirtColor::Blue
}
}
}The giveaway function passes a closure that takes no arguments to the unwrap_or_else method, the closure is defined here but passed to the method for evaluation later.
Inferring & Annotating Closure Types
While closures don’t usually require annotations for the types of parameters or return value like a function signature does.
The latter require them in the signatures due to exposure for other users of the code.
Also, closures are short and for a narrow context so don’t require them usually, and the compiler can infer the types most of the time.
However, we can be explicit and add type annotations if we want.
let expensive_closure = |num: u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
num
}This meaningless closure that simply delays execution for 2 seconds and returns its own parameter num: u32 is a great example :)
“This illustrates how closure syntax is similar to function syntax except for the use of pipes and the amount of syntax that is optional:”
fn add_one_v1 (x: u32) -> u32 { x + 1 }
let add_one_v2 = |x: u32| -> u32 { x + 1 };
let add_one_v3 = |x| { x + 1 };
let add_one_v4 = |x| x + 1 ;So closures can be annotated or not, defined within {} or not if it’s a single expression, similar to the giveaway example.
Capturing References and Moving Ownership
We mentioned that closures capture the references from their environments. They do this in three ways.
- Borrowing immutably
- Borrowing mutably
- Taking ownership.
Just like functions, except closures decide this based on the body.
Here’s an example of how a closure captures an immutable reference to a vector, because it only needs that in order to print.
fn main(){
let list = vec![1,2,3];
println!("printing 1st, {list:?}");
let clos = || println!("printing from closure, {list:?}");
println!("before callingh clos");
clos();
println!("after clos {list:?}");
}❯ ./func_scrap
printing 1st, [1, 2, 3]
before callingh clos
printing from closure, [1, 2, 3]
after clos [1, 2, 3]The compiler determined that the closure only needed to capture an immutable reference from its environment.
However, if we want to add an element it’d need a mutable reference, lets see what happens
fn main(){
let mut list = vec![1,2,3];
println!("printing 1st, {list:?}");
let mut clos = || list.push(55);
println!("before callingh clos");
clos();
println!("after clos {list:?}");
}Output:
❯ ./func_scrap
printing 1st, [1, 2, 3]
before callingh clos
after clos [1, 2, 3, 55]When clos is defined, it captures a mutable reference to list. Since the closure’s not used again after calling, the mutable borrow ends.
Also, we can’t print between the closure’s call and the final print, since that’d require an immutable reference when we’ve already made a mutable reference. review borrowing here.
To force closure ownership when not necessary
We can force the closure to take ownership using the
movekeyword. This is great when we want to move the data to a new thread and want the thread to own it.
Quick example, multi-threading will be covered down the line, for now:
use std::thread;
fn main() {
let list = vec![1, 2, 3];
println!("Before defining closure: {list:?}");
thread::spawn(move || println!("From thread: {list:?}"))
.join()
.unwrap();
}Breakdown:
- A new thread is spawned
- The closure passed into it moves the immutable reference it captured in order to print
- If the main thread does some more operations on the list before calling
join, the new thread might finish before the main or the main might finish first - If the main thread finishes before the spawned thread and drops the list, then the immutable reference is no longer valid
- Which is why the compiler requires the reference be moved so it remains valid throughout the new threads runtime
Moving Captured Values Out
When a closure is defined and it captures the references or captured ownership of values from its environment (thus affecting what, if anything, is moved into the closure), the code in the body of the closure defines what happens to the references or values when the closure is evaluated later (thus affecting what, if anything, is moved out of the closure).
Closure body can do any of the following:
- Move captured value out
- Mutate the captured value
- neither
- capture nothing from the environment from the start
The traits a closure implements are influenced by how it captures and handles values.
They’ll apply the following traits additively, automatically, based on the handling of the values in the closure body.
Traits:
FnOnceapplies to closures that can be called once. All closures implement at least this trait because all closures can be called. A closure that moves captured values out of its body will only implementFnOnceand none of the otherFntraits because it can only be called once.FnMutapplies to closures that don’t move captured values out of their body but might mutate the captured values. These closures can be called more than once.Fnapplies to closures that don’t move captured values out of their body and don’t mutate captured values, as well as closures that capture nothing from their environment. These closures can be called more than once without mutating their environment, which is important in cases such as calling a closure multiple times concurrently.
By examining the unwrap_or_else method used earlier
impl<T> Option<T> {
pub fn unwrap_or_else<F>(self, f: F) -> T
where
F: FnOnce() -> T
{
match self {
Some(x) => x,
None => f(),
}
}
}We notice along with the generic type T there’s also F, this is the closure we pass to the method.
By adding a trait bound using where for FnOnce we then implement the method’s body using match.
This makes sense, since something passed to the method should only be unwrapped once…
“Note: If what we want to do doesn’t require capturing a value from the environment, we can use the name of a function rather than a closure where we need something that implements one of the
Fntraits. For example, on anOption<Vec<T>>value, we could callunwrap_or_else(Vec::new)to get a new, empty vector if the value isNone. The compiler automatically implements whichever of theFntraits is applicable for a function definition.”
The sort_by_key method
We’re going to use this method to demonstrate the difference between FnOnce and FnMut and how returning the values affects behavior.
The sort_by_key(key)method is defined on slices and sorts a collection like a vector when given a closure as a key.
Basic example:
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let mut list = [
Rectangle { width: 10, height: 1 },
Rectangle { width: 3, height: 5 },
Rectangle { width: 7, height: 12 },
];
list.sort_by_key(|r| r.width);
println!("{list:#?}");
}Since the the method is expected to call the closure multiple times, it’s defined with FnMut. The closure |r| r.width doesn’t capture, mutate, or move any values from the environment, thus meeting the trait bound.
However, if we give sort_by_key a closure that implements FnOnce and moves a value out of the environment, compiler won’t be happy.
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let mut list = [
Rectangle { width: 10, height: 1 },
Rectangle { width: 3, height: 5 },
Rectangle { width: 7, height: 12 },
];
let mut sort_operations = vec![];
let value = String::from("closure called");
list.sort_by_key(|r| {
sort_operations.push(value);
r.width
});
println!("{list:#?}");
}This example tries to “count” the number of times the closure’s called and it doesn’t work, since the value is moved from the environment during the push, transferring ownership to the sort_operations vector, meaning it can’t be called a second time because value would no longer be in the environment to be pushed into sort_operations again!
To fix this silly count, we can change the closure to:
let mut num_sort_operations = 0;
list.sort_by_key(|r| {
num_sort_operations += 1;
r.width
});Iterators
iterators allow the execution of a task on a sequence of elements in turn. They handle iterating over the elements in the sequence for us, but we need to define the task they implement on the elements, often in the form of a closure.
iterators are Lazily evaluated, which means they don’t take effect until they’re used by method calls.
Defining an iterator on a vector is very easy
let my_vec = vec![1,2,3,4];
let iter = my_vec.iter();Now the iterator stored in iter won’t evaluate and do anything until it’s used.
A for-loop example is fitting
let my_vec = vec![1,2,3,4];
let iter = my_vec.iter();
for v in iter{
println!("{v}");
}Languages that don’t have iterators in their standard library would require us to start a loop at index 0 and keep looping till index len - 1.
The beauty of iterators is that they work on many different kinds of sequences, not just indexable data structures.
Iterator Trait & next method
All iterators implement the Iterator trait, part of standard library, and define a next method to get the next element in the sequence.
The trait’s definition:
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
// methods with default implementations elided
}“Notice that this definition uses some new syntax:
type ItemandSelf::Item, which are defining an associated type with this trait. We’ll talk about associated types in depth in Chapter 20.”
The Item type wrapped in an Option enum is used as the return value of the next method. On each iteration, the next definition returns Some(Item) and on the last iteration returns None.
If needed, we can call the next element directly like so let next_val = iter.next(); getting either Some or None.
Iterator mutability and
nextmethod
- When calling the
nexton the iterator, it changes its internal state used to track position in the sequence. Therefore the the variable that holds the value must be defines withmut. For-loops don’t require this since they take ownership of the iterator.- Values returned from the
nextmethod are immutable references to the value from the sequence from which it’s retrieved
Alternatives:
- Calling
into_iterinstead ofiter, allows taking ownership of the variable - Calling
iter_mutallows iterating over mutable references
Methods
Most methods that consume iterators call next in their definitions, hence why it must be defined for the Iterator trait.
These methods are referred to as Consuming Adapters because they use up the iterator.
For example, the sum method takes ownership of the iterator and sums up its elements.
#[test]
fn iterator_sum() {
let v1 = vec![1, 2, 3];
let v1_iter = v1.iter();
let total: i32 = v1_iter.sum();
assert_eq!(total, 6);
}Once sum is done, the iterator can no longer be used due to sum taking ownership.
Methods Producing Other Iterators
Some methods don’t consume iterators, they produce them instead.
A good example of this is the very useful map method. which takes a closure and applies it to all the elements of the sequence, while producing the iterator needed for this.
Let’s examine the following:
let v1: Vec<i32> = vec![1,2,3,54];
v1.iter.map(|x| x + 1);This will produce a compiler error, because the iterator isn’t being consumed. Remember that they’re lazy, so no effect takes place until they’re consumed. To do this we need the .collect() method.
Like so:
let v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<_> = v1.iter().map(|x| x + 1).collect();
assert_eq!(v2, vec![2, 3, 4]);It collects the resultant values into a collection data type.
"You can chain multiple calls to iterator adapters to perform complex actions in a readable way. But because all iterators are lazy, you have to call one of the consuming adapter methods to get results from calls to iterator adapters."
Closure Capturing Their Environment
When passing closures to iterator adapters, any of them tend to capture their environments.
When using the filter method, which takes a closure. The closure gets an item from the iterator returns a bool, the boolean value determines if it’s included in the filtered results.
Example, filter shoe inventory for a specific shoe size:
#[derive(PartialEq, Debug)]
struct Shoe {
size: u32,
style: String,
}
fn shoes_in_size(shoes: Vec<Shoe>, shoe_size: u32) -> Vec<Shoe> {
shoes.into_iter().filter(|s| s.size == shoe_size).collect()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn filters_by_size() {
let shoes = vec![
Shoe {
size: 10,
style: String::from("sneaker"),
},
Shoe {
size: 13,
style: String::from("sandal"),
},
Shoe {
size: 10,
style: String::from("boot"),
},
];
let in_my_size = shoes_in_size(shoes, 10);
assert_eq!(
in_my_size,
vec![
Shoe {
size: 10,
style: String::from("sneaker")
},
Shoe {
size: 10,
style: String::from("boot")
},
]
);
}
}The function shoes_in_size takes ownership of the vector of shoes and the size as parameters, returning a vector of shoes only in that size.
Its body calls into_iter in order to create an iterator that takes ownership of the vector.
The closure captures the shoe_size parameter from the environment and compares the value with each shoe’s size, keeping only shoes of the size specified.
Improving O Project from Chapter 12
Refactoring expensive clone calls
We’ll get rid of the .clone in the Config constructor new.
“We needed
clonehere because we have a slice withStringelements in the parameterargs, but thebuildfunction doesn’t ownargs”
Instead of cloning, our function will take ownership using an iterator as an argument instead of borrowing a slice.
We’ve updated the function to accept the iterator parameter, which is actually any type that implements the Iterator trait and returns String items.
impl Config {
fn build(
mut args: impl Iterator<Item = String>,
) -> Result<Config, &'static str> {
args.next();
// first expected is the expression to search for
let query = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a query string"),
};
// second is obvs the file path
let file_path = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a file path"),
};
let ignore_case = env::var("IGNORE_CASE").is_ok();
Ok(Config {
query,
file_path,
ignore_case,
})
}
}We call next at the very start of the constructor to skip the program name which is always the first element in the env::args array.
// This implementation doesn't account for the user passing the file path first....
Iterator Adapters
Using iterator adapters for the search function makes the code cleaner and removes an intermediate mutable results vector, which is something functional programming seeks to minimize. Especially in the case of concurrency.
Previous implementation:
pub fn search<'a>(query: &str, content: &'a str) -> Vec<&'a str> {
let mut results: Vec<&str> = Vec::new();
for line in content.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}Let’s turn refactor using iterators:
pub fn search<'a>(query: &str, content: &'a str) -> Vec<&'a str> {
let mut results: Vec<&str> =
content.lines()
.filter(|line| line.contains(query))
.collect();
results
}Now, let’s turn the entire function into an iterator adapter.
pub fn search<'a>(query: &'a str, content: &'a str) -> impl Iterator<Item = &'a str> {
content.lines().filter(move |line| line.contains(query))
}Breakdown:
- Change the return type to something that returns the correct Iterator trait implementation
- Return the iterator, no
collect - Needed to
movethe query into the closure since the closure might outlive the function that borrows the query.
Without the move, we get the following compiler error:
[abdu@abdu mingrep]$ cargo test
Compiling mingrep v0.1.0 (/home/abdu/Documents/Rust-Projects/mingrep)
error[E0373]: closure may outlive the current function, but it borrows `query`, which is owned by the current function
--> src/lib.rs:2:28
|
1 | pub fn search<'a>(query: &'a str, content: &'a str) -> impl Iterator<Item = &'a str> {
| -- lifetime `'a` defined here
2 | content.lines().filter(|line| line.contains(query))
| ^^^^^^ ----- `query` is borrowed here
| |
| may outlive borrowed value `query`
|
note: function requires argument type to outlive `'a`
--> src/lib.rs:2:5
|
2 | content.lines().filter(|line| line.contains(query))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to force the closure to take ownership of `query` (and any other referenced variables), use the `move` keyword
|
2 | content.lines().filter(move |line| line.contains(query))
| ++++
For more information about this error, try `rustc --explain E0373`.
error: could not compile `mingrep` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
error: could not compile `mingrep` (lib test) due to 1 previous errorNoticeable difference between returning Vector and Iterator
Previously, the program only printed results after it found all the hits and prints the vector’s all at once.
In contrast, using iterators allows printing a match as soon as it’s found, because the for-loop in therunfunction inmain.rstakes advantage of the iterator’s laziness!
Loops vs Iterators
While it’s Rustacean’s preference to use iterators, they’re also very useful once understood.
They allow allow devs to focus on the high-level objective of the loop and abstract the act of looping itself.
Performance
The performance is basically the same, because under the hood iterators basically get compiled to the same code.
“Iterators are one of Rust’s zero-cost abstractions, by which we mean that using the abstraction imposes no additional runtime overhead. This is analogous to how Bjarne Stroustrup, the original designer and implementor of C++, defines zero-overhead in his 2012 ETAPS keynote presentation “Foundations of C++”:”
In general, C++ implementations obey the zero-overhead principle: What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.
In most cases, Rust code using iterators compiles to the same assembly that if you’d write the logic by hand. The many optimizations under the hood such as loop unrolling, eliminating bound checking on array access, etc, all make the code extremely efficient.
Summary
- Closures are like functions that we can pass around and assign to variables
- Closures capture values from their environment
- Iterators allow us to iterate over a sequence of elements while abstracting the looping code
- Many iterator methods accept closures to accomplish tasks
- The performance of iterators compared to regular loops is basically the same