Intro
The main purpose of generics is to act as abstract placeholders for types that share similar behavior and interact with one another similarly, without actually knowing what might end up being inside of them.
We’ve already encountered generics before and have used them with things like Options, Results, and even Hashmaps. These are all generic in their implementation and don’t know what they’ll contain until they do…
This chapter of the Rust Book will go over many concepts that’ll help us reduce code duplication and make our code more flexible as it accepts and works with generic types instead of concrete types.
Generics combined with Traits
When generics and traits are combined, we can define the behavior of certain types to only accept types with a certain behavior, instead of just anything.
Function Extraction - Reducing Duplicate Code
Instead of writing duplicate code and functions, a generic function enables the function to work with multiple types.
The book goes over an example that finds the largest integer in a list. Of course instead of writing the logic every time we need to do so. We instead write an abstraction called a function.
This function has a parameter for a list that it’ll iterate through…etc.
Similar to how we write a function to abstract the logic and not write it multiple times, we can do the same for the type of data the function will operate on!
It all boils down to two steps:
- Identify the duplicate logic (code).
- Extract it into a function, abstract the actual data/variables with parameters, specify the returns.
Generic Data Types
Getting into the meat of the pie.
Generics are used to design the signatures for functions, enums, and structs.
Function Definitions
When defining a generic function, the generics go in the function’s signature, right where the data types for the parameters would go, and the same for the return type of course.
Think of it like naming the type of the parameter in addition to the value.
Start by providing a name for the Type Parameter this is the T in something like Option<T>. // It can be named anything, `T` is just convention.
Declare the parameter name in the function signature. For example:
fn largest<T>(list: &[T]) -> &T {“We read this definition as: the function
largestis generic over some typeT. This function has one parameter namedlist, which is a slice of values of typeT. Thelargestfunction will return a reference to a value of the same typeT.”
Refactor the function to use the generic data type instead:
fn largest<T>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list {
if item > largest {
largest = item;
}
}
largest
}But this won’t compile because you’re type T doesn’t implement the necessary trait, the PartialOrd trait.
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0369]: binary operation `>` cannot be applied to type `&T`
--> src/main.rs:5:17
|
5 | if item > largest {
| ---- ^ ------- &T
| |
| &T
|
help: consider restricting type parameter `T` with trait `PartialOrd`
|
1 | fn largest<T: std::cmp::PartialOrd>(list: &[T]) -> &T {
| ++++++++++++++++++++++
For more information about this error, try `rustc --explain E0369`.
error: could not compile `chapter10` (bin "chapter10") due to 1 previous error
In short, this means our function body won’t work for all data types, specifically, ones that don’t implement this trait.
Fix this by doing what the compiler error recommends., implementing the trait to be able to compare the values of the type.
Struct Definitions
Similar to functions, Structs can also be defined with Generics, also with <>.
Same procedure:
- Declare and name the type parameter.
- Use the generic type inside the Struct’s definition where you would a concrete data type.
For example, this struct represents a point.
struct Point<T> {
x: T,
y: T,
}
fn main() {
let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
}The struct Point is generic over the type T and can be used on both x and y.
A Struct may have different data types, same goes for a generic struct.
The signature becomes something like: struct Point<T, U> if x and y need to have different data types for some reason…
This means we can something like
struct Point<T, U> {
x: T,
y: U,
}
fn main() {
let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
let integer_and_float = Point { x: 5, y: 4.0 };
let float_and_integer = Point {x: 3.2, y: 8}
}Enum Definitions
As Structs, so are Enums.
We’ve actually seen this with the Option type in chapter 6.
enum Option<T>{
Some(T),
None,
}Also, similar to Structs, Enums can also have multiple generic types, and we’ve seen this with the Result type.
enum Result<T, E>{
Ok(T),
Err(E),
}"When you recognize situations in your code with multiple struct or enum definitions that differ only in the types of the values they hold, you can avoid duplication by using generic types instead."
Methods
By now you’ve got the gist, so this is more to clear house…
Methods can also be generic.
struct Point<T> {
x: T,
y: T,
}
impl<T> Point<T> {
fn x(&self) -> &T {
&self.x
}
}
fn main() {
let p = Point { x: 5, y: 10 };
println!("p.x = {}", p.x());
}Note the impl<T> before the Point<T> this notifies Rust that we’re implementing the method using the generic used in the Struct definition.
Something interesting happens when we declare some methods using generics and others using concrete variables.
Specific Type Methods
For example within the same Struct, we can define a method that only works on i32 variables. Meaning it’ll be contained to when the Struct’s <T> ends up being a i32. // That's pretty cool!
impl Point<f32> {
fn distance_from_origin(&self) -> f32 {
(self.x.powi(2) + self.y.powi(2)).sqrt()
}
}
Different Generic Params
Not all methods must have the same generic type parameters used in the Struct definition. When different parameters are passed to a method, it ends up creating a new struct with those parameters.
This example makes it clear
struct Point<X1, Y1> {
x: X1,
y: Y1,
}
impl<X1, Y1> Point<X1, Y1> {
// See how mixup takes different types
fn mixup<X2, Y2>(self, other: Point<X2, Y2>) -> Point<X1, Y2> {
Point {
x: self.x,
y: other.y,
}
}
}
fn main() {
let p1 = Point { x: 5, y: 10.4 };
let p2 = Point { x: "Hello", y: 'c' };
let p3 = p1.mixup(p2);
// p3.c = 5, p3.y = c
println!("p3.x = {}, p3.y = {}", p3.x, p3.y);
}Generics and Performance
Using generic parameters has no performance hit, at all.
This is because of monomorphization at compile time.
Monomorphization
“Is the process of turning generic code into specific code by filling in the concrete types that are used when compiled.”
The compiler backtracks the very same steps we did to arrive at the generics, it finds where the generic types are called and used, and then generates code for the concrete types it is called with.
For more details, here
Traits and Defining Shared Behavior
Traits define functionality of a particular type and that it can share with other types. They define behavior in an abstract way using Trait bounds to specify that a generic can be any type with certain behavior.
// That seems a bit confusing...
Traits are like Interfaces from other languages but with some differences.
Definition
The behavior of a type is all within the methods called on it. Some different types share similar behavior and so we can use the same methods, and that’s where traits come in.
Traits group method signatures together to define sets of behavior.
Example of defining a trait for any summarizeable collection of text be it newspaper, blog, post, etc.
pub trait Summary {
fn summarize(&self) -> String;
}The trait is declared using its respective keyword along with the name and privacy modifier. Inside it are the method signatures (one per line followed by ;) that describe the behavior for any type that implements the trait. But any type implementing this trait must provide its own custom behavior for the body of the method.
Implementing Traits on Type
Implementing a trait is similar to regular methods, but with the addition of the name of the Trait it’s implemented for.
This is done using the for keyword followed by the name of the Trait.
General syntax: impl Trait for Type{}
Here’s an example
pub struct NewsArticle {
pub headline: String,
pub location: String,
pub author: String,
pub content: String,
}
impl Summary for NewsArticle {
fn summarize(&self) -> String {
format!("{}, by {} ({})", self.headline, self.author, self.location)
}
}
pub struct SocialPost {
pub username: String,
pub content: String,
pub reply: bool,
pub repost: bool,
}
impl Summary for SocialPost {
fn summarize(&self) -> String {
format!("{}: {}", self.username, self.content)
}
}Now these types can use the summarize behavior. For example:
use aggregator::{SocialPost, Summary};
fn main() {
let post = SocialPost {
username: String::from("horse_ebooks"),
content: String::from(
"of course, as you probably already know, people",
),
reply: false,
repost: false,
};
println!("1 new post: {}", post.summarize());
}When using trait methods, remember to bring the trait into scope as well, not just the method.
Coherence and the Orphan rule
We can’t implement external traits on external types, like those defined in the standard library.
This rule ensures that code belonging to others can’t break yours and vice versa, otherwise two separate crates could implement the same trait for the same type which would confuse the compiler…
Using Default Implementations
There are times when we don’t want to implement custom logic and the default is just fine.
To do this we define the default logic/behavior for a trait (in the definition) and that would then give the coder the option to keep or override the behavior.
For example the Summary trait can return a default string
pub trait Summary{
fn summarize(&self) -> String{
String::from("Read more here")
}
}So if we have a RustitPost type that needs to be summarized using the default implementation we simply implement with an empty implementation block like so:
impl Summary for RustitPost{}"Default implementations can call other methods in the same trait, even if those other methods don’t have a default implementation. In this way, a trait can provide a lot of useful functionality and only require implementors to specify a small part of it."
pub trait Summary {
fn summarize_author(&self) -> String;
fn summarize(&self) -> String {General syntax:
format!("(Read more from {}...)", self.summarize_author())
}
}To use this version of Summary, we only need to define summarize_author when we implement the trait on a type:
impl Summary for SocialPost {
fn summarize_author(&self) -> String {
format!("@{}", self.username)
}
}However, once we override an implementation for a trait on a type, then the default cannot be called...
Using Traits as Parameters
We can define functions that accept parameters that implement a specific trait.
// Kinda like a parameter that extends a class in oop.
pub fn super_do(item: &impl MyTrait){
// logic here ...
item.trait_do()
}Explanation:
The function super_do accepts an item parameter that implements the MyTrait trait, and call the methods on item from that trait in the function body!
Trait Bound Syntax
The previous syntax used to specify the trait parameters is syntacticsugar for the following:
pub fn super_do<T: MyTrait>(item: &T){
// logic here ...
item.trait_do()
}The trait bounds with the declaration of the generic type parameter after a colon and inside angle brackets.
While the previous syntax is more verbose, it serves a purpose when dealing complex cases…
For example, dealing with two or more parameters and want to force them to have the same trait bound type:
pub fn notify<T: Summary>(item1: &T, item2: &T) {whereas using impl Trait would not and lends itself to having different types…
Multiple Trait Bounds with +
To specify types that implement multiple traits.
pub fn notify(item: &(impl Summary + Display)) {The + syntax is also valid with trait bounds on generic types:
pub fn notify<T: Summary + Display>(item: &T) {}With the two trait bounds specified, the body of notify can call summarize and use {} to format item.
Clearer Trait Bounds with where Clauses
Using too many trait bounds has its downsides.
“Each generic type has its own trait bounds, so functions with multiple generic type parameters can contain lots of trait bound information between the function’s name and its parameter list…”
The function signature becomes hard to read, so instead the where clause after the function signature clears this up.
Instead of this:
fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {}Write this:
fn some_function<T, U>(t: &T, u: &U) -> i32
where
T: Display + Clone,
U: Clone + Debug,
{
Cleaner, much more readable and the basic signature info is where it’s expected and the trait bounds are specified after!
Returning Types implementing Traits
We can use the impl Trait syntax in a function’s return type as well, meaning we can make functions that return instances of types implementing those traits.
fn returns_summarizable() -> impl Summary {
SocialPost {
username: String::from("horse_ebooks"),
content: String::from(
"of course, as you probably already know, people",
),
reply: false,
repost: false,
}
}Thus avoiding the need to declare a concrete type, it simply returns one!
But it only works if we’re returning a single type.
// Kinda like a trait-implementing type factory
Trait Bounds and Conditional Method Implementation
Like the header says, based on the trait bound, we can implement methods in a specific way.
Example:
use std::fmt::Display;
struct Pair<T> {
x: T,
y: T,
}
impl<T> Pair<T> {
fn new(x: T, y: T) -> Self {
Self { x, y }
}
}
impl<T: Display + PartialOrd> Pair<T> {
fn cmp_display(&self) {
if self.x >= self.y {
println!("The largest member is x = {}", self.x);
} else {
println!("The largest member is y = {}", self.y);
}
}
}“Implementations of a trait on any type that satisfies the trait bounds are called blanket implementations and are used extensively in the Rust standard library. For example, the standard library implements the
ToStringtrait on any type that implements theDisplaytrait”
Summarizing Generics and Traits
Traits and trait bounds allow us to take advantage of the power of generic types, thus reducing code duplication and informing the compiler what behavior is expected from a generic type. So the compiler can check all the concrete types for correct behavior.
How Lifetimes Validate References
Surprise surprise, Lifetimes are also a generic.
Lifetimes ensures that a reference is valid for the duration that we need it to be, which is the scope within which it’s valid. // Aptly name...
Annotation
Often, lifetimes are implicit and inferred, just like types. However, like the need to annotate types at times,sometimes we also need to annotate lifetimes using generic lifetime parameters to ensure validity at runtime.
// This is uncharted territory for me, something new!
Dangling References
These are references that cause the program to reference data that it shouldn’t intend to. The ultimate goal of lifetimes is to prevent dangling references.
Here’s an example:
fn main() {
let r;
{
let x = 5;
r = &x;
}
println!("r: {r}");
}Breakdown:
- The var
rhas no value (the compiler will complain) - We declare
xwith a value 5 in the inner scope - We assign
rthe reference tox’s value. - Then attempt to use the value assigned to
rafter it’s out of scope.
The code won’t compile and the compiler will inform us thatxdoes not live long enough, i,e, is out of scope when needed.
The Borrow Checker
This infamous feature of Rust compares scopes to determine whether the borrows in a program are valid.
It checks Ownership and Lifetimes.
Taking the previous example, the Borrow Checker will find that the lifetime of the reference assigned to r is shorter than that of r itself, and therefore that value goes out of scope, and the flags are raised!
fn main() {
let r; // ---------+-- 'a
// |
{ // |
let x = 5; // -+-- 'b |
r = &x; // | |
} // -+ |
// |
println!("r: {r}"); // |
} // ---------+“The subject of the reference doesn’t live as long as the reference.”
To fix this we can simply pull the value into scope for long enough
fn main(){
let r;
let x = 5;
let r = &x;
println!("The value of r is: {r}");
}Now the lifetime of the r is greater and will always be valid while x is valid.
Generic Lifetimes in Functions
Before we get into this, ponder the following example:
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {result}");
}
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() { x } else { y }
}This won’t compile and give the following error:
error[E0106]: missing lifetime specifier
--> scrap.rs:13:33
|
13 | fn longest(x: &str, y: &str) -> &str{
| ---- ---- ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
help: consider introducing a named lifetime parameter
|
13 | fn longest<'a>(x: &'a str, y: &'a str) -> &'a str{
| ++++ ++ ++ ++
error: aborting due to 1 previous error
This happens because longest takes references in the form of string slices, since we don’t want to give it ownership, and the compiler cannot infer where the borrowed value is from (due to the branches of the if-else statement) and recommends adding a named lifetime parameter, i,e annotate the parameters and return value.
Lifetime Annotation Syntax
The annotations don’t affect the length of the lifetimes, rather help describe their relationships to other references. It helps the compiler and you make sense of the lifetimes.
“…functions can accept references with any lifetime by specifying a generic lifetime parameter.”
Syntax
- Names start with an apostrophe
'. - Lowercase and very short.
- Placed after the reference
&.
Traditionally the first lifetime parameter annotation starts with'aand so on.
Examples:
&i32 // a reference
&'a i32 // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime- The first is a reference to an
i32 - The second is the same reference but with a lifetime parameter
'a - The third is the same as the second but with a mutable reference with lifetime parameter
'a
“One lifetime annotation by itself doesn’t have much meaning, because the annotations are meant to tell Rust how generic lifetime parameters of multiple references relate to each other.”
Annotations in Function Signatures
To use lifetime parameters we declare them in angle brackets just like generic type parameters.
Revisiting the longest function as an example,
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str{
if x.len() > y.len() { x } else { y }
}The goal is for the signature to express the following constraint:
The returned reference will be valid as long as both of the parameters are valid. This is the relationship between lifetimes of the parameters and the return value.
Now Rust knows that the function takes two slices that live as long as 'a does, and the same for the returned slice. Practically saying that the references returned by longest is the same or smaller than the lifetimes of the values referred in the function’s parameters.
// Remember, lifetimes aren't being altered, we're simply informing Rust how should constrain the lifetimes.
Relationships
Annotations depend on what the function does, which is why we don’t always need to annotate when a function accepts references.
When a function returns a references, the lifetime parameter of the return type must match the lifetime parameter of the one of the function’s parameters. If it doesn’t then it must refer to a value created in the function.
However, this could lead to danglingreference since the value the return type refers to will go out of scope at the function’s end…
Look:
fn longest<'a>(x: &str, y: &str) -> &'a str {
let result = String::from("really long string");
result.as_str()
}Even though lifetime params have been specified, this won’t compile and the dangling ref occurs because the return type refers to result which goes out of scope and is cleaned up.
To fix this, return an owned type instead of a reference, that way the function caller keeps the value even after longest is done.
Syntax summarized
It’s about connecting the lifetimes of various parameters and return values, in a way that Rust can manage the info to ensure memory-safe operations.
Annotations in Structs
We can define structs that hold references instead of owned values, but this requires lifetime annotation on every reference in the struct’s definition.
Example
struct SystemInfo<'a> {
os: &'a str,
cpu: &'a str,
username: str,
}
fn main(){
let os = String::from("CachyOs");
let cpu = "Ryzen 7800X3D";
let username = "Dev";
let sys = SystemInfo{
os,
cpu,
username,
}
}The annotation comes after the name, like a generic type, and again for each referenced value. Meaning that an instance of the SystemInfo struct can’t outlive a reference of any of its fields that are references.
Lifetime Elision and Rust's History
Certain lifetime annotation patterns became extremely common in Rust’s early days and so they were incorporated into the compiler. Interesting stuff, Read more
Method Definitions
Similar syntax as before, depending on the return types.
- Annotations are declared after the
implkeyword.
The Elision rules often make is so that annotations aren’t necessary in method signatures.
impl<'a> ImportantExcerpt<'a> {
fn level(&self) -> i32 {
3
}
}The lifetime parameter declaration after impl and its use after the type name are required, but because of the first elision rule, we’re not required to annotate the lifetime of the reference to self.
Static Lifetime
The static lifetime is special and denotes that the affected reference can live for the duration of the program’s runtime. For example, all string literals have static lifetimes.
which is annotated as follows:
let s: &'static str = "I have a static lifetime.";This is stored directly in the binary so it’s always available.
Use
staticwith caution as it's not something to just slap around...The compiler will likely suggest against it, if you end up using
staticfor a bug fix then go back and fix the issue…
All Together
An example with generic type parameters, trait bounds, and lifetimes!
Revisiting the longest function from earlier:
use std::fmt::Display;
fn longest_with_an_announcement<'a, T>(
x: &'a str,
y: &'a str,
ann: T,
) -> &'a str
where
T: Display,
{
println!("Announcement! {ann}");
if x.len() > y.len() { x } else { y }
}- Notice how the lifetime and the type are together between angled-brackets.
- The generic type parameter
Tis bound to a type that implements theDisplaytrait.