Lifetimes in Rust
In the world of high level languages like Java, Python, JavaScript, Ruby, Go, the developer mindset is not really tuned to understand one important thing — which is Memory. The reason being — the Garbage Collector as it keeps track of all the objects in the memory for you. But this comes at cost of the GC running alongside your program. On the other end of the spectrum where you have C, C++ you can extract maximum performance out of your system, but then this comes at the cost of you having to manage the memory yourself. Sure, you get a solid granular control, but then you have to ensure that your code is safe in terms of memory and concurrency.
Rust changes this by bringing best of both the worlds. Rust puts a strong emphasis on the concept of Ownership & Borrowing. There is no Garbage Collector, yet you can be rest assured that your program execution is safe. (Rust does have an “unsafe” offering, but that’s off-topic)
Before we dive into the lifetimes, I assume we have a good handle on the basics of Rust and a decent understanding of Mutability, Traits, Ownership & Borrowing /References concepts.
Let’s start!
So what is a lifetime? Here’s a simple code.
fn main () {
let x = 10;
A:| let y = &x;
| }
(Note: I have annotated the borrows in the code with markers that helps us visualize the state of the borrows .“|” (bold) for alive , “|” (light) for dead and “X” for invalid. You will have to remove the annotations if you wish to compile the code, as it is not a part of the Rust syntax . This is what is generally used to visualize the lifetimes by the Rust community)
As you see, y
is a reference and will only be valid for as long the value it is borrowing lives at-least as long as y
, which is true in the above example. Now let’s have a look at the following code:
fn main () {
let x;
{
let y = 1;
A:| x = &y;
| }
X assert_eq!(*x, 1)
X }
The compiler will not accept this code as x
which is a reference lives longer than the value it is borrowing. y
only lives within the inner scope. The reference x
is being used further outside the inner scope. As a result x
will become invalidated, as y
will be dropped by then, effectively rendering x
dangling. Rust does not allow dangling references.
Non Lexical Lifetimes
In the previous example, if we remove assert_eq!()
, the code miraculously compiles! How?
Well, the lifetimes need not necessarily be tied to a scope. The compiler is pretty smart to figure out that x
is not really being used further. So it says — Big deal, why nitpick?, better consider the borrow as “dead”.
fn main () {
let x;
{
let y = 1;
A:| x = &y;
| }
|
| }
Pretty cool stuff, isn’t it ? :)
This is called NLL (Non Lexical Lifetimes) and is a powerful lifetime feature that was introduced in Rust 2018. This same piece of code would give an error if you were to compile this with pre-2018 versions.
The following code will also compile, as all those references are not used after the respective borrows.
fn main () {
let mut x = 1;
A:| let y = &mut x;
| B:| let z = &y;
| |
| | x = 2;
| | }
And so does the following code too …
fn main () {
let mut x = 1;
A:| let y = &mut x;
| B:| let z = &y;
| |
| | *y = 3;
| | x = 2;
| | }
When x
is getting mutated, the borrow *y
can be considered “dead” as it is not being used further. However, if you swap the last two statements of the above example, the compiler will throw the following error:
error[E0506]: cannot assign to `x` because it is borrowed
--> src/main.rs:6:5
|
3 | let y = &mut x;
| ------ borrow of `x` occurs here
...
6 | x = 2;
| ^^^^^ assignment to borrowed `x` occurs here
7 | *y = 3;
| ------ borrow later used here
As you can see below (with our made-up annotations), while x
is being mutated, *y
is still “alive” as it is getting used later. This leads to a shared mutable references scenario, hence the compiler cries foul.
fn main () {
let mut x = 1;
A:| let y = &mut x;
| B:| let z = &y;
| |
⟶| | x = 2;
| | *y = 3;
| | }
Using Lifetimes in Code
So, how does one use lifetimes in code? All this while we were looking at lifetimes conceptually (with our made-up annotations) . The compiler was able to infer them for us. We didn’t have to do anything special in the code.
Turns out Rust does have an annotation for lifetime that is a part of it’s syntax. Lifetimes are denoted by a '
followed by the identifier. eg. 'a
.
Lifetimes can be considered in the same category (sort of) as traits. I know this sounds confusing. Let’s look at some examples of where we explicitly need to use them.
Function Arguments:
fn some_function<'a, 'b>(x:&'a i32, y:&'b i32) -> &'b i32 {
// blah blah
y
}
In the above example, the two input arguments are references, and the output is also a reference. Here, we are saying that for some lifetime 'a
(i.e the reference x
is holding onto) and 'b
(i.e the reference y
is holding onto), the function returns a reference that lives for as long as the input reference y
. Consider it like a contract that we make with the function. The compiler would throw an error if the function was returning x
instead.
'a
and 'b
both are “some lifetimes” that live longer than the stack-frame of the function and are considered disjoint (i.e no relation to each other), hence the compiler will throw an error if you were to return x
. Functions are like execution boundaries; they don’t have the whole view of the code. The function has no idea how long 'a
and 'b
really are. All it knows is that they are disjoint and will be there for at-least as long as the function execution.
I hope now you get why I said they are somewhat similar to traits. 'a
and 'b
are not concrete types here. These variants will be resolved at the time of the function invocation, i.e at the call site.
One might wonder what do we achieve by using different lifetimes in the function arguments in the above example. Would it hurt if we use just one lifetime?
fn some_function<'a>(x:&'a i32, y:&'a i32) -> &'a i32 {
// blah blah
y
}
Well, from only the function’s perspective, no. There’s nothing wrong with the above function. To understand this better, let’s write an example where we use this function.
fn some_function<'a>(x:&'a i32, y:&'a i32) -> &'a i32 {
// blah blah
y
} fn main() {
let y = 2; let v = {
let x = 1;
some_function(&x, &y)
};
}
..and here’s the compiler output:
error[E0597]: `x` does not live long enough
--> src/main.rs:9:23
|
7 | let v = {
| - borrow later stored here
8 | let x = 1;
9 | some_function(&x, &y)
| ^^ borrowed value does not live long enough
10 | };
| - `x` dropped here while still borrowed
What happened ? If you carefully observe &y
is passed into the function and you get &y
as the output, &x
is just tagging along. y
lives for the entirety of main()
, then why is the compiler giving an error here? The culprit is the same lifetimes on all arguments of the function. The function unifies the lifetimes (since it is just one lifetime), as for it, what matters is that they live for just more than the function’s stack-frame. However, they remain unified because there’s a return which also is a reference and is escaping the inner block (into v
) . As a result &x
is forced to live longer. For how long? As long as the lifetime of the return. The longer lifetime gets picked and shorter lifetime is “forcefully dragged”, which results in an error in this scenario.
If you make the inner block not to return any value (terminate the expression into a statement, let z = some_function(&x, &y);
), then the error vanishes. The function still unifies the lifetimes, but the unification doesn’t last for more than the function execution, since the return stays within the inner block.
We want the return to come out of the inner block. So what’s the solution ? The solution is to use different lifetimes:
fn some_function<'a, 'b>(x:&'a i32, y:&'b i32) -> &'b i32 {
// blah blah
y
}
If you substitute same-lifetimes function with this above function, the compiler is happy! The lifetimes here are disjoint, so the function won’t “unify” them.
So having a single lifetime or multiple lifetimes on function arguments is largely dependent of on how the function interacts with the surrounding code.
Type Constructors
Types like Structs, Enums, Tuples, Vectors also have to be annotated with lifetime parameters whenever needed.
struct A<'a> {
i : &'a i32
}
This is similar to how we parameterize struct over a generic type T. The above definition states that struct A
holds a reference that is valid for some lifetime 'a
and the struct A
cannot outlive the value it is borrowing i.e 'a
has to at-least be there till struct A
is around.
Inferred Lifetimes
The compiler is able to infer lifetimes in the following cases, where is one input reference argument and/or a return of reference type.
fn some_function(x:&i32) -> &i32 {
//blah blah
x
}
or
fn some_function(x:&i32) -> A {
//blah blah
A {
i : x
}
}
This is called lifetime elision. The programmer need not explicitly state the lifetimes. The compiler will deduce them on it’s own.
fn some_function<'a> (x:&'a i32) -> &'a i32fn some_function<'a> (x:&'a i32) -> A<'a>
If there is only one input lifetime, then that same lifetime will be applied to the output.
Hopefully now we know more on what a lifetime is. Although it seems a little rigid to grasp at the beginning, lifetimes makes us rethink and introspect the way we write code. Remember, there’s no garbage collector , yet the the compiler does it best in inferring and helping us out in some cases.