This is the second part of the Learn You a Rust for Great Good! tutorial explainer series.If you’re coming to the series for the first time, I recommend starting at the first post linked above.
Last time, we covered the topic of ownership, and how Rust uses it to make really cool things happen such as complete memory
safety. However, there’s a small problem with ownership, and I’ll demonstrate it with the same code snippet as last time:
fndo_stuff_with_data(a:ImportantData,b:ImportantData,c:ImportantData)->(ImportantData,ImportantData,ImportantData,ImportantData){/* does magic - let's say, adds them up and returns the result */}fnmain(){let(a,b,c)=(ImportantData(0,1),ImportantData(1,2),ImportantData(2,3));let(result,d,e,f)=do_stuff_with_data(a,b,c);}
Here, we want to have a do_stuff_with_data function that takes three ImportantDatas and returns one. However, with Rust’s
ownership system, we end up having to return four ImportantDatas: our three inputs, and one output - else the do_stuff_with_data
function will consume our input values. Remember last time, where we moved a value, making it no longer accessible by its original
name? The same thing happens here, except if we don’t return the ImportantDatas we put in and move them back, they just die
when do_stuff_with_data finishes doing its thing.
We need a way to have a reference to data. To look at it, perhaps to edit it, but not to own it.
Introducing borrows
Thankfully, Rust’s got our back. Rust provides two sorts of borrows that let you take a reference to a piece of data, whilst
still making sure that nothing bad happens to it. (How? Stay tuned, and we’ll tell you later on in the series in the lifetimes section.)
These two types are &T and &mut T. I’m using the letter T here to mean “any type” - an integer, string, ImportantData, you name it.
(You’ll encounter this pattern later with generics a lot.) The & operator means “reference to”, and the mut part means “mutable” -
that is, editable. A borrow is created by using one or both of these operators - for example, borrowing variable as &variable.
Let’s have a look at how we can use these to improve our code.
fndo_stuff_with_data(a:&ImportantData,b:&ImportantData,c:&mutImportantData)->ImportantData{/* does magic - let's say, adds them up and returns the result */}fnmain(){let(a,b)=(ImportantData(0,1),ImportantData(1,2));letmutc=ImportantData(2,3);letresult=do_stuff_with_data(&a,&b,&mutc);}
Ah, much nicer - now do_stuff_with_data only has one output. It borrowsa and b, and mutably borrowsc, does computations
with them, and returns an owned value - the ImportantData - that is moved back into main(). (It might accomplish this
by returning something like ImportantData(a.0 + b.0, c.0).)
It’s important to note the difference between a &T and a &mut T. Consider the following:
fnadd_4_to_vec(vec:&Vec<i32>){/* `Vec` is an 'array' type, short for 'vector' */vec.push(4);}fnmain(){letmutmy_little_vector:Vec<i32>=vec![1,2,3];add_4_to_vec(&my_little_vector);}
Here, Rust is telling us that what we are trying to do is strictly verboten. We’ve only been handed an immutable borrow -
a &T - yet we somehow are trying to edit the thing being borrowed. If you could do this, all sorts of Bad Things™ would happen.
(Also note that Rust has told us about one new piece of syntax, the magic dereferencing asterisk, used to mean “thing behind
this borrow”.)
Changing the code to use a &mut T - a mutable borrow - works.
fnadd_4_to_vec(vec:&mutVec<i32>){/* `Vec` is an 'array' type, short for 'vector' */vec.push(4);}fnmain(){letmutmy_little_vector:Vec<i32>=vec![1,2,3];add_4_to_vec(&mutmy_little_vector);}
Sound good? You should also note that moving stuff out of a borrow is not allowed.
fail.rs:6:13: 6:15 error: cannot move out of borrowed content [E0507]
fail.rs:6 let z = *y;
^~
fail.rs:6:9: 6:10 note: attempting to move value to here
fail.rs:6 let z = *y;
^
What, did you want to leave poor x with no data? You’ve only borrowed x - not
taken full ownership of it. Thus, moving as we would with ownership is strictly
verboten.
A small note about Copy
There’s one small detail about some times in Rust that we didn’t cover last time. Ever try to do this…
fnadd_two(a:i32,b:i32)->i32{a+b/* note: Rust's expression-based nature means you can leave
an expression on the last line of a function (WITHOUT semicolon)
to be a return value */}fnmain(){leta=3;letb=4;letc=add_two(a,b);println!("{}, {}, {}",a,b,c);/* prints "3, 4, 7" */}
…and wonder why add_two didn’t gobble up a and b? No? Well, it’s rather interesting, so let me tell you about it.
To prevent development with ownership being a massive pain, some types are labeled with a trait called Copy. (You don’t have
to worry about traits for now, but here’s the relevant section of the Rust Book if you’re interested.) What does this mean? Well, let’s go to the excellent standard library documentation to find out:
Types that can be copied by simply copying bits (i.e. memcpy).
By default, variable bindings have 'move semantics.' In other words:
#[derive(Debug)]structFoo;letx=Foo;lety=x;// `x` has moved into `y`, and so cannot be used// println!("{:?}", x); // error: use of moved value
However, if a type implements Copy, it instead has 'copy semantics':
// we can just derive a `Copy` implementation#[derive(Debug,Copy,Clone)]structFoo;letx=Foo;lety=x;// `y` is a copy of `x`println!("{:?}",x);// A-OK!
It's important to note that in these two examples, the only difference is if you are allowed to access x after the assignment: a move is also a bitwise copy under the hood.
Basically, what this is saying is that Copy makes variables imbued with it have special copy semantics - that is, instead of
moving them about everywhere and worrying about ownership, we simply just copy the 1s and 0s that make up the variable
to a different place, where another variable can use them. Sharing is caring!
An even smaller note about mutability
We’ve now encountered two ways something can be mutable - editable - in Rust: a mut variable binding (variable binding is a fancy term for “name we use for a variable”), as in let mut x = 1,
and a &mut borrow, as in &mut x. It’s important to distinguish between the two.
fnmain(){leta=1;/* immutable */letb=&muta;/* impossible, as `a` is immutable */letb2=&a;/* immutable variable binding to immutable borrow */letmutc=2;/* mutable variable binding */letd=&mutc;/* immutable variable binding to mutable borrow */letmute=&mutc;/* mutable variable binding to mutable borrow */*e=3;/* works, as c - the thing borrowed by e - is mutable */e=&a;/* works, as e was mutable */d=&a;/* doesn't work, even though we can immutably borrow `a`,
as `d` is an immutable variable binding */}
Your head probably hurts after reading that. Don’t worry, it’s not as hard as it looks here.
The problem with borrowing
Okay. I’ll admit something to you: when I initially told you about mutable references (&mut), I made them seem simpler then they actually are.
With ownership, we were always sure that data was exclusively owned by one
variable, and it could only be edited through use of that variable. To edit it, you’d either need to change it through the original variable or move the data elsewhere.
However, with borrows, we can have many references to some data. What if one is
reading whilst another is writing? We now have the possibility of a data race
occuring. Here’s the definition:
There is a ‘data race’ when two or more pointers access the same memory location at the same time, where at least one of them is writing, and the operations are not synchronized.
The bad things that happen under the umbrella term data race include: two bits
of code editing data at the same time, resulting in mangled data; something reading
a piece of data being written, resulting in the reader getting garbage - and various other nefarious events. So how do we fix it?
Enter the borrowing rules
Rust is clever, and it has a set of rules to deal with this exact problem. Here they are, straight out of the Rust Book:
Rule one: any borrow must last for a scope no greater than that of the owner
Basically, “a borrow can’t outlive the thing it’s borrowing”. It’s hopefully obvious why:
a borrow to nonexistent content doesn’t make any sense, and trying to read or write
it will so something - it’ll try and read from where the thing it borrowed once
was - but you have no idea what will happen.
Rule two: you may have EITHER 1+ immutable borrows OR EXACTLY 1 mutable borrow
This one prevents data races. It dictates that, whilst you have any immutable (&T)
borrows, you’re not allowed any mutable (&mut T) borrows - to prevent the problem
of something reading while something else is writing, or code getting lost
because a piece of data it depended on changed in an unexpected way.
It also says that you can’t have two or more mutable borrows, to prevent the problem
of two things overwriting the same piece of data.
error: cannot borrow `x` as immutable because it is also borrowed as mutable
println!("{}", x);
^
Let’s walk through how this is bad:
When y was created, we gave it a &mut x - a mutable borrow of x.
println!(), a macro to print stuff out to the console, needs to read x. Since
it needs to read x only, it takes an immutable borrow (as most things in Rust do)
Rust looks through this code, and sees println!() trying to immutably borrow x. Since y’s still around - a mutable borrow - it gets annoyed and blows up in our face.
You’ll actually notice that Rust will be helpful to you if you encounter such an issue, and tell you where the borrow ends. In this case, it’s right at the end of main(). Dang.
note: previous borrow ends here
fn main() {
}
^
The solution? Again, pilfered from the very hands of the Rust Book’s author:
fnmain(){letmutx=5;{/* ooh look, braces! */lety=&mutx;// -+ &mut borrow starts here*y+=1;// |}// -+ ... and ends hereprintln!("{}",x);// <- try to borrow x here}
The braces create a new scope - a new block of code for stuff to occur in.
These help us express the notion that we only want y to live so we can add 1 to it, and after that we’re done.
Here’s a more sneaky example - one that I’ve encountered myself, and one that
took me a while to figure out. I post it here, so that you may not be as stupid.
What’s wrong here? This code is totally fine! I wrote it, dammmit, I know what’s
going on!
Nope.
fail.rs:8:18: 8:22 error: `vec2` does not live long enough
fail.rs:8 for data in &vec2 {
^~~~
What’s going wrong here is more subtle. To solve it, you have to know this one fact:
Rust gets rid of data in reverse order to when it was created. This is called
last in, first out (LIFO).
Let’s break the problem down.
We have vec1, a vector of borrows to some piece of Data. (You might be asking
yourself: where does the Data come from?)
We then have vec2, a vector of Data. So this is where it comes from.
We borrow every piece of Data in vec2 with a for loop, and store these
borrows in vec1.
We’re now at the end of main(). Rust comes along and starts destroying stuff.
Its first target is vec2, because it was created last. Bonk RIP vec2.
Now, we have a problem. vec1 is full of borrows to stuff in vec2 - which
just got bonked on the head. This breaks rule one, as a borrow is outliving
something it borrowed.
Rust even told us that at the start. Sigh. The solution is simple: reverse the order!
structData(i32);fnmain(){letmutvec2:Vec<Data>=vec![Data(0),Data(1),Data(2)];letmutvec1:Vec<&Data>=Vec::new();fordatain&vec2{vec1.push(data);}/* vec1 dies first */}
Everything works, and all is well with the world.
Sidenote: iterating over vectors
This stymied me when working on this part.
structData(i32);fnmain(){letmutvec2:Vec<Data>=vec![Data(0),Data(1),Data(2)];letmutvec1:Vec<&Data>=Vec::new();/*
??? what goes here ???
*/}
What I would like to do, as above, is fill vec1 with a bunch of &Data. This
was my first instinct:
forrefdatainvec2{vec1.push(data);}
The ref keyword, along with its friend ref mut, is used to borrow a value when
used as part of a pattern. It
desugars to:
This won’t work, as we move the data out of the vector. The reference we create
is thus only usable for the scope that the data is alive in - which is the curly
braces of our for loop. We get this error:
fail.rs:7:9: 7:17 error: borrowed value does not live long enough
fail.rs:7 for ref data in vec2 {
^~~~~~~~
Note how this is different from the error we got in the last section, and indicates
the problem we just talked about instead of the last section’s problem.
…It’s all very subtle.
The proper way to do it is written in the last section:
fordatain&vec2{vec1.push(data);}
Here, we immutably borrow the vec2. Therefore, the most it can do is give us &Ts - immutable
borrows - because it itself is immutably borrowed; moving stuff out of it would be
impossible, because it’s a borrow.
Thanks for reading this second post in the series! I initially wanted to get to
explaining lifetimes, but since explaining all the nuances of borrowing in the detail
I wanted to give took so much time, I’ll have to get to it next time.
You may have noticed the excessive references to the Rust Book - I highly recommend you also read that, if you haven’t already.