Effectively Using Iterators In Rust
iter()通過引用遍歷專案 into_iter()遍歷專案,將其移至新範圍 iter_mut()遍歷專案,為每個專案提供可變的引用 因此,for x in my_vec { ... }本質上等效於my_vec.into_iter().for_each(|x| ... )-將my_vec的兩個元素都包含在...範圍內。
fn main() { // Vec example let vec1 = vec![1, 2, 3]; let vec2 = vec![4, 5, 6]; // `iter()` for vecs yields `&i32`. Destructure to `i32`.println!("2 in vec1: {}", vec1.iter() .any(|&x| x == 2)); // `into_iter()` for vecs yields `i32`. No destructuring required. println!("2 in vec2: {}", vec2.into_iter().any(| x| x == 2)); // Array example let array1 = [1, 2, 3]; let array2 = [4, 5, 6]; // `iter()` for arrays yields `&i32`. println!("2 in array1: {}", array1.iter() .any(|&x| x == 2)); // `into_iter()` for arrays unusually yields `&i32`. println!("2 in array2: {}", array2.into_iter().any(|&x| x == 2)); let args = vec![ "binary-name", "--exec-file", "foo", "--api-sock", "bar", "--id", "foobar", "--seccomp-level", "0", "--", "--extra-flag", ] .into_iter() .map(String::from) .collect::<Vec<String>>(); }
2 in vec1: true 2 in vec2: false 2 in array1: true 2 in array2: false
In Rust, you quickly learn that vector and slice types are not iterable themselves. Depending on which tutorial or example you see first, you call.iter()
or.into_iter()
. If you do not realize both of these functions exist or that they do different things, you may find yourself fighting with the compiler to get your code to work. Let us take a journey through the world of iterators and figure out the differences between iter() and into_iter() in Rust.
Iter
Most examples I have found use.iter()
. We can callv.iter()
on something like a vector or slice. This creates anIter<'a, T>
type and it is thisIter<'a, T>
type that implements theIterator
trait and allows us to call functions like.map()
. It is important to note that thisIter<'a, T>
type only has a reference toT
. This means that callingv.iter()
will create a struct thatborrowsfromv
. Use theiter()
function if you want to iterate over the values byreference.
Let us write a simple map/reduce example:
fn use_names_for_something_else(_names: Vec<&str>) {
}
fn main() {
let names = vec!["Jane", "Jill", "Jack", "John"];
let total_bytes = names
.iter()
.map(|name: &&str| name.len())
.fold(0, |acc, len| acc + len );
assert_eq!(total_bytes, 16);
use_names_for_something_else(names);
}
In this example, we are using.map()
and.fold()
to count the number of bytes (not characters! Rust strings are UTF-8) for all strings in thenames
vector. Weknowthat thelen()
function can use an immutable reference. As such, we preferiter()
instead ofiter_mut()
orinto_iter()
. This allows us tomovethenames
vector later if we want. I put a bogususe_names_for_something()
function in the example just to prove this. If we had usedinto_iter()
instead, the compiler would have given us anerror: use of moved value:names
response.
The closure used inmap()
does not require thename
parameter to have a type, but I specified the type to show how it is being passed as a reference. Notice that the type of name is&&str
and not&str
. The string"Jane"
is of type&str
. Theiter()
function creates an iterator that has areferenceto each element in thenames
vector. Thus, we have areferenceto areferenceof a string slice. This can get a little unwieldy and I generally do not worry about the type. However, if we are destructuring the type, we do need to specify the reference:
fn main() {
let player_scores = [
("Jack", 20), ("Jane", 23), ("Jill", 18), ("John", 19),
];
let players = player_scores
.iter()
.map(|(player, _score)| {
player
})
.collect::<Vec<_>>();
assert_eq!(players, ["Jack", "Jane", "Jill", "John"]);
}
In the above example, the compiler will complain that we are specifying the type(_, _)
instead of&(_, _)
. Changing the pattern to&(player, _score)
will satisfy the compiler.
Rust is immutable by default and iterators make it easy to manipulate data without needing mutability. If you do find yourself wanting to mutate some data, you can use theiter_mut()
method to get a mutable reference to the values. Example use ofiter_mut()
:
fn main() {
let mut teams = [
[ ("Jack", 20), ("Jane", 23), ("Jill", 18), ("John", 19), ],
[ ("Bill", 17), ("Brenda", 16), ("Brad", 18), ("Barbara", 17), ]
];
let teams_in_score_order = teams
.iter_mut()
.map(|team| {
team.sort_by(|&a, &b| a.1.cmp(&b.1).reverse());
team
})
.collect::<Vec<_>>();
println!("Teams: {:?}", teams_in_score_order);
}
Here we are using a mutable reference to sort the list of players on each team by highest score. Thesort_by()
function performs the sorting of the Vector/slice in place. This means we need the ability to mutateteam
in order to sort. I do not use.iter_mut()
often, but sometimes functions like.sort_by()
provide no immutable alternative.
I tend to use.iter()
most. I try to be very concious and deliberate about when Imoveresources and default to borrowing (or referencing) first. The reference created by.iter()
is short-lived, so we canmoveor use our original value afterwards. If you find yourself running intodoes not live long enough,moveerrors or using the.clone()
function, this is a sign that you probably want to use.into_iter()
instead.
IntoIter
Use theinto_iter()
function when you want tomove, instead ofborrow, your value. The.into_iter()
function creates aIntoIter<T>
type that now has ownership of the original value. LikeIter<'a, T>
, it is thisIntoIter<T>
type that actually implements theIterator
trait. The wordintois commonly used in Rust to signal thatT
is beingmoved. The docs also use the wordsownedorconsumedinterchangeably withmoved. I normally find myself using.into_iter()
when I have a function that is transforming some values:
fn get_names(v: Vec<(String, usize)>) -> Vec<String> {
v.into_iter()
.map(|(name, _score)| name)
.collect()
}
fn main() {
let v = vec!( ("Herman".to_string(), 5));
let names = get_names(v);
assert_eq!(names, ["Herman"]);
}
Theget_names
function is plucking out the name from a list of tuples. I chose.into_iter()
here because we are transforming the tuple into aString
type.
The concept behind.into_iter()
is similar to thecore::convert::Intotrait we discussed when accepting&str
andString
in a function. In fact, thestd::iter::Iteratortype implementsstd::iter::IntoIteratortoo. That means we can do something likevec![1, 2, 3, 4].into_iter().into_iter().into_iter()
. In each subsequent call to.into_iter()
just returns itself. This is an example of theidentity function. I mention that only because I find it interesting to identify functional concepts that I see being used in the wild.
How for Loops Actually Work
One of the first errors a new Rustacean will run into is themoveerror after using a for loop:
fn main() {
let values = vec![1, 2, 3, 4];
for x in values {
println!("{}", x);
}
let y = values; // move error
}
The question we immediately ask ourselves is “How do I create a for loop that uses a reference?”. Afor loopin Rust is really just syntatic sugar around.into_iter()
. From the manual:
// Rough translation of the iteration without a `for` iterator.
let mut it = values.into_iter();
loop {
match it.next() {
Some(x) => println!("{}", x),
None => break,
}
}
Now that we know.into_iter()
creates a typeIntoIter<T>
thatmovesT
, this behavior makes perfect sense. If we want to usevalues
after the for loop, we just need to use a reference instead:
fn main() {
let values = vec![1, 2, 3, 4];
for x in &values {
println!("{}", x);
}
let y = values; // perfectly valid
}
Instead of movingvalues
, which is typeVec<i32>
, we are moving&values
, which is type&Vec<i32>
. The for loop onlyborrows&values
for the duration of the loop and we are able tomovevalues
as soon as the for loop is done.
core::iter::Cloned
There are times when you want create a new value when iterating over your original value. You might first try something like:
fn main() {
let x = vec!["Jill", "Jack", "Jane", "John"];
let _ = x
.clone()
.into_iter()
.collect::<Vec<_>>();
}
Exercise for the reader:Why would.iter()
not work in this example?
While this is valid, we want to give Rust every chance to optimize our code. What if we only wanted the first two names from that list?
fn main() {
let x = vec!["Jill", "Jack", "Jane", "John"];
let _ = x
.clone()
.into_iter()
.take(2)
.collect::<Vec<_>>();
}
If we clone all ofx
, then we are cloning all four elements, but we only need two of them. We can do better by using.map()
to clone the elements of the underlying iterator:
fn main() {
let x = vec!["Jill", "Jack", "Jane", "John"];
let y = x
.iter()
.map(|i| i.clone())
.take(2)
.collect::<Vec<_>>();
}
The Rust compiler can now optimize this code and only clone two out of the four elements ofx
. This pattern is used so often that Rust core now has a special function that does this for us calledcloned(). This is a recent addition and will be stable in Rust 1.1. Our code now looks something like:
fn main() {
let x = vec!["Jill", "Jack", "Jane", "John"];
let y = x
.iter()
.cloned()
.take(2)
.collect::<Vec<_>>();
}
Iterators Outside of Core
There is a really great crate, calleditertools, that provides extra iterator adaptors, iterator methods and macros. If you are looking for some iterator functionality in the Rust docs and do not see it, there is a good chance it is part of itertools. I recently added anitertools::IterTools::sort_by()function so we can sort collections without needed to use a mutable iterator. One of the nice things about working with Rust is that the documentation looks the same across all these crates. Thedocumentation for itertoolslooks the same as thedocumentation for Rust std library.