push_back v.s. emplace_back
What are the differences between push_back
and emplace_back
?
Intro
Let's see an example.
push_back
Let us start with C++98.
Suppose there is a class A
, and we want to use a vector
to store some instances of class A
.
class A { protected: int *ptr; int size; public: A(int n = 16) { ptr = new int[n]; size = n; puts("A(int)"); } A(const A &a) { size = a.size; ptr = new int[size]; memcpy(ptr, a.ptr, size * sizeof(int)); puts("A(const A&)"); } virtual ~A() { if (ptr != nullptr) delete[] ptr; } };
Compile this code with command:
clang++ -std=c++98 push_back.cpp
And it will output:
A(int)
A(const A&)
We can see that, if we want to store an instance of A
in the vector
, there are at least two instances constructed. One is temporary, and the another one is stored in the heap of vector
.
If A
emplace_back
wants to optimize, to reduce one temporary instance copy.
emplace_back
In template class vector
, push_back
is defined as:
void push_back (const value_type& val); // C++98 void push_back (value_type&& val); // since C++11, where && denote rvalue reference
However, emplace_back
is defined as:
template <class... Args>
void emplace_back (Args&&... args); // where && denote universal reference, we will explain it latter
template< class... Args >
reference emplace_back( Args&&... args );
The arguments of emplace_back
is variadic, which is similar to printf
. What's more, the args
are type of rvalue reference. We will explain what is rvalue reference.
We can see that the argument of emplace_back
is universal reference, and it's variadic.
After C++11, the C++ standard introduces "move semantic" and "perfect forward". And there is a new type of constructor, call "move constructor".
#include <iostream>
#include <vector>
using namespace std;
class A {
protected:
int *ptr;
int size;
public:
A(int n = 16) {
ptr = new int[n];
size = n;
puts("A(int)");
}
A(const A &a) {
size = a.size;
ptr = new int[size];
memcpy(ptr, a.ptr, size * sizeof(int));
puts("A(const A&)");
}
A(A &&a) {
size = a.size;
ptr = a.ptr;
a.ptr = nullptr;
puts("A(const A&&)");
}
virtual ~A() {
if (ptr != nullptr)
delete[] ptr;
}
};
int main() {
vector<A> vec;
vec.emplace_back(10);
}
Compiled it with clang++ -std=c++17 push.cpp
. Then the program will output:
A(int)
Now, we can see the differences between push_back
and emplace_back
.
What will happen if we call emplace_back(A(10))
? Actually, it will output:
A(int)
A(const A&&)
So we can see that, there is still only one copy, no temporary object.
In the next section, we will explain what is "universal reference", and introduce the difference among lvalue-reference, rvalue-reference and universal-reference.
lvalue, rvalue and xvalue
Please refer to
for more details.
Generally speaking,
-
lvalue - Left-hand side value of an assignment expression. A lvalue always has an identity name.
- Please note that "assignment" is not declaration and initialization.
- For example,
int x = 1;
is declaring a lvaluex
, initialized it with 1. int &y = x;
is declaring a lvalue referencey
, initialized it with lvaluex
.
-
rvalue - Right-hand side value of an assignment expression. A rvalue usually is a temporary object.
- e.g.
string s = string("hello")
, wherestring("hello")
is a rvalue. - A rvalue has no identity name.
- e.g.
-
xvalue - "eXpiring value", it usually refers to an object, usually near the end of its lifetime (so that its resources may be moved).
- e.g. suppose we have a function
auto f() { return string("hello")}
, and we letstr += f()
, wheref()
is a xvalue (also a rvalue).
- e.g. suppose we have a function
Universal Reference
In C++, there are two common reference types: lvalue reference and rvalue reference. In addition,
- non-const lvalue reference must be binded to a lvalue,
- const lvalue reference can be binded to a either const lvalue or a rvalue
- e.g. if we have a function
void f(const string &str);
, thenf(string("ABC"))
is valid.
- e.g. if we have a function
- rvalue reference must be binded to a rvalue.
void f1(vector<int>& vec) {}
void f2(vector<int>&& vec) {}
In above code, vector<int>&
means vec
is lvalue reference, and vector<int>&&
means vec
is a rvalue reference.
Actully, there are 3rd reference type, called "universal reference". Universal reference is a reference that may resolve to either an lvalue reference or an rvalue reference.
Now, let us see another example, which is about template
.
template<class T> void f1(T &val); // lvalue reference
template<class T> void f2(T &&val); // universal reference
template<class T> void f3(vector<T> &&val); // rvalue reference
template<class T> void f4(const T&& param); // rvalue reference
T &
is the most common reference type, lvalue-reference, which must be binded to a lvalue.T &&
is actually the universal reference.vector<T> &&
andconst T &&
are the rvalue references.
So, we can see that it's easy to distinguish the lvalue reference, there is only one &
in lvalue reference.
But how can we distinguish rvalue reference and universal reference, both of them have two &
?
Refer to this blog: Universal References in C++11
- "Universal references can only occur in the form
T&&
!"- More specifically, universal references always have the form
T&&
for some deduced typeT
.
Let's revisit the push_back
and emplace_back
.
template <class T, class Allocator = allocator<T> >
class vector {
public:
...
void push_back(T&& x); // fully specified parameter type => no type deduction;
... // && is rvalue reference
};
Actually, the declaration for push_back
is:
template <class T>
void vector<T>::push_back(T&& x);
push_back
can't exist without the class std::vector<T>
that contains it. But if we have a class std::vector<T>
, we already know what T
is, so there’s no need to deduce it. Hence T &&
is not a deduced type.
The case is different in emplace_back
.
template <class T, class Allocator = allocator<T> >
class vector {
public:
...
template <class... Args>
void emplace_back(Args&&... args); // deduced parameter types => type deduction;
... // && is universal references
};
And the declaration of emplace_back
is:
template<class T>
template<class... Args>
void std::vector<T>::emplace_back(Args&&... args);
Here Args
is a deduced type, obviously. Hence Args &&
is universal reference.
move
std::move
is used to "cast a lvalue to rvalue".
std::move
is used to indicate that an objectt
may be "moved from", i.e. allowing the efficient transfer of resources fromt
to another object.In particular,
std::move
produces an xvalue expression that identifies its argumentt
. It is exactly equivalent to astatic_cast
to an rvalue reference type.
std::move
is defined as:
template<class T>
constexpr std::remove_reference_t<T>&& move( T&& t ) noexcept; // since C++14
Here T &&t
is an universal reference, since T
is a deduced type.
The implementation of move
is very simple, what it does is to make a type-casting by static_cast
.
template<class T>
constexpr std::remove_reference_t<T>&& move( T&& t ) noexcept {
return static_cast<typename std::remove_reference<T>::type&&>(t);
}
The effect of remove_reference
is remove reference qualifier of a type T
.
template<class T> struct remove_reference {typedef T type;};
template<class T> struct remove_reference<T&> {typedef T type;};
template<class T> struct remove_reference<T&&> {typedef T type;};
We can make these code simpler, that is:
template<class T>
constexpr T&& move(T&& t) noexcept {
return static_cast<T &&>(t);
}
forward
std::forward
is defined as:
template< class T >
constexpr T&& forward( std::remove_reference_t<T>& t ) noexcept {
return static_cast<T&&>(t);
}
template< class T >
constexpr T&& forward(std::remove_reference_t<T>&& t) noexcept {
static_assert(!is_lvalue_reference<T>::value,
"can not forward an rvalue as an lvalue");
return static_cast<T&&>(t);
}
- For the 1st one, it forwards lvalue
t
as either lvalue or as rvalue, depending onT
.std::forward<string &>(str)
will produce an lvalue reference. (Actually, it does nothing here.)std::forward<string &&>(str)
will produce an rvalue reference. It can forwardstr
(a lvalue) as rvalue. Here we can see that, this version offorward
can replacemove
. See Usage of std::forward vs std::move.
- For the 2nd one, it forwards rvalue
t
as rvalues and prohibits forwarding of rvalues as lvalues.- e.g.
std::forward<string &>("")
will cause compiler error, since it attempts to forward a rvalue""
as a lvalue.
- e.g.
std::forward
makes it possible to forward a result of an expression (such as function call), which may be rvalue or lvalue, as the original value category of a forwarding reference argument.
The forward
operation will keep the reference property while forwarding t
, hence it is called "Perfect Forwarding".
Implementation of emplace_back
Based on std::forward<>()
and std::move()
, (after C++11) one of the possible implementations of push_back
and emplace_back
is:
template<class T>
class Vector {
protected:
using value_type = T;
using pointer_type = T*;
using reference_type = T&;
pointer_type start;
std::size_t size;
std::size_t capacity;
// ...
public:
void push_back(value_type &&val) { this->emplace_back(val); }
void push_back(const value_type &val) { this->emplace_back(std::move(val)); }
template <class... Args>
reference_type emplace_back (Args&&... args) {
if (size == capacity) {
// make vector grow via some strategies
}
// new placement
return *new(start + (size++)) T(std::forward<Args>args...);
}
};
In C++98 (before C++11), implementation of push_back
maybe:
void push_back(const value_type &val) {
if (size == capacity) {
// ...
}
start[size] = val; // this will call copy constructor
++size;
}