UPDATE April 8, 2016 This post has been edited since publication to reflect my evolving understanding. As a result of the issues raised in this post, it’s possible that the committee decides to strengthen the post-conditions on move, so the recommendations made here may evolve further. Stay tuned.
TL;DR: In addition to the usual rule about move operations leaving the source object in a valid but unspecified state, we can add an additional rule:
Self-move assignment should “work” and at the very least leave the object in a valid but unspecified state.
Discussion
What do you think the following code should do?
X x = {/*something*/}; x = std::move(x);
Yes, it’s dumb, but with our alias-happy language, it can happen. So what does the standard say about this? For that we turn to [res.on.arguments]/p1.3 taken from the library introduction (emphasis mine):
If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to this argument. […] If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g. by calling the function with the argument
std::move(x)
), the program is effectively asking that function to treat that lvalue as a temporary. The implementation is free to optimize away aliasing checks which might be needed if the argumentwaswere an lvalue.
(I fixed the grammar mistake because I am a Huge Dork.) The above seems to say that std::swap(x, x)
is playing with fire, because std::swap
is implemented as follows:
template <class T> void swap(T& a, T& b) { auto x(std::move(a)); a = std::move(b); // Here be dragons b = std::move(x); }
If a
and b
refer to the same object, the second line of std::swap
does a self-move assign. Blamo! Undefined behavior, right?
Such was what I thought when I first wrote this post until Howard Hinnant drew my attention to the requirements table for the MoveAssignable concept, which says that for the expression t = rv
(emphasis mine):
If
t
andrv
do not refer to the same object,t
is equivalent to the value ofrv
before the assignment […]rv
’s state is unspecified. [ Note: rv must still meet the requirements of the library component that is using it, whether or nott
andrv
refer to the same object. […] –end note]
Ah, ha! So here we have it. After a self-move, the object is required to be in a valid-but-unspecified state.
My attention we drawn to this issue during a code review of a change I wanted to make to Folly‘s Function
class template. I wanted to change this:
Function& operator=(Function&& that) noexcept { if (this != &that) { // do the move } return *this; }
to this:
Function& operator=(Function&& that) noexcept { assert(this != &that); // do the move return *this; }
The reason: let’s make moves as fast as possible and take advantage of the fact that Self-Moves Shouldn’t Happen. We assert, fix up the places that get it wrong, and make our programs an iota faster. Right?
Not so fast, said one clued-in reviewer. Self-swaps can happen quite easily in generic algorithms, and they shouldn’t trash the state of the object or the state of the program. This rang true, and so begin my investigation.
A few Google searches later turned up this StackOverflow gem from Howard Hinnant. C++ wonks know Howard Hinnant. He’s the author of libc++, and an old time C++ library developer. (Remember Metrowerks CodeWarrior? No? Get off my lawn.) He also happens to be the person who wrote the proposal to add rvalue references to the language, so you know, Howard’s given this some thought. First Howard says this:
Some will argue that
swap(x, x)
is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.I disagree that
swap(x, x)
is ever a good idea. If found in my own code, I will consider it a performance bug and fix it.
But then in an Update, he backtracks:
I’ve given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:
For copy assignment:
x = y;one should have a post-condition that the value of
y
should not be altered. When&x == &y
then this postcondition translates into: self copy assignment should have no impact on the value ofx
.For move assignment:
x = std::move(y);one should have a post-condition that
y
has a valid but unspecified state. When&x == &y
then this postcondition translates into:x
has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowingswap(x, x)
to just work […]
When Howard Hinnant changes his mind about something having to do with library design, I sit up and take note, because it means that something very deep and subtle is going on. In this case, it means I’ve been writing bad move assignment operators for years.
By Howard’s yardstick — and by the requirements for the MoveAssignable concept in the standard, thanks Howard! — this move assignment operator is wrong:
Function& operator=(Function&& that) noexcept { assert(this != &that); // No! Bad C++ programmer! // do the move return *this; }
Move assignment operators should accept self-moves and do no evil; indeed for std::swap(f, f)
to work it must.
That’s not the same as saying it needs to preserve the object’s value, though, and not preserving the object’s value can be a performance win. It can save a branch, for instance. Here is how I reformulated folly::Function
’s move assignment operator[*]:
Function& operator=(Function&& that) noexcept { clear_(); // Free all of the resources owned by *this moveFrom_(that); // Move that's guts into *this. return *this; }
[*] Well, not exactly, but that’s the gist.
Of note is that clear_()
leaves *this
in a state such that it is still OK to moveFrom_(*this)
, which is what happens when that
and *this
are the same object. In the case of Function
, it just so happens that the effect of this code is to put the Function
object back into the default-constructed state, obliterating the previous value. The particular final state of the object isn’t important though, so long as it is still valid.
Summary
So, as always we have the rule about moves:
Move operations should leave the source object in a valid but unspecified state.
And to that we can add an additional rule:
Self-moves should do no evil and leave the object in a valid but unspecified state.
If you want to go further and leave the object unmodified, that’s not wrong per se, but it’s not required by the standard as it is today. Changing the value is perfectly OK (Howard and the standard say so!), and doing that might save you some cycles.
TIL