Re: RFC: Allow moved-from strings to be non-empty

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Marc Glisse-6
On Fri, 26 Oct 2018, Ville Voutilainen wrote:

> I would rather not introduce a behavioral difference between us and
> libc++.

Why not? There are already several, and it helps find bugs. Maybe you
could convince libc++ to change as well if you want to keep the behavior
the same?

> It does slightly concern me that some users might
> actually semantically expect a moved-from string to be empty, even
> though that's not guaranteed, although for non-SSO cases
> it *is* guaranteed.

Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
5, short string so there is no allocation).

--
Marc Glisse
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Ville Voutilainen
On Fri, 26 Oct 2018 at 01:42, Marc Glisse <[hidden email]> wrote:
>
> On Fri, 26 Oct 2018, Ville Voutilainen wrote:
>
> > I would rather not introduce a behavioral difference between us and
> > libc++.
>
> Why not? There are already several, and it helps find bugs. Maybe you
> could convince libc++ to change as well if you want to keep the behavior
> the same?

What bugs?

> > It does slightly concern me that some users might
> > actually semantically expect a moved-from string to be empty, even
> > though that's not guaranteed, although for non-SSO cases
> > it *is* guaranteed.
>
> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
> 5, short string so there is no allocation).

Sigh. Apparently it isn't, because the standard doesn't bother placing
complexity
requirements on string constructors. Even so, I'd prefer string acting
like vector,
so that it will leave the source of a move in an empty state, rather
than an unspecified
state. Despite the standard not requiring that, it's more useful
programmatically
to have the empty state than the unspecified state, especially when the state
is empty in some cases anyway.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Jonathan Wakely-3
On 26/10/18 12:16 +0300, Ville Voutilainen wrote:

>On Fri, 26 Oct 2018 at 01:42, Marc Glisse <[hidden email]> wrote:
>>
>> On Fri, 26 Oct 2018, Ville Voutilainen wrote:
>>
>> > I would rather not introduce a behavioral difference between us and
>> > libc++.
>>
>> Why not? There are already several, and it helps find bugs. Maybe you
>> could convince libc++ to change as well if you want to keep the behavior
>> the same?
>
>What bugs?

Assuming the string is empty after a move and appending to it without
calling clear() first.


>> > It does slightly concern me that some users might
>> > actually semantically expect a moved-from string to be empty, even
>> > though that's not guaranteed, although for non-SSO cases
>> > it *is* guaranteed.
>>
>> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
>> 5, short string so there is no allocation).
>
>Sigh. Apparently it isn't, because the standard doesn't bother placing
>complexity
>requirements on string constructors.

Writing 5 bytes into the SSO buffer would be constant complexity :-P

>Even so, I'd prefer string acting
>like vector,
>so that it will leave the source of a move in an empty state, rather
>than an unspecified
>state.

It's not guaranteed for vector either. An allocator with POCMA==false
doesn't propagate on move assignment and if the source object's
allocator isn't equal to the target's it will copy, and there's no
guarantee the source will be empty.

This would be a conforming change to our vector:

--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1793,7 +1793,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
            // so we need to individually move each element.
            this->assign(std::__make_move_if_noexcept_iterator(__x.begin()),
                         std::__make_move_if_noexcept_iterator(__x.end()));
-           __x.clear();
          }
       }
 #endif

That might even have a bigger performance benefit, because clearing a
vector runs destructors, it doesn't just set the length and write a
null terminator.

If you're using a vector as a buffer of objects and the first thing
you do after v1=std::move(v2) is v2.resize(n) then it's a
pessimization to have cleared it.

>Despite the standard not requiring that, it's more useful
>programmatically
>to have the empty state than the unspecified state, especially when the state
>is empty in some cases anyway.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Marc Glisse-6
On Fri, 26 Oct 2018, Ville Voutilainen wrote:

> On Fri, 26 Oct 2018 at 12:55, Jonathan Wakely <[hidden email]> wrote:
>>>> Why not? There are already several, and it helps find bugs. Maybe you
>>>> could convince libc++ to change as well if you want to keep the behavior
>>>> the same?
>>>
>>> What bugs?
>>
>> Assuming the string is empty after a move and appending to it without
>> calling clear() first.
>
> I find it bloody unfortunate that the standard considers that a bug. It
> seems quite convenient to me that it's possible to have a bag of
> strings, move some of them out to be processed, and be able to tell the
> processed ones from the unprocessed ones without having to separately
> clear them when moving.

We all seem to want different things from move:

1) copy, possibly with a speed up because I don't care what the source
looks like afterwards. That's what was standardized, and for a small
string that means that moving it should just copy.

2) move and clear

3) move and destroy

(most likely many others)

The convenience does not seem that great to me, especially if it has a
performance cost.

>>>>> It does slightly concern me that some users might
>>>>> actually semantically expect a moved-from string to be empty, even
>>>>> though that's not guaranteed, although for non-SSO cases
>>>>> it *is* guaranteed.
>>>>
>>>> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
>>>> 5, short string so there is no allocation).
>>>
>>> Sigh. Apparently it isn't, because the standard doesn't bother placing
>>> complexity
>>> requirements on string constructors.
>>
>> Writing 5 bytes into the SSO buffer would be constant complexity :-P
>
> Indeed it would, but the standard doesn't seem to have complexity
> requirements on string constructors at all. If it did, moving a non-sso
> string would *have to* steal from the source, and the sso case would not
> have to do that, but could.

I am not sure what you are saying exactly, but I think "noexcept" is (kind
of) playing the role of a complexity requirement here.

--
Marc Glisse
Reply | Threaded
Open this post in threaded view
|

RE: RFC: Allow moved-from strings to be non-empty

Joe Buck
The reason move constructors were introduced was to speed up code in cases where an object
Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
there that relies on accidental properties of library implementations.  It would be best if the
Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
the string is empty, the user should only be able to assume that it is in some consistent but
unspecified state.  Otherwise we pay a performance penalty forever.

If the standard explicitly states that the argument to the move constructor is defined to be
empty after the call, we're stuck.

Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Ville Voutilainen
On Sat, 27 Oct 2018 at 01:27, Joe Buck <[hidden email]> wrote:

>
> The reason move constructors were introduced was to speed up code in cases where an object
> Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
> there that relies on accidental properties of library implementations.  It would be best if the
> Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
> the string is empty, the user should only be able to assume that it is in some consistent but
> unspecified state.  Otherwise we pay a performance penalty forever.
>
> If the standard explicitly states that the argument to the move constructor is defined to be
> empty after the call, we're stuck.

It certainly doesn't specify so for an SSO string, so we're not stuck.
On the other hand, we already get
a speed-up, it's just not as much of a speed-up as it can be. What I
really loathe is the potential implementation
divergence; it's all good for the priesthood to refer to the standard
and say "you shouldn't have done that", but
that's, as a good friend of mine provided as a phrasing on a different
manner, spectacularly unhelpful.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

François Dumont-2
On 10/27/18 12:33 AM, Ville Voutilainen wrote:

> On Sat, 27 Oct 2018 at 01:27, Joe Buck <[hidden email]> wrote:
>> The reason move constructors were introduced was to speed up code in cases where an object
>> Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
>> there that relies on accidental properties of library implementations.  It would be best if the
>> Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
>> the string is empty, the user should only be able to assume that it is in some consistent but
>> unspecified state.  Otherwise we pay a performance penalty forever.
>>
>> If the standard explicitly states that the argument to the move constructor is defined to be
>> empty after the call, we're stuck.
> It certainly doesn't specify so for an SSO string, so we're not stuck.
> On the other hand, we already get
> a speed-up, it's just not as much of a speed-up as it can be. What I
> really loathe is the potential implementation
> divergence; it's all good for the priesthood to refer to the standard
> and say "you shouldn't have done that", but
> that's, as a good friend of mine provided as a phrasing on a different
> manner, spectacularly unhelpful.
>
+1 to allow moved-from strings to be non-empty

In many cases move will take place on temporary instances which future
is just to be destroyed.

It's already too bad that in such situation we sometimes do some
allocation because current implementation do not support none (std::deque)

Note that on vector we already have this undefined behavior within
libstdc++ itself. On std::vector move constructor empty the moved
instance but allocator extended one with equal allocator instance don't,
it just swap content.

I'll add to my TODO to detect this kind of invalid usage in Debug
pendantic mode.

François