Re: RFC: Allow moved-from strings to be non-empty

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Marc Glisse-6
On Fri, 26 Oct 2018, Ville Voutilainen wrote:

> I would rather not introduce a behavioral difference between us and
> libc++.

Why not? There are already several, and it helps find bugs. Maybe you
could convince libc++ to change as well if you want to keep the behavior
the same?

> It does slightly concern me that some users might
> actually semantically expect a moved-from string to be empty, even
> though that's not guaranteed, although for non-SSO cases
> it *is* guaranteed.

Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
5, short string so there is no allocation).

--
Marc Glisse
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Ville Voutilainen
On Fri, 26 Oct 2018 at 01:42, Marc Glisse <[hidden email]> wrote:
>
> On Fri, 26 Oct 2018, Ville Voutilainen wrote:
>
> > I would rather not introduce a behavioral difference between us and
> > libc++.
>
> Why not? There are already several, and it helps find bugs. Maybe you
> could convince libc++ to change as well if you want to keep the behavior
> the same?

What bugs?

> > It does slightly concern me that some users might
> > actually semantically expect a moved-from string to be empty, even
> > though that's not guaranteed, although for non-SSO cases
> > it *is* guaranteed.
>
> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
> 5, short string so there is no allocation).

Sigh. Apparently it isn't, because the standard doesn't bother placing
complexity
requirements on string constructors. Even so, I'd prefer string acting
like vector,
so that it will leave the source of a move in an empty state, rather
than an unspecified
state. Despite the standard not requiring that, it's more useful
programmatically
to have the empty state than the unspecified state, especially when the state
is empty in some cases anyway.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Jonathan Wakely-3
On 26/10/18 12:16 +0300, Ville Voutilainen wrote:

>On Fri, 26 Oct 2018 at 01:42, Marc Glisse <[hidden email]> wrote:
>>
>> On Fri, 26 Oct 2018, Ville Voutilainen wrote:
>>
>> > I would rather not introduce a behavioral difference between us and
>> > libc++.
>>
>> Why not? There are already several, and it helps find bugs. Maybe you
>> could convince libc++ to change as well if you want to keep the behavior
>> the same?
>
>What bugs?

Assuming the string is empty after a move and appending to it without
calling clear() first.


>> > It does slightly concern me that some users might
>> > actually semantically expect a moved-from string to be empty, even
>> > though that's not guaranteed, although for non-SSO cases
>> > it *is* guaranteed.
>>
>> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
>> 5, short string so there is no allocation).
>
>Sigh. Apparently it isn't, because the standard doesn't bother placing
>complexity
>requirements on string constructors.

Writing 5 bytes into the SSO buffer would be constant complexity :-P

>Even so, I'd prefer string acting
>like vector,
>so that it will leave the source of a move in an empty state, rather
>than an unspecified
>state.

It's not guaranteed for vector either. An allocator with POCMA==false
doesn't propagate on move assignment and if the source object's
allocator isn't equal to the target's it will copy, and there's no
guarantee the source will be empty.

This would be a conforming change to our vector:

--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1793,7 +1793,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
            // so we need to individually move each element.
            this->assign(std::__make_move_if_noexcept_iterator(__x.begin()),
                         std::__make_move_if_noexcept_iterator(__x.end()));
-           __x.clear();
          }
       }
 #endif

That might even have a bigger performance benefit, because clearing a
vector runs destructors, it doesn't just set the length and write a
null terminator.

If you're using a vector as a buffer of objects and the first thing
you do after v1=std::move(v2) is v2.resize(n) then it's a
pessimization to have cleared it.

>Despite the standard not requiring that, it's more useful
>programmatically
>to have the empty state than the unspecified state, especially when the state
>is empty in some cases anyway.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Marc Glisse-6
On Fri, 26 Oct 2018, Ville Voutilainen wrote:

> On Fri, 26 Oct 2018 at 12:55, Jonathan Wakely <[hidden email]> wrote:
>>>> Why not? There are already several, and it helps find bugs. Maybe you
>>>> could convince libc++ to change as well if you want to keep the behavior
>>>> the same?
>>>
>>> What bugs?
>>
>> Assuming the string is empty after a move and appending to it without
>> calling clear() first.
>
> I find it bloody unfortunate that the standard considers that a bug. It
> seems quite convenient to me that it's possible to have a bag of
> strings, move some of them out to be processed, and be able to tell the
> processed ones from the unprocessed ones without having to separately
> clear them when moving.

We all seem to want different things from move:

1) copy, possibly with a speed up because I don't care what the source
looks like afterwards. That's what was standardized, and for a small
string that means that moving it should just copy.

2) move and clear

3) move and destroy

(most likely many others)

The convenience does not seem that great to me, especially if it has a
performance cost.

>>>>> It does slightly concern me that some users might
>>>>> actually semantically expect a moved-from string to be empty, even
>>>>> though that's not guaranteed, although for non-SSO cases
>>>>> it *is* guaranteed.
>>>>
>>>> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
>>>> 5, short string so there is no allocation).
>>>
>>> Sigh. Apparently it isn't, because the standard doesn't bother placing
>>> complexity
>>> requirements on string constructors.
>>
>> Writing 5 bytes into the SSO buffer would be constant complexity :-P
>
> Indeed it would, but the standard doesn't seem to have complexity
> requirements on string constructors at all. If it did, moving a non-sso
> string would *have to* steal from the source, and the sso case would not
> have to do that, but could.

I am not sure what you are saying exactly, but I think "noexcept" is (kind
of) playing the role of a complexity requirement here.

--
Marc Glisse
Reply | Threaded
Open this post in threaded view
|

RE: RFC: Allow moved-from strings to be non-empty

Joe Buck
The reason move constructors were introduced was to speed up code in cases where an object
Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
there that relies on accidental properties of library implementations.  It would be best if the
Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
the string is empty, the user should only be able to assume that it is in some consistent but
unspecified state.  Otherwise we pay a performance penalty forever.

If the standard explicitly states that the argument to the move constructor is defined to be
empty after the call, we're stuck.

Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Ville Voutilainen
On Sat, 27 Oct 2018 at 01:27, Joe Buck <[hidden email]> wrote:

>
> The reason move constructors were introduced was to speed up code in cases where an object
> Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
> there that relies on accidental properties of library implementations.  It would be best if the
> Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
> the string is empty, the user should only be able to assume that it is in some consistent but
> unspecified state.  Otherwise we pay a performance penalty forever.
>
> If the standard explicitly states that the argument to the move constructor is defined to be
> empty after the call, we're stuck.

It certainly doesn't specify so for an SSO string, so we're not stuck.
On the other hand, we already get
a speed-up, it's just not as much of a speed-up as it can be. What I
really loathe is the potential implementation
divergence; it's all good for the priesthood to refer to the standard
and say "you shouldn't have done that", but
that's, as a good friend of mine provided as a phrasing on a different
manner, spectacularly unhelpful.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

François Dumont-2
On 10/27/18 12:33 AM, Ville Voutilainen wrote:

> On Sat, 27 Oct 2018 at 01:27, Joe Buck <[hidden email]> wrote:
>> The reason move constructors were introduced was to speed up code in cases where an object
>> Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
>> there that relies on accidental properties of library implementations.  It would be best if the
>> Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
>> the string is empty, the user should only be able to assume that it is in some consistent but
>> unspecified state.  Otherwise we pay a performance penalty forever.
>>
>> If the standard explicitly states that the argument to the move constructor is defined to be
>> empty after the call, we're stuck.
> It certainly doesn't specify so for an SSO string, so we're not stuck.
> On the other hand, we already get
> a speed-up, it's just not as much of a speed-up as it can be. What I
> really loathe is the potential implementation
> divergence; it's all good for the priesthood to refer to the standard
> and say "you shouldn't have done that", but
> that's, as a good friend of mine provided as a phrasing on a different
> manner, spectacularly unhelpful.
>
+1 to allow moved-from strings to be non-empty

In many cases move will take place on temporary instances which future
is just to be destroyed.

It's already too bad that in such situation we sometimes do some
allocation because current implementation do not support none (std::deque)

Note that on vector we already have this undefined behavior within
libstdc++ itself. On std::vector move constructor empty the moved
instance but allocator extended one with equal allocator instance don't,
it just swap content.

I'll add to my TODO to detect this kind of invalid usage in Debug
pendantic mode.

François

Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Jonathan Wakely-3
On 30/10/18 07:48 +0100, François Dumont wrote:

>On 10/27/18 12:33 AM, Ville Voutilainen wrote:
>>On Sat, 27 Oct 2018 at 01:27, Joe Buck <[hidden email]> wrote:
>>>The reason move constructors were introduced was to speed up code in cases where an object
>>>Is copied and the copy is no longer needed.  It is unfortunate that there may now be code out
>>>there that relies on accidental properties of library implementations.  It would be best if the
>>>Implementation is not constrained.  Unless the standard mandates that, after a string is moved,
>>>the string is empty, the user should only be able to assume that it is in some consistent but
>>>unspecified state.  Otherwise we pay a performance penalty forever.
>>>
>>>If the standard explicitly states that the argument to the move constructor is defined to be
>>>empty after the call, we're stuck.
>>It certainly doesn't specify so for an SSO string, so we're not stuck.
>>On the other hand, we already get
>>a speed-up, it's just not as much of a speed-up as it can be. What I
>>really loathe is the potential implementation
>>divergence; it's all good for the priesthood to refer to the standard
>>and say "you shouldn't have done that", but
>>that's, as a good friend of mine provided as a phrasing on a different
>>manner, spectacularly unhelpful.
>>
>+1 to allow moved-from strings to be non-empty
>
>In many cases move will take place on temporary instances which future
>is just to be destroyed.
>
>It's already too bad that in such situation we sometimes do some
>allocation because current implementation do not support none
>(std::deque)
>
>Note that on vector we already have this undefined behavior within
>libstdc++ itself. On std::vector move constructor empty the moved
>instance but allocator extended one with equal allocator instance
>don't, it just swap content.

But the content of the new object being constructed is empty, because
it's a new object being constructed. It doesn't have any elements. So
if you swap the content, the moved instance becomes empty. Or have I
misunderstood what you're saying?

>I'll add to my TODO to detect this kind of invalid usage in Debug
>pendantic mode.

I'm not sure we want to do that; code relying on it might be
non-portable, but it's not undefined.


Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

Ville Voutilainen
On Fri, 21 Dec 2018 at 22:42, Jonathan Wakely <[hidden email]> wrote:

> >I'll add to my TODO to detect this kind of invalid usage in Debug
> >pendantic mode.
>
> I'm not sure we want to do that; code relying on it might be
> non-portable, but it's not undefined.

In pedantic mode, that's maybe fine. But let's please allow users who
want the definition of unspecified
that they get if they insist on getting it to get what they ask for,
rather than forcibly suggest to them that what they want is
unsound. In other words, fully agreed on "non-portable but not
undefined". Portability problems are not
something we should blindly shout about; however, I'll repeat that I'm
not against shouting about them in pedantic
mode.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Allow moved-from strings to be non-empty

François Dumont-2
In reply to this post by Jonathan Wakely-3
On 12/21/18 9:42 PM, Jonathan Wakely wrote:

> On 30/10/18 07:48 +0100, François Dumont wrote:
>> On 10/27/18 12:33 AM, Ville Voutilainen wrote:
>>> On Sat, 27 Oct 2018 at 01:27, Joe Buck <[hidden email]> wrote:
>>>> The reason move constructors were introduced was to speed up code
>>>> in cases where an object
>>>> Is copied and the copy is no longer needed.  It is unfortunate that
>>>> there may now be code out
>>>> there that relies on accidental properties of library
>>>> implementations.  It would be best if the
>>>> Implementation is not constrained.  Unless the standard mandates
>>>> that, after a string is moved,
>>>> the string is empty, the user should only be able to assume that it
>>>> is in some consistent but
>>>> unspecified state.  Otherwise we pay a performance penalty forever.
>>>>
>>>> If the standard explicitly states that the argument to the move
>>>> constructor is defined to be
>>>> empty after the call, we're stuck.
>>> It certainly doesn't specify so for an SSO string, so we're not stuck.
>>> On the other hand, we already get
>>> a speed-up, it's just not as much of a speed-up as it can be. What I
>>> really loathe is the potential implementation
>>> divergence; it's all good for the priesthood to refer to the standard
>>> and say "you shouldn't have done that", but
>>> that's, as a good friend of mine provided as a phrasing on a different
>>> manner, spectacularly unhelpful.
>>>
>> +1 to allow moved-from strings to be non-empty
>>
>> In many cases move will take place on temporary instances which
>> future is just to be destroyed.
>>
>> It's already too bad that in such situation we sometimes do some
>> allocation because current implementation do not support none
>> (std::deque)
>>
>> Note that on vector we already have this undefined behavior within
>> libstdc++ itself. On std::vector move constructor empty the moved
>> instance but allocator extended one with equal allocator instance
>> don't, it just swap content.
>
> But the content of the new object being constructed is empty, because
> it's a new object being constructed. It doesn't have any elements. So
> if you swap the content, the moved instance becomes empty. Or have I
> misunderstood what you're saying?

I am even misunderstanding myself ! Sorry, wrong analogy indeed.

>
>> I'll add to my TODO to detect this kind of invalid usage in Debug
>> pendantic mode.
>
> I'm not sure we want to do that; code relying on it might be
> non-portable, but it's not undefined.
>
Yes, pedantic mode is not meant for that. We would need an option to
detect portability issue for behaviors that are not described by the
Standard but if it is limited to this situation it might not worth it.
This is why I consider using pedantic mode for it in the first place I
think.

François