[PATCH] let hash-based containers work with non-trivial types (PR 90923)

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] let hash-based containers work with non-trivial types (PR 90923)

Martin Sebor-2
Bug 90923 shows that even though GCC hash-table based containers
like hash_map can be instantiated on types with user-defined ctors
and dtors they invoke the dtors of such types without invoking
the corresponding ctors.

It was thanks to this bug that I spent a day debugging "interesting"
miscompilations during GCC bootstrap (in fairness, it was that and
bug 90904 about auto_vec copy assignment/construction also being
hosed even for POD types).

The attached patch corrects the hash_map and hash_set templates
to invoke the ctors of the elements they insert and makes them
(hopefully) safe to use with non-trivial user-defined types.

Tested on x86_64-linux.

Martin

gcc-90923.diff (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Richard Biener-2
On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:

>
> Bug 90923 shows that even though GCC hash-table based containers
> like hash_map can be instantiated on types with user-defined ctors
> and dtors they invoke the dtors of such types without invoking
> the corresponding ctors.
>
> It was thanks to this bug that I spent a day debugging "interesting"
> miscompilations during GCC bootstrap (in fairness, it was that and
> bug 90904 about auto_vec copy assignment/construction also being
> hosed even for POD types).
>
> The attached patch corrects the hash_map and hash_set templates
> to invoke the ctors of the elements they insert and makes them
> (hopefully) safe to use with non-trivial user-defined types.

Hum.  I am worried about the difference of assignment vs. construction
in ::put()

+      bool ins = hash_entry::is_empty (*e);
+      if (ins)
+       {
+         e->m_key = k;
+         new ((void *) &e->m_value) Value (v);
+       }
+      else
+       e->m_value = v;

why not invoke the dtor on the old value and then the ctor again?

How is an empty hash_entry constructed?  ::remove() doesn't
seem to invoke the dtor either, instead it relies on the
traits::remove function?

> Tested on x86_64-linux.
>
> Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Martin Sebor-2
On 6/21/19 6:06 AM, Richard Biener wrote:

> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:
>>
>> Bug 90923 shows that even though GCC hash-table based containers
>> like hash_map can be instantiated on types with user-defined ctors
>> and dtors they invoke the dtors of such types without invoking
>> the corresponding ctors.
>>
>> It was thanks to this bug that I spent a day debugging "interesting"
>> miscompilations during GCC bootstrap (in fairness, it was that and
>> bug 90904 about auto_vec copy assignment/construction also being
>> hosed even for POD types).
>>
>> The attached patch corrects the hash_map and hash_set templates
>> to invoke the ctors of the elements they insert and makes them
>> (hopefully) safe to use with non-trivial user-defined types.
>
> Hum.  I am worried about the difference of assignment vs. construction
> in ::put()
>
> +      bool ins = hash_entry::is_empty (*e);
> +      if (ins)
> +       {
> +         e->m_key = k;
> +         new ((void *) &e->m_value) Value (v);
> +       }
> +      else
> +       e->m_value = v;
>
> why not invoke the dtor on the old value and then the ctor again?
It wouldn't work for self-assignment:

   Value &y = m.get_or_insert (key);
   m.put (key, y);

The usual rule of thumb for writes into containers is to use
construction when creating a new element and assignment when
replacing the value of an existing element.

Which reminds me that the hash containers, despite being copy-
constructible (at least for POD types, they aren't for user-
defined types), also aren't safe for assignment even for PODs.
I opened bug 90959 for this.  Until the assignment is fixed
I made it inaccessibe in the patch (I have fixed the copy
ctor to DTRT for non-PODs).

> How is an empty hash_entry constructed?

In hash_table::find_slot_with_hash simply by finding an empty
slot and returning a pointer to it.  The memory for the slot
is marked "empty" by calling the Traits::mark_empty() function.

The full type of hash_map<void*, Value> is actually

   hash_map<void*,
            Value,
            simple_hashmap_traits<default_hash_traits<void*>,
                                  Value>

and simple_hashmap_traits delegates it to default_hash_traits
whose mark_empty() just clears the void*, leaving the Value
part uninitialized.  That makes sense because we don't want
to call ctors for empty entries.  I think the questions one
might ask if one were to extend the design are: a) what class
should invoke the ctor/assignment and b) should it do it
directly or via the traits?

> ::remove() doesn't
> seem to invoke the dtor either, instead it relies on the
> traits::remove function?

Yes.  There is no Traits::construct or assign or copy.  We
could add them but I'm not sure I see to what end (there could
be use cases, I just don't know enough about these classes to
think of any).

Attached is an updated patch with the additional minor fixes
mentioned above.

Martin

PS I think we would get much better results by adopting
the properly designed and tested standard library containers
than by spending time trying to improve the design of these
legacy classes.  For simple uses that don't need to integrate
with the GC machinery the standard containers should be fine
(plus, it'd provide us with greater motivation to improve
them and the code GCC emits for their uses).  Unfortunately,
to be able to use the hash-based containers we would need to
upgrade to C++ 11.  Isn't it time yet?

gcc-90923.diff (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Richard Biener-2
On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]> wrote:

>
> On 6/21/19 6:06 AM, Richard Biener wrote:
> > On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:
> >>
> >> Bug 90923 shows that even though GCC hash-table based containers
> >> like hash_map can be instantiated on types with user-defined ctors
> >> and dtors they invoke the dtors of such types without invoking
> >> the corresponding ctors.
> >>
> >> It was thanks to this bug that I spent a day debugging "interesting"
> >> miscompilations during GCC bootstrap (in fairness, it was that and
> >> bug 90904 about auto_vec copy assignment/construction also being
> >> hosed even for POD types).
> >>
> >> The attached patch corrects the hash_map and hash_set templates
> >> to invoke the ctors of the elements they insert and makes them
> >> (hopefully) safe to use with non-trivial user-defined types.
> >
> > Hum.  I am worried about the difference of assignment vs. construction
> > in ::put()
> >
> > +      bool ins = hash_entry::is_empty (*e);
> > +      if (ins)
> > +       {
> > +         e->m_key = k;
> > +         new ((void *) &e->m_value) Value (v);
> > +       }
> > +      else
> > +       e->m_value = v;
> >
> > why not invoke the dtor on the old value and then the ctor again?
>
> It wouldn't work for self-assignment:
>
>    Value &y = m.get_or_insert (key);
>    m.put (key, y);
>
> The usual rule of thumb for writes into containers is to use
> construction when creating a new element and assignment when
> replacing the value of an existing element.
>
> Which reminds me that the hash containers, despite being copy-
> constructible (at least for POD types, they aren't for user-
> defined types), also aren't safe for assignment even for PODs.
> I opened bug 90959 for this.  Until the assignment is fixed
> I made it inaccessibe in the patch (I have fixed the copy
> ctor to DTRT for non-PODs).
>
> > How is an empty hash_entry constructed?
>
> In hash_table::find_slot_with_hash simply by finding an empty
> slot and returning a pointer to it.  The memory for the slot
> is marked "empty" by calling the Traits::mark_empty() function.
>
> The full type of hash_map<void*, Value> is actually
>
>    hash_map<void*,
>             Value,
>             simple_hashmap_traits<default_hash_traits<void*>,
>                                   Value>
>
> and simple_hashmap_traits delegates it to default_hash_traits
> whose mark_empty() just clears the void*, leaving the Value
> part uninitialized.  That makes sense because we don't want
> to call ctors for empty entries.  I think the questions one
> might ask if one were to extend the design are: a) what class
> should invoke the ctor/assignment and b) should it do it
> directly or via the traits?
>
> > ::remove() doesn't
> > seem to invoke the dtor either, instead it relies on the
> > traits::remove function?
>
> Yes.  There is no Traits::construct or assign or copy.  We
> could add them but I'm not sure I see to what end (there could
> be use cases, I just don't know enough about these classes to
> think of any).
>
> Attached is an updated patch with the additional minor fixes
> mentioned above.
>
> Martin
>
> PS I think we would get much better results by adopting
> the properly designed and tested standard library containers
> than by spending time trying to improve the design of these
> legacy classes.  For simple uses that don't need to integrate
> with the GC machinery the standard containers should be fine
> (plus, it'd provide us with greater motivation to improve
> them and the code GCC emits for their uses).  Unfortunately,
> to be able to use the hash-based containers we would need to
> upgrade to C++ 11.  Isn't it time yet?

I don't think so.  I'm also not sure if C++11 on its own is desirable
or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
SLES 11 already struggles currently (GCC 4.3) but I'd no longer
consider that important enough.

Note any such change to libstdc++ containers should be complete
and come with both code-size and compile-time and memory-usage
measurements (both of GCC and other apps of course).

Richard.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Martin Sebor-2
On 6/24/19 6:11 AM, Richard Biener wrote:

> On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]> wrote:
>>
>> On 6/21/19 6:06 AM, Richard Biener wrote:
>>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:
>>>>
>>>> Bug 90923 shows that even though GCC hash-table based containers
>>>> like hash_map can be instantiated on types with user-defined ctors
>>>> and dtors they invoke the dtors of such types without invoking
>>>> the corresponding ctors.
>>>>
>>>> It was thanks to this bug that I spent a day debugging "interesting"
>>>> miscompilations during GCC bootstrap (in fairness, it was that and
>>>> bug 90904 about auto_vec copy assignment/construction also being
>>>> hosed even for POD types).
>>>>
>>>> The attached patch corrects the hash_map and hash_set templates
>>>> to invoke the ctors of the elements they insert and makes them
>>>> (hopefully) safe to use with non-trivial user-defined types.
>>>
>>> Hum.  I am worried about the difference of assignment vs. construction
>>> in ::put()
>>>
>>> +      bool ins = hash_entry::is_empty (*e);
>>> +      if (ins)
>>> +       {
>>> +         e->m_key = k;
>>> +         new ((void *) &e->m_value) Value (v);
>>> +       }
>>> +      else
>>> +       e->m_value = v;
>>>
>>> why not invoke the dtor on the old value and then the ctor again?
>>
>> It wouldn't work for self-assignment:
>>
>>     Value &y = m.get_or_insert (key);
>>     m.put (key, y);
>>
>> The usual rule of thumb for writes into containers is to use
>> construction when creating a new element and assignment when
>> replacing the value of an existing element.
>>
>> Which reminds me that the hash containers, despite being copy-
>> constructible (at least for POD types, they aren't for user-
>> defined types), also aren't safe for assignment even for PODs.
>> I opened bug 90959 for this.  Until the assignment is fixed
>> I made it inaccessibe in the patch (I have fixed the copy
>> ctor to DTRT for non-PODs).
>>
>>> How is an empty hash_entry constructed?
>>
>> In hash_table::find_slot_with_hash simply by finding an empty
>> slot and returning a pointer to it.  The memory for the slot
>> is marked "empty" by calling the Traits::mark_empty() function.
>>
>> The full type of hash_map<void*, Value> is actually
>>
>>     hash_map<void*,
>>              Value,
>>              simple_hashmap_traits<default_hash_traits<void*>,
>>                                    Value>
>>
>> and simple_hashmap_traits delegates it to default_hash_traits
>> whose mark_empty() just clears the void*, leaving the Value
>> part uninitialized.  That makes sense because we don't want
>> to call ctors for empty entries.  I think the questions one
>> might ask if one were to extend the design are: a) what class
>> should invoke the ctor/assignment and b) should it do it
>> directly or via the traits?
>>
>>> ::remove() doesn't
>>> seem to invoke the dtor either, instead it relies on the
>>> traits::remove function?
>>
>> Yes.  There is no Traits::construct or assign or copy.  We
>> could add them but I'm not sure I see to what end (there could
>> be use cases, I just don't know enough about these classes to
>> think of any).
>>
>> Attached is an updated patch with the additional minor fixes
>> mentioned above.
>>
>> Martin
>>
>> PS I think we would get much better results by adopting
>> the properly designed and tested standard library containers
>> than by spending time trying to improve the design of these
>> legacy classes.  For simple uses that don't need to integrate
>> with the GC machinery the standard containers should be fine
>> (plus, it'd provide us with greater motivation to improve
>> them and the code GCC emits for their uses).  Unfortunately,
>> to be able to use the hash-based containers we would need to
>> upgrade to C++ 11.  Isn't it time yet?
>
> I don't think so.  I'm also not sure if C++11 on its own is desirable
> or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
> as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
> SLES 11 already struggles currently (GCC 4.3) but I'd no longer
> consider that important enough.
>
> Note any such change to libstdc++ containers should be complete
> and come with both code-size and compile-time and memory-usage
> measurements (both of GCC and other apps of course).

Can I go ahead and commit the patch?

Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Richard Biener-2
On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor <[hidden email]> wrote:

>
> On 6/24/19 6:11 AM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]> wrote:
> >>
> >> On 6/21/19 6:06 AM, Richard Biener wrote:
> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:
> >>>>
> >>>> Bug 90923 shows that even though GCC hash-table based containers
> >>>> like hash_map can be instantiated on types with user-defined ctors
> >>>> and dtors they invoke the dtors of such types without invoking
> >>>> the corresponding ctors.
> >>>>
> >>>> It was thanks to this bug that I spent a day debugging "interesting"
> >>>> miscompilations during GCC bootstrap (in fairness, it was that and
> >>>> bug 90904 about auto_vec copy assignment/construction also being
> >>>> hosed even for POD types).
> >>>>
> >>>> The attached patch corrects the hash_map and hash_set templates
> >>>> to invoke the ctors of the elements they insert and makes them
> >>>> (hopefully) safe to use with non-trivial user-defined types.
> >>>
> >>> Hum.  I am worried about the difference of assignment vs. construction
> >>> in ::put()
> >>>
> >>> +      bool ins = hash_entry::is_empty (*e);
> >>> +      if (ins)
> >>> +       {
> >>> +         e->m_key = k;
> >>> +         new ((void *) &e->m_value) Value (v);
> >>> +       }
> >>> +      else
> >>> +       e->m_value = v;
> >>>
> >>> why not invoke the dtor on the old value and then the ctor again?
> >>
> >> It wouldn't work for self-assignment:
> >>
> >>     Value &y = m.get_or_insert (key);
> >>     m.put (key, y);
> >>
> >> The usual rule of thumb for writes into containers is to use
> >> construction when creating a new element and assignment when
> >> replacing the value of an existing element.
> >>
> >> Which reminds me that the hash containers, despite being copy-
> >> constructible (at least for POD types, they aren't for user-
> >> defined types), also aren't safe for assignment even for PODs.
> >> I opened bug 90959 for this.  Until the assignment is fixed
> >> I made it inaccessibe in the patch (I have fixed the copy
> >> ctor to DTRT for non-PODs).
> >>
> >>> How is an empty hash_entry constructed?
> >>
> >> In hash_table::find_slot_with_hash simply by finding an empty
> >> slot and returning a pointer to it.  The memory for the slot
> >> is marked "empty" by calling the Traits::mark_empty() function.
> >>
> >> The full type of hash_map<void*, Value> is actually
> >>
> >>     hash_map<void*,
> >>              Value,
> >>              simple_hashmap_traits<default_hash_traits<void*>,
> >>                                    Value>
> >>
> >> and simple_hashmap_traits delegates it to default_hash_traits
> >> whose mark_empty() just clears the void*, leaving the Value
> >> part uninitialized.  That makes sense because we don't want
> >> to call ctors for empty entries.  I think the questions one
> >> might ask if one were to extend the design are: a) what class
> >> should invoke the ctor/assignment and b) should it do it
> >> directly or via the traits?
> >>
> >>> ::remove() doesn't
> >>> seem to invoke the dtor either, instead it relies on the
> >>> traits::remove function?
> >>
> >> Yes.  There is no Traits::construct or assign or copy.  We
> >> could add them but I'm not sure I see to what end (there could
> >> be use cases, I just don't know enough about these classes to
> >> think of any).
> >>
> >> Attached is an updated patch with the additional minor fixes
> >> mentioned above.
> >>
> >> Martin
> >>
> >> PS I think we would get much better results by adopting
> >> the properly designed and tested standard library containers
> >> than by spending time trying to improve the design of these
> >> legacy classes.  For simple uses that don't need to integrate
> >> with the GC machinery the standard containers should be fine
> >> (plus, it'd provide us with greater motivation to improve
> >> them and the code GCC emits for their uses).  Unfortunately,
> >> to be able to use the hash-based containers we would need to
> >> upgrade to C++ 11.  Isn't it time yet?
> >
> > I don't think so.  I'm also not sure if C++11 on its own is desirable
> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer
> > consider that important enough.
> >
> > Note any such change to libstdc++ containers should be complete
> > and come with both code-size and compile-time and memory-usage
> > measurements (both of GCC and other apps of course).
>
> Can I go ahead and commit the patch?

I think we need to document the requirements on Value classes better.

@@ -177,7 +185,10 @@ public:
                                                   INSERT);
       bool ins = Traits::is_empty (*e);
       if (ins)
-       e->m_key = k;
+       {
+         e->m_key = k;
+         new ((void *)&e->m_value) Value ();
+       }

this now requires a default constructor and I always forget about
differences between the different form of initializations -- for a POD,
does this zero the entry?

Otherwise looks OK to me - I was hoping Jonathan would chime in here.

Richard.

> Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Martin Sebor-2
On 6/24/19 11:42 AM, Richard Biener wrote:

> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor <[hidden email]> wrote:
>>
>> On 6/24/19 6:11 AM, Richard Biener wrote:
>>> On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]> wrote:
>>>>
>>>> On 6/21/19 6:06 AM, Richard Biener wrote:
>>>>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:
>>>>>>
>>>>>> Bug 90923 shows that even though GCC hash-table based containers
>>>>>> like hash_map can be instantiated on types with user-defined ctors
>>>>>> and dtors they invoke the dtors of such types without invoking
>>>>>> the corresponding ctors.
>>>>>>
>>>>>> It was thanks to this bug that I spent a day debugging "interesting"
>>>>>> miscompilations during GCC bootstrap (in fairness, it was that and
>>>>>> bug 90904 about auto_vec copy assignment/construction also being
>>>>>> hosed even for POD types).
>>>>>>
>>>>>> The attached patch corrects the hash_map and hash_set templates
>>>>>> to invoke the ctors of the elements they insert and makes them
>>>>>> (hopefully) safe to use with non-trivial user-defined types.
>>>>>
>>>>> Hum.  I am worried about the difference of assignment vs. construction
>>>>> in ::put()
>>>>>
>>>>> +      bool ins = hash_entry::is_empty (*e);
>>>>> +      if (ins)
>>>>> +       {
>>>>> +         e->m_key = k;
>>>>> +         new ((void *) &e->m_value) Value (v);
>>>>> +       }
>>>>> +      else
>>>>> +       e->m_value = v;
>>>>>
>>>>> why not invoke the dtor on the old value and then the ctor again?
>>>>
>>>> It wouldn't work for self-assignment:
>>>>
>>>>      Value &y = m.get_or_insert (key);
>>>>      m.put (key, y);
>>>>
>>>> The usual rule of thumb for writes into containers is to use
>>>> construction when creating a new element and assignment when
>>>> replacing the value of an existing element.
>>>>
>>>> Which reminds me that the hash containers, despite being copy-
>>>> constructible (at least for POD types, they aren't for user-
>>>> defined types), also aren't safe for assignment even for PODs.
>>>> I opened bug 90959 for this.  Until the assignment is fixed
>>>> I made it inaccessibe in the patch (I have fixed the copy
>>>> ctor to DTRT for non-PODs).
>>>>
>>>>> How is an empty hash_entry constructed?
>>>>
>>>> In hash_table::find_slot_with_hash simply by finding an empty
>>>> slot and returning a pointer to it.  The memory for the slot
>>>> is marked "empty" by calling the Traits::mark_empty() function.
>>>>
>>>> The full type of hash_map<void*, Value> is actually
>>>>
>>>>      hash_map<void*,
>>>>               Value,
>>>>               simple_hashmap_traits<default_hash_traits<void*>,
>>>>                                     Value>
>>>>
>>>> and simple_hashmap_traits delegates it to default_hash_traits
>>>> whose mark_empty() just clears the void*, leaving the Value
>>>> part uninitialized.  That makes sense because we don't want
>>>> to call ctors for empty entries.  I think the questions one
>>>> might ask if one were to extend the design are: a) what class
>>>> should invoke the ctor/assignment and b) should it do it
>>>> directly or via the traits?
>>>>
>>>>> ::remove() doesn't
>>>>> seem to invoke the dtor either, instead it relies on the
>>>>> traits::remove function?
>>>>
>>>> Yes.  There is no Traits::construct or assign or copy.  We
>>>> could add them but I'm not sure I see to what end (there could
>>>> be use cases, I just don't know enough about these classes to
>>>> think of any).
>>>>
>>>> Attached is an updated patch with the additional minor fixes
>>>> mentioned above.
>>>>
>>>> Martin
>>>>
>>>> PS I think we would get much better results by adopting
>>>> the properly designed and tested standard library containers
>>>> than by spending time trying to improve the design of these
>>>> legacy classes.  For simple uses that don't need to integrate
>>>> with the GC machinery the standard containers should be fine
>>>> (plus, it'd provide us with greater motivation to improve
>>>> them and the code GCC emits for their uses).  Unfortunately,
>>>> to be able to use the hash-based containers we would need to
>>>> upgrade to C++ 11.  Isn't it time yet?
>>>
>>> I don't think so.  I'm also not sure if C++11 on its own is desirable
>>> or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
>>> as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
>>> SLES 11 already struggles currently (GCC 4.3) but I'd no longer
>>> consider that important enough.
>>>
>>> Note any such change to libstdc++ containers should be complete
>>> and come with both code-size and compile-time and memory-usage
>>> measurements (both of GCC and other apps of course).
>>
>> Can I go ahead and commit the patch?
>
> I think we need to document the requirements on Value classes better.
>
> @@ -177,7 +185,10 @@ public:
>                                                     INSERT);
>         bool ins = Traits::is_empty (*e);
>         if (ins)
> -       e->m_key = k;
> +       {
> +         e->m_key = k;
> +         new ((void *)&e->m_value) Value ();
> +       }
>
> this now requires a default constructor and I always forget about
> differences between the different form of initializations -- for a POD,
> does this zero the entry?

Yes, the /value initialization/ above (via Value ()) zeroes out
PODs, and it doesn't need a ctor.  The placement new syntax should
not require any changes to existing code.

Does this updated comment for hash_map make it clearer?

   /* KeyId must be a trivial (POD) type.  Value may be non-trivial
      (non-POD).  Inserted elements are value-initialized either to
      zero for POD types or by invoking their default ctor.  Removed
      elements are destroyed by invoking their dtor.   On hash_map
      destruction all elements are removed.  */

Martin

> Otherwise looks OK to me - I was hoping Jonathan would chime in here.
>
> Richard.
>
>> Martin

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Jonathan Wakely-3
In reply to this post by Richard Biener-2
On 24/06/19 19:42 +0200, Richard Biener wrote:

>On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor <[hidden email]> wrote:
>>
>> On 6/24/19 6:11 AM, Richard Biener wrote:
>> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]> wrote:
>> >>
>> >> On 6/21/19 6:06 AM, Richard Biener wrote:
>> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]> wrote:
>> >>>>
>> >>>> Bug 90923 shows that even though GCC hash-table based containers
>> >>>> like hash_map can be instantiated on types with user-defined ctors
>> >>>> and dtors they invoke the dtors of such types without invoking
>> >>>> the corresponding ctors.
>> >>>>
>> >>>> It was thanks to this bug that I spent a day debugging "interesting"
>> >>>> miscompilations during GCC bootstrap (in fairness, it was that and
>> >>>> bug 90904 about auto_vec copy assignment/construction also being
>> >>>> hosed even for POD types).
>> >>>>
>> >>>> The attached patch corrects the hash_map and hash_set templates
>> >>>> to invoke the ctors of the elements they insert and makes them
>> >>>> (hopefully) safe to use with non-trivial user-defined types.
>> >>>
>> >>> Hum.  I am worried about the difference of assignment vs. construction
>> >>> in ::put()
>> >>>
>> >>> +      bool ins = hash_entry::is_empty (*e);
>> >>> +      if (ins)
>> >>> +       {
>> >>> +         e->m_key = k;
>> >>> +         new ((void *) &e->m_value) Value (v);
>> >>> +       }
>> >>> +      else
>> >>> +       e->m_value = v;
>> >>>
>> >>> why not invoke the dtor on the old value and then the ctor again?
>> >>
>> >> It wouldn't work for self-assignment:
>> >>
>> >>     Value &y = m.get_or_insert (key);
>> >>     m.put (key, y);
>> >>
>> >> The usual rule of thumb for writes into containers is to use
>> >> construction when creating a new element and assignment when
>> >> replacing the value of an existing element.
>> >>
>> >> Which reminds me that the hash containers, despite being copy-
>> >> constructible (at least for POD types, they aren't for user-
>> >> defined types), also aren't safe for assignment even for PODs.
>> >> I opened bug 90959 for this.  Until the assignment is fixed
>> >> I made it inaccessibe in the patch (I have fixed the copy
>> >> ctor to DTRT for non-PODs).
>> >>
>> >>> How is an empty hash_entry constructed?
>> >>
>> >> In hash_table::find_slot_with_hash simply by finding an empty
>> >> slot and returning a pointer to it.  The memory for the slot
>> >> is marked "empty" by calling the Traits::mark_empty() function.
>> >>
>> >> The full type of hash_map<void*, Value> is actually
>> >>
>> >>     hash_map<void*,
>> >>              Value,
>> >>              simple_hashmap_traits<default_hash_traits<void*>,
>> >>                                    Value>
>> >>
>> >> and simple_hashmap_traits delegates it to default_hash_traits
>> >> whose mark_empty() just clears the void*, leaving the Value
>> >> part uninitialized.  That makes sense because we don't want
>> >> to call ctors for empty entries.  I think the questions one
>> >> might ask if one were to extend the design are: a) what class
>> >> should invoke the ctor/assignment and b) should it do it
>> >> directly or via the traits?
>> >>
>> >>> ::remove() doesn't
>> >>> seem to invoke the dtor either, instead it relies on the
>> >>> traits::remove function?
>> >>
>> >> Yes.  There is no Traits::construct or assign or copy.  We
>> >> could add them but I'm not sure I see to what end (there could
>> >> be use cases, I just don't know enough about these classes to
>> >> think of any).
>> >>
>> >> Attached is an updated patch with the additional minor fixes
>> >> mentioned above.
>> >>
>> >> Martin
>> >>
>> >> PS I think we would get much better results by adopting
>> >> the properly designed and tested standard library containers
>> >> than by spending time trying to improve the design of these
>> >> legacy classes.  For simple uses that don't need to integrate
>> >> with the GC machinery the standard containers should be fine
>> >> (plus, it'd provide us with greater motivation to improve
>> >> them and the code GCC emits for their uses).  Unfortunately,
>> >> to be able to use the hash-based containers we would need to
>> >> upgrade to C++ 11.  Isn't it time yet?
>> >
>> > I don't think so.  I'm also not sure if C++11 on its own is desirable
>> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
>> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
>> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer
>> > consider that important enough.
>> >
>> > Note any such change to libstdc++ containers should be complete
>> > and come with both code-size and compile-time and memory-usage
>> > measurements (both of GCC and other apps of course).
>>
>> Can I go ahead and commit the patch?
>
>I think we need to document the requirements on Value classes better.
>
>@@ -177,7 +185,10 @@ public:
>                                                   INSERT);
>       bool ins = Traits::is_empty (*e);
>       if (ins)
>-       e->m_key = k;
>+       {
>+         e->m_key = k;
>+         new ((void *)&e->m_value) Value ();
>+       }
>
>this now requires a default constructor and I always forget about
>differences between the different form of initializations -- for a POD,
>does this zero the entry?
>
>Otherwise looks OK to me - I was hoping Jonathan would chime in here.

The patch looks good to me. I 100% agree with Martin that put() should
not destroy an existing element and recreate a new one. Assignment is
the right way to update the value.

And Value() is the correct initialization.

The only change I'd suggest is something to enforce the "KeyId must be
a trivial (POD) type" requirement:

#if __GNUC__ >= 6 && __cplusplus >= 201103L
static_assert(__is_pod(KeyId), "KeyId must be a trivial (POD) type");
#endif

This could actually be added for 4.7 and up (__is_pod is available
earlier, but __cplusplus isn't set correctly before that) but GCC 6 is
when we started to default to C++14. If the check only happens for
people bootstrapping with new versions of GCC it's still going to
catch misuses with non-POD types.


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Martin Sebor-2
On 6/25/19 3:53 AM, Jonathan Wakely wrote:

> On 24/06/19 19:42 +0200, Richard Biener wrote:
>> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor <[hidden email]> wrote:
>>>
>>> On 6/24/19 6:11 AM, Richard Biener wrote:
>>> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]> wrote:
>>> >>
>>> >> On 6/21/19 6:06 AM, Richard Biener wrote:
>>> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]>
>>> wrote:
>>> >>>>
>>> >>>> Bug 90923 shows that even though GCC hash-table based containers
>>> >>>> like hash_map can be instantiated on types with user-defined ctors
>>> >>>> and dtors they invoke the dtors of such types without invoking
>>> >>>> the corresponding ctors.
>>> >>>>
>>> >>>> It was thanks to this bug that I spent a day debugging
>>> "interesting"
>>> >>>> miscompilations during GCC bootstrap (in fairness, it was that and
>>> >>>> bug 90904 about auto_vec copy assignment/construction also being
>>> >>>> hosed even for POD types).
>>> >>>>
>>> >>>> The attached patch corrects the hash_map and hash_set templates
>>> >>>> to invoke the ctors of the elements they insert and makes them
>>> >>>> (hopefully) safe to use with non-trivial user-defined types.
>>> >>>
>>> >>> Hum.  I am worried about the difference of assignment vs.
>>> construction
>>> >>> in ::put()
>>> >>>
>>> >>> +      bool ins = hash_entry::is_empty (*e);
>>> >>> +      if (ins)
>>> >>> +       {
>>> >>> +         e->m_key = k;
>>> >>> +         new ((void *) &e->m_value) Value (v);
>>> >>> +       }
>>> >>> +      else
>>> >>> +       e->m_value = v;
>>> >>>
>>> >>> why not invoke the dtor on the old value and then the ctor again?
>>> >>
>>> >> It wouldn't work for self-assignment:
>>> >>
>>> >>     Value &y = m.get_or_insert (key);
>>> >>     m.put (key, y);
>>> >>
>>> >> The usual rule of thumb for writes into containers is to use
>>> >> construction when creating a new element and assignment when
>>> >> replacing the value of an existing element.
>>> >>
>>> >> Which reminds me that the hash containers, despite being copy-
>>> >> constructible (at least for POD types, they aren't for user-
>>> >> defined types), also aren't safe for assignment even for PODs.
>>> >> I opened bug 90959 for this.  Until the assignment is fixed
>>> >> I made it inaccessibe in the patch (I have fixed the copy
>>> >> ctor to DTRT for non-PODs).
>>> >>
>>> >>> How is an empty hash_entry constructed?
>>> >>
>>> >> In hash_table::find_slot_with_hash simply by finding an empty
>>> >> slot and returning a pointer to it.  The memory for the slot
>>> >> is marked "empty" by calling the Traits::mark_empty() function.
>>> >>
>>> >> The full type of hash_map<void*, Value> is actually
>>> >>
>>> >>     hash_map<void*,
>>> >>              Value,
>>> >>              simple_hashmap_traits<default_hash_traits<void*>,
>>> >>                                    Value>
>>> >>
>>> >> and simple_hashmap_traits delegates it to default_hash_traits
>>> >> whose mark_empty() just clears the void*, leaving the Value
>>> >> part uninitialized.  That makes sense because we don't want
>>> >> to call ctors for empty entries.  I think the questions one
>>> >> might ask if one were to extend the design are: a) what class
>>> >> should invoke the ctor/assignment and b) should it do it
>>> >> directly or via the traits?
>>> >>
>>> >>> ::remove() doesn't
>>> >>> seem to invoke the dtor either, instead it relies on the
>>> >>> traits::remove function?
>>> >>
>>> >> Yes.  There is no Traits::construct or assign or copy.  We
>>> >> could add them but I'm not sure I see to what end (there could
>>> >> be use cases, I just don't know enough about these classes to
>>> >> think of any).
>>> >>
>>> >> Attached is an updated patch with the additional minor fixes
>>> >> mentioned above.
>>> >>
>>> >> Martin
>>> >>
>>> >> PS I think we would get much better results by adopting
>>> >> the properly designed and tested standard library containers
>>> >> than by spending time trying to improve the design of these
>>> >> legacy classes.  For simple uses that don't need to integrate
>>> >> with the GC machinery the standard containers should be fine
>>> >> (plus, it'd provide us with greater motivation to improve
>>> >> them and the code GCC emits for their uses).  Unfortunately,
>>> >> to be able to use the hash-based containers we would need to
>>> >> upgrade to C++ 11.  Isn't it time yet?
>>> >
>>> > I don't think so.  I'm also not sure if C++11 on its own is desirable
>>> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
>>> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
>>> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer
>>> > consider that important enough.
>>> >
>>> > Note any such change to libstdc++ containers should be complete
>>> > and come with both code-size and compile-time and memory-usage
>>> > measurements (both of GCC and other apps of course).
>>>
>>> Can I go ahead and commit the patch?
>>
>> I think we need to document the requirements on Value classes better.
>>
>> @@ -177,7 +185,10 @@ public:
>>                                                   INSERT);
>>       bool ins = Traits::is_empty (*e);
>>       if (ins)
>> -       e->m_key = k;
>> +       {
>> +         e->m_key = k;
>> +         new ((void *)&e->m_value) Value ();
>> +       }
>>
>> this now requires a default constructor and I always forget about
>> differences between the different form of initializations -- for a POD,
>> does this zero the entry?
>>
>> Otherwise looks OK to me - I was hoping Jonathan would chime in here.
>
> The patch looks good to me. I 100% agree with Martin that put() should
> not destroy an existing element and recreate a new one. Assignment is
> the right way to update the value.
>
> And Value() is the correct initialization.
>
> The only change I'd suggest is something to enforce the "KeyId must be
> a trivial (POD) type" requirement:
>
> #if __GNUC__ >= 6 && __cplusplus >= 201103L
> static_assert(__is_pod(KeyId), "KeyId must be a trivial (POD) type");
> #endif
>
> This could actually be added for 4.7 and up (__is_pod is available
> earlier, but __cplusplus isn't set correctly before that) but GCC 6 is
> when we started to default to C++14. If the check only happens for
> people bootstrapping with new versions of GCC it's still going to
> catch misuses with non-POD types.
I've updated the comments explaining the constraints in more detail
and added the static assert to hash_table where it covers both
has_map and hash_set.  FWIW, I don't imagine anyone instantiating
these containers on a non-trivial key types in GCC but having
the assert there doesn't hurt anything.

Martin


gcc-90923.diff (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Richard Biener-2
On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor <[hidden email]> wrote:>
> [Adding gcc-patches]
>
> Richard, do you have any further comments or is the revised patch
> good to commit?

No further comments from my side - it's good to commit.

Richard.

> Martin
>
> On 6/25/19 2:30 PM, Martin Sebor wrote:
> > On 6/25/19 3:53 AM, Jonathan Wakely wrote:
> >> On 24/06/19 19:42 +0200, Richard Biener wrote:
> >>> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor <[hidden email]> wrote:
> >>>>
> >>>> On 6/24/19 6:11 AM, Richard Biener wrote:
> >>>> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor <[hidden email]>
> >>>> wrote:
> >>>> >>
> >>>> >> On 6/21/19 6:06 AM, Richard Biener wrote:
> >>>> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor <[hidden email]>
> >>>> wrote:
> >>>> >>>>
> >>>> >>>> Bug 90923 shows that even though GCC hash-table based containers
> >>>> >>>> like hash_map can be instantiated on types with user-defined ctors
> >>>> >>>> and dtors they invoke the dtors of such types without invoking
> >>>> >>>> the corresponding ctors.
> >>>> >>>>
> >>>> >>>> It was thanks to this bug that I spent a day debugging
> >>>> "interesting"
> >>>> >>>> miscompilations during GCC bootstrap (in fairness, it was that and
> >>>> >>>> bug 90904 about auto_vec copy assignment/construction also being
> >>>> >>>> hosed even for POD types).
> >>>> >>>>
> >>>> >>>> The attached patch corrects the hash_map and hash_set templates
> >>>> >>>> to invoke the ctors of the elements they insert and makes them
> >>>> >>>> (hopefully) safe to use with non-trivial user-defined types.
> >>>> >>>
> >>>> >>> Hum.  I am worried about the difference of assignment vs.
> >>>> construction
> >>>> >>> in ::put()
> >>>> >>>
> >>>> >>> +      bool ins = hash_entry::is_empty (*e);
> >>>> >>> +      if (ins)
> >>>> >>> +       {
> >>>> >>> +         e->m_key = k;
> >>>> >>> +         new ((void *) &e->m_value) Value (v);
> >>>> >>> +       }
> >>>> >>> +      else
> >>>> >>> +       e->m_value = v;
> >>>> >>>
> >>>> >>> why not invoke the dtor on the old value and then the ctor again?
> >>>> >>
> >>>> >> It wouldn't work for self-assignment:
> >>>> >>
> >>>> >>     Value &y = m.get_or_insert (key);
> >>>> >>     m.put (key, y);
> >>>> >>
> >>>> >> The usual rule of thumb for writes into containers is to use
> >>>> >> construction when creating a new element and assignment when
> >>>> >> replacing the value of an existing element.
> >>>> >>
> >>>> >> Which reminds me that the hash containers, despite being copy-
> >>>> >> constructible (at least for POD types, they aren't for user-
> >>>> >> defined types), also aren't safe for assignment even for PODs.
> >>>> >> I opened bug 90959 for this.  Until the assignment is fixed
> >>>> >> I made it inaccessibe in the patch (I have fixed the copy
> >>>> >> ctor to DTRT for non-PODs).
> >>>> >>
> >>>> >>> How is an empty hash_entry constructed?
> >>>> >>
> >>>> >> In hash_table::find_slot_with_hash simply by finding an empty
> >>>> >> slot and returning a pointer to it.  The memory for the slot
> >>>> >> is marked "empty" by calling the Traits::mark_empty() function.
> >>>> >>
> >>>> >> The full type of hash_map<void*, Value> is actually
> >>>> >>
> >>>> >>     hash_map<void*,
> >>>> >>              Value,
> >>>> >>              simple_hashmap_traits<default_hash_traits<void*>,
> >>>> >>                                    Value>
> >>>> >>
> >>>> >> and simple_hashmap_traits delegates it to default_hash_traits
> >>>> >> whose mark_empty() just clears the void*, leaving the Value
> >>>> >> part uninitialized.  That makes sense because we don't want
> >>>> >> to call ctors for empty entries.  I think the questions one
> >>>> >> might ask if one were to extend the design are: a) what class
> >>>> >> should invoke the ctor/assignment and b) should it do it
> >>>> >> directly or via the traits?
> >>>> >>
> >>>> >>> ::remove() doesn't
> >>>> >>> seem to invoke the dtor either, instead it relies on the
> >>>> >>> traits::remove function?
> >>>> >>
> >>>> >> Yes.  There is no Traits::construct or assign or copy.  We
> >>>> >> could add them but I'm not sure I see to what end (there could
> >>>> >> be use cases, I just don't know enough about these classes to
> >>>> >> think of any).
> >>>> >>
> >>>> >> Attached is an updated patch with the additional minor fixes
> >>>> >> mentioned above.
> >>>> >>
> >>>> >> Martin
> >>>> >>
> >>>> >> PS I think we would get much better results by adopting
> >>>> >> the properly designed and tested standard library containers
> >>>> >> than by spending time trying to improve the design of these
> >>>> >> legacy classes.  For simple uses that don't need to integrate
> >>>> >> with the GC machinery the standard containers should be fine
> >>>> >> (plus, it'd provide us with greater motivation to improve
> >>>> >> them and the code GCC emits for their uses).  Unfortunately,
> >>>> >> to be able to use the hash-based containers we would need to
> >>>> >> upgrade to C++ 11.  Isn't it time yet?
> >>>> >
> >>>> > I don't think so.  I'm also not sure if C++11 on its own is desirable
> >>>> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
> >>>> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
> >>>> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer
> >>>> > consider that important enough.
> >>>> >
> >>>> > Note any such change to libstdc++ containers should be complete
> >>>> > and come with both code-size and compile-time and memory-usage
> >>>> > measurements (both of GCC and other apps of course).
> >>>>
> >>>> Can I go ahead and commit the patch?
> >>>
> >>> I think we need to document the requirements on Value classes better.
> >>>
> >>> @@ -177,7 +185,10 @@ public:
> >>>                                                   INSERT);
> >>>       bool ins = Traits::is_empty (*e);
> >>>       if (ins)
> >>> -       e->m_key = k;
> >>> +       {
> >>> +         e->m_key = k;
> >>> +         new ((void *)&e->m_value) Value ();
> >>> +       }
> >>>
> >>> this now requires a default constructor and I always forget about
> >>> differences between the different form of initializations -- for a POD,
> >>> does this zero the entry?
> >>>
> >>> Otherwise looks OK to me - I was hoping Jonathan would chime in here.
> >>
> >> The patch looks good to me. I 100% agree with Martin that put() should
> >> not destroy an existing element and recreate a new one. Assignment is
> >> the right way to update the value.
> >>
> >> And Value() is the correct initialization.
> >>
> >> The only change I'd suggest is something to enforce the "KeyId must be
> >> a trivial (POD) type" requirement:
> >>
> >> #if __GNUC__ >= 6 && __cplusplus >= 201103L
> >> static_assert(__is_pod(KeyId), "KeyId must be a trivial (POD) type");
> >> #endif
> >>
> >> This could actually be added for 4.7 and up (__is_pod is available
> >> earlier, but __cplusplus isn't set correctly before that) but GCC 6 is
> >> when we started to default to C++14. If the check only happens for
> >> people bootstrapping with new versions of GCC it's still going to
> >> catch misuses with non-POD types.
> >
> > I've updated the comments explaining the constraints in more detail
> > and added the static assert to hash_table where it covers both
> > has_map and hash_set.  FWIW, I don't imagine anyone instantiating
> > these containers on a non-trivial key types in GCC but having
> > the assert there doesn't hurt anything.
> >
> > Martin
> >
>