copy/copy_backward/fill/fill_n/equal rework

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

copy/copy_backward/fill/fill_n/equal rework

François Dumont-2
Hi

     This patch improves stl_algobase.h
copy/copy_backward/fill/fill_n/equal implementations. The improvements are:

- activation of algo specialization for __gnu_debug::_Safe_iterator (w/o
_GLIBCXX_DEBUG mode)

- activation of algo specialization for _Deque_iterator even if mixed
with another kind of iterator.

- activation of algo specializations __copy_move_a2 for something else
than pointers. For example this code:

std::vector<char> v { 'a', 'b', .... };

ostreambuf_iterator out(std::cout);

std::copy(v.begin(), v.end(), out);

is not calling the specialization __copy_move_a2(const char*, const
char*, ostreambuf_iterator<>);

It also fix a _GLIBCXX_DEBUG issue where the __niter_base specialization
was wrongly removing the _Safe_iterator<> layer. The
testsuite/25_algorithms/copy/debug/1_neg.cc test case was failing on a
debug assertion because _after_ the copy we were trying to increment the
vector iterator after past-the-end. Of course the problem is the
_after_, Debug mode should detect this _before_ it takes place which it
does now.

Note that std::fill_n is now making use of std::fill for some
optimizations dealing with random access iterators.

Performances are very good:

Before:

copy_backward_deque_iterators.cc    deque 2 deque 1084r 1084u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    deque 2 vector 3373r 3372u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    vector 2 deque 3316r 3316u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    int deque 2 char vector 3610r
3609u    0s         0mem    0pf
copy_backward_deque_iterators.cc    char vector 2 int deque 3552r
3552u    0s         0mem    0pf
copy_backward_deque_iterators.cc    deque 2 list 10528r 10528u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    list 2 deque 2161r 2162u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 deque                 752r 751u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 vector               3300r 3299u   
0s         0mem    0pf
copy_deque_iterators.cc      vector 2 deque               3144r 3140u   
0s         0mem    0pf
copy_deque_iterators.cc      int deque 2 char vector      3340r 3338u   
1s         0mem    0pf
copy_deque_iterators.cc      char vector 2 int deque      3132r 3132u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 list                 10013r
10012u    0s         0mem    0pf
copy_deque_iterators.cc      list 2 deque                 2274r 2275u   
0s         0mem    0pf
equal_deque_iterators.cc     deque vs deque               8676r 8675u   
0s         0mem    0pf
equal_deque_iterators.cc     deque vs vector              5870r 5870u   
0s         0mem    0pf
equal_deque_iterators.cc     vector vs deque              3163r 3163u   
0s         0mem    0pf
equal_deque_iterators.cc     int deque vs char vector     5845r 5845u   
0s         0mem    0pf
equal_deque_iterators.cc     char vector vs int deque     3307r 3307u   
0s         0mem    0pf

After:

copy_backward_deque_iterators.cc    deque 2 deque  697r  697u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    deque 2 vector  219r  218u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    vector 2 deque  453r  453u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    int deque 2 char vector 1914r
1915u    0s         0mem    0pf
copy_backward_deque_iterators.cc    char vector 2 int deque 2112r
2111u    0s         0mem    0pf
copy_backward_deque_iterators.cc    deque 2 list 7770r 7771u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    list 2 deque 2194r 2193u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 deque                 505r 504u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 vector                221r 221u   
0s         0mem    0pf
copy_deque_iterators.cc      vector 2 deque                398r 397u   
0s         0mem    0pf
copy_deque_iterators.cc      int deque 2 char vector      1770r 1767u   
0s         0mem    0pf
copy_deque_iterators.cc      char vector 2 int deque      1995r 1993u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 list                 7650r 7641u   
2s         0mem    0pf
copy_deque_iterators.cc      list 2 deque                 2270r 2270u   
0s         0mem    0pf
equal_deque_iterators.cc     deque vs deque                769r 768u   
0s         0mem    0pf
equal_deque_iterators.cc     deque vs vector               231r 230u   
0s         0mem    0pf
equal_deque_iterators.cc     vector vs deque               397r 397u   
0s         0mem    0pf
equal_deque_iterators.cc     int deque vs char vector     1541r 1541u   
0s         0mem    0pf
equal_deque_iterators.cc     char vector vs int deque     1623r 1623u   
0s         0mem    0pf

In Debug Mode it is of course even better. I haven't had the patience to
run the benches before the patch, it just takes hours to run. So here is
just the After part:

copy_backward_deque_iterators.cc    deque 2 deque 1128r 1128u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    deque 2 vector  616r  616u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    vector 2 deque  856r  855u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    int deque 2 char vector 2277r
2277u    0s         0mem    0pf
copy_backward_deque_iterators.cc    char vector 2 int deque 2518r
2519u    0s         0mem    0pf
copy_backward_deque_iterators.cc    deque 2 list 8029r 8028u   
0s         0mem    0pf
copy_backward_deque_iterators.cc    list 2 deque 10418r 10416u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 deque                 931r 930u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 vector                613r 613u   
0s         0mem    0pf
copy_deque_iterators.cc      vector 2 deque                794r 795u   
0s         0mem    0pf
copy_deque_iterators.cc      int deque 2 char vector      2192r 2192u   
0s         0mem    0pf
copy_deque_iterators.cc      char vector 2 int deque      2365r 2364u   
0s         0mem    0pf
copy_deque_iterators.cc      deque 2 list                 8009r 8010u   
0s         0mem    0pf
copy_deque_iterators.cc      list 2 deque                 10979r
10970u    1s         0mem    0pf
equal_deque_iterators.cc     deque vs deque               1034r 1032u   
0s         0mem    0pf
equal_deque_iterators.cc     deque vs vector               481r 482u   
0s         0mem    0pf
equal_deque_iterators.cc     vector vs deque               646r 646u   
0s         0mem    0pf
equal_deque_iterators.cc     int deque vs char vector     1802r 1802u   
0s         0mem    0pf
equal_deque_iterators.cc     char vector vs int deque     1867r 1865u   
0s         0mem    0pf

Note that copy/copy_backward 'list 2 deque' is still much slower because
for the moment we can't remove the debug layer. I plan to do a
refinement to also cover this use case.

In Debug implementations I have duplicated the Debug checks because
those algo specialization will also be used when users are directly
using <debug/vector> for instance without defining _GLIBCXX_DEBUG.

All algos tests passed except the constexpr ones in Debug mode, I've put
this on my todo list.

Ok to commit once I completed all the other tests ?

François


ChangeLog.entry (6K) Download Attachment
algos.patch (87K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: copy/copy_backward/fill/fill_n/equal rework

François Dumont-2
Ping ?

On 9/9/19 8:34 PM, François Dumont wrote:

> Hi
>
>     This patch improves stl_algobase.h
> copy/copy_backward/fill/fill_n/equal implementations. The improvements
> are:
>
> - activation of algo specialization for __gnu_debug::_Safe_iterator
> (w/o _GLIBCXX_DEBUG mode)
>
> - activation of algo specialization for _Deque_iterator even if mixed
> with another kind of iterator.
>
> - activation of algo specializations __copy_move_a2 for something else
> than pointers. For example this code:
>
> std::vector<char> v { 'a', 'b', .... };
>
> ostreambuf_iterator out(std::cout);
>
> std::copy(v.begin(), v.end(), out);
>
> is not calling the specialization __copy_move_a2(const char*, const
> char*, ostreambuf_iterator<>);
>
> It also fix a _GLIBCXX_DEBUG issue where the __niter_base
> specialization was wrongly removing the _Safe_iterator<> layer. The
> testsuite/25_algorithms/copy/debug/1_neg.cc test case was failing on a
> debug assertion because _after_ the copy we were trying to increment
> the vector iterator after past-the-end. Of course the problem is the
> _after_, Debug mode should detect this _before_ it takes place which
> it does now.
>
> Note that std::fill_n is now making use of std::fill for some
> optimizations dealing with random access iterators.
>
> Performances are very good:
>
> Before:
>
> copy_backward_deque_iterators.cc    deque 2 deque 1084r 1084u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    deque 2 vector 3373r 3372u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    vector 2 deque 3316r 3316u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    int deque 2 char vector 3610r
> 3609u    0s         0mem    0pf
> copy_backward_deque_iterators.cc    char vector 2 int deque 3552r
> 3552u    0s         0mem    0pf
> copy_backward_deque_iterators.cc    deque 2 list 10528r 10528u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    list 2 deque 2161r 2162u
> 0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 deque                 752r
> 751u    0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 vector               3300r
> 3299u    0s         0mem    0pf
> copy_deque_iterators.cc      vector 2 deque               3144r
> 3140u    0s         0mem    0pf
> copy_deque_iterators.cc      int deque 2 char vector      3340r
> 3338u    1s         0mem    0pf
> copy_deque_iterators.cc      char vector 2 int deque      3132r
> 3132u    0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 list                 10013r
> 10012u    0s         0mem    0pf
> copy_deque_iterators.cc      list 2 deque                 2274r
> 2275u    0s         0mem    0pf
> equal_deque_iterators.cc     deque vs deque               8676r
> 8675u    0s         0mem    0pf
> equal_deque_iterators.cc     deque vs vector              5870r
> 5870u    0s         0mem    0pf
> equal_deque_iterators.cc     vector vs deque              3163r
> 3163u    0s         0mem    0pf
> equal_deque_iterators.cc     int deque vs char vector     5845r
> 5845u    0s         0mem    0pf
> equal_deque_iterators.cc     char vector vs int deque     3307r
> 3307u    0s         0mem    0pf
>
> After:
>
> copy_backward_deque_iterators.cc    deque 2 deque  697r  697u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    deque 2 vector  219r  218u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    vector 2 deque  453r  453u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    int deque 2 char vector 1914r
> 1915u    0s         0mem    0pf
> copy_backward_deque_iterators.cc    char vector 2 int deque 2112r
> 2111u    0s         0mem    0pf
> copy_backward_deque_iterators.cc    deque 2 list 7770r 7771u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    list 2 deque 2194r 2193u
> 0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 deque                 505r
> 504u    0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 vector                221r
> 221u    0s         0mem    0pf
> copy_deque_iterators.cc      vector 2 deque                398r
> 397u    0s         0mem    0pf
> copy_deque_iterators.cc      int deque 2 char vector      1770r
> 1767u    0s         0mem    0pf
> copy_deque_iterators.cc      char vector 2 int deque      1995r
> 1993u    0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 list                 7650r
> 7641u    2s         0mem    0pf
> copy_deque_iterators.cc      list 2 deque                 2270r
> 2270u    0s         0mem    0pf
> equal_deque_iterators.cc     deque vs deque                769r
> 768u    0s         0mem    0pf
> equal_deque_iterators.cc     deque vs vector               231r
> 230u    0s         0mem    0pf
> equal_deque_iterators.cc     vector vs deque               397r
> 397u    0s         0mem    0pf
> equal_deque_iterators.cc     int deque vs char vector     1541r
> 1541u    0s         0mem    0pf
> equal_deque_iterators.cc     char vector vs int deque     1623r
> 1623u    0s         0mem    0pf
>
> In Debug Mode it is of course even better. I haven't had the patience
> to run the benches before the patch, it just takes hours to run. So
> here is just the After part:
>
> copy_backward_deque_iterators.cc    deque 2 deque 1128r 1128u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    deque 2 vector  616r  616u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    vector 2 deque  856r  855u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    int deque 2 char vector 2277r
> 2277u    0s         0mem    0pf
> copy_backward_deque_iterators.cc    char vector 2 int deque 2518r
> 2519u    0s         0mem    0pf
> copy_backward_deque_iterators.cc    deque 2 list 8029r 8028u
> 0s         0mem    0pf
> copy_backward_deque_iterators.cc    list 2 deque 10418r 10416u
> 0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 deque                 931r
> 930u    0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 vector                613r
> 613u    0s         0mem    0pf
> copy_deque_iterators.cc      vector 2 deque                794r
> 795u    0s         0mem    0pf
> copy_deque_iterators.cc      int deque 2 char vector      2192r
> 2192u    0s         0mem    0pf
> copy_deque_iterators.cc      char vector 2 int deque      2365r
> 2364u    0s         0mem    0pf
> copy_deque_iterators.cc      deque 2 list                 8009r
> 8010u    0s         0mem    0pf
> copy_deque_iterators.cc      list 2 deque                 10979r
> 10970u    1s         0mem    0pf
> equal_deque_iterators.cc     deque vs deque               1034r
> 1032u    0s         0mem    0pf
> equal_deque_iterators.cc     deque vs vector               481r
> 482u    0s         0mem    0pf
> equal_deque_iterators.cc     vector vs deque               646r
> 646u    0s         0mem    0pf
> equal_deque_iterators.cc     int deque vs char vector     1802r
> 1802u    0s         0mem    0pf
> equal_deque_iterators.cc     char vector vs int deque     1867r
> 1865u    0s         0mem    0pf
>
> Note that copy/copy_backward 'list 2 deque' is still much slower
> because for the moment we can't remove the debug layer. I plan to do a
> refinement to also cover this use case.
>
> In Debug implementations I have duplicated the Debug checks because
> those algo specialization will also be used when users are directly
> using <debug/vector> for instance without defining _GLIBCXX_DEBUG.
>
> All algos tests passed except the constexpr ones in Debug mode, I've
> put this on my todo list.
>
> Ok to commit once I completed all the other tests ?
>
> François
>

Reply | Threaded
Open this post in threaded view
|

Re: copy/copy_backward/fill/fill_n/equal rework

François Dumont-2
In reply to this post by François Dumont-2
Following recently committed patches some changes that couldn't be
committed are now part of this patch.

Moreover testing istreambuf_iterator std::copy changes I realized that
this specialization was broken because order of function declarations in
stl_algobase.h was wrong. I'll check if I can find a way to confirm that
a given overload is indeed being called.

So here is this patch again.

François

On 9/27/19 11:14 PM, François Dumont wrote:

> On 9/27/19 2:28 PM, Jonathan Wakely wrote:
>> On 09/09/19 20:34 +0200, François Dumont wrote:
>>> Hi
>>>
>>>     This patch improves stl_algobase.h
>>> copy/copy_backward/fill/fill_n/equal implementations. The
>>> improvements are:
>>>
>>> - activation of algo specialization for __gnu_debug::_Safe_iterator
>>> (w/o _GLIBCXX_DEBUG mode)
>>>
>>> - activation of algo specialization for _Deque_iterator even if
>>> mixed with another kind of iterator.
>>>
>>> - activation of algo specializations __copy_move_a2 for something
>>> else than pointers. For example this code:
>>>
>>> std::vector<char> v { 'a', 'b', .... };
>>>
>>> ostreambuf_iterator out(std::cout);
>>>
>>> std::copy(v.begin(), v.end(), out);
>>>
>>> is not calling the specialization __copy_move_a2(const char*, const
>>> char*, ostreambuf_iterator<>);
>>>
>>> It also fix a _GLIBCXX_DEBUG issue where the __niter_base
>>> specialization was wrongly removing the _Safe_iterator<> layer. The
>>> testsuite/25_algorithms/copy/debug/1_neg.cc test case was failing on
>>> a debug assertion because _after_ the copy we were trying to
>>> increment the vector iterator after past-the-end. Of course the
>>> problem is the _after_, Debug mode should detect this _before_ it
>>> takes place which it does now.
>>>
>>> Note that std::fill_n is now making use of std::fill for some
>>> optimizations dealing with random access iterators.
>>>
>>> Performances are very good:
>>
>> This looks good, but I'm unable to apply the patch:
>>
>>
>> error: patch failed: libstdc++-v3/include/bits/deque.tcc:967
>> error: libstdc++-v3/include/bits/deque.tcc: patch does not apply
>> error: patch failed: libstdc++-v3/include/bits/stl_algobase.h:499
>> error: libstdc++-v3/include/bits/stl_algobase.h: patch does not apply
>>
>> Could you regenerate the patch (against a clean master tree) and
>> resend? Thanks.
>>
> Here it is, thanks.
>


algos.patch (89K) Download Attachment