Re: Doubts regarding the _Dependent_ptr keyword

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Akshat Garg
On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
[hidden email]> wrote:

> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]> wrote:
> >
> > As we have some working front-end code for _Dependent_ptr, What should
> we do next? What I understand, we can start adding the library for
> dependent_ptr and its functions for C corresponding to the ones we created
> as C++ template library. Then, after that, we can move on to generating the
> assembly code part.
> >
>
>
> I think the next step is figuring out how to model the Dependent
> pointer information in the IR and figuring out what optimizations to
> allow or not with that information. At this point , I suspect we need
> a plan on record and have the conversation upstream on the lists.
>
> I think we need to put down a plan on record.
>
> Ramana

[CCing gcc mailing list]

So, shall I start looking over the pointer optimizations only and see what
information we may be needed on the same examples in the IR itself?

- Akshat
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Akshat Garg
On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:

> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> [hidden email]> wrote:
>
>> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]> wrote:
>> >
>> > As we have some working front-end code for _Dependent_ptr, What should
>> we do next? What I understand, we can start adding the library for
>> dependent_ptr and its functions for C corresponding to the ones we created
>> as C++ template library. Then, after that, we can move on to generating the
>> assembly code part.
>> >
>>
>>
>> I think the next step is figuring out how to model the Dependent
>> pointer information in the IR and figuring out what optimizations to
>> allow or not with that information. At this point , I suspect we need
>> a plan on record and have the conversation upstream on the lists.
>>
>> I think we need to put down a plan on record.
>>
>> Ramana
>
> [CCing gcc mailing list]
>
> So, shall I start looking over the pointer optimizations only and see what
> information we may be needed on the same examples in the IR itself?
>
> - Akshat
>
I have coded an example where equality comparison kills dependency from the
document P0190R4 as shown below :

1. struct rcutest rt = {1, 2, 3};
2. void thread0 ()
3. {
4.     rt.a = -42;
5.     rt.b = -43;
6.     rt.c = -44;
7.     rcu_assign_pointer(gp, &rt);
8. }
9.
10. void thread1 ()
11. {
12.    int i = -1;
13.    int j = -1;
14.    _Dependent_ptr struct rcutest *p;
15.
16.    p = rcu_dereference(gp);
17.    j = p->a;
18.   if (p == &rt)
19.        i = p->b;  /*Dependency breaking point*/
20.   else if(p)
21.       i = p->c;
22.   assert(i<0);
23.   assert(j<0);
24. }
The gimple unoptimized code produced for lines 17-24 is shown below

1. if (p_16 == &rt)
2.     goto <bb 3>; [INV]
3.   else
4.    goto <bb 4>; [INV]
5.
6.  <bb 3> :
7.  i_19 = p_16->b;
8.  goto <bb 6>; [INV]
9.
10.  <bb 4> :
11.  if (p_16 != 0B)
12.    goto <bb 5>; [INV]
13.  else
14.    goto <bb 6>; [INV]
15.
16.  <bb 5> :
17.  i_18 = p_16->c;
18.
19.  <bb 6> :
20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
21.  _3 = i_7 < 0;
22.  _4 = (int) _3;
23.  assert (_4);
24.  _5 = j_17 < 0;
25.  _6 = (int) _5;
26.  assert (_6);
27.  return;

The optimized code after -O1 is applied for the same lines is hown below :

1. if (_2 == &rt)
2.    goto <bb 3>; [30.00%]
3. else
4.    goto <bb 4>; [70.00%]
5.
6.  <bb 3> [local count: 322122547]:
7.   i_12 = rt.b;
8.   goto <bb 6>; [100.00%]
9.
10.  <bb 4> [local count: 751619277]:
11.   if (_1 != 0)
12.   goto <bb 5>; [50.00%]
13.   else
14.    goto <bb 6>; [50.00%]
15.
16.  <bb 5> [local count: 375809638]:
17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
18.
19.   <bb 6> [local count: 1073741824]:
20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
21.   _3 = i_7 < 0;
22.   _4 = (int) _3;
23.   assert (_4);
24.  _5 = j_10 < 0;
25.  _6 = (int) _5;
26.   assert (_6);
27.   return;

Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
breaks the dependency chain. We need to figure out the pass that does that
and put some handling code in there for the _dependent_ptr qualified
pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
any other option alone does not produce the optimized code that breaks the
dependency. But applying -O1, i.e., allowing all the optimizations does so.
As passes are applied in a certain order, we need to figure out upto what
passes, the code remains same and after what pass the dependency does not
holds. So, we need to check the translated code after every pass.

Does this sounds like a workable plan for ? Let me know your thoughts. If
this sounds good then, we can do this for all the optimizations that may
kill the dependencies at somepoint.

-Akshat
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:

> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:
>
> > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > [hidden email]> wrote:
> >
> >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]> wrote:
> >> >
> >> > As we have some working front-end code for _Dependent_ptr, What should
> >> we do next? What I understand, we can start adding the library for
> >> dependent_ptr and its functions for C corresponding to the ones we created
> >> as C++ template library. Then, after that, we can move on to generating the
> >> assembly code part.
> >> >
> >>
> >>
> >> I think the next step is figuring out how to model the Dependent
> >> pointer information in the IR and figuring out what optimizations to
> >> allow or not with that information. At this point , I suspect we need
> >> a plan on record and have the conversation upstream on the lists.
> >>
> >> I think we need to put down a plan on record.
> >>
> >> Ramana
> >
> > [CCing gcc mailing list]
> >
> > So, shall I start looking over the pointer optimizations only and see what
> > information we may be needed on the same examples in the IR itself?
> >
> > - Akshat
> >
> I have coded an example where equality comparison kills dependency from the
> document P0190R4 as shown below :
>
> 1. struct rcutest rt = {1, 2, 3};
> 2. void thread0 ()
> 3. {
> 4.     rt.a = -42;
> 5.     rt.b = -43;
> 6.     rt.c = -44;
> 7.     rcu_assign_pointer(gp, &rt);
> 8. }
> 9.
> 10. void thread1 ()
> 11. {
> 12.    int i = -1;
> 13.    int j = -1;
> 14.    _Dependent_ptr struct rcutest *p;
> 15.
> 16.    p = rcu_dereference(gp);
> 17.    j = p->a;
> 18.   if (p == &rt)
> 19.        i = p->b;  /*Dependency breaking point*/
> 20.   else if(p)
> 21.       i = p->c;
> 22.   assert(i<0);
> 23.   assert(j<0);
> 24. }
> The gimple unoptimized code produced for lines 17-24 is shown below
>
> 1. if (p_16 == &rt)
> 2.     goto <bb 3>; [INV]
> 3.   else
> 4.    goto <bb 4>; [INV]
> 5.
> 6.  <bb 3> :
> 7.  i_19 = p_16->b;
> 8.  goto <bb 6>; [INV]
> 9.
> 10.  <bb 4> :
> 11.  if (p_16 != 0B)
> 12.    goto <bb 5>; [INV]
> 13.  else
> 14.    goto <bb 6>; [INV]
> 15.
> 16.  <bb 5> :
> 17.  i_18 = p_16->c;
> 18.
> 19.  <bb 6> :
> 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> 21.  _3 = i_7 < 0;
> 22.  _4 = (int) _3;
> 23.  assert (_4);
> 24.  _5 = j_17 < 0;
> 25.  _6 = (int) _5;
> 26.  assert (_6);
> 27.  return;
>
> The optimized code after -O1 is applied for the same lines is hown below :
>
> 1. if (_2 == &rt)
> 2.    goto <bb 3>; [30.00%]
> 3. else
> 4.    goto <bb 4>; [70.00%]
> 5.
> 6.  <bb 3> [local count: 322122547]:
> 7.   i_12 = rt.b;
> 8.   goto <bb 6>; [100.00%]
> 9.
> 10.  <bb 4> [local count: 751619277]:
> 11.   if (_1 != 0)
> 12.   goto <bb 5>; [50.00%]
> 13.   else
> 14.    goto <bb 6>; [50.00%]
> 15.
> 16.  <bb 5> [local count: 375809638]:
> 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> 18.
> 19.   <bb 6> [local count: 1073741824]:
> 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> 21.   _3 = i_7 < 0;
> 22.   _4 = (int) _3;
> 23.   assert (_4);
> 24.  _5 = j_10 < 0;
> 25.  _6 = (int) _5;
> 26.   assert (_6);
> 27.   return;

Good show on tracing this through!

> Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
> in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> breaks the dependency chain. We need to figure out the pass that does that
> and put some handling code in there for the _dependent_ptr qualified
> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
> any other option alone does not produce the optimized code that breaks the
> dependency. But applying -O1, i.e., allowing all the optimizations does so.
> As passes are applied in a certain order, we need to figure out upto what
> passes, the code remains same and after what pass the dependency does not
> holds. So, we need to check the translated code after every pass.
>
> Does this sounds like a workable plan for ? Let me know your thoughts. If
> this sounds good then, we can do this for all the optimizations that may
> kill the dependencies at somepoint.

I don't know of a better plan.

My usual question...  Is there some way to script the checking of the
translated code at the end of each pass?

                                                        Thanx, Paul

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Nicholas Krause


On 2019-07-01 8:59 p.m., Paul E. McKenney wrote:

> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:
>>
>>> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>> [hidden email]> wrote:
>>>
>>>> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]> wrote:
>>>>>
>>>>> As we have some working front-end code for _Dependent_ptr, What should
>>>> we do next? What I understand, we can start adding the library for
>>>> dependent_ptr and its functions for C corresponding to the ones we created
>>>> as C++ template library. Then, after that, we can move on to generating the
>>>> assembly code part.
>>>>>
>>>>
>>>>
>>>> I think the next step is figuring out how to model the Dependent
>>>> pointer information in the IR and figuring out what optimizations to
>>>> allow or not with that information. At this point , I suspect we need
>>>> a plan on record and have the conversation upstream on the lists.
>>>>
>>>> I think we need to put down a plan on record.
>>>>
>>>> Ramana
>>>
>>> [CCing gcc mailing list]
>>>
>>> So, shall I start looking over the pointer optimizations only and see what
>>> information we may be needed on the same examples in the IR itself?
>>>
>>> - Akshat
>>>
>> I have coded an example where equality comparison kills dependency from the
>> document P0190R4 as shown below :
>>
>> 1. struct rcutest rt = {1, 2, 3};
>> 2. void thread0 ()
>> 3. {
>> 4.     rt.a = -42;
>> 5.     rt.b = -43;
>> 6.     rt.c = -44;
>> 7.     rcu_assign_pointer(gp, &rt);
>> 8. }
>> 9.
>> 10. void thread1 ()
>> 11. {
>> 12.    int i = -1;
>> 13.    int j = -1;
>> 14.    _Dependent_ptr struct rcutest *p;
>> 15.
>> 16.    p = rcu_dereference(gp);
>> 17.    j = p->a;
>> 18.   if (p == &rt)
>> 19.        i = p->b;  /*Dependency breaking point*/
>> 20.   else if(p)
>> 21.       i = p->c;
>> 22.   assert(i<0);
>> 23.   assert(j<0);
>> 24. }
>> The gimple unoptimized code produced for lines 17-24 is shown below
>>
>> 1. if (p_16 == &rt)
>> 2.     goto <bb 3>; [INV]
>> 3.   else
>> 4.    goto <bb 4>; [INV]
>> 5.
>> 6.  <bb 3> :
>> 7.  i_19 = p_16->b;
>> 8.  goto <bb 6>; [INV]
>> 9.
>> 10.  <bb 4> :
>> 11.  if (p_16 != 0B)
>> 12.    goto <bb 5>; [INV]
>> 13.  else
>> 14.    goto <bb 6>; [INV]
>> 15.
>> 16.  <bb 5> :
>> 17.  i_18 = p_16->c;
>> 18.
>> 19.  <bb 6> :
>> 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>> 21.  _3 = i_7 < 0;
>> 22.  _4 = (int) _3;
>> 23.  assert (_4);
>> 24.  _5 = j_17 < 0;
>> 25.  _6 = (int) _5;
>> 26.  assert (_6);
>> 27.  return;
>>
>> The optimized code after -O1 is applied for the same lines is hown below :
>>
>> 1. if (_2 == &rt)
>> 2.    goto <bb 3>; [30.00%]
>> 3. else
>> 4.    goto <bb 4>; [70.00%]
>> 5.
>> 6.  <bb 3> [local count: 322122547]:
>> 7.   i_12 = rt.b;
>> 8.   goto <bb 6>; [100.00%]
>> 9.
>> 10.  <bb 4> [local count: 751619277]:
>> 11.   if (_1 != 0)
>> 12.   goto <bb 5>; [50.00%]
>> 13.   else
>> 14.    goto <bb 6>; [50.00%]
>> 15.
>> 16.  <bb 5> [local count: 375809638]:
>> 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>> 18.
>> 19.   <bb 6> [local count: 1073741824]:
>> 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>> 21.   _3 = i_7 < 0;
>> 22.   _4 = (int) _3;
>> 23.   assert (_4);
>> 24.  _5 = j_10 < 0;
>> 25.  _6 = (int) _5;
>> 26.   assert (_6);
>> 27.   return;
>
> Good show on tracing this through!
>
>> Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
>> in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
>> breaks the dependency chain. We need to figure out the pass that does that
>> and put some handling code in there for the _dependent_ptr qualified
>> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
>> any other option alone does not produce the optimized code that breaks the
>> dependency. But applying -O1, i.e., allowing all the optimizations does so.
>> As passes are applied in a certain order, we need to figure out upto what
>> passes, the code remains same and after what pass the dependency does not
>> holds. So, we need to check the translated code after every pass.
>>
>> Does this sounds like a workable plan for ? Let me know your thoughts. If
>> this sounds good then, we can do this for all the optimizations that may
>> kill the dependencies at somepoint.
>
> I don't know of a better plan.
>
> My usual question...  Is there some way to script the checking of the
> translated code at the end of each pass?
>
> Thanx, Paul
>

I don't off the top of my head where the documentation is but writing a gcc
tool to inspect passes if one doesn't exist is the best way forward. A gcc
tool would be exposed to those internals but not sure if it's easy to do
that in the time frame due to the effort required by you or Akshat.

Cheers,

Nick
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Ramana Radhakrishnan [on liliput]
In reply to this post by Akshat Garg
On Tue, Jul 2, 2019 at 1:29 AM Akshat Garg <[hidden email]> wrote:

>
> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:
>>
>> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <[hidden email]> wrote:
>>>
>> [CCing gcc mailing list]
>>
>> So, shall I start looking over the pointer optimizations only and see what information we may be needed on the same examples in the IR itself?
>>
>> - Akshat
>
> I have coded an example where equality comparison kills dependency from the document P0190R4 as shown below :
>
> 1. struct rcutest rt = {1, 2, 3};
> 2. void thread0 ()
> 3. {
> 4.     rt.a = -42;
> 5.     rt.b = -43;
> 6.     rt.c = -44;
> 7.     rcu_assign_pointer(gp, &rt);
> 8. }
> 9.
> 10. void thread1 ()
> 11. {
> 12.    int i = -1;
> 13.    int j = -1;
> 14.    _Dependent_ptr struct rcutest *p;
> 15.
> 16.    p = rcu_dereference(gp);
> 17.    j = p->a;
> 18.   if (p == &rt)
> 19.        i = p->b;  /*Dependency breaking point*/
> 20.   else if(p)
> 21.       i = p->c;
> 22.   assert(i<0);
> 23.   assert(j<0);
> 24. }
> The gimple unoptimized code produced for lines 17-24 is shown below
>
> 1. if (p_16 == &rt)
> 2.     goto <bb 3>; [INV]
> 3.   else
> 4.    goto <bb 4>; [INV]
> 5.
> 6.  <bb 3> :
> 7.  i_19 = p_16->b;
> 8.  goto <bb 6>; [INV]
> 9.
> 10.  <bb 4> :
> 11.  if (p_16 != 0B)
> 12.    goto <bb 5>; [INV]
> 13.  else
> 14.    goto <bb 6>; [INV]
> 15.
> 16.  <bb 5> :
> 17.  i_18 = p_16->c;
> 18.
> 19.  <bb 6> :
> 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> 21.  _3 = i_7 < 0;
> 22.  _4 = (int) _3;
> 23.  assert (_4);
> 24.  _5 = j_17 < 0;
> 25.  _6 = (int) _5;
> 26.  assert (_6);
> 27.  return;
>
> The optimized code after -O1 is applied for the same lines is hown below :
>
> 1. if (_2 == &rt)
> 2.    goto <bb 3>; [30.00%]
> 3. else
> 4.    goto <bb 4>; [70.00%]
> 5.
> 6.  <bb 3> [local count: 322122547]:
> 7.   i_12 = rt.b;
> 8.   goto <bb 6>; [100.00%]
> 9.
> 10.  <bb 4> [local count: 751619277]:
> 11.   if (_1 != 0)
> 12.   goto <bb 5>; [50.00%]
> 13.   else
> 14.    goto <bb 6>; [50.00%]
> 15.
> 16.  <bb 5> [local count: 375809638]:
> 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> 18.
> 19.   <bb 6> [local count: 1073741824]:
> 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> 21.   _3 = i_7 < 0;
> 22.   _4 = (int) _3;
> 23.   assert (_4);
> 24.  _5 = j_10 < 0;
> 25.  _6 = (int) _5;
> 26.   assert (_6);
> 27.   return;
>
> Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7 in unoptimized code to i_12 = rt.b; in line 7 in optimized code which breaks the dependency chain. We need to figure out the pass that does that and put some handling code in there for the _dependent_ptr qualified pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or any other option alone does not produce the optimized code that breaks the dependency. But applying -O1, i.e., allowing all the optimizations does so. As passes are applied in a certain order, we need to figure out upto what passes, the code remains same and after what pass the dependency does not holds. So, we need to check the translated code after every pass.
>

It's worth figuring out what passes are doing this - however the worry
I have is that every pass now needs to be handling this case with
respect to pointer attributes. Is there some place that you are
storing said information and what is the transitive nature of
assignments with these attributes ?

regards
Ramana


> Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
>
> -Akshat
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Akshat Garg
On Tue, 2 Jul, 2019, 3:52 PM Ramana Radhakrishnan, <
[hidden email]> wrote:

> On Tue, Jul 2, 2019 at 1:29 AM Akshat Garg <[hidden email]> wrote:
> >
> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:
> >>
> >> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> [hidden email]> wrote:
> >>>
> >> [CCing gcc mailing list]
> >>
> >> So, shall I start looking over the pointer optimizations only and see
> what information we may be needed on the same examples in the IR itself?
> >>
> >> - Akshat
> >
> > I have coded an example where equality comparison kills dependency from
> the document P0190R4 as shown below :
> >
> > 1. struct rcutest rt = {1, 2, 3};
> > 2. void thread0 ()
> > 3. {
> > 4.     rt.a = -42;
> > 5.     rt.b = -43;
> > 6.     rt.c = -44;
> > 7.     rcu_assign_pointer(gp, &rt);
> > 8. }
> > 9.
> > 10. void thread1 ()
> > 11. {
> > 12.    int i = -1;
> > 13.    int j = -1;
> > 14.    _Dependent_ptr struct rcutest *p;
> > 15.
> > 16.    p = rcu_dereference(gp);
> > 17.    j = p->a;
> > 18.   if (p == &rt)
> > 19.        i = p->b;  /*Dependency breaking point*/
> > 20.   else if(p)
> > 21.       i = p->c;
> > 22.   assert(i<0);
> > 23.   assert(j<0);
> > 24. }
> > The gimple unoptimized code produced for lines 17-24 is shown below
> >
> > 1. if (p_16 == &rt)
> > 2.     goto <bb 3>; [INV]
> > 3.   else
> > 4.    goto <bb 4>; [INV]
> > 5.
> > 6.  <bb 3> :
> > 7.  i_19 = p_16->b;
> > 8.  goto <bb 6>; [INV]
> > 9.
> > 10.  <bb 4> :
> > 11.  if (p_16 != 0B)
> > 12.    goto <bb 5>; [INV]
> > 13.  else
> > 14.    goto <bb 6>; [INV]
> > 15.
> > 16.  <bb 5> :
> > 17.  i_18 = p_16->c;
> > 18.
> > 19.  <bb 6> :
> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > 21.  _3 = i_7 < 0;
> > 22.  _4 = (int) _3;
> > 23.  assert (_4);
> > 24.  _5 = j_17 < 0;
> > 25.  _6 = (int) _5;
> > 26.  assert (_6);
> > 27.  return;
> >
> > The optimized code after -O1 is applied for the same lines is hown below
> :
> >
> > 1. if (_2 == &rt)
> > 2.    goto <bb 3>; [30.00%]
> > 3. else
> > 4.    goto <bb 4>; [70.00%]
> > 5.
> > 6.  <bb 3> [local count: 322122547]:
> > 7.   i_12 = rt.b;
> > 8.   goto <bb 6>; [100.00%]
> > 9.
> > 10.  <bb 4> [local count: 751619277]:
> > 11.   if (_1 != 0)
> > 12.   goto <bb 5>; [50.00%]
> > 13.   else
> > 14.    goto <bb 6>; [50.00%]
> > 15.
> > 16.  <bb 5> [local count: 375809638]:
> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > 18.
> > 19.   <bb 6> [local count: 1073741824]:
> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > 21.   _3 = i_7 < 0;
> > 22.   _4 = (int) _3;
> > 23.   assert (_4);
> > 24.  _5 = j_10 < 0;
> > 25.  _6 = (int) _5;
> > 26.   assert (_6);
> > 27.   return;
> >
> > Statement 19 in the program gets converted from  i_19 = p_16->b; in line
> 7 in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> breaks the dependency chain. We need to figure out the pass that does that
> and put some handling code in there for the _dependent_ptr qualified
> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
> any other option alone does not produce the optimized code that breaks the
> dependency. But applying -O1, i.e., allowing all the optimizations does so.
> As passes are applied in a certain order, we need to figure out upto what
> passes, the code remains same and after what pass the dependency does not
> holds. So, we need to check the translated code after every pass.
> >
>
> It's worth figuring out what passes are doing this - however the worry
> I have is that every pass now needs to be handling this case with
> respect to pointer attributes. Is there some place that you are
> storing said information and what is the transitive nature of
> assignments with these attributes ?
>
> regards
> Ramana
>
I don't understand what information to store, can you explain? I was
thinking of putting some code inside the pass, which breaks the dependency
chains, which will avoid this type of conversions when it finds out  that
the pointer is _Dependent_ptr qualified otherwise don't do anything. Can
you let me know what I am missing on?

In transitive nature, are you talking about the dependent pointer being
assigned to some non-dependent pointer and after further assigned to some
other pointer?

>
> > Does this sounds like a workable plan for ? Let me know your thoughts.
> If this sounds good then, we can do this for all the optimizations that may
> kill the dependencies at somepoint.
> >
> > -Akshat
>
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Ramana Radhakrishnan [on liliput]
>>
>> It's worth figuring out what passes are doing this - however the worry
>> I have is that every pass now needs to be handling this case with
>> respect to pointer attributes. Is there some place that you are
>> storing said information and what is the transitive nature of
>> assignments with these attributes ?
>>
>> regards
>> Ramana
>
> I don't understand what information to store, can you explain? I was thinking of putting some code inside the pass, which breaks the dependency chains, which will avoid this type of conversions when it finds out  that the pointer is _Dependent_ptr qualified otherwise don't do anything. Can you let me know what I am missing on?
>

That the pointer has an attribute that it's a dependent pointer . How
do you expect the later passes to detect it has a dependent_ptr
attribute attached to it ?

> In transitive nature, are you talking about the dependent pointer being assigned to some non-dependent pointer and after further assigned to some other pointer?


Yes. What's the expected behaviour as per the document ?

Ramana
>>
>>
>> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
>> >
>> > -Akshat
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
On Tue, Jul 02, 2019 at 12:01:00PM +0100, Ramana Radhakrishnan wrote:

> >>
> >> It's worth figuring out what passes are doing this - however the worry
> >> I have is that every pass now needs to be handling this case with
> >> respect to pointer attributes. Is there some place that you are
> >> storing said information and what is the transitive nature of
> >> assignments with these attributes ?
> >>
> >> regards
> >> Ramana
> >
> > I don't understand what information to store, can you explain? I was thinking of putting some code inside the pass, which breaks the dependency chains, which will avoid this type of conversions when it finds out  that the pointer is _Dependent_ptr qualified otherwise don't do anything. Can you let me know what I am missing on?
> >
>
> That the pointer has an attribute that it's a dependent pointer . How
> do you expect the later passes to detect it has a dependent_ptr
> attribute attached to it ?
>
> > In transitive nature, are you talking about the dependent pointer being assigned to some non-dependent pointer and after further assigned to some other pointer?
>
> Yes. What's the expected behaviour as per the document ?

Once a user-created non-dependent pointer is assigned to, it is OK to
break the dependency.

Or am I missing the point here?

                                                        Thanx, Paul

> Ramana
> >>
> >>
> >> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
> >> >
> >> > -Akshat
>

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Ramana Radhakrishnan [on liliput]
On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <[hidden email]> wrote:

>
> Once a user-created non-dependent pointer is assigned to, it is OK to
> break the dependency.

Ok, that's good.
>
> Or am I missing the point here?

I was just trying to make sure we were on the same page. I wonder if
marking this volatile would be sufficient for prototyping. I suspect
we would need another flag somewhere which someone with gimple
knowledge might be able to help us with.

regards
Ramana

>
>                                                         Thanx, Paul
>
> > Ramana
> > >>
> > >>
> > >> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
> > >> >
> > >> > -Akshat
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan wrote:

> On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <[hidden email]> wrote:
>
> >
> > Once a user-created non-dependent pointer is assigned to, it is OK to
> > break the dependency.
>
> Ok, that's good.
> >
> > Or am I missing the point here?
>
> I was just trying to make sure we were on the same page. I wonder if
> marking this volatile would be sufficient for prototyping. I suspect
> we would need another flag somewhere which someone with gimple
> knowledge might be able to help us with.

I expect that marking it as volatile would do the trick.  ;-)

                                                        Thanx, Paul

> regards
> Ramana
>
> >
> >                                                         Thanx, Paul
> >
> > > Ramana
> > > >>
> > > >>
> > > >> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
> > > >> >
> > > >> > -Akshat
> > >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Jason Merrill
In reply to this post by Paul E. McKenney
On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <[hidden email]> wrote:

>
> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:
> >
> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > > [hidden email]> wrote:
> > >
> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]> wrote:
> > >> >
> > >> > As we have some working front-end code for _Dependent_ptr, What should
> > >> we do next? What I understand, we can start adding the library for
> > >> dependent_ptr and its functions for C corresponding to the ones we created
> > >> as C++ template library. Then, after that, we can move on to generating the
> > >> assembly code part.
> > >> >
> > >>
> > >>
> > >> I think the next step is figuring out how to model the Dependent
> > >> pointer information in the IR and figuring out what optimizations to
> > >> allow or not with that information. At this point , I suspect we need
> > >> a plan on record and have the conversation upstream on the lists.
> > >>
> > >> I think we need to put down a plan on record.
> > >>
> > >> Ramana
> > >
> > > [CCing gcc mailing list]
> > >
> > > So, shall I start looking over the pointer optimizations only and see what
> > > information we may be needed on the same examples in the IR itself?
> > >
> > > - Akshat
> > >
> > I have coded an example where equality comparison kills dependency from the
> > document P0190R4 as shown below :
> >
> > 1. struct rcutest rt = {1, 2, 3};
> > 2. void thread0 ()
> > 3. {
> > 4.     rt.a = -42;
> > 5.     rt.b = -43;
> > 6.     rt.c = -44;
> > 7.     rcu_assign_pointer(gp, &rt);
> > 8. }
> > 9.
> > 10. void thread1 ()
> > 11. {
> > 12.    int i = -1;
> > 13.    int j = -1;
> > 14.    _Dependent_ptr struct rcutest *p;
> > 15.
> > 16.    p = rcu_dereference(gp);
> > 17.    j = p->a;
> > 18.   if (p == &rt)
> > 19.        i = p->b;  /*Dependency breaking point*/
> > 20.   else if(p)
> > 21.       i = p->c;
> > 22.   assert(i<0);
> > 23.   assert(j<0);
> > 24. }
> > The gimple unoptimized code produced for lines 17-24 is shown below
> >
> > 1. if (p_16 == &rt)
> > 2.     goto <bb 3>; [INV]
> > 3.   else
> > 4.    goto <bb 4>; [INV]
> > 5.
> > 6.  <bb 3> :
> > 7.  i_19 = p_16->b;
> > 8.  goto <bb 6>; [INV]
> > 9.
> > 10.  <bb 4> :
> > 11.  if (p_16 != 0B)
> > 12.    goto <bb 5>; [INV]
> > 13.  else
> > 14.    goto <bb 6>; [INV]
> > 15.
> > 16.  <bb 5> :
> > 17.  i_18 = p_16->c;
> > 18.
> > 19.  <bb 6> :
> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > 21.  _3 = i_7 < 0;
> > 22.  _4 = (int) _3;
> > 23.  assert (_4);
> > 24.  _5 = j_17 < 0;
> > 25.  _6 = (int) _5;
> > 26.  assert (_6);
> > 27.  return;
> >
> > The optimized code after -O1 is applied for the same lines is hown below :
> >
> > 1. if (_2 == &rt)
> > 2.    goto <bb 3>; [30.00%]
> > 3. else
> > 4.    goto <bb 4>; [70.00%]
> > 5.
> > 6.  <bb 3> [local count: 322122547]:
> > 7.   i_12 = rt.b;
> > 8.   goto <bb 6>; [100.00%]
> > 9.
> > 10.  <bb 4> [local count: 751619277]:
> > 11.   if (_1 != 0)
> > 12.   goto <bb 5>; [50.00%]
> > 13.   else
> > 14.    goto <bb 6>; [50.00%]
> > 15.
> > 16.  <bb 5> [local count: 375809638]:
> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > 18.
> > 19.   <bb 6> [local count: 1073741824]:
> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > 21.   _3 = i_7 < 0;
> > 22.   _4 = (int) _3;
> > 23.   assert (_4);
> > 24.  _5 = j_10 < 0;
> > 25.  _6 = (int) _5;
> > 26.   assert (_6);
> > 27.   return;
>
> Good show on tracing this through!
>
> > Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> > breaks the dependency chain. We need to figure out the pass that does that
> > and put some handling code in there for the _dependent_ptr qualified
> > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
> > any other option alone does not produce the optimized code that breaks the
> > dependency. But applying -O1, i.e., allowing all the optimizations does so.
> > As passes are applied in a certain order, we need to figure out up to what
> > passes, the code remains same and after what pass the dependency does not
> > holds. So, we need to check the translated code after every pass.
> >
> > Does this sounds like a workable plan for ? Let me know your thoughts. If
> > this sounds good then, we can do this for all the optimizations that may
> > kill the dependencies at somepoint.
>
> I don't know of a better plan.
>
> My usual question...  Is there some way to script the checking of the
> translated code at the end of each pass?

The usual way to check the output of an optimization pass is by
dumping the intermediate code at that point and matching the dump
against a regexp, as in the tree-ssa directories in the testsuite.
-fdump-tree-all will dump after all the gimple optimization passes,
and you can look through them until you find the breakage.

Jason
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Richard Biener-2
On July 2, 2019 5:36:08 PM GMT+02:00, Jason Merrill <[hidden email]> wrote:

>On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <[hidden email]>
>wrote:
>>
>> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]>
>wrote:
>> >
>> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>> > > [hidden email]> wrote:
>> > >
>> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]>
>wrote:
>> > >> >
>> > >> > As we have some working front-end code for _Dependent_ptr,
>What should
>> > >> we do next? What I understand, we can start adding the library
>for
>> > >> dependent_ptr and its functions for C corresponding to the ones
>we created
>> > >> as C++ template library. Then, after that, we can move on to
>generating the
>> > >> assembly code part.
>> > >> >
>> > >>
>> > >>
>> > >> I think the next step is figuring out how to model the Dependent
>> > >> pointer information in the IR and figuring out what
>optimizations to
>> > >> allow or not with that information. At this point , I suspect we
>need
>> > >> a plan on record and have the conversation upstream on the
>lists.
>> > >>
>> > >> I think we need to put down a plan on record.
>> > >>
>> > >> Ramana
>> > >
>> > > [CCing gcc mailing list]
>> > >
>> > > So, shall I start looking over the pointer optimizations only and
>see what
>> > > information we may be needed on the same examples in the IR
>itself?
>> > >
>> > > - Akshat
>> > >
>> > I have coded an example where equality comparison kills dependency
>from the
>> > document P0190R4 as shown below :
>> >
>> > 1. struct rcutest rt = {1, 2, 3};
>> > 2. void thread0 ()
>> > 3. {
>> > 4.     rt.a = -42;
>> > 5.     rt.b = -43;
>> > 6.     rt.c = -44;
>> > 7.     rcu_assign_pointer(gp, &rt);
>> > 8. }
>> > 9.
>> > 10. void thread1 ()
>> > 11. {
>> > 12.    int i = -1;
>> > 13.    int j = -1;
>> > 14.    _Dependent_ptr struct rcutest *p;
>> > 15.
>> > 16.    p = rcu_dereference(gp);
>> > 17.    j = p->a;
>> > 18.   if (p == &rt)
>> > 19.        i = p->b;  /*Dependency breaking point*/
>> > 20.   else if(p)
>> > 21.       i = p->c;
>> > 22.   assert(i<0);
>> > 23.   assert(j<0);
>> > 24. }
>> > The gimple unoptimized code produced for lines 17-24 is shown below
>> >
>> > 1. if (p_16 == &rt)
>> > 2.     goto <bb 3>; [INV]
>> > 3.   else
>> > 4.    goto <bb 4>; [INV]
>> > 5.
>> > 6.  <bb 3> :
>> > 7.  i_19 = p_16->b;
>> > 8.  goto <bb 6>; [INV]
>> > 9.
>> > 10.  <bb 4> :
>> > 11.  if (p_16 != 0B)
>> > 12.    goto <bb 5>; [INV]
>> > 13.  else
>> > 14.    goto <bb 6>; [INV]
>> > 15.
>> > 16.  <bb 5> :
>> > 17.  i_18 = p_16->c;
>> > 18.
>> > 19.  <bb 6> :
>> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>> > 21.  _3 = i_7 < 0;
>> > 22.  _4 = (int) _3;
>> > 23.  assert (_4);
>> > 24.  _5 = j_17 < 0;
>> > 25.  _6 = (int) _5;
>> > 26.  assert (_6);
>> > 27.  return;
>> >
>> > The optimized code after -O1 is applied for the same lines is hown
>below :
>> >
>> > 1. if (_2 == &rt)
>> > 2.    goto <bb 3>; [30.00%]
>> > 3. else
>> > 4.    goto <bb 4>; [70.00%]
>> > 5.
>> > 6.  <bb 3> [local count: 322122547]:
>> > 7.   i_12 = rt.b;
>> > 8.   goto <bb 6>; [100.00%]
>> > 9.
>> > 10.  <bb 4> [local count: 751619277]:
>> > 11.   if (_1 != 0)
>> > 12.   goto <bb 5>; [50.00%]
>> > 13.   else
>> > 14.    goto <bb 6>; [50.00%]
>> > 15.
>> > 16.  <bb 5> [local count: 375809638]:
>> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>> > 18.
>> > 19.   <bb 6> [local count: 1073741824]:
>> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>> > 21.   _3 = i_7 < 0;
>> > 22.   _4 = (int) _3;
>> > 23.   assert (_4);
>> > 24.  _5 = j_10 < 0;
>> > 25.  _6 = (int) _5;
>> > 26.   assert (_6);
>> > 27.   return;
>>
>> Good show on tracing this through!
>>
>> > Statement 19 in the program gets converted from  i_19 = p_16->b; in
>line 7
>> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>which
>> > breaks the dependency chain. We need to figure out the pass that
>does that
>> > and put some handling code in there for the _dependent_ptr
>qualified
>> > pointers.

Wtf should this be for?  A type qualifier is certainly not going to work.

Richard.


 Passing simply -fipa-pure-const,

>-fguess-branch-probability or
>> > any other option alone does not produce the optimized code that
>breaks the
>> > dependency. But applying -O1, i.e., allowing all the optimizations
>does so.
>> > As passes are applied in a certain order, we need to figure out up
>to what
>> > passes, the code remains same and after what pass the dependency
>does not
>> > holds. So, we need to check the translated code after every pass.
>> >
>> > Does this sounds like a workable plan for ? Let me know your
>thoughts. If
>> > this sounds good then, we can do this for all the optimizations
>that may
>> > kill the dependencies at somepoint.
>>
>> I don't know of a better plan.
>>
>> My usual question...  Is there some way to script the checking of the
>> translated code at the end of each pass?
>
>The usual way to check the output of an optimization pass is by
>dumping the intermediate code at that point and matching the dump
>against a regexp, as in the tree-ssa directories in the testsuite.
>-fdump-tree-all will dump after all the gimple optimization passes,
>and you can look through them until you find the breakage.
>
>Jason

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Akshat Garg
In reply to this post by Paul E. McKenney
On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney <[hidden email]>
wrote:

> On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan wrote:
> > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <[hidden email]>
> wrote:
> >
> > >
> > > Once a user-created non-dependent pointer is assigned to, it is OK to
> > > break the dependency.
> >
> > Ok, that's good.
> > >
> > > Or am I missing the point here?
> >
> > I was just trying to make sure we were on the same page. I wonder if
> > marking this volatile would be sufficient for prototyping. I suspect
> > we would need another flag somewhere which someone with gimple
> > knowledge might be able to help us with.
>
> I expect that marking it as volatile would do the trick.  ;-)
>
>                                                         Thanx, Paul
>
So, marking this pointer as volatile will not allow the compiler to
modify/optimize the statements, the pointer is appearing in. And we don't
need to push any other code inside any of the passes. Due to this, we want
to automatically say those dependent pointers are volatile and introduce a
new flag for this. Am I getting you guys correctly? Kindly, let me know?

Akshat

>
> > regards
> > Ramana
> >
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Ramana
> > > > >>
> > > > >>
> > > > >> > Does this sounds like a workable plan for ? Let me know your
> thoughts. If this sounds good then, we can do this for all the
> optimizations that may kill the dependencies at somepoint.
> > > > >> >
> > > > >> > -Akshat
> > > >
> > >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Akshat Garg
In reply to this post by Jason Merrill
On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <[hidden email]> wrote:

> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <[hidden email]>
> wrote:
> >
> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]> wrote:
> > >
> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > > > [hidden email]> wrote:
> > > >
> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]>
> wrote:
> > > >> >
> > > >> > As we have some working front-end code for _Dependent_ptr, What
> should
> > > >> we do next? What I understand, we can start adding the library for
> > > >> dependent_ptr and its functions for C corresponding to the ones we
> created
> > > >> as C++ template library. Then, after that, we can move on to
> generating the
> > > >> assembly code part.
> > > >> >
> > > >>
> > > >>
> > > >> I think the next step is figuring out how to model the Dependent
> > > >> pointer information in the IR and figuring out what optimizations to
> > > >> allow or not with that information. At this point , I suspect we
> need
> > > >> a plan on record and have the conversation upstream on the lists.
> > > >>
> > > >> I think we need to put down a plan on record.
> > > >>
> > > >> Ramana
> > > >
> > > > [CCing gcc mailing list]
> > > >
> > > > So, shall I start looking over the pointer optimizations only and
> see what
> > > > information we may be needed on the same examples in the IR itself?
> > > >
> > > > - Akshat
> > > >
> > > I have coded an example where equality comparison kills dependency
> from the
> > > document P0190R4 as shown below :
> > >
> > > 1. struct rcutest rt = {1, 2, 3};
> > > 2. void thread0 ()
> > > 3. {
> > > 4.     rt.a = -42;
> > > 5.     rt.b = -43;
> > > 6.     rt.c = -44;
> > > 7.     rcu_assign_pointer(gp, &rt);
> > > 8. }
> > > 9.
> > > 10. void thread1 ()
> > > 11. {
> > > 12.    int i = -1;
> > > 13.    int j = -1;
> > > 14.    _Dependent_ptr struct rcutest *p;
> > > 15.
> > > 16.    p = rcu_dereference(gp);
> > > 17.    j = p->a;
> > > 18.   if (p == &rt)
> > > 19.        i = p->b;  /*Dependency breaking point*/
> > > 20.   else if(p)
> > > 21.       i = p->c;
> > > 22.   assert(i<0);
> > > 23.   assert(j<0);
> > > 24. }
> > > The gimple unoptimized code produced for lines 17-24 is shown below
> > >
> > > 1. if (p_16 == &rt)
> > > 2.     goto <bb 3>; [INV]
> > > 3.   else
> > > 4.    goto <bb 4>; [INV]
> > > 5.
> > > 6.  <bb 3> :
> > > 7.  i_19 = p_16->b;
> > > 8.  goto <bb 6>; [INV]
> > > 9.
> > > 10.  <bb 4> :
> > > 11.  if (p_16 != 0B)
> > > 12.    goto <bb 5>; [INV]
> > > 13.  else
> > > 14.    goto <bb 6>; [INV]
> > > 15.
> > > 16.  <bb 5> :
> > > 17.  i_18 = p_16->c;
> > > 18.
> > > 19.  <bb 6> :
> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > > 21.  _3 = i_7 < 0;
> > > 22.  _4 = (int) _3;
> > > 23.  assert (_4);
> > > 24.  _5 = j_17 < 0;
> > > 25.  _6 = (int) _5;
> > > 26.  assert (_6);
> > > 27.  return;
> > >
> > > The optimized code after -O1 is applied for the same lines is hown
> below :
> > >
> > > 1. if (_2 == &rt)
> > > 2.    goto <bb 3>; [30.00%]
> > > 3. else
> > > 4.    goto <bb 4>; [70.00%]
> > > 5.
> > > 6.  <bb 3> [local count: 322122547]:
> > > 7.   i_12 = rt.b;
> > > 8.   goto <bb 6>; [100.00%]
> > > 9.
> > > 10.  <bb 4> [local count: 751619277]:
> > > 11.   if (_1 != 0)
> > > 12.   goto <bb 5>; [50.00%]
> > > 13.   else
> > > 14.    goto <bb 6>; [50.00%]
> > > 15.
> > > 16.  <bb 5> [local count: 375809638]:
> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > > 18.
> > > 19.   <bb 6> [local count: 1073741824]:
> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > > 21.   _3 = i_7 < 0;
> > > 22.   _4 = (int) _3;
> > > 23.   assert (_4);
> > > 24.  _5 = j_10 < 0;
> > > 25.  _6 = (int) _5;
> > > 26.   assert (_6);
> > > 27.   return;
> >
> > Good show on tracing this through!
> >
> > > Statement 19 in the program gets converted from  i_19 = p_16->b; in
> line 7
> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> > > breaks the dependency chain. We need to figure out the pass that does
> that
> > > and put some handling code in there for the _dependent_ptr qualified
> > > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability
> or
> > > any other option alone does not produce the optimized code that breaks
> the
> > > dependency. But applying -O1, i.e., allowing all the optimizations
> does so.
> > > As passes are applied in a certain order, we need to figure out up to
> what
> > > passes, the code remains same and after what pass the dependency does
> not
> > > holds. So, we need to check the translated code after every pass.
> > >
> > > Does this sounds like a workable plan for ? Let me know your thoughts.
> If
> > > this sounds good then, we can do this for all the optimizations that
> may
> > > kill the dependencies at somepoint.
> >
> > I don't know of a better plan.
> >
> > My usual question...  Is there some way to script the checking of the
> > translated code at the end of each pass?
>
> The usual way to check the output of an optimization pass is by
> dumping the intermediate code at that point and matching the dump
> against a regexp, as in the tree-ssa directories in the testsuite.
> -fdump-tree-all will dump after all the gimple optimization passes,
> and you can look through them until you find the breakage.
>
> Jason
>
I tried out your method. Up to the temp.c.114t.sra, the line 19 from
program is
  i_12 = MEM[(dependent_ptr struct rcutest *)_2].b;
After this in the next file, temp.c.116t.dom2, line 19 becomes
i_12 = rt.b;
Then, I believe, we need to check the pass doing this, but this will take a
lot of effort I think to handle dependent_ptr in a pass and after that
passes. Although, the approach of introducing new flag seems an easy
workaround to me. Anyway thanks for your time.
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
In reply to this post by Richard Biener-2
On Tue, Jul 02, 2019 at 07:53:20PM +0200, Richard Biener wrote:

> On July 2, 2019 5:36:08 PM GMT+02:00, Jason Merrill <[hidden email]> wrote:
> >On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <[hidden email]>
> >wrote:
> >>
> >> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> >> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <[hidden email]>
> >wrote:
> >> >
> >> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> >> > > [hidden email]> wrote:
> >> > >
> >> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <[hidden email]>
> >wrote:
> >> > >> >
> >> > >> > As we have some working front-end code for _Dependent_ptr,
> >What should
> >> > >> we do next? What I understand, we can start adding the library
> >for
> >> > >> dependent_ptr and its functions for C corresponding to the ones
> >we created
> >> > >> as C++ template library. Then, after that, we can move on to
> >generating the
> >> > >> assembly code part.
> >> > >> >
> >> > >>
> >> > >>
> >> > >> I think the next step is figuring out how to model the Dependent
> >> > >> pointer information in the IR and figuring out what
> >optimizations to
> >> > >> allow or not with that information. At this point , I suspect we
> >need
> >> > >> a plan on record and have the conversation upstream on the
> >lists.
> >> > >>
> >> > >> I think we need to put down a plan on record.
> >> > >>
> >> > >> Ramana
> >> > >
> >> > > [CCing gcc mailing list]
> >> > >
> >> > > So, shall I start looking over the pointer optimizations only and
> >see what
> >> > > information we may be needed on the same examples in the IR
> >itself?
> >> > >
> >> > > - Akshat
> >> > >
> >> > I have coded an example where equality comparison kills dependency
> >from the
> >> > document P0190R4 as shown below :
> >> >
> >> > 1. struct rcutest rt = {1, 2, 3};
> >> > 2. void thread0 ()
> >> > 3. {
> >> > 4.     rt.a = -42;
> >> > 5.     rt.b = -43;
> >> > 6.     rt.c = -44;
> >> > 7.     rcu_assign_pointer(gp, &rt);
> >> > 8. }
> >> > 9.
> >> > 10. void thread1 ()
> >> > 11. {
> >> > 12.    int i = -1;
> >> > 13.    int j = -1;
> >> > 14.    _Dependent_ptr struct rcutest *p;
> >> > 15.
> >> > 16.    p = rcu_dereference(gp);
> >> > 17.    j = p->a;
> >> > 18.   if (p == &rt)
> >> > 19.        i = p->b;  /*Dependency breaking point*/
> >> > 20.   else if(p)
> >> > 21.       i = p->c;
> >> > 22.   assert(i<0);
> >> > 23.   assert(j<0);
> >> > 24. }
> >> > The gimple unoptimized code produced for lines 17-24 is shown below
> >> >
> >> > 1. if (p_16 == &rt)
> >> > 2.     goto <bb 3>; [INV]
> >> > 3.   else
> >> > 4.    goto <bb 4>; [INV]
> >> > 5.
> >> > 6.  <bb 3> :
> >> > 7.  i_19 = p_16->b;
> >> > 8.  goto <bb 6>; [INV]
> >> > 9.
> >> > 10.  <bb 4> :
> >> > 11.  if (p_16 != 0B)
> >> > 12.    goto <bb 5>; [INV]
> >> > 13.  else
> >> > 14.    goto <bb 6>; [INV]
> >> > 15.
> >> > 16.  <bb 5> :
> >> > 17.  i_18 = p_16->c;
> >> > 18.
> >> > 19.  <bb 6> :
> >> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> >> > 21.  _3 = i_7 < 0;
> >> > 22.  _4 = (int) _3;
> >> > 23.  assert (_4);
> >> > 24.  _5 = j_17 < 0;
> >> > 25.  _6 = (int) _5;
> >> > 26.  assert (_6);
> >> > 27.  return;
> >> >
> >> > The optimized code after -O1 is applied for the same lines is hown
> >below :
> >> >
> >> > 1. if (_2 == &rt)
> >> > 2.    goto <bb 3>; [30.00%]
> >> > 3. else
> >> > 4.    goto <bb 4>; [70.00%]
> >> > 5.
> >> > 6.  <bb 3> [local count: 322122547]:
> >> > 7.   i_12 = rt.b;
> >> > 8.   goto <bb 6>; [100.00%]
> >> > 9.
> >> > 10.  <bb 4> [local count: 751619277]:
> >> > 11.   if (_1 != 0)
> >> > 12.   goto <bb 5>; [50.00%]
> >> > 13.   else
> >> > 14.    goto <bb 6>; [50.00%]
> >> > 15.
> >> > 16.  <bb 5> [local count: 375809638]:
> >> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> >> > 18.
> >> > 19.   <bb 6> [local count: 1073741824]:
> >> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> >> > 21.   _3 = i_7 < 0;
> >> > 22.   _4 = (int) _3;
> >> > 23.   assert (_4);
> >> > 24.  _5 = j_10 < 0;
> >> > 25.  _6 = (int) _5;
> >> > 26.   assert (_6);
> >> > 27.   return;
> >>
> >> Good show on tracing this through!
> >>
> >> > Statement 19 in the program gets converted from  i_19 = p_16->b; in
> >line 7
> >> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
> >which
> >> > breaks the dependency chain. We need to figure out the pass that
> >does that
> >> > and put some handling code in there for the _dependent_ptr
> >qualified
> >> > pointers.
>
> Wtf should this be for?  A type qualifier is certainly not going to work.

I might be wrong, but I don't believe that Akshat is claiming that this
is already a complete solution.

But please tell us more.  Given what Akshat is trying to do, what else
is missing or otherwise in need of fixing?

                                                        Thanx, Paul

> Richard.
>
>
>  Passing simply -fipa-pure-const,
> >-fguess-branch-probability or
> >> > any other option alone does not produce the optimized code that
> >breaks the
> >> > dependency. But applying -O1, i.e., allowing all the optimizations
> >does so.
> >> > As passes are applied in a certain order, we need to figure out up
> >to what
> >> > passes, the code remains same and after what pass the dependency
> >does not
> >> > holds. So, we need to check the translated code after every pass.
> >> >
> >> > Does this sounds like a workable plan for ? Let me know your
> >thoughts. If
> >> > this sounds good then, we can do this for all the optimizations
> >that may
> >> > kill the dependencies at somepoint.
> >>
> >> I don't know of a better plan.
> >>
> >> My usual question...  Is there some way to script the checking of the
> >> translated code at the end of each pass?
> >
> >The usual way to check the output of an optimization pass is by
> >dumping the intermediate code at that point and matching the dump
> >against a regexp, as in the tree-ssa directories in the testsuite.
> >-fdump-tree-all will dump after all the gimple optimization passes,
> >and you can look through them until you find the breakage.
> >
> >Jason
>

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
In reply to this post by Akshat Garg
On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:

> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney <[hidden email]>
> wrote:
>
> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan wrote:
> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <[hidden email]>
> > wrote:
> > >
> > > >
> > > > Once a user-created non-dependent pointer is assigned to, it is OK to
> > > > break the dependency.
> > >
> > > Ok, that's good.
> > > >
> > > > Or am I missing the point here?
> > >
> > > I was just trying to make sure we were on the same page. I wonder if
> > > marking this volatile would be sufficient for prototyping. I suspect
> > > we would need another flag somewhere which someone with gimple
> > > knowledge might be able to help us with.
> >
> > I expect that marking it as volatile would do the trick.  ;-)
> >
> >                                                         Thanx, Paul
> >
> So, marking this pointer as volatile will not allow the compiler to
> modify/optimize the statements, the pointer is appearing in. And we don't
> need to push any other code inside any of the passes. Due to this, we want
> to automatically say those dependent pointers are volatile and introduce a
> new flag for this. Am I getting you guys correctly? Kindly, let me know?

While I suspect that this might work, it would suppress way more
optimizations than would be good.  For but one example, consider:

        _Dependent_ptr int *p;

        p = atomic_load_explicit(gp, memory_order_consume);
        a = p->a;
        b = p->b;

If "p" is volatile, then the compiler will be prevented from keeping
it in a register, which would not make people coding fastpaths at
all happy.  ;-)

Still, use of volatile might be a good technique for prototyping and
analysis of _Dependent_ptr.

                                                        Thanx, Paul

> Akshat
>
> >
> > > regards
> > > Ramana
> > >
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Ramana
> > > > > >>
> > > > > >>
> > > > > >> > Does this sounds like a workable plan for ? Let me know your
> > thoughts. If this sounds good then, we can do this for all the
> > optimizations that may kill the dependencies at somepoint.
> > > > > >> >
> > > > > >> > -Akshat
> > > > >
> > > >
> > >
> >
> >

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Richard Biener-2
On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <[hidden email]> wrote:

>On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
>> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
><[hidden email]>
>> wrote:
>>
>> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
>wrote:
>> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
><[hidden email]>
>> > wrote:
>> > >
>> > > >
>> > > > Once a user-created non-dependent pointer is assigned to, it is
>OK to
>> > > > break the dependency.
>> > >
>> > > Ok, that's good.
>> > > >
>> > > > Or am I missing the point here?
>> > >
>> > > I was just trying to make sure we were on the same page. I wonder
>if
>> > > marking this volatile would be sufficient for prototyping. I
>suspect
>> > > we would need another flag somewhere which someone with gimple
>> > > knowledge might be able to help us with.
>> >
>> > I expect that marking it as volatile would do the trick.  ;-)
>> >
>> >                                                         Thanx, Paul
>> >
>> So, marking this pointer as volatile will not allow the compiler to
>> modify/optimize the statements, the pointer is appearing in. And we
>don't
>> need to push any other code inside any of the passes. Due to this, we
>want
>> to automatically say those dependent pointers are volatile and
>introduce a
>> new flag for this. Am I getting you guys correctly? Kindly, let me
>know?
>
>While I suspect that this might work, it would suppress way more
>optimizations than would be good.  For but one example, consider:
>
> _Dependent_ptr int *p;
>
> p = atomic_load_explicit(gp, memory_order_consume);
> a = p->a;
> b = p->b;
>
>If "p" is volatile, then the compiler will be prevented from keeping
>it in a register, which would not make people coding fastpaths at
>all happy.  ;-)
>
>Still, use of volatile might be a good technique for prototyping and
>analysis of _Dependent_ptr.

With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
Could possibly break those?

Richard.

>
> Thanx, Paul
>
>> Akshat
>>
>> >
>> > > regards
>> > > Ramana
>> > >
>> > > >
>> > > >                                                         Thanx,
>Paul
>> > > >
>> > > > > Ramana
>> > > > > >>
>> > > > > >>
>> > > > > >> > Does this sounds like a workable plan for ? Let me know
>your
>> > thoughts. If this sounds good then, we can do this for all the
>> > optimizations that may kill the dependencies at somepoint.
>> > > > > >> >
>> > > > > >> > -Akshat
>> > > > >
>> > > >
>> > >
>> >
>> >

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
On Wed, Jul 03, 2019 at 05:47:56PM +0200, Richard Biener wrote:

> On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <[hidden email]> wrote:
> >On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> >> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
> ><[hidden email]>
> >> wrote:
> >>
> >> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
> >wrote:
> >> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
> ><[hidden email]>
> >> > wrote:
> >> > >
> >> > > >
> >> > > > Once a user-created non-dependent pointer is assigned to, it is
> >OK to
> >> > > > break the dependency.
> >> > >
> >> > > Ok, that's good.
> >> > > >
> >> > > > Or am I missing the point here?
> >> > >
> >> > > I was just trying to make sure we were on the same page. I wonder
> >if
> >> > > marking this volatile would be sufficient for prototyping. I
> >suspect
> >> > > we would need another flag somewhere which someone with gimple
> >> > > knowledge might be able to help us with.
> >> >
> >> > I expect that marking it as volatile would do the trick.  ;-)
> >> >
> >> >                                                         Thanx, Paul
> >> >
> >> So, marking this pointer as volatile will not allow the compiler to
> >> modify/optimize the statements, the pointer is appearing in. And we
> >don't
> >> need to push any other code inside any of the passes. Due to this, we
> >want
> >> to automatically say those dependent pointers are volatile and
> >introduce a
> >> new flag for this. Am I getting you guys correctly? Kindly, let me
> >know?
> >
> >While I suspect that this might work, it would suppress way more
> >optimizations than would be good.  For but one example, consider:
> >
> > _Dependent_ptr int *p;
> >
> > p = atomic_load_explicit(gp, memory_order_consume);
> > a = p->a;
> > b = p->b;
> >
> >If "p" is volatile, then the compiler will be prevented from keeping
> >it in a register, which would not make people coding fastpaths at
> >all happy.  ;-)
> >
> >Still, use of volatile might be a good technique for prototyping and
> >analysis of _Dependent_ptr.
>
> With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
> Could possibly break those?

First I suppose I should fix the bug in the above code.  Or one of the
bugs, at least.  :-/

        struct foo {
                int a;
                int b;
        };

        _Dependent_ptr struct foo *p;

        p = atomic_load_explicit(gp, memory_order_consume);
        a = p->a;
        b = p->b;

And then let me tweak the example a bit.  For the first tweak:

        struct foo {
                int a;
                int b;
        };

        struct foo default_foo = { .a = 42, .b = 43 };
        int *gp = &default_foo;

        ...

        _Dependent_ptr int *p;

        p = atomic_load_explicit(gp, memory_order_consume);
        a = p->a;
        b = p->b;

Suppose that the compiler used feedback-driven optimization, and noticed
that the value of gp was almost always &default_foo.  The compiler might
decide to transform the last three lines as follows:

        p = atomic_load_explicit(gp, memory_order_consume);
        if (p == &default_foo) {
                a = default_foo.a;
                b = default_foo.b;
        } else {
                a = p->a;
                b = p->b;
        }

Now, as long as the value of gp had remained &default_foo for the full
duration of execution, no harm done.  But suppose the following code
was executing concurrently with the above transformed code:

        struct foo *q;

        q = malloc(sizeof(*q));
        assert(q);
        q->a = 1729;
        q->b = 1730;
        atomic_store_explicit(gp, q, memory_order_release);
        do_something();
        default_foo.a = 1;
        default_foo.b = 2;
        atomic_store_explicit(gp, &default_foo, memory_order_release);

In this case, if the memory_order_consume() came just after the pointer
was reset to &default_foo, it is possible that the transformed code
would set "a" to 42 and "b" to 43, which might not be what the guy
writing the code wanted to happen.

One of the purposes of _Dependent_ptr is to prevent this transformation.

This transformation can also happen if the developer's code contained a
comparison to &default_foo -- an ARM or PowerPC compiler backend, upon
seeing two pointers containing the same bits, would likely consider the
two pointers as being interchangeable, and thus might do the dereferences
using the copy that was not tagged with the hardware dependencies.

There are quite a few other examples.  The C++ standards committee
working papers shown below go through a number of them, in case the
above example is not convincing.  Or you could tell me what you would
like to see, and I would attempt to find/create a suitable example.

Does that help, or am I missing your point?

                                                        Thanx, Paul

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0098r0.pdf
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0462r1.pdf

> Richard.
>
> >
> > Thanx, Paul
> >
> >> Akshat
> >>
> >> >
> >> > > regards
> >> > > Ramana
> >> > >
> >> > > >
> >> > > >                                                         Thanx,
> >Paul
> >> > > >
> >> > > > > Ramana
> >> > > > > >>
> >> > > > > >>
> >> > > > > >> > Does this sounds like a workable plan for ? Let me know
> >your
> >> > thoughts. If this sounds good then, we can do this for all the
> >> > optimizations that may kill the dependencies at somepoint.
> >> > > > > >> >
> >> > > > > >> > -Akshat
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
>

Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Richard Biener-2
On Wed, Jul 3, 2019 at 6:33 PM Paul E. McKenney <[hidden email]> wrote:

>
> On Wed, Jul 03, 2019 at 05:47:56PM +0200, Richard Biener wrote:
> > On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <[hidden email]> wrote:
> > >On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> > >> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
> > ><[hidden email]>
> > >> wrote:
> > >>
> > >> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
> > >wrote:
> > >> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
> > ><[hidden email]>
> > >> > wrote:
> > >> > >
> > >> > > >
> > >> > > > Once a user-created non-dependent pointer is assigned to, it is
> > >OK to
> > >> > > > break the dependency.
> > >> > >
> > >> > > Ok, that's good.
> > >> > > >
> > >> > > > Or am I missing the point here?
> > >> > >
> > >> > > I was just trying to make sure we were on the same page. I wonder
> > >if
> > >> > > marking this volatile would be sufficient for prototyping. I
> > >suspect
> > >> > > we would need another flag somewhere which someone with gimple
> > >> > > knowledge might be able to help us with.
> > >> >
> > >> > I expect that marking it as volatile would do the trick.  ;-)
> > >> >
> > >> >                                                         Thanx, Paul
> > >> >
> > >> So, marking this pointer as volatile will not allow the compiler to
> > >> modify/optimize the statements, the pointer is appearing in. And we
> > >don't
> > >> need to push any other code inside any of the passes. Due to this, we
> > >want
> > >> to automatically say those dependent pointers are volatile and
> > >introduce a
> > >> new flag for this. Am I getting you guys correctly? Kindly, let me
> > >know?
> > >
> > >While I suspect that this might work, it would suppress way more
> > >optimizations than would be good.  For but one example, consider:
> > >
> > >     _Dependent_ptr int *p;
> > >
> > >     p = atomic_load_explicit(gp, memory_order_consume);
> > >     a = p->a;
> > >     b = p->b;
> > >
> > >If "p" is volatile, then the compiler will be prevented from keeping
> > >it in a register, which would not make people coding fastpaths at
> > >all happy.  ;-)
> > >
> > >Still, use of volatile might be a good technique for prototyping and
> > >analysis of _Dependent_ptr.
> >
> > With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
> > Could possibly break those?
>
> First I suppose I should fix the bug in the above code.  Or one of the
> bugs, at least.  :-/
>
>         struct foo {
>                 int a;
>                 int b;
>         };
>
>         _Dependent_ptr struct foo *p;
>
>         p = atomic_load_explicit(gp, memory_order_consume);
>         a = p->a;
>         b = p->b;
>
> And then let me tweak the example a bit.  For the first tweak:
>
>         struct foo {
>                 int a;
>                 int b;
>         };
>
>         struct foo default_foo = { .a = 42, .b = 43 };
>         int *gp = &default_foo;
>
>         ...
>
>         _Dependent_ptr int *p;
>
>         p = atomic_load_explicit(gp, memory_order_consume);
>         a = p->a;
>         b = p->b;
>
> Suppose that the compiler used feedback-driven optimization, and noticed
> that the value of gp was almost always &default_foo.  The compiler might
> decide to transform the last three lines as follows:
>
>         p = atomic_load_explicit(gp, memory_order_consume);
>         if (p == &default_foo) {
>                 a = default_foo.a;
>                 b = default_foo.b;
>         } else {
>                 a = p->a;
>                 b = p->b;
>         }
>
> Now, as long as the value of gp had remained &default_foo for the full
> duration of execution, no harm done.  But suppose the following code
> was executing concurrently with the above transformed code:
>
>         struct foo *q;
>
>         q = malloc(sizeof(*q));
>         assert(q);
>         q->a = 1729;
>         q->b = 1730;
>         atomic_store_explicit(gp, q, memory_order_release);
>         do_something();
>         default_foo.a = 1;
>         default_foo.b = 2;
>         atomic_store_explicit(gp, &default_foo, memory_order_release);
>
> In this case, if the memory_order_consume() came just after the pointer
> was reset to &default_foo, it is possible that the transformed code
> would set "a" to 42 and "b" to 43, which might not be what the guy
> writing the code wanted to happen.
>
> One of the purposes of _Dependent_ptr is to prevent this transformation.
>
> This transformation can also happen if the developer's code contained a
> comparison to &default_foo -- an ARM or PowerPC compiler backend, upon
> seeing two pointers containing the same bits, would likely consider the
> two pointers as being interchangeable, and thus might do the dereferences
> using the copy that was not tagged with the hardware dependencies.
>
> There are quite a few other examples.  The C++ standards committee
> working papers shown below go through a number of them, in case the
> above example is not convincing.  Or you could tell me what you would
> like to see, and I would attempt to find/create a suitable example.
>
> Does that help, or am I missing your point?

Browsed through that resources, so yes.  So the important thing
is that data dependences in the source are not to be removed or
replaced by control dependences because that enables out-of-order
execution on the CPU (or by the compiler).  This is only needed
for data typed with the _Dependent_ptr qualifier.

I think fully guaranteeing this is hard (besides when you use
volatile), we have the very same issue when dealing with
pointer provenance rules, known for years and not fixed
(and I don't see a good way to fix these issues without
sacrifying performance everywhere).

Good luck ;)

Maybe performance isn't so much of an issue for _Dependent_ptr
(compared to when all pointers are affected).

Richard.

>
>                                                         Thanx, Paul
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0098r0.pdf
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0462r1.pdf
>
> > Richard.
> >
> > >
> > >                                                     Thanx, Paul
> > >
> > >> Akshat
> > >>
> > >> >
> > >> > > regards
> > >> > > Ramana
> > >> > >
> > >> > > >
> > >> > > >                                                         Thanx,
> > >Paul
> > >> > > >
> > >> > > > > Ramana
> > >> > > > > >>
> > >> > > > > >>
> > >> > > > > >> > Does this sounds like a workable plan for ? Let me know
> > >your
> > >> > thoughts. If this sounds good then, we can do this for all the
> > >> > optimizations that may kill the dependencies at somepoint.
> > >> > > > > >> >
> > >> > > > > >> > -Akshat
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Doubts regarding the _Dependent_ptr keyword

Paul E. McKenney
On Thu, Jul 04, 2019 at 01:00:18PM +0200, Richard Biener wrote:

> On Wed, Jul 3, 2019 at 6:33 PM Paul E. McKenney <[hidden email]> wrote:
> >
> > On Wed, Jul 03, 2019 at 05:47:56PM +0200, Richard Biener wrote:
> > > On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <[hidden email]> wrote:
> > > >On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> > > >> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
> > > ><[hidden email]>
> > > >> wrote:
> > > >>
> > > >> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
> > > >wrote:
> > > >> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
> > > ><[hidden email]>
> > > >> > wrote:
> > > >> > >
> > > >> > > >
> > > >> > > > Once a user-created non-dependent pointer is assigned to, it is
> > > >OK to
> > > >> > > > break the dependency.
> > > >> > >
> > > >> > > Ok, that's good.
> > > >> > > >
> > > >> > > > Or am I missing the point here?
> > > >> > >
> > > >> > > I was just trying to make sure we were on the same page. I wonder
> > > >if
> > > >> > > marking this volatile would be sufficient for prototyping. I
> > > >suspect
> > > >> > > we would need another flag somewhere which someone with gimple
> > > >> > > knowledge might be able to help us with.
> > > >> >
> > > >> > I expect that marking it as volatile would do the trick.  ;-)
> > > >> >
> > > >> >                                                         Thanx, Paul
> > > >> >
> > > >> So, marking this pointer as volatile will not allow the compiler to
> > > >> modify/optimize the statements, the pointer is appearing in. And we
> > > >don't
> > > >> need to push any other code inside any of the passes. Due to this, we
> > > >want
> > > >> to automatically say those dependent pointers are volatile and
> > > >introduce a
> > > >> new flag for this. Am I getting you guys correctly? Kindly, let me
> > > >know?
> > > >
> > > >While I suspect that this might work, it would suppress way more
> > > >optimizations than would be good.  For but one example, consider:
> > > >
> > > >     _Dependent_ptr int *p;
> > > >
> > > >     p = atomic_load_explicit(gp, memory_order_consume);
> > > >     a = p->a;
> > > >     b = p->b;
> > > >
> > > >If "p" is volatile, then the compiler will be prevented from keeping
> > > >it in a register, which would not make people coding fastpaths at
> > > >all happy.  ;-)
> > > >
> > > >Still, use of volatile might be a good technique for prototyping and
> > > >analysis of _Dependent_ptr.
> > >
> > > With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
> > > Could possibly break those?
> >
> > First I suppose I should fix the bug in the above code.  Or one of the
> > bugs, at least.  :-/
> >
> >         struct foo {
> >                 int a;
> >                 int b;
> >         };
> >
> >         _Dependent_ptr struct foo *p;
> >
> >         p = atomic_load_explicit(gp, memory_order_consume);
> >         a = p->a;
> >         b = p->b;
> >
> > And then let me tweak the example a bit.  For the first tweak:
> >
> >         struct foo {
> >                 int a;
> >                 int b;
> >         };
> >
> >         struct foo default_foo = { .a = 42, .b = 43 };
> >         int *gp = &default_foo;
> >
> >         ...
> >
> >         _Dependent_ptr int *p;
> >
> >         p = atomic_load_explicit(gp, memory_order_consume);
> >         a = p->a;
> >         b = p->b;
> >
> > Suppose that the compiler used feedback-driven optimization, and noticed
> > that the value of gp was almost always &default_foo.  The compiler might
> > decide to transform the last three lines as follows:
> >
> >         p = atomic_load_explicit(gp, memory_order_consume);
> >         if (p == &default_foo) {
> >                 a = default_foo.a;
> >                 b = default_foo.b;
> >         } else {
> >                 a = p->a;
> >                 b = p->b;
> >         }
> >
> > Now, as long as the value of gp had remained &default_foo for the full
> > duration of execution, no harm done.  But suppose the following code
> > was executing concurrently with the above transformed code:
> >
> >         struct foo *q;
> >
> >         q = malloc(sizeof(*q));
> >         assert(q);
> >         q->a = 1729;
> >         q->b = 1730;
> >         atomic_store_explicit(gp, q, memory_order_release);
> >         do_something();
> >         default_foo.a = 1;
> >         default_foo.b = 2;
> >         atomic_store_explicit(gp, &default_foo, memory_order_release);
> >
> > In this case, if the memory_order_consume() came just after the pointer
> > was reset to &default_foo, it is possible that the transformed code
> > would set "a" to 42 and "b" to 43, which might not be what the guy
> > writing the code wanted to happen.
> >
> > One of the purposes of _Dependent_ptr is to prevent this transformation.
> >
> > This transformation can also happen if the developer's code contained a
> > comparison to &default_foo -- an ARM or PowerPC compiler backend, upon
> > seeing two pointers containing the same bits, would likely consider the
> > two pointers as being interchangeable, and thus might do the dereferences
> > using the copy that was not tagged with the hardware dependencies.
> >
> > There are quite a few other examples.  The C++ standards committee
> > working papers shown below go through a number of them, in case the
> > above example is not convincing.  Or you could tell me what you would
> > like to see, and I would attempt to find/create a suitable example.
> >
> > Does that help, or am I missing your point?
>
> Browsed through that resources, so yes.  So the important thing
> is that data dependences in the source are not to be removed or
> replaced by control dependences because that enables out-of-order
> execution on the CPU (or by the compiler).  This is only needed
> for data typed with the _Dependent_ptr qualifier.
>
> I think fully guaranteeing this is hard (besides when you use
> volatile), we have the very same issue when dealing with
> pointer provenance rules, known for years and not fixed
> (and I don't see a good way to fix these issues without
> sacrifying performance everywhere).
>
> Good luck ;)

Thank you, we will need it!  ;-)

> Maybe performance isn't so much of an issue for _Dependent_ptr
> (compared to when all pointers are affected).

Here is hoping!

                                                        Thanx, Paul

> Richard.
>
> >
> >                                                         Thanx, Paul
> >
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0098r0.pdf
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0462r1.pdf
> >
> > > Richard.
> > >
> > > >
> > > >                                                     Thanx, Paul
> > > >
> > > >> Akshat
> > > >>
> > > >> >
> > > >> > > regards
> > > >> > > Ramana
> > > >> > >
> > > >> > > >
> > > >> > > >                                                         Thanx,
> > > >Paul
> > > >> > > >
> > > >> > > > > Ramana
> > > >> > > > > >>
> > > >> > > > > >>
> > > >> > > > > >> > Does this sounds like a workable plan for ? Let me know
> > > >your
> > > >> > thoughts. If this sounds good then, we can do this for all the
> > > >> > optimizations that may kill the dependencies at somepoint.
> > > >> > > > > >> >
> > > >> > > > > >> > -Akshat
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> >
> > >
> >

12