Questions about initialization data during LTO

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Questions about initialization data during LTO

Gary Oblock
I'm trying to do a set of optimizations that drastically transform the
layout of arrays of structures. For obvious reasons they will need to
run at LTO time. I'm running into some difficulties comprehending how
the initialization data is stored. Also, I'm seeing DECL_INITIALs
being set to NULL and that is worrisome since it would throw a monkey
wrench into what I'm doing. That is, because for my optimizations to
work they will need to either disqualify an array with initialization
data or transform said data.

So, is the initialization data being hidden at LTO time?
If not what's its format and how do I best manipulate it?

Any insight into how to deal with these problem would be most helpful.
These are some really interesting optimizations and will greatly speed
up code that uses large arrays of structures.


Thanks,

Gary Oblock
Reply | Threaded
Open this post in threaded view
|

Re: Questions about initialization data during LTO

Richard Biener-2
On Thu, Sep 12, 2019 at 1:28 AM Gary Oblock <[hidden email]> wrote:

>
> I'm trying to do a set of optimizations that drastically transform the
> layout of arrays of structures. For obvious reasons they will need to
> run at LTO time. I'm running into some difficulties comprehending how
> the initialization data is stored. Also, I'm seeing DECL_INITIALs
> being set to NULL and that is worrisome since it would throw a monkey
> wrench into what I'm doing. That is, because for my optimizations to
> work they will need to either disqualify an array with initialization
> data or transform said data.
>
> So, is the initialization data being hidden at LTO time?
> If not what's its format and how do I best manipulate it?

You probably have to do at least part of the work at WPA time where
DECL_INITIAL should be appropriately set.  Initializers are generally only
shipped to / output in one LTRANS unit unless they can be used for
constant folding.

> Any insight into how to deal with these problem would be most helpful.
> These are some really interesting optimizations and will greatly speed
> up code that uses large arrays of structures.

Usually hinting the programmer is way easier here since a compiler has
to give up too easily for data layout optimizations.  Unless you only
target specific benchmarks...

Richard.

>
> Thanks,
>
> Gary Oblock
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Questions about initialization data during LTO

Gary Oblock
On 9/12/19 3:12 AM, Richard Biener wrote:

> External Email
>
> ----------------------------------------------------------------------
> On Thu, Sep 12, 2019 at 1:28 AM Gary Oblock <[hidden email]> wrote:
>> I'm trying to do a set of optimizations that drastically transform the
>> layout of arrays of structures. For obvious reasons they will need to
>> run at LTO time. I'm running into some difficulties comprehending how
>> the initialization data is stored. Also, I'm seeing DECL_INITIALs
>> being set to NULL and that is worrisome since it would throw a monkey
>> wrench into what I'm doing. That is, because for my optimizations to
>> work they will need to either disqualify an array with initialization
>> data or transform said data.
>>
>> So, is the initialization data being hidden at LTO time?
>> If not what's its format and how do I best manipulate it?
> You probably have to do at least part of the work at WPA time where
> DECL_INITIAL should be appropriately set.  Initializers are generally only
> shipped to / output in one LTRANS unit unless they can be used for
> constant folding.
First, I thought WHOPR was deprecated?? Second, if the initializations
are stripped is there any indication that it's occurred? Third, is it
possible
to selectively disable the stripping for arrays of structures and
structures?

Note, this optimization is a lot harder to do as bits and pieces here and
there and perhaps totally unfeasible. I have to be able to examine all the
uses of of all the structures and all the variables of these types to in
order
to qualify the optimization. Then I have to do the same thing all over
again
to apply the transformations.
>> Any insight into how to deal with these problem would be most helpful.
>> These are some really interesting optimizations and will greatly speed
>> up code that uses large arrays of structures.
> Usually hinting the programmer is way easier here since a compiler has
> to give up too easily for data layout optimizations.  Unless you only
> target specific benchmarks...
I don't follow you about hinting... Are you saying the programmer
would need to supply hints in the code enabling the optimizations?
I want to be as general as possible and improve "dusty decks" too.
Admittedly the first cut of these optimizations will be quite limited.
My plan is to get the framework to function and then extend it.

Gary
>
> Richard.
>
>> Thanks,
>>
>> Gary Oblock

Reply | Threaded
Open this post in threaded view
|

Re: Questions about initialization data during LTO

Martin Liška-2
In reply to this post by Gary Oblock
On 9/11/19 7:27 PM, Gary Oblock wrote:
> I'm trying to do a set of optimizations that drastically transform the
> layout of arrays of structures.

You're probably talking about struct-reorg pass that we used to have.
Last note about the optimization I have comes from Cauldron 2015:
https://www.youtube.com/watch?v=vhV75sys0Nw

You may be interested. You can also contact Olga, she can probably share details of her work.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Questions about initialization data during LTO

Richard Biener-2
In reply to this post by Gary Oblock
On Thu, Sep 12, 2019 at 9:09 PM Gary Oblock <[hidden email]> wrote:

>
> On 9/12/19 3:12 AM, Richard Biener wrote:
> > External Email
> >
> > ----------------------------------------------------------------------
> > On Thu, Sep 12, 2019 at 1:28 AM Gary Oblock <[hidden email]> wrote:
> >> I'm trying to do a set of optimizations that drastically transform the
> >> layout of arrays of structures. For obvious reasons they will need to
> >> run at LTO time. I'm running into some difficulties comprehending how
> >> the initialization data is stored. Also, I'm seeing DECL_INITIALs
> >> being set to NULL and that is worrisome since it would throw a monkey
> >> wrench into what I'm doing. That is, because for my optimizations to
> >> work they will need to either disqualify an array with initialization
> >> data or transform said data.
> >>
> >> So, is the initialization data being hidden at LTO time?
> >> If not what's its format and how do I best manipulate it?
> > You probably have to do at least part of the work at WPA time where
> > DECL_INITIAL should be appropriately set.  Initializers are generally only
> > shipped to / output in one LTRANS unit unless they can be used for
> > constant folding.
> First, I thought WHOPR was deprecated??

On the contrary, it's the default.

> Second, if the initializations
> are stripped is there any indication that it's occurred? Third, is it
> possible
> to selectively disable the stripping for arrays of structures and
> structures?

As said, you need to view the whole program anyways so work at WPA
time.  That means all initializers are still present.

> Note, this optimization is a lot harder to do as bits and pieces here and
> there and perhaps totally unfeasible. I have to be able to examine all the
> uses of of all the structures and all the variables of these types to in
> order
> to qualify the optimization. Then I have to do the same thing all over
> again
> to apply the transformations.

Yes, indeed you have.  Ideally you'd do function-level analysis
at pre-WPA compile-time, then combine and decide how to transform
at WPA time and actually apply the transform to the function bodies
(and initializers) at LTRANS.

> >> Any insight into how to deal with these problem would be most helpful.
> >> These are some really interesting optimizations and will greatly speed
> >> up code that uses large arrays of structures.
> > Usually hinting the programmer is way easier here since a compiler has
> > to give up too easily for data layout optimizations.  Unless you only
> > target specific benchmarks...
> I don't follow you about hinting... Are you saying the programmer
> would need to supply hints in the code enabling the optimizations?
> I want to be as general as possible and improve "dusty decks" too.
> Admittedly the first cut of these optimizations will be quite limited.
> My plan is to get the framework to function and then extend it.

No, the compiler should hint the programmer at "hey, if you'd reorder
this performance would go up!" and have the programmer do the
transform.

Richard.

> Gary
> >
> > Richard.
> >
> >> Thanks,
> >>
> >> Gary Oblock
>
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Questions about initialization data during LTO

Gary Oblock
In reply to this post by Martin Liška-2
On 9/13/19 5:20 AM, Martin Liška wrote:

> External Email
>
> ----------------------------------------------------------------------
> On 9/11/19 7:27 PM, Gary Oblock wrote:
>> I'm trying to do a set of optimizations that drastically transform the
>> layout of arrays of structures.
> You're probably talking about struct-reorg pass that we used to have.
> Last note about the optimization I have comes from Cauldron 2015:
> https://www.youtube.com/watch?v=vhV75sys0Nw
>
> You may be interested. You can also contact Olga, she can probably share details of her work.
>
> Martin
>
Martin --

I've got what I feel is a good approach of my own and it's similar to
some of the
other proposals floating around on the Internet and to ideas from other
sources.

I actually work for the people that employed Olga to do the structure
reorg work
and her approach, which I'm trying not to duplicate, was never completed.

So, back to my questions, any ideas about how to get initialization
information? This
is going to be a very powerful optimization for code with structures of
arrays and
I just need a little help getting around a few obstacles in my path.

Thanks,

-- Gary
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Questions about initialization data during LTO

Martin Liška-2
On 9/13/19 3:01 PM, Gary Oblock wrote:
> So, back to my questions, any ideas about how to get initialization
> information? This
> is going to be a very powerful optimization for code with structures of
> arrays and
> I just need a little help getting around a few obstacles in my path.

Sure. So I would point you to the IPA ICF pass, which makes merging
of variables in WPA phase of LTO. Let's take a look at:

ipa-icf.c:1839 (sem_variable::equals) where
we do:
   if (DECL_INITIAL (decl) == error_mark_node && in_lto_p)
     dyn_cast <varpool_node *>(node)->get_constructor ();

That's the way how load DECL_INITIAL of variables.
Does it help you?

Martin
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Questions about initialization data during LTO

Gary Oblock
On 9/14/19 8:39 AM, Martin Liška wrote:

On 9/13/19 3:01 PM, Gary Oblock wrote:


So, back to my questions, any ideas about how to get initialization
information? This
is going to be a very powerful optimization for code with structures of
arrays and
I just need a little help getting around a few obstacles in my path.



Sure. So I would point you to the IPA ICF pass, which makes merging
of variables in WPA phase of LTO. Let's take a look at:

ipa-icf.c:1839 (sem_variable::equals) where
we do:
   if (DECL_INITIAL (decl) == error_mark_node && in_lto_p)
     dyn_cast <varpool_node *>(node)->get_constructor ();

That's the way how load DECL_INITIAL of variables.
Does it help you?

Martin



So Martin, let me get this straight, all of the initialization information
can be fetched here? I ask this because I was under the impression that
some of it was deleted and could not be recovered. The worst case scenario
for my optimization (making it illegal) is if the user specified an initialization
and the initialization disappeared without leaving a trace in the IR that it ever
existed.

Many thanks,

Gary
Reply | Threaded
Open this post in threaded view
|

Re: [EXT] Re: Questions about initialization data during LTO

Martin Liška-2
On 9/16/19 8:28 PM, Gary Oblock wrote:
> So Martin, let me get this straight, all of the initialization information
> can be fetched here?

Yes.

> I ask this because I was under the impression that
> some of it was deleted and could not be recovered.

No, that should not happen.

Martin

> The worst case scenario
> for my optimization (making it illegal) is if the user specified an initialization
> and the initialization disappeared without leaving a trace in the IR that it ever
> existed.