[PATCH] Add Optimization for various IPA parameters.

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH] Add Optimization for various IPA parameters.

Martin Liška-2
Hi.

This is similar transformation for IPA passes. This time,
one needs to use opt_for_fn in order to get the right
parameter values.

@Martin, Honza:
There are last few remaining parameters which should use
opt_for_fn:

param_ipa_max_agg_items
param_ipa_cp_unit_growth
param_ipa_sra_max_replacements
param_max_speculative_devirt_maydefs

Can you please help me with these as it's in your code?

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-01-03  Martin Liska  <[hidden email]>

        * auto-profile.c (auto_profile): Use opt_for_fn
        for a parameter.
        * ipa-cp.c (ipcp_lattice::add_value): Likewise.
        (propagate_vals_across_arith_jfunc): Likewise.
        (hint_time_bonus): Likewise.
        (incorporate_penalties): Likewise.
        (good_cloning_opportunity_p): Likewise.
        (perform_estimation_of_a_value): Likewise.
        (estimate_local_effects): Likewise.
        (ipcp_propagate_stage): Likewise.
        * ipa-fnsummary.c (decompose_param_expr): Likewise.
        (set_switch_stmt_execution_predicate): Likewise.
        (analyze_function_body): Likewise.
        * ipa-inline-analysis.c (offline_size): Likewise.
        * ipa-inline.c (early_inliner): Likewise.
        * ipa-prop.c (ipa_analyze_node): Likewise.
        (ipcp_transform_function): Likewise.
        * ipa-sra.c (process_scan_results): Likewise.
        (ipa_sra_summarize_function): Likewise.
        * params.opt: Rename ipcp-unit-growth to
        ipa-cp-unit-growth.  Add Optimization for various
        IPA-related parameters.
---
  gcc/auto-profile.c        |  3 ++-
  gcc/ipa-cp.c              | 44 +++++++++++++++++++++++----------------
  gcc/ipa-fnsummary.c       |  7 ++++---
  gcc/ipa-inline-analysis.c |  7 ++++---
  gcc/ipa-inline.c          |  6 ++++--
  gcc/ipa-prop.c            |  4 ++--
  gcc/ipa-sra.c             |  6 ++++--
  gcc/params.opt            | 34 +++++++++++++++---------------
  8 files changed, 63 insertions(+), 48 deletions(-)



0001-Add-Optimization-for-various-IPA-parameters.patch (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:

> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_max_agg_items


IPA-CP: Always access param_ipa_max_agg_items through opt_for_fn

2020-01-07  Martin Jambor  <[hidden email]>

        * params.opt (param_ipa_max_agg_items): Mark as Optimization
        * ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
        instead of param_ipa_max_agg_items.
        (merge_aggregate_lattices): Extract param_ipa_max_agg_items from
        optimization info for the callee.
---
 gcc/ipa-cp.c   | 14 +++++++++-----
 gcc/ipa-prop.c |  8 +++++---
 gcc/params.opt |  2 +-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4381b35a809..9e20e278eff 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2458,13 +2458,13 @@ set_check_aggs_by_ref (class ipcp_param_lattices *dest_plats,
    unless there are too many already.  If there are two many, return false.  If
    there are overlaps turn whole DEST_PLATS to bottom and return false.  If any
    skipped lattices were newly marked as containing variable, set *CHANGE to
-   true.  */
+   true.  MAX_AGG_ITEMS is the maximum number of lattices.  */
 
 static bool
 merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
      HOST_WIDE_INT offset, HOST_WIDE_INT val_size,
      struct ipcp_agg_lattice ***aglat,
-     bool pre_existing, bool *change)
+     bool pre_existing, bool *change, int max_agg_items)
 {
   gcc_checking_assert (offset >= 0);
 
@@ -2499,7 +2499,7 @@ merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
   set_agg_lats_to_bottom (dest_plats);
   return false;
  }
-      if (dest_plats->aggs_count == param_ipa_max_agg_items)
+      if (dest_plats->aggs_count == max_agg_items)
  return false;
       dest_plats->aggs_count++;
       new_al = ipcp_agg_lattice_pool.allocate ();
@@ -2553,6 +2553,8 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
     ret |= set_agg_lats_contain_variable (dest_plats);
   dst_aglat = &dest_plats->aggs;
 
+  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+  param_ipa_max_agg_items);
   for (struct ipcp_agg_lattice *src_aglat = src_plats->aggs;
        src_aglat;
        src_aglat = src_aglat->next)
@@ -2562,7 +2564,7 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
       if (new_offset < 0)
  continue;
       if (merge_agg_lats_step (dest_plats, new_offset, src_aglat->size,
-       &dst_aglat, pre_existing, &ret))
+       &dst_aglat, pre_existing, &ret, max_agg_items))
  {
   struct ipcp_agg_lattice *new_al = *dst_aglat;
 
@@ -2742,6 +2744,8 @@ propagate_aggs_across_jump_function (struct cgraph_edge *cs,
       if (set_check_aggs_by_ref (dest_plats, jfunc->agg.by_ref))
  return true;
 
+      int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+      param_ipa_max_agg_items);
       FOR_EACH_VEC_ELT (*jfunc->agg.items, i, item)
  {
   HOST_WIDE_INT val_size;
@@ -2751,7 +2755,7 @@ propagate_aggs_across_jump_function (struct cgraph_edge *cs,
   val_size = tree_to_shwi (TYPE_SIZE (item->type));
 
   if (merge_agg_lats_step (dest_plats, item->offset, val_size,
-   &aglat, pre_existing, &ret))
+   &aglat, pre_existing, &ret, max_agg_items))
     {
       ret |= propagate_aggregate_lattice (cs, item, *aglat);
       aglat = &(*aglat)->next;
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index fcb13dfbac4..1ccdbbfb8c0 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -1852,8 +1852,10 @@ determine_known_aggregate_parts (struct ipa_func_body_info *fbi,
   tree arg_base;
   bool check_ref, by_ref;
   ao_ref r;
+  unsigned max_agg_items = opt_for_fn (fbi->node->decl,
+       param_ipa_max_agg_items);
 
-  if (param_ipa_max_agg_items == 0)
+  if (max_agg_items == 0)
     return;
 
   /* The function operates in three stages.  First, we prepare check_ref, r,
@@ -1951,14 +1953,14 @@ determine_known_aggregate_parts (struct ipa_func_body_info *fbi,
  operands, whose definitions can finally reach the call.  */
       add_to_agg_contents_list (&list, (*copy = *content, copy));
 
-      if (++value_count == param_ipa_max_agg_items)
+      if (++value_count == max_agg_items)
  break;
     }
 
   /* Add to the list consisting of all dominating virtual operands.  */
   add_to_agg_contents_list (&all_list, content);
 
-  if (++item_count == 2 * param_ipa_max_agg_items)
+  if (++item_count == 2 * max_agg_items)
     break;
  }
       dom_vuse = gimple_vuse (stmt);
diff --git a/gcc/params.opt b/gcc/params.opt
index 6f05b29a929..d5f0d9a69ce 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -223,7 +223,7 @@ Common Joined UInteger Var(param_ipa_max_aa_steps) Init(25000) Param
 Maximum number of statements that will be visited by IPA formal parameter analysis based on alias analysis in any given function.
 
 -param=ipa-max-agg-items=
-Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param
+Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param Optimization
 Maximum number of aggregate content items for a parameter in jump functions and lattices.
 
 -param=ipa-max-param-expr-ops=
--
2.24.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_max_agg_items
>
>
> IPA-CP: Always access param_ipa_max_agg_items through opt_for_fn
>
> 2020-01-07  Martin Jambor  <[hidden email]>
>
> * params.opt (param_ipa_max_agg_items): Mark as Optimization
> * ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
> instead of param_ipa_max_agg_items.
> (merge_aggregate_lattices): Extract param_ipa_max_agg_items from
> optimization info for the callee.
OK, thanks (to both Martins) for working on this

Honza

> ---
>  gcc/ipa-cp.c   | 14 +++++++++-----
>  gcc/ipa-prop.c |  8 +++++---
>  gcc/params.opt |  2 +-
>  3 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 4381b35a809..9e20e278eff 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -2458,13 +2458,13 @@ set_check_aggs_by_ref (class ipcp_param_lattices *dest_plats,
>     unless there are too many already.  If there are two many, return false.  If
>     there are overlaps turn whole DEST_PLATS to bottom and return false.  If any
>     skipped lattices were newly marked as containing variable, set *CHANGE to
> -   true.  */
> +   true.  MAX_AGG_ITEMS is the maximum number of lattices.  */
>  
>  static bool
>  merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
>       HOST_WIDE_INT offset, HOST_WIDE_INT val_size,
>       struct ipcp_agg_lattice ***aglat,
> -     bool pre_existing, bool *change)
> +     bool pre_existing, bool *change, int max_agg_items)
>  {
>    gcc_checking_assert (offset >= 0);
>  
> @@ -2499,7 +2499,7 @@ merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
>    set_agg_lats_to_bottom (dest_plats);
>    return false;
>   }
> -      if (dest_plats->aggs_count == param_ipa_max_agg_items)
> +      if (dest_plats->aggs_count == max_agg_items)
>   return false;
>        dest_plats->aggs_count++;
>        new_al = ipcp_agg_lattice_pool.allocate ();
> @@ -2553,6 +2553,8 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
>      ret |= set_agg_lats_contain_variable (dest_plats);
>    dst_aglat = &dest_plats->aggs;
>  
> +  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
> +  param_ipa_max_agg_items);
>    for (struct ipcp_agg_lattice *src_aglat = src_plats->aggs;
>         src_aglat;
>         src_aglat = src_aglat->next)
> @@ -2562,7 +2564,7 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
>        if (new_offset < 0)
>   continue;
>        if (merge_agg_lats_step (dest_plats, new_offset, src_aglat->size,
> -       &dst_aglat, pre_existing, &ret))
> +       &dst_aglat, pre_existing, &ret, max_agg_items))
>   {
>    struct ipcp_agg_lattice *new_al = *dst_aglat;
>  
> @@ -2742,6 +2744,8 @@ propagate_aggs_across_jump_function (struct cgraph_edge *cs,
>        if (set_check_aggs_by_ref (dest_plats, jfunc->agg.by_ref))
>   return true;
>  
> +      int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
> +      param_ipa_max_agg_items);
>        FOR_EACH_VEC_ELT (*jfunc->agg.items, i, item)
>   {
>    HOST_WIDE_INT val_size;
> @@ -2751,7 +2755,7 @@ propagate_aggs_across_jump_function (struct cgraph_edge *cs,
>    val_size = tree_to_shwi (TYPE_SIZE (item->type));
>  
>    if (merge_agg_lats_step (dest_plats, item->offset, val_size,
> -   &aglat, pre_existing, &ret))
> +   &aglat, pre_existing, &ret, max_agg_items))
>      {
>        ret |= propagate_aggregate_lattice (cs, item, *aglat);
>        aglat = &(*aglat)->next;
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index fcb13dfbac4..1ccdbbfb8c0 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -1852,8 +1852,10 @@ determine_known_aggregate_parts (struct ipa_func_body_info *fbi,
>    tree arg_base;
>    bool check_ref, by_ref;
>    ao_ref r;
> +  unsigned max_agg_items = opt_for_fn (fbi->node->decl,
> +       param_ipa_max_agg_items);
>  
> -  if (param_ipa_max_agg_items == 0)
> +  if (max_agg_items == 0)
>      return;
>  
>    /* The function operates in three stages.  First, we prepare check_ref, r,
> @@ -1951,14 +1953,14 @@ determine_known_aggregate_parts (struct ipa_func_body_info *fbi,
>   operands, whose definitions can finally reach the call.  */
>        add_to_agg_contents_list (&list, (*copy = *content, copy));
>  
> -      if (++value_count == param_ipa_max_agg_items)
> +      if (++value_count == max_agg_items)
>   break;
>      }
>  
>    /* Add to the list consisting of all dominating virtual operands.  */
>    add_to_agg_contents_list (&all_list, content);
>  
> -  if (++item_count == 2 * param_ipa_max_agg_items)
> +  if (++item_count == 2 * max_agg_items)
>      break;
>   }
>        dom_vuse = gimple_vuse (stmt);
> diff --git a/gcc/params.opt b/gcc/params.opt
> index 6f05b29a929..d5f0d9a69ce 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -223,7 +223,7 @@ Common Joined UInteger Var(param_ipa_max_aa_steps) Init(25000) Param
>  Maximum number of statements that will be visited by IPA formal parameter analysis based on alias analysis in any given function.
>  
>  -param=ipa-max-agg-items=
> -Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param
> +Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param Optimization
>  Maximum number of aggregate content items for a parameter in jump functions and lattices.
>  
>  -param=ipa-max-param-expr-ops=
> --
> 2.24.1
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
In reply to this post by Martin Liška-2
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:

> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_cp_unit_growth

So as we discussed, picking this one from one particular node is not
what one would expect to happen, but inlining does it too and so anyway:


IPA-CP: Always access param_ipcp_unit_growth through opt_for_fn

2020-01-07  Martin Jambor  <[hidden email]>

        * params.opt (param_ipcp_unit_growth): Mark as Optimization.
        * ipa-cp.c (max_new_size): Removed.
        (orig_overall_size): New variable.
        (get_max_overall_size): New function.
        (estimate_local_effects): Use it.  Adjust dump.
        (decide_about_value): Likewise.
        (ipcp_propagate_stage): Do not calculate max_new_size, just store
        orig_overall_size.  Adjust dump.
        (ipa_cp_c_finalize): Clear orig_overall_size instead of max_new_size.
---
 gcc/ipa-cp.c   | 39 ++++++++++++++++++++++++++-------------
 gcc/params.opt |  2 +-
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 9e20e278eff..c2572e3e0e8 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -375,7 +375,7 @@ static profile_count max_count;
 
 /* Original overall size of the program.  */
 
-static long overall_size, max_new_size;
+static long overall_size, orig_overall_size;
 
 /* Node name to unique clone suffix number map.  */
 static hash_map<const char *, unsigned> *clone_num_suffixes;
@@ -3395,6 +3395,23 @@ perform_estimation_of_a_value (cgraph_node *node, vec<tree> known_csts,
   val->local_size_cost = size;
 }
 
+/* Get the overall limit oof growth based on parameters extracted from growth.
+   it does not really make sense to mix functions with different overall growth
+   limits but it is possible and if it happens, we do not want to select one
+   limit at random.  */
+
+static long
+get_max_overall_size (cgraph_node *node)
+{
+  long max_new_size = orig_overall_size;
+  long large_unit = opt_for_fn (node->decl, param_large_unit_insns);
+  if (max_new_size < large_unit)
+    max_new_size = large_unit;
+  int unit_growth = opt_for_fn (node->decl, param_ipcp_unit_growth);
+  max_new_size += max_new_size * unit_growth / 100 + 1;
+  return max_new_size;
+}
+
 /* Iterate over known values of parameters of NODE and estimate the local
    effects in terms of time and size they have.  */
 
@@ -3457,7 +3474,7 @@ estimate_local_effects (struct cgraph_node *node)
    stats.freq_sum, stats.count_sum,
    size))
  {
-  if (size + overall_size <= max_new_size)
+  if (size + overall_size <= get_max_overall_size (node))
     {
       info->do_clone_for_all_contexts = true;
       overall_size += size;
@@ -3467,8 +3484,8 @@ estimate_local_effects (struct cgraph_node *node)
  "known contexts, growth deemed beneficial.\n");
     }
   else if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "   Not cloning for all contexts because "
-     "max_new_size would be reached with %li.\n",
+    fprintf (dump_file, "  Not cloning for all contexts because "
+     "maximum unit size would be reached with %li.\n",
      size + overall_size);
  }
       else if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3860,14 +3877,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
     max_count = max_count.max (node->count.ipa ());
   }
 
-  max_new_size = overall_size;
-  if (max_new_size < param_large_unit_insns)
-    max_new_size = param_large_unit_insns;
-  max_new_size += max_new_size * param_ipcp_unit_growth / 100 + 1;
+  orig_overall_size = overall_size;
 
   if (dump_file)
-    fprintf (dump_file, "\noverall_size: %li, max_new_size: %li\n",
-     overall_size, max_new_size);
+    fprintf (dump_file, "\noverall_size: %li\n", overall_size);
 
   propagate_constants_topo (topo);
   if (flag_checking)
@@ -5380,11 +5393,11 @@ decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset,
       perhaps_add_new_callers (node, val);
       return false;
     }
-  else if (val->local_size_cost + overall_size > max_new_size)
+  else if (val->local_size_cost + overall_size > get_max_overall_size (node))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
  fprintf (dump_file, "   Ignoring candidate value because "
- "max_new_size would be reached with %li.\n",
+ "maximum unit size would be reached with %li.\n",
  val->local_size_cost + overall_size);
       return false;
     }
@@ -5928,6 +5941,6 @@ ipa_cp_c_finalize (void)
 {
   max_count = profile_count::uninitialized ();
   overall_size = 0;
-  max_new_size = 0;
+  orig_overall_size = 0;
   ipcp_free_transformation_sum ();
 }
diff --git a/gcc/params.opt b/gcc/params.opt
index d5f0d9a69ce..3cd6d7d481d 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -243,7 +243,7 @@ Common Joined UInteger Var(param_ipa_sra_ptr_growth_factor) Init(2) Param
 Maximum allowed growth of number and total size of new parameters that ipa-sra replaces a pointer to an aggregate with.
 
 -param=ipcp-unit-growth=
-Common Joined UInteger Var(param_ipcp_unit_growth) Init(10) Param
+Common Joined UInteger Var(param_ipcp_unit_growth) Optimization Init(10) Param
 How much can given compilation unit grow because of the interprocedural constant propagation (in percent).
 
 -param=ira-loop-reserved-regs=
--
2.24.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
In reply to this post by Martin Liška-2
> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_max_agg_items
> param_ipa_cp_unit_growth
> param_ipa_sra_max_replacements
> param_max_speculative_devirt_maydefs
max_speculative_devirt_maydefs is used only on analysis stage when cfun
is set up properly, so it is safe to set it PerFunction.

Honza

>
> Can you please help me with these as it's in your code?
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2020-01-03  Martin Liska  <[hidden email]>
>
> * auto-profile.c (auto_profile): Use opt_for_fn
> for a parameter.
> * ipa-cp.c (ipcp_lattice::add_value): Likewise.
> (propagate_vals_across_arith_jfunc): Likewise.
> (hint_time_bonus): Likewise.
> (incorporate_penalties): Likewise.
> (good_cloning_opportunity_p): Likewise.
> (perform_estimation_of_a_value): Likewise.
> (estimate_local_effects): Likewise.
> (ipcp_propagate_stage): Likewise.
> * ipa-fnsummary.c (decompose_param_expr): Likewise.
> (set_switch_stmt_execution_predicate): Likewise.
> (analyze_function_body): Likewise.
> * ipa-inline-analysis.c (offline_size): Likewise.
> * ipa-inline.c (early_inliner): Likewise.
> * ipa-prop.c (ipa_analyze_node): Likewise.
> (ipcp_transform_function): Likewise.
> * ipa-sra.c (process_scan_results): Likewise.
> (ipa_sra_summarize_function): Likewise.
> * params.opt: Rename ipcp-unit-growth to
> ipa-cp-unit-growth.  Add Optimization for various
> IPA-related parameters.
> ---
>  gcc/auto-profile.c        |  3 ++-
>  gcc/ipa-cp.c              | 44 +++++++++++++++++++++++----------------
>  gcc/ipa-fnsummary.c       |  7 ++++---
>  gcc/ipa-inline-analysis.c |  7 ++++---
>  gcc/ipa-inline.c          |  6 ++++--
>  gcc/ipa-prop.c            |  4 ++--
>  gcc/ipa-sra.c             |  6 ++++--
>  gcc/params.opt            | 34 +++++++++++++++---------------
>  8 files changed, 63 insertions(+), 48 deletions(-)
>
>

> diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
> index 6aca2f29022..c5e9f1336a7 100644
> --- a/gcc/auto-profile.c
> +++ b/gcc/auto-profile.c
> @@ -1628,7 +1628,8 @@ auto_profile (void)
>         function before annotation, so the profile inside bar@loc_foo2
>         will be useful.  */
>      autofdo::stmt_set promoted_stmts;
> -    for (int i = 0; i < param_early_inliner_max_iterations; i++)
> +    for (int i = 0; i < opt_for_fn (node->decl,
> +    param_early_inliner_max_iterations); i++)
>        {
>          if (!flag_value_profile_transformations
>              || !autofdo::afdo_vpt_for_early_inline (&promoted_stmts))
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 43c0d5a6706..def89471abb 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -1859,7 +1859,8 @@ ipcp_lattice<valtype>::add_value (valtype newval, cgraph_edge *cs,
>   return false;
>        }
>  
> -  if (!unlimited && values_count == param_ipa_cp_value_list_size)
> +  if (!unlimited && values_count == opt_for_fn (cs->caller->decl,
> + param_ipa_cp_value_list_size))
>      {
>        /* We can only free sources, not the values themselves, because sources
>   of other values in this SCC might point to them.   */
> @@ -1986,12 +1987,15 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs,
>      {
>        int i;
>  
> -      if (src_lat != dest_lat || param_ipa_cp_max_recursive_depth < 1)
> +      int max_recursive_depth = opt_for_fn(cs->caller->decl,
> +   param_ipa_cp_max_recursive_depth);
> +      if (src_lat != dest_lat || max_recursive_depth < 1)
>   return dest_lat->set_contains_variable ();
>  
>        /* No benefit if recursive execution is in low probability.  */
>        if (cs->sreal_frequency () * 100
> -  <= ((sreal) 1) * param_ipa_cp_min_recursive_probability)
> +  <= ((sreal) 1) * opt_for_fn (cs->caller->decl,
> +       param_ipa_cp_min_recursive_probability))
>   return dest_lat->set_contains_variable ();
>  
>        auto_vec<ipcp_value<tree> *, 8> val_seeds;
> @@ -2019,7 +2023,7 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs,
>        /* Recursively generate lattice values with a limited count.  */
>        FOR_EACH_VEC_ELT (val_seeds, i, src_val)
>   {
> -  for (int j = 1; j < param_ipa_cp_max_recursive_depth; j++)
> +  for (int j = 1; j < max_recursive_depth; j++)
>      {
>        tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2,
>       src_val, res_type);
> @@ -3155,11 +3159,11 @@ devirtualization_time_bonus (struct cgraph_node *node,
>  /* Return time bonus incurred because of HINTS.  */
>  
>  static int
> -hint_time_bonus (ipa_hints hints)
> +hint_time_bonus (cgraph_node *node, ipa_hints hints)
>  {
>    int result = 0;
>    if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride))
> -    result += param_ipa_cp_loop_hint_bonus;
> +    result += opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus);
>    return result;
>  }
>  
> @@ -3167,15 +3171,18 @@ hint_time_bonus (ipa_hints hints)
>     cloning goodness evaluation, do so.  */
>  
>  static inline int64_t
> -incorporate_penalties (ipa_node_params *info, int64_t evaluation)
> +incorporate_penalties (cgraph_node *node, ipa_node_params *info,
> +       int64_t evaluation)
>  {
>    if (info->node_within_scc && !info->node_is_self_scc)
>      evaluation = (evaluation
> -  * (100 - param_ipa_cp_recursion_penalty)) / 100;
> +  * (100 - opt_for_fn (node->decl,
> +       param_ipa_cp_recursion_penalty))) / 100;
>  
>    if (info->node_calling_single_call)
>      evaluation = (evaluation
> -  * (100 - param_ipa_cp_single_call_penalty))
> +  * (100 - opt_for_fn (node->decl,
> +       param_ipa_cp_single_call_penalty)))
>        / 100;
>  
>    return evaluation;
> @@ -3197,6 +3204,7 @@ good_cloning_opportunity_p (struct cgraph_node *node, int time_benefit,
>    gcc_assert (size_cost > 0);
>  
>    class ipa_node_params *info = IPA_NODE_REF (node);
> +  int eval_threshold = opt_for_fn (node->decl, param_ipa_cp_eval_threshold);
>    if (max_count > profile_count::zero ())
>      {
>        int factor = RDIV (count_sum.probability_in
> @@ -3204,7 +3212,7 @@ good_cloning_opportunity_p (struct cgraph_node *node, int time_benefit,
>           * 1000, REG_BR_PROB_BASE);
>        int64_t evaluation = (((int64_t) time_benefit * factor)
>      / size_cost);
> -      evaluation = incorporate_penalties (info, evaluation);
> +      evaluation = incorporate_penalties (node, info, evaluation);
>  
>        if (dump_file && (dump_flags & TDF_DETAILS))
>   {
> @@ -3216,16 +3224,16 @@ good_cloning_opportunity_p (struct cgraph_node *node, int time_benefit,
>   info->node_within_scc
>     ? (info->node_is_self_scc ? ", self_scc" : ", scc") : "",
>   info->node_calling_single_call ? ", single_call" : "",
> - evaluation, param_ipa_cp_eval_threshold);
> + evaluation, eval_threshold);
>   }
>  
> -      return evaluation >= param_ipa_cp_eval_threshold;
> +      return evaluation >= eval_threshold;
>      }
>    else
>      {
>        int64_t evaluation = (((int64_t) time_benefit * freq_sum)
>      / size_cost);
> -      evaluation = incorporate_penalties (info, evaluation);
> +      evaluation = incorporate_penalties (node, info, evaluation);
>  
>        if (dump_file && (dump_flags & TDF_DETAILS))
>   fprintf (dump_file, "     good_cloning_opportunity_p (time: %i, "
> @@ -3235,9 +3243,9 @@ good_cloning_opportunity_p (struct cgraph_node *node, int time_benefit,
>   info->node_within_scc
>     ? (info->node_is_self_scc ? ", self_scc" : ", scc") : "",
>   info->node_calling_single_call ? ", single_call" : "",
> - evaluation, param_ipa_cp_eval_threshold);
> + evaluation, eval_threshold);
>  
> -      return evaluation >= param_ipa_cp_eval_threshold;
> +      return evaluation >= eval_threshold;
>      }
>  }
>  
> @@ -3375,7 +3383,7 @@ perform_estimation_of_a_value (cgraph_node *node, vec<tree> known_csts,
>      time_benefit = base_time.to_int ()
>        + devirtualization_time_bonus (node, known_csts, known_contexts,
>       known_aggs)
> -      + hint_time_bonus (hints)
> +      + hint_time_bonus (node, hints)
>        + removable_params_cost + est_move_cost;
>  
>    gcc_checking_assert (size >=0);
> @@ -3430,7 +3438,7 @@ estimate_local_effects (struct cgraph_node *node)
>   known_aggs, &size, &time,
>   &base_time, &hints);
>        time -= devirt_bonus;
> -      time -= hint_time_bonus (hints);
> +      time -= hint_time_bonus (node, hints);
>        time -= removable_params_cost;
>        size -= stats.n_calls * removable_params_cost;
>  
> @@ -3858,7 +3866,7 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
>    max_new_size = overall_size;
>    if (max_new_size < param_large_unit_insns)
>      max_new_size = param_large_unit_insns;
> -  max_new_size += max_new_size * param_ipcp_unit_growth / 100 + 1;
> +  max_new_size += max_new_size * param_ipa_cp_unit_growth / 100 + 1;
>  
>    if (dump_file)
>      fprintf (dump_file, "\noverall_size: %li, max_new_size: %li\n",
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index fa01cb6c083..7364f823efd 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -1324,7 +1324,7 @@ decompose_param_expr (struct ipa_func_body_info *fbi,
>        struct agg_position_info *aggpos,
>        expr_eval_ops *param_ops_p = NULL)
>  {
> -  int op_limit = param_ipa_max_param_expr_ops;
> +  int op_limit = opt_for_fn (fbi->node->decl, param_ipa_max_param_expr_ops);
>    int op_count = 0;
>  
>    if (param_ops_p)
> @@ -1555,7 +1555,8 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
>  
>    auto_vec<std::pair<tree, tree> > ranges;
>    tree type = TREE_TYPE (op);
> -  int bound_limit = param_ipa_max_switch_predicate_bounds;
> +  int bound_limit = opt_for_fn (fbi->node->decl,
> + param_ipa_max_switch_predicate_bounds);
>    int bound_count = 0;
>    wide_int vr_wmin, vr_wmax;
>    value_range_kind vr_type = get_range_info (op, &vr_wmin, &vr_wmax);
> @@ -2451,7 +2452,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
>    fbi.bb_infos = vNULL;
>    fbi.bb_infos.safe_grow_cleared (last_basic_block_for_fn (cfun));
>    fbi.param_count = count_formal_params (node->decl);
> -  fbi.aa_walk_budget = param_ipa_max_aa_steps;
> +  fbi.aa_walk_budget = opt_for_fn (node->decl, param_ipa_max_aa_steps);
>  
>    nonconstant_names.safe_grow_cleared
>      (SSANAMES (my_function)->length ());
> diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
> index eb42caeb7ae..39be56c8f76 100644
> --- a/gcc/ipa-inline-analysis.c
> +++ b/gcc/ipa-inline-analysis.c
> @@ -455,9 +455,10 @@ offline_size (struct cgraph_node *node, ipa_size_summary *info)
>           Take this into account.  */
>        else if (DECL_COMDAT (node->decl)
>         && node->can_remove_if_no_direct_calls_p ())
> - return (info->size
> - * (100 - param_comdat_sharing_probability)
> - + 50) / 100;
> + {
> +  int prob = opt_for_fn (node->decl, param_comdat_sharing_probability);
> +  return info->size * (100 - prob + 50) / 100;
> + }
>      }
>    return 0;
>  }
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 3b68fc47d01..96ce67d71a5 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -2967,7 +2967,8 @@ early_inliner (function *fun)
>   }
>        /* We iterate incremental inlining to get trivial cases of indirect
>   inlining.  */
> -      while (iterations < param_early_inliner_max_iterations
> +      while (iterations < opt_for_fn (node->decl,
> +      param_early_inliner_max_iterations)
>       && early_inline_small_functions (node))
>   {
>    timevar_push (TV_INTEGRATION);
> @@ -2986,7 +2987,8 @@ early_inliner (function *fun)
>        es->call_stmt_time
>   = estimate_num_insns (edge->call_stmt, &eni_time_weights);
>      }
> -  if (iterations < param_early_inliner_max_iterations - 1)
> +  if (iterations < opt_for_fn (node->decl,
> +       param_early_inliner_max_iterations) - 1)
>      ipa_update_overall_fn_summary (node);
>    timevar_pop (TV_INTEGRATION);
>    iterations++;
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 035730d180d..9f80dd1f6a8 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -2876,7 +2876,7 @@ ipa_analyze_node (struct cgraph_node *node)
>    fbi.bb_infos = vNULL;
>    fbi.bb_infos.safe_grow_cleared (last_basic_block_for_fn (cfun));
>    fbi.param_count = ipa_get_param_count (info);
> -  fbi.aa_walk_budget = param_ipa_max_aa_steps;
> +  fbi.aa_walk_budget = opt_for_fn (node->decl, param_ipa_max_aa_steps);
>  
>    for (struct cgraph_edge *cs = node->callees; cs; cs = cs->next_callee)
>      {
> @@ -5756,7 +5756,7 @@ ipcp_transform_function (struct cgraph_node *node)
>    fbi.bb_infos = vNULL;
>    fbi.bb_infos.safe_grow_cleared (last_basic_block_for_fn (cfun));
>    fbi.param_count = param_count;
> -  fbi.aa_walk_budget = param_ipa_max_aa_steps;
> +  fbi.aa_walk_budget = opt_for_fn (node->decl, param_ipa_max_aa_steps);
>  
>    vec_safe_grow_cleared (descriptors, param_count);
>    ipa_populate_param_decls (node, *descriptors);
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index a051a9f2154..79b543fa934 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -2279,7 +2279,9 @@ process_scan_results (cgraph_node *node, struct function *fun,
>        if (!desc->by_ref || optimize_function_for_size_p (fun))
>   param_size_limit = cur_param_size;
>        else
> - param_size_limit = param_ipa_sra_ptr_growth_factor * cur_param_size;
> +  param_size_limit
> +    = opt_for_fn (node->decl,
> +  param_ipa_sra_ptr_growth_factor) * cur_param_size;
>        if (nonarg_acc_size > param_size_limit
>    || (!desc->by_ref && nonarg_acc_size == param_size_limit))
>   {
> @@ -2499,7 +2501,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>    bb_dereferences = XCNEWVEC (HOST_WIDE_INT,
>        by_ref_count
>        * last_basic_block_for_fn (fun));
> -  aa_walking_limit = param_ipa_max_aa_steps;
> +  aa_walking_limit = opt_for_fn (node->decl, param_ipa_max_aa_steps);
>    scan_function (node, fun);
>  
>    if (dump_file)
> diff --git a/gcc/params.opt b/gcc/params.opt
> index 5d39244761a..094e04aae5e 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -83,7 +83,7 @@ Common Joined UInteger Var(param_case_values_threshold) Param Optimization
>  The smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches, if 0, use the default for the machine.
>  
>  -param=comdat-sharing-probability=
> -Common Joined UInteger Var(param_comdat_sharing_probability) Init(20) Param
> +Common Joined UInteger Var(param_comdat_sharing_probability) Init(20) Param Optimization
>  Probability that COMDAT function will be shared with different compilation unit.
>  
>  -param=cxx-max-namespaces-for-diagnostic-help=
> @@ -191,35 +191,39 @@ Common Joined UInteger Var(param_integer_share_limit) Init(251) IntegerRange(2,
>  The upper bound for sharing integer constants.
>  
>  -param=ipa-cp-eval-threshold=
> -Common Joined UInteger Var(param_ipa_cp_eval_threshold) Init(500) Param
> +Common Joined UInteger Var(param_ipa_cp_eval_threshold) Init(500) Param Optimization
>  Threshold ipa-cp opportunity evaluation that is still considered beneficial to clone.
>  
>  -param=ipa-cp-loop-hint-bonus=
> -Common Joined UInteger Var(param_ipa_cp_loop_hint_bonus) Init(64) Param
> +Common Joined UInteger Var(param_ipa_cp_loop_hint_bonus) Init(64) Param Optimization
>  Compile-time bonus IPA-CP assigns to candidates which make loop bounds or strides known.
>  
>  -param=ipa-cp-max-recursive-depth=
> -Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(8) Param
> +Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(8) Param Optimization
>  Maximum depth of recursive cloning for self-recursive function.
>  
>  -param=ipa-cp-min-recursive-probability=
> -Common Joined UInteger Var(param_ipa_cp_min_recursive_probability) Init(2) Param
> +Common Joined UInteger Var(param_ipa_cp_min_recursive_probability) Init(2) Param Optimization
>  Recursive cloning only when the probability of call being executed exceeds the parameter.
>  
>  -param=ipa-cp-recursion-penalty=
> -Common Joined UInteger Var(param_ipa_cp_recursion_penalty) Init(40) IntegerRange(0, 100) Param
> +Common Joined UInteger Var(param_ipa_cp_recursion_penalty) Init(40) IntegerRange(0, 100) Param Optimization
>  Percentage penalty the recursive functions will receive when they are evaluated for cloning.
>  
>  -param=ipa-cp-single-call-penalty=
> -Common Joined UInteger Var(param_ipa_cp_single_call_penalty) Init(15) IntegerRange(0, 100) Param
> +Common Joined UInteger Var(param_ipa_cp_single_call_penalty) Init(15) IntegerRange(0, 100) Param Optimization
>  Percentage penalty functions containing a single call to another function will receive when they are evaluated for cloning.
>  
> +-param=ipa-cp-unit-growth=
> +Common Joined UInteger Var(param_ipa_cp_unit_growth) Init(10) Param
> +How much can given compilation unit grow because of the interprocedural constant propagation (in percent).
> +
>  -param=ipa-cp-value-list-size=
> -Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param
> +Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param Optimization
>  Maximum size of a list of values associated with each parameter for interprocedural constant propagation.
>  
>  -param=ipa-max-aa-steps=
> -Common Joined UInteger Var(param_ipa_max_aa_steps) Init(25000) Param
> +Common Joined UInteger Var(param_ipa_max_aa_steps) Init(25000) Param Optimization
>  Maximum number of statements that will be visited by IPA formal parameter analysis based on alias analysis in any given function.
>  
>  -param=ipa-max-agg-items=
> @@ -227,11 +231,11 @@ Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param
>  Maximum number of aggregate content items for a parameter in jump functions and lattices.
>  
>  -param=ipa-max-param-expr-ops=
> -Common Joined UInteger Var(param_ipa_max_param_expr_ops) Init(10) Param
> +Common Joined UInteger Var(param_ipa_max_param_expr_ops) Init(10) Param Optimization
>  Maximum number of operations in a parameter expression that can be handled by IPA analysis.
>  
>  -param=ipa-max-switch-predicate-bounds=
> -Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param
> +Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param Optimization
>  Maximal number of boundary endpoints of case ranges of switch statement used during IPA functoin summary generation.
>  
>  -param=ipa-sra-max-replacements=
> @@ -239,13 +243,9 @@ Common Joined UInteger Var(param_ipa_sra_max_replacements) Init(8) IntegerRange(
>  Maximum pieces that IPA-SRA tracks per formal parameter, as a consequence, also the maximum number of replacements of a formal parameter.
>  
>  -param=ipa-sra-ptr-growth-factor=
> -Common Joined UInteger Var(param_ipa_sra_ptr_growth_factor) Init(2) Param
> +Common Joined UInteger Var(param_ipa_sra_ptr_growth_factor) Init(2) Param Optimization
>  Maximum allowed growth of number and total size of new parameters that ipa-sra replaces a pointer to an aggregate with.
>  
> --param=ipcp-unit-growth=
> -Common Joined UInteger Var(param_ipcp_unit_growth) Init(10) Param
> -How much can given compilation unit grow because of the interprocedural constant propagation (in percent).
> -
>  -param=ira-loop-reserved-regs=
>  Common Joined UInteger Var(param_ira_loop_reserved_regs) Init(2) Param Optimization
>  The number of registers in each class kept unused by loop invariant motion.
> @@ -423,7 +423,7 @@ Common Joined UInteger Var(param_max_dse_active_local_stores) Init(5000) Param O
>  Maximum number of active local stores in RTL dead store elimination.
>  
>  -param=max-early-inliner-iterations=
> -Common Joined UInteger Var(param_early_inliner_max_iterations) Init(1) Param
> +Common Joined UInteger Var(param_early_inliner_max_iterations) Init(1) Param Optimization
>  The maximum number of nested indirect inlining performed by early inliner.
>  
>  -param=max-fields-for-field-sensitive=
>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
In reply to this post by Martin Jambor-3
> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_cp_unit_growth
>
> So as we discussed, picking this one from one particular node is not
> what one would expect to happen, but inlining does it too and so anyway:

Inlining does not do that.  For each inlining decision it calculcates
the growth accroding to function it inlines into. So if you set
unit-growth more for -O3 than for -O2 (as I am just finishing
benchmarking of) and combine both settings, the -O3 code will be allowed
to grow unit when -O2 code would not.  I think ipa-cp can do the same.

Honza
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
In reply to this post by Martin Liška-2
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:

> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_sra_max_replacements

IPA-CP: Always access param_ipa_sra_max_replacements through opt_for_fn

2020-01-07  Martin Jambor  <[hidden email]>

        * params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
        * ipa-sra.c (scanned_node): New variable.
        (allocate_access): Use it to get param_ipa_sra_max_replacements.
        (ipa_sra_summarize_function): Set up scanned_node.
        (pull_accesses_from_callee): New parameter caller, use it to get
        param_ipa_sra_max_replacements.
        (param_splitting_across_edge): Pass the caller to
        pull_accesses_from_callee.
---
 gcc/ipa-sra.c  | 33 +++++++++++++++++++++------------
 gcc/params.opt |  2 +-
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index a051a9f2154..133ed687509 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -521,6 +521,10 @@ ipa_sra_call_summaries::duplicate (cgraph_edge *, cgraph_edge *,
 
 /* With all GTY stuff done, we can move to anonymous namespace.  */
 namespace {
+/* Functions which currently has its body analyzed.  */
+
+cgraph_node *scanned_node;
+
 /* Quick mapping from a decl to its param descriptor.  */
 
 hash_map<tree, gensum_param_desc *> *decl2desc;
@@ -1265,7 +1269,8 @@ allocate_access (gensum_param_desc *desc,
  HOST_WIDE_INT offset, HOST_WIDE_INT size)
 {
   if (desc->access_count
-      == (unsigned) param_ipa_sra_max_replacements)
+      == (unsigned) opt_for_fn (scanned_node->decl,
+ param_ipa_sra_max_replacements))
     {
       disqualify_split_candidate (desc, "Too many replacement candidates");
       return NULL;
@@ -2472,6 +2477,7 @@ ipa_sra_summarize_function (cgraph_node *node)
      node->order);
   if (!ipa_sra_preliminary_function_checks (node))
     return;
+  scanned_node = node;
   gcc_obstack_init (&gensum_obstack);
   isra_func_summary *ifs = func_sums->get_create (node);
   ifs->m_candidate = true;
@@ -2526,6 +2532,7 @@ ipa_sra_summarize_function (cgraph_node *node)
   delete decl2desc;
   decl2desc = NULL;
   obstack_free (&gensum_obstack, NULL);
+  scanned_node = NULL;
   if (dump_file)
     fprintf (dump_file, "\n\n");
   if (flag_checking)
@@ -3246,16 +3253,17 @@ all_callee_accesses_present_p (isra_param_desc *param_desc,
 enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN};
 
 
-/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, if
-   they would not violate some constraint there.  If successful, return NULL,
-   otherwise return the string reason for failure (which can be written to the
-   dump file).  DELTA_OFFSET is the known offset of the actual argument withing
-   the formal parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the
-   size of the actual argument or zero, if not known. In case of success, set
-   *CHANGE_P to true if propagation actually changed anything.  */
+/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
+   (which belongs to CALLER) if they would not violate some constraint there.
+   If successful, return NULL, otherwise return the string reason for failure
+   (which can be written to the dump file).  DELTA_OFFSET is the known offset
+   of the actual argument withing the formal parameter (so of ARG_DESCS within
+   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
+   known. In case of success, set *CHANGE_P to true if propagation actually
+   changed anything.  */
 
 static const char *
-pull_accesses_from_callee (isra_param_desc *param_desc,
+pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
    isra_param_desc *arg_desc,
    unsigned delta_offset, unsigned arg_size,
    bool *change_p)
@@ -3335,7 +3343,7 @@ pull_accesses_from_callee (isra_param_desc *param_desc,
       return NULL;
 
     if ((prop_count + pclen
- > (unsigned) param_ipa_sra_max_replacements)
+ > (unsigned) opt_for_fn (caller->decl, param_ipa_sra_max_replacements))
  || size_would_violate_limit_p (param_desc,
        param_desc->size_reached + prop_size))
       return "propagating accesses would violate the count or size limit";
@@ -3455,7 +3463,8 @@ param_splitting_across_edge (cgraph_edge *cs)
   else
     {
       const char *pull_failure
- = pull_accesses_from_callee (param_desc, arg_desc, 0, 0, &res);
+ = pull_accesses_from_callee (cs->caller, param_desc, arg_desc,
+     0, 0, &res);
       if (pull_failure)
  {
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3516,7 +3525,7 @@ param_splitting_across_edge (cgraph_edge *cs)
   else
     {
       const char *pull_failure
- = pull_accesses_from_callee (param_desc, arg_desc,
+ = pull_accesses_from_callee (cs->caller, param_desc, arg_desc,
      ipf->unit_offset,
      ipf->unit_size, &res);
       if (pull_failure)
diff --git a/gcc/params.opt b/gcc/params.opt
index 3cd6d7d481d..02584740640 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -235,7 +235,7 @@ Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param
 Maximal number of boundary endpoints of case ranges of switch statement used during IPA functoin summary generation.
 
 -param=ipa-sra-max-replacements=
-Common Joined UInteger Var(param_ipa_sra_max_replacements) Init(8) IntegerRange(0, 16) Param
+Common Joined UInteger Var(param_ipa_sra_max_replacements) Optimization Init(8) IntegerRange(0, 16) Param
 Maximum pieces that IPA-SRA tracks per formal parameter, as a consequence, also the maximum number of replacements of a formal parameter.
 
 -param=ipa-sra-ptr-growth-factor=
--
2.24.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
In reply to this post by Martin Liška-2
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:

> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_max_agg_items
> param_ipa_cp_unit_growth
> param_ipa_sra_max_replacements

sent in previous separate email messages.  All of those patches passed
bootstrap and testing and LTO bootstrap on an x86_64-linux.  Feel free
to commit them with your other param-related patches after they are
reviewed.

> param_max_speculative_devirt_maydefs
>

This one is Honza's but I am quite sure you can just use cfun->decl in
the one place where this is queried.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
In reply to this post by Jan Hubicka-2
Hi,

On Wed, Jan 08 2020, Jan Hubicka wrote:

>> Hi,
>>
>> On Fri, Jan 03 2020, Martin Liška wrote:
>> > Hi.
>> >
>> > This is similar transformation for IPA passes. This time,
>> > one needs to use opt_for_fn in order to get the right
>> > parameter values.
>> >
>> > @Martin, Honza:
>> > There are last few remaining parameters which should use
>> > opt_for_fn:
>> >
>> > param_ipa_cp_unit_growth
>>
>> So as we discussed, picking this one from one particular node is not

all right, the above was perhaps confusing, the patch does not pick one
value but keeps picking the maximum growth ratio from each and every
node as it considers cloning opportunities...

>> what one would expect to happen, but inlining does it too and so anyway:
>
> Inlining does not do that.  For each inlining decision it calculcates
> the growth accroding to function it inlines into. So if you set
> unit-growth more for -O3 than for -O2 (as I am just finishing
> benchmarking of) and combine both settings, the -O3 code will be allowed
> to grow unit when -O2 code would not.  I think ipa-cp can do the same.
>

...and thus I believe the patch actually the patch does the same.

Martin

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
In reply to this post by Martin Jambor-3
> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_sra_max_replacements
>
> IPA-CP: Always access param_ipa_sra_max_replacements through opt_for_fn
>
> 2020-01-07  Martin Jambor  <[hidden email]>
>
> * params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
> * ipa-sra.c (scanned_node): New variable.
> (allocate_access): Use it to get param_ipa_sra_max_replacements.
> (ipa_sra_summarize_function): Set up scanned_node.
> (pull_accesses_from_callee): New parameter caller, use it to get
> param_ipa_sra_max_replacements.
> (param_splitting_across_edge): Pass the caller to
> pull_accesses_from_callee.
> ---
>  gcc/ipa-sra.c  | 33 +++++++++++++++++++++------------
>  gcc/params.opt |  2 +-
>  2 files changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index a051a9f2154..133ed687509 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -521,6 +521,10 @@ ipa_sra_call_summaries::duplicate (cgraph_edge *, cgraph_edge *,
>  
>  /* With all GTY stuff done, we can move to anonymous namespace.  */
>  namespace {
> +/* Functions which currently has its body analyzed.  */
> +
> +cgraph_node *scanned_node;
> +
>  /* Quick mapping from a decl to its param descriptor.  */
>  
>  hash_map<tree, gensum_param_desc *> *decl2desc;
> @@ -1265,7 +1269,8 @@ allocate_access (gensum_param_desc *desc,
>   HOST_WIDE_INT offset, HOST_WIDE_INT size)
>  {
>    if (desc->access_count
> -      == (unsigned) param_ipa_sra_max_replacements)
> +      == (unsigned) opt_for_fn (scanned_node->decl,
> + param_ipa_sra_max_replacements))
>      {
>        disqualify_split_candidate (desc, "Too many replacement candidates");
>        return NULL;
> @@ -2472,6 +2477,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>       node->order);
>    if (!ipa_sra_preliminary_function_checks (node))
>      return;
> +  scanned_node = node;
>    gcc_obstack_init (&gensum_obstack);
>    isra_func_summary *ifs = func_sums->get_create (node);
>    ifs->m_candidate = true;
> @@ -2526,6 +2532,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>    delete decl2desc;
>    decl2desc = NULL;
>    obstack_free (&gensum_obstack, NULL);
> +  scanned_node = NULL;

It is your code.  having static var to track currently analyzed function
is bit ugly, and I am not sure if you don't have current_function_decl
set to that function in all cases.  But I will leave this to your
decision.

Patch is OK either in this form or with tracking updated.
Honza
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
In reply to this post by Martin Jambor-3
> Hi,
>
> On Wed, Jan 08 2020, Jan Hubicka wrote:
> >> Hi,
> >>
> >> On Fri, Jan 03 2020, Martin Liška wrote:
> >> > Hi.
> >> >
> >> > This is similar transformation for IPA passes. This time,
> >> > one needs to use opt_for_fn in order to get the right
> >> > parameter values.
> >> >
> >> > @Martin, Honza:
> >> > There are last few remaining parameters which should use
> >> > opt_for_fn:
> >> >
> >> > param_ipa_cp_unit_growth
> >>
> >> So as we discussed, picking this one from one particular node is not
>
> all right, the above was perhaps confusing, the patch does not pick one
> value but keeps picking the maximum growth ratio from each and every
> node as it considers cloning opportunities...
>
> >> what one would expect to happen, but inlining does it too and so anyway:
> >
> > Inlining does not do that.  For each inlining decision it calculcates
> > the growth accroding to function it inlines into. So if you set
> > unit-growth more for -O3 than for -O2 (as I am just finishing
> > benchmarking of) and combine both settings, the -O3 code will be allowed
> > to grow unit when -O2 code would not.  I think ipa-cp can do the same.
> >
>
> ...and thus I believe the patch actually the patch does the same.

I guess it won't cut cloning -O2 nodes before cloning -O3 nodes?
But if that is hard to set up (it may be since it is not done in greedy
way as in inline) the patch is OK - it is definitly improvement over
what we have right now. Having one param that matters at linktime
and rest tha tmatters at compile time would definitly be very confusing.

Honza
>
> Martin
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
In reply to this post by Martin Jambor-3
> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_max_agg_items
> > param_ipa_cp_unit_growth
> > param_ipa_sra_max_replacements
>
> sent in previous separate email messages.  All of those patches passed
> bootstrap and testing and LTO bootstrap on an x86_64-linux.  Feel free
> to commit them with your other param-related patches after they are
> reviewed.
>
> > param_max_speculative_devirt_maydefs
> >
>
> This one is Honza's but I am quite sure you can just use cfun->decl in
> the one place where this is queried.
You do not need that. param_* already does that.  I think it is
safe to simply declare it Optimizatoin/PerFunction.

Honza
>
> Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
In reply to this post by Martin Jambor-3
Hi,

On Wed, Jan 08 2020, Martin Jambor wrote:

> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
>> Hi.
>>
>> This is similar transformation for IPA passes. This time,
>> one needs to use opt_for_fn in order to get the right
>> parameter values.
>>
>> @Martin, Honza:
>> There are last few remaining parameters which should use
>> opt_for_fn:
>>
>> param_ipa_max_agg_items
>
>

Sorry, I sent out an earlier version of the patch that fails bootstrap
because of a Werror.  This one is the one I meant (the only difference
is that I changed one unsigned int into a signed one).

IPA-CP: Always access param_ipa_max_agg_items through opt_for_fn

2020-01-07  Martin Jambor  <[hidden email]>

        * params.opt (param_ipa_max_agg_items): Mark as Optimization
        * ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
        instead of param_ipa_max_agg_items.
        (merge_aggregate_lattices): Extract param_ipa_max_agg_items from
        optimization info for the callee.
---
 gcc/ipa-cp.c   | 14 +++++++++-----
 gcc/ipa-prop.c |  7 ++++---
 gcc/params.opt |  2 +-
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4381b35a809..9e20e278eff 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2458,13 +2458,13 @@ set_check_aggs_by_ref (class ipcp_param_lattices *dest_plats,
    unless there are too many already.  If there are two many, return false.  If
    there are overlaps turn whole DEST_PLATS to bottom and return false.  If any
    skipped lattices were newly marked as containing variable, set *CHANGE to
-   true.  */
+   true.  MAX_AGG_ITEMS is the maximum number of lattices.  */
 
 static bool
 merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
      HOST_WIDE_INT offset, HOST_WIDE_INT val_size,
      struct ipcp_agg_lattice ***aglat,
-     bool pre_existing, bool *change)
+     bool pre_existing, bool *change, int max_agg_items)
 {
   gcc_checking_assert (offset >= 0);
 
@@ -2499,7 +2499,7 @@ merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
   set_agg_lats_to_bottom (dest_plats);
   return false;
  }
-      if (dest_plats->aggs_count == param_ipa_max_agg_items)
+      if (dest_plats->aggs_count == max_agg_items)
  return false;
       dest_plats->aggs_count++;
       new_al = ipcp_agg_lattice_pool.allocate ();
@@ -2553,6 +2553,8 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
     ret |= set_agg_lats_contain_variable (dest_plats);
   dst_aglat = &dest_plats->aggs;
 
+  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+  param_ipa_max_agg_items);
   for (struct ipcp_agg_lattice *src_aglat = src_plats->aggs;
        src_aglat;
        src_aglat = src_aglat->next)
@@ -2562,7 +2564,7 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
       if (new_offset < 0)
  continue;
       if (merge_agg_lats_step (dest_plats, new_offset, src_aglat->size,
-       &dst_aglat, pre_existing, &ret))
+       &dst_aglat, pre_existing, &ret, max_agg_items))
  {
   struct ipcp_agg_lattice *new_al = *dst_aglat;
 
@@ -2742,6 +2744,8 @@ propagate_aggs_across_jump_function (struct cgraph_edge *cs,
       if (set_check_aggs_by_ref (dest_plats, jfunc->agg.by_ref))
  return true;
 
+      int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+      param_ipa_max_agg_items);
       FOR_EACH_VEC_ELT (*jfunc->agg.items, i, item)
  {
   HOST_WIDE_INT val_size;
@@ -2751,7 +2755,7 @@ propagate_aggs_across_jump_function (struct cgraph_edge *cs,
   val_size = tree_to_shwi (TYPE_SIZE (item->type));
 
   if (merge_agg_lats_step (dest_plats, item->offset, val_size,
-   &aglat, pre_existing, &ret))
+   &aglat, pre_existing, &ret, max_agg_items))
     {
       ret |= propagate_aggregate_lattice (cs, item, *aglat);
       aglat = &(*aglat)->next;
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index fcb13dfbac4..3488674760f 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -1852,8 +1852,9 @@ determine_known_aggregate_parts (struct ipa_func_body_info *fbi,
   tree arg_base;
   bool check_ref, by_ref;
   ao_ref r;
+  int max_agg_items = opt_for_fn (fbi->node->decl, param_ipa_max_agg_items);
 
-  if (param_ipa_max_agg_items == 0)
+  if (max_agg_items == 0)
     return;
 
   /* The function operates in three stages.  First, we prepare check_ref, r,
@@ -1951,14 +1952,14 @@ determine_known_aggregate_parts (struct ipa_func_body_info *fbi,
  operands, whose definitions can finally reach the call.  */
       add_to_agg_contents_list (&list, (*copy = *content, copy));
 
-      if (++value_count == param_ipa_max_agg_items)
+      if (++value_count == max_agg_items)
  break;
     }
 
   /* Add to the list consisting of all dominating virtual operands.  */
   add_to_agg_contents_list (&all_list, content);
 
-  if (++item_count == 2 * param_ipa_max_agg_items)
+  if (++item_count == 2 * max_agg_items)
     break;
  }
       dom_vuse = gimple_vuse (stmt);
diff --git a/gcc/params.opt b/gcc/params.opt
index 6f05b29a929..d5f0d9a69ce 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -223,7 +223,7 @@ Common Joined UInteger Var(param_ipa_max_aa_steps) Init(25000) Param
 Maximum number of statements that will be visited by IPA formal parameter analysis based on alias analysis in any given function.
 
 -param=ipa-max-agg-items=
-Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param
+Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param Optimization
 Maximum number of aggregate content items for a parameter in jump functions and lattices.
 
 -param=ipa-max-param-expr-ops=
--
2.24.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
In reply to this post by Jan Hubicka-2
Hi,

On Wed, Jan 08 2020, Jan Hubicka wrote:

>> Hi,
>>
>> On Fri, Jan 03 2020, Martin Liška wrote:
>> > Hi.
>> >
>> > This is similar transformation for IPA passes. This time,
>> > one needs to use opt_for_fn in order to get the right
>> > parameter values.
>> >
>> > @Martin, Honza:
>> > There are last few remaining parameters which should use
>> > opt_for_fn:
>> >
>> > param_ipa_sra_max_replacements
>>
>> IPA-CP: Always access param_ipa_sra_max_replacements through opt_for_fn
>>
>> 2020-01-07  Martin Jambor  <[hidden email]>
>>
>> * params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
>> * ipa-sra.c (scanned_node): New variable.
>> (allocate_access): Use it to get param_ipa_sra_max_replacements.
>> (ipa_sra_summarize_function): Set up scanned_node.
>> (pull_accesses_from_callee): New parameter caller, use it to get
>> param_ipa_sra_max_replacements.
>> (param_splitting_across_edge): Pass the caller to
>> pull_accesses_from_callee.
>> ---
>>  gcc/ipa-sra.c  | 33 +++++++++++++++++++++------------
>>  gcc/params.opt |  2 +-
>>  2 files changed, 22 insertions(+), 13 deletions(-)
>>
>> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
>> index a051a9f2154..133ed687509 100644
>> --- a/gcc/ipa-sra.c
>> +++ b/gcc/ipa-sra.c
>> @@ -521,6 +521,10 @@ ipa_sra_call_summaries::duplicate (cgraph_edge *, cgraph_edge *,
>>  
>>  /* With all GTY stuff done, we can move to anonymous namespace.  */
>>  namespace {
>> +/* Functions which currently has its body analyzed.  */
>> +
>> +cgraph_node *scanned_node;
>> +
>>  /* Quick mapping from a decl to its param descriptor.  */
>>  
>>  hash_map<tree, gensum_param_desc *> *decl2desc;
>> @@ -1265,7 +1269,8 @@ allocate_access (gensum_param_desc *desc,
>>   HOST_WIDE_INT offset, HOST_WIDE_INT size)
>>  {
>>    if (desc->access_count
>> -      == (unsigned) param_ipa_sra_max_replacements)
>> +      == (unsigned) opt_for_fn (scanned_node->decl,
>> + param_ipa_sra_max_replacements))
>>      {
>>        disqualify_split_candidate (desc, "Too many replacement candidates");
>>        return NULL;
>> @@ -2472,6 +2477,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>>       node->order);
>>    if (!ipa_sra_preliminary_function_checks (node))
>>      return;
>> +  scanned_node = node;
>>    gcc_obstack_init (&gensum_obstack);
>>    isra_func_summary *ifs = func_sums->get_create (node);
>>    ifs->m_candidate = true;
>> @@ -2526,6 +2532,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>>    delete decl2desc;
>>    decl2desc = NULL;
>>    obstack_free (&gensum_obstack, NULL);
>> +  scanned_node = NULL;
>
> It is your code.  having static var to track currently analyzed function
> is bit ugly, and I am not sure if you don't have current_function_decl
> set to that function in all cases.  But I will leave this to your
> decision.

It is my code but apparently I already forgot some changes to it.
Originally I only pushed/popped cfun when processing (some) scan results
(I originally hoped not to do it at all) but since May I actually push
it whenever the body of the function is scanned, so using cfun is indeed
OK.

So please use the following instead:

IPA-CP: Access param_ipa_sra_max_replacements through opt_for_fn

2020-01-08  Martin Jambor  <[hidden email]>

        * params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
        * ipa-sra.c (pull_accesses_from_callee): New parameter caller, use it
        to get param_ipa_sra_max_replacements.
        (param_splitting_across_edge): Pass the caller to
        pull_accesses_from_callee.
---
 gcc/ipa-sra.c  | 24 +++++++++++++-----------
 gcc/params.opt |  2 +-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index a051a9f2154..51d225ed772 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -3246,16 +3246,17 @@ all_callee_accesses_present_p (isra_param_desc *param_desc,
 enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN};
 
 
-/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, if
-   they would not violate some constraint there.  If successful, return NULL,
-   otherwise return the string reason for failure (which can be written to the
-   dump file).  DELTA_OFFSET is the known offset of the actual argument withing
-   the formal parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the
-   size of the actual argument or zero, if not known. In case of success, set
-   *CHANGE_P to true if propagation actually changed anything.  */
+/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
+   (which belongs to CALLER) if they would not violate some constraint there.
+   If successful, return NULL, otherwise return the string reason for failure
+   (which can be written to the dump file).  DELTA_OFFSET is the known offset
+   of the actual argument withing the formal parameter (so of ARG_DESCS within
+   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
+   known. In case of success, set *CHANGE_P to true if propagation actually
+   changed anything.  */
 
 static const char *
-pull_accesses_from_callee (isra_param_desc *param_desc,
+pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
    isra_param_desc *arg_desc,
    unsigned delta_offset, unsigned arg_size,
    bool *change_p)
@@ -3335,7 +3336,7 @@ pull_accesses_from_callee (isra_param_desc *param_desc,
       return NULL;
 
     if ((prop_count + pclen
- > (unsigned) param_ipa_sra_max_replacements)
+ > (unsigned) opt_for_fn (caller->decl, param_ipa_sra_max_replacements))
  || size_would_violate_limit_p (param_desc,
        param_desc->size_reached + prop_size))
       return "propagating accesses would violate the count or size limit";
@@ -3455,7 +3456,8 @@ param_splitting_across_edge (cgraph_edge *cs)
   else
     {
       const char *pull_failure
- = pull_accesses_from_callee (param_desc, arg_desc, 0, 0, &res);
+ = pull_accesses_from_callee (cs->caller, param_desc, arg_desc,
+     0, 0, &res);
       if (pull_failure)
  {
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3516,7 +3518,7 @@ param_splitting_across_edge (cgraph_edge *cs)
   else
     {
       const char *pull_failure
- = pull_accesses_from_callee (param_desc, arg_desc,
+ = pull_accesses_from_callee (cs->caller, param_desc, arg_desc,
      ipf->unit_offset,
      ipf->unit_size, &res);
       if (pull_failure)
diff --git a/gcc/params.opt b/gcc/params.opt
index 3cd6d7d481d..02584740640 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -235,7 +235,7 @@ Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param
 Maximal number of boundary endpoints of case ranges of switch statement used during IPA functoin summary generation.
 
 -param=ipa-sra-max-replacements=
-Common Joined UInteger Var(param_ipa_sra_max_replacements) Init(8) IntegerRange(0, 16) Param
+Common Joined UInteger Var(param_ipa_sra_max_replacements) Optimization Init(8) IntegerRange(0, 16) Param
 Maximum pieces that IPA-SRA tracks per formal parameter, as a consequence, also the maximum number of replacements of a formal parameter.
 
 -param=ipa-sra-ptr-growth-factor=
--
2.24.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Jan Hubicka-2
In reply to this post by Martin Liška-2
> 2020-01-03  Martin Liska  <[hidden email]>
>
> * auto-profile.c (auto_profile): Use opt_for_fn
> for a parameter.
> * ipa-cp.c (ipcp_lattice::add_value): Likewise.
> (propagate_vals_across_arith_jfunc): Likewise.
> (hint_time_bonus): Likewise.
> (incorporate_penalties): Likewise.
> (good_cloning_opportunity_p): Likewise.
> (perform_estimation_of_a_value): Likewise.
> (estimate_local_effects): Likewise.
> (ipcp_propagate_stage): Likewise.
> * ipa-fnsummary.c (decompose_param_expr): Likewise.
> (set_switch_stmt_execution_predicate): Likewise.
> (analyze_function_body): Likewise.
> * ipa-inline-analysis.c (offline_size): Likewise.
> * ipa-inline.c (early_inliner): Likewise.
> * ipa-prop.c (ipa_analyze_node): Likewise.
> (ipcp_transform_function): Likewise.
> * ipa-sra.c (process_scan_results): Likewise.
> (ipa_sra_summarize_function): Likewise.
> * params.opt: Rename ipcp-unit-growth to
> ipa-cp-unit-growth.  Add Optimization for various
> IPA-related parameters.
OK (possibly with s/Optimization/Perfunction/ if that is settled before
patch goes in.

Also add Optimization to -param=max-speculative-devirt-maydefs=
It is safe as discussed elsewhere.

Honza
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Liška-2
In reply to this post by Martin Jambor-3
On 1/8/20 5:14 PM, Martin Jambor wrote:
> + int unit_growth = opt_for_fn (node->decl, param_ipcp_unit_growth);

Sorry, but I renamed you the param to param_ipa_cp_unit_growth.
Which is a name that matches name conventions of other IPA CP
related parameters.

Please rebase the patch.

Thanks,
Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Liška-2
In reply to this post by Jan Hubicka-2
On 1/9/20 12:06 PM, Jan Hubicka wrote:

>> 2020-01-03  Martin Liska  <[hidden email]>
>>
>> * auto-profile.c (auto_profile): Use opt_for_fn
>> for a parameter.
>> * ipa-cp.c (ipcp_lattice::add_value): Likewise.
>> (propagate_vals_across_arith_jfunc): Likewise.
>> (hint_time_bonus): Likewise.
>> (incorporate_penalties): Likewise.
>> (good_cloning_opportunity_p): Likewise.
>> (perform_estimation_of_a_value): Likewise.
>> (estimate_local_effects): Likewise.
>> (ipcp_propagate_stage): Likewise.
>> * ipa-fnsummary.c (decompose_param_expr): Likewise.
>> (set_switch_stmt_execution_predicate): Likewise.
>> (analyze_function_body): Likewise.
>> * ipa-inline-analysis.c (offline_size): Likewise.
>> * ipa-inline.c (early_inliner): Likewise.
>> * ipa-prop.c (ipa_analyze_node): Likewise.
>> (ipcp_transform_function): Likewise.
>> * ipa-sra.c (process_scan_results): Likewise.
>> (ipa_sra_summarize_function): Likewise.
>> * params.opt: Rename ipcp-unit-growth to
>> ipa-cp-unit-growth.  Add Optimization for various
>> IPA-related parameters.
> OK (possibly with s/Optimization/Perfunction/ if that is settled before
> patch goes in.
>
> Also add Optimization to -param=max-speculative-devirt-maydefs=
> It is safe as discussed elsewhere.

Yes, I've just tested this param change and installed it
as r280042.

Martin

>
> Honza
>

Reply | Threaded
Open this post in threaded view
|

RE: [PATCH] Add Optimization for various IPA parameters.

Tamar Christina
In reply to this post by Martin Jambor-3
Hi Martin,

This change (r280099) is causing a major performance regression on exchange2 in SPEC2017 dropping the benchmark by more than 30%.

It seems the parameters no longer do anything. i.e. -flto --param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80 doesn't have any effect anymore.

Thanks,
Tamar

> -----Original Message-----
> From: [hidden email] <[hidden email]>
> On Behalf Of Martin Jambor
> Sent: Wednesday, January 8, 2020 4:14 PM
> To: Martin Liška <[hidden email]>; [hidden email]
> Cc: [hidden email]; Richard Biener <[hidden email]>
> Subject: Re: [PATCH] Add Optimization for various IPA parameters.
>
> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time, one needs to
> > use opt_for_fn in order to get the right parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_cp_unit_growth
>
> So as we discussed, picking this one from one particular node is not what one
> would expect to happen, but inlining does it too and so anyway:
>
>
> IPA-CP: Always access param_ipcp_unit_growth through opt_for_fn
>
> 2020-01-07  Martin Jambor  <[hidden email]>
>
> * params.opt (param_ipcp_unit_growth): Mark as Optimization.
> * ipa-cp.c (max_new_size): Removed.
> (orig_overall_size): New variable.
> (get_max_overall_size): New function.
> (estimate_local_effects): Use it.  Adjust dump.
> (decide_about_value): Likewise.
> (ipcp_propagate_stage): Do not calculate max_new_size, just store
> orig_overall_size.  Adjust dump.
> (ipa_cp_c_finalize): Clear orig_overall_size instead of max_new_size.
> ---
>  gcc/ipa-cp.c   | 39 ++++++++++++++++++++++++++-------------
>  gcc/params.opt |  2 +-
>  2 files changed, 27 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 9e20e278eff..c2572e3e0e8
> 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -375,7 +375,7 @@ static profile_count max_count;
>
>  /* Original overall size of the program.  */
>
> -static long overall_size, max_new_size;
> +static long overall_size, orig_overall_size;
>
>  /* Node name to unique clone suffix number map.  */  static
> hash_map<const char *, unsigned> *clone_num_suffixes; @@ -3395,6
> +3395,23 @@ perform_estimation_of_a_value (cgraph_node *node,
> vec<tree> known_csts,
>    val->local_size_cost = size;
>  }
>
> +/* Get the overall limit oof growth based on parameters extracted from
> growth.
> +   it does not really make sense to mix functions with different overall
> growth
> +   limits but it is possible and if it happens, we do not want to select one
> +   limit at random.  */
> +
> +static long
> +get_max_overall_size (cgraph_node *node) {
> +  long max_new_size = orig_overall_size;
> +  long large_unit = opt_for_fn (node->decl, param_large_unit_insns);
> +  if (max_new_size < large_unit)
> +    max_new_size = large_unit;
> +  int unit_growth = opt_for_fn (node->decl, param_ipcp_unit_growth);
> +  max_new_size += max_new_size * unit_growth / 100 + 1;
> +  return max_new_size;
> +}
> +
>  /* Iterate over known values of parameters of NODE and estimate the local
>     effects in terms of time and size they have.  */
>
> @@ -3457,7 +3474,7 @@ estimate_local_effects (struct cgraph_node *node)
>     stats.freq_sum, stats.count_sum,
>     size))
>   {
> -  if (size + overall_size <= max_new_size)
> +  if (size + overall_size <= get_max_overall_size (node))
>      {
>        info->do_clone_for_all_contexts = true;
>        overall_size += size;
> @@ -3467,8 +3484,8 @@ estimate_local_effects (struct cgraph_node *node)
>   "known contexts, growth deemed beneficial.\n");
>      }
>    else if (dump_file && (dump_flags & TDF_DETAILS))
> -    fprintf (dump_file, "   Not cloning for all contexts because "
> -     "max_new_size would be reached with %li.\n",
> +    fprintf (dump_file, "  Not cloning for all contexts because "
> +     "maximum unit size would be reached with %li.\n",
>       size + overall_size);
>   }
>        else if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3860,14
> +3877,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
>      max_count = max_count.max (node->count.ipa ());
>    }
>
> -  max_new_size = overall_size;
> -  if (max_new_size < param_large_unit_insns)
> -    max_new_size = param_large_unit_insns;
> -  max_new_size += max_new_size * param_ipcp_unit_growth / 100 + 1;
> +  orig_overall_size = overall_size;
>
>    if (dump_file)
> -    fprintf (dump_file, "\noverall_size: %li, max_new_size: %li\n",
> -     overall_size, max_new_size);
> +    fprintf (dump_file, "\noverall_size: %li\n", overall_size);
>
>    propagate_constants_topo (topo);
>    if (flag_checking)
> @@ -5380,11 +5393,11 @@ decide_about_value (struct cgraph_node *node,
> int index, HOST_WIDE_INT offset,
>        perhaps_add_new_callers (node, val);
>        return false;
>      }
> -  else if (val->local_size_cost + overall_size > max_new_size)
> +  else if (val->local_size_cost + overall_size > get_max_overall_size
> + (node))
>      {
>        if (dump_file && (dump_flags & TDF_DETAILS))
>   fprintf (dump_file, "   Ignoring candidate value because "
> - "max_new_size would be reached with %li.\n",
> + "maximum unit size would be reached with %li.\n",
>   val->local_size_cost + overall_size);
>        return false;
>      }
> @@ -5928,6 +5941,6 @@ ipa_cp_c_finalize (void)  {
>    max_count = profile_count::uninitialized ();
>    overall_size = 0;
> -  max_new_size = 0;
> +  orig_overall_size = 0;
>    ipcp_free_transformation_sum ();
>  }
> diff --git a/gcc/params.opt b/gcc/params.opt index
> d5f0d9a69ce..3cd6d7d481d 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -243,7 +243,7 @@ Common Joined UInteger
> Var(param_ipa_sra_ptr_growth_factor) Init(2) Param  Maximum allowed
> growth of number and total size of new parameters that ipa-sra replaces a
> pointer to an aggregate with.
>
>  -param=ipcp-unit-growth=
> -Common Joined UInteger Var(param_ipcp_unit_growth) Init(10) Param
> +Common Joined UInteger Var(param_ipcp_unit_growth) Optimization
> +Init(10) Param
>  How much can given compilation unit grow because of the interprocedural
> constant propagation (in percent).
>
>  -param=ira-loop-reserved-regs=
> --
> 2.24.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Liška-2
On 1/11/20 1:20 PM, Tamar Christina wrote:
> It seems the parameters no longer do anything. i.e. -flto --param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80 doesn't have any effect anymore.

Hi.

You are right, the param '--param ipa-cp-unit-growth' is really unused.
It's cause by Martin's commit f7725a488382, where we again reinvented
renamed param name. It will be fixed with the following obvious commit.

I'm going to install it once git will be opened for committing.

Martin

0001-Remove-usage-of-legacy-param_ipa_cp_unit_growth.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Add Optimization for various IPA parameters.

Martin Jambor-3
Hi,

On Mon, Jan 13 2020, Martin Liška wrote:

> On 1/11/20 1:20 PM, Tamar Christina wrote:
>> It seems the parameters no longer do anything. i.e. -flto --param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80 doesn't have any effect anymore.
>
> Hi.
>
> You are right, the param '--param ipa-cp-unit-growth' is really unused.
> It's cause by Martin's commit f7725a488382, where we again reinvented
> renamed param name. It will be fixed with the following obvious commit.
>
> I'm going to install it once git will be opened for committing.

Oh, sorry, somehow the conflict marks tricked me into thinking you
canonicialized the ipa cp parameters the other way round.

Thanks,

Martin

>
> Martin
> From 72f42e55f3a2cbfd73d142de0b3799b6d787d05f Mon Sep 17 00:00:00 2001
> From: Martin Liska <[hidden email]>
> Date: Mon, 13 Jan 2020 10:00:35 +0100
> Subject: [PATCH] Remove usage of legacy param_ipa_cp_unit_growth.
>
> gcc/ChangeLog:
>
> 2020-01-13  Martin Liska  <[hidden email]>
>
> * ipa-cp.c (get_max_overall_size): Use newly
> renamed param param_ipa_cp_unit_growth.
> * params.opt: Remove legacy param name.
12