[PATCH] Fix PR92537

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Fix PR92537

Richard Biener

There's a bit left in this PR which the following fixes.  The root
only became external late and the check more naturally belongs to
the new place.

Boostrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-11-20  Richard Biener  <[hidden email]>

        PR tree-optimization/92537
        * tree-vect-slp.c (vect_analyze_slp_instance): Move CTOR
        vectorization validity check...
        (vect_slp_analyze_operations): ... here.

        * gfortran.dg/pr92537.f90: New testcase.

Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c (revision 278477)
+++ gcc/tree-vect-slp.c (working copy)
@@ -2253,18 +2264,6 @@ vect_analyze_slp_instance (vec_info *vin
   matches[group_size / const_max_nunits * const_max_nunits] = false;
   vect_free_slp_tree (node, false);
  }
-      else if (constructor
-       && SLP_TREE_DEF_TYPE (node) != vect_internal_def)
- {
-  /* CONSTRUCTOR vectorization relies on a vector stmt being
-     generated, that doesn't work for fully external ones.  */
-  if (dump_enabled_p ())
-    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-     "Build SLP failed: CONSTRUCTOR of external "
-     "or constant elements\n");
-  vect_free_slp_tree (node, false);
-  return false;
- }
       else
  {
   /* Create a new SLP instance.  */
@@ -2939,7 +2939,12 @@ vect_slp_analyze_operations (vec_info *v
       if (!vect_slp_analyze_node_operations (vinfo,
      SLP_INSTANCE_TREE (instance),
      instance, visited, lvisited,
-     &cost_vec))
+     &cost_vec)
+  /* Instances with a root stmt require vectorized defs for the
+     SLP tree root.  */
+  || (SLP_INSTANCE_ROOT_STMT (instance)
+      && (SLP_TREE_DEF_TYPE (SLP_INSTANCE_TREE (instance))
+  != vect_internal_def)))
         {
   slp_tree node = SLP_INSTANCE_TREE (instance);
   stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0];
Index: gcc/testsuite/gfortran.dg/pr92537.f90
===================================================================
--- gcc/testsuite/gfortran.dg/pr92537.f90 (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr92537.f90 (working copy)
@@ -0,0 +1,32 @@
+! { dg-do compile }
+! { dg-options "-O2 -ftree-vectorize -fno-inline" }
+! { dg-additional-options "-march=skylake" { target x86_64-*-* i?86-*-* } }
+MODULE pr93527
+  implicit none
+  integer, parameter :: wp = kind (1.d0)
+  interface p_min
+     module procedure p_min_wp
+  end interface
+contains
+  subroutine foo (pr)
+    real(wp), pointer     :: pr(:)
+    integer  ::  nzd
+    real(wp) ::  pmin
+    real(wp) ::  pmin_diag
+    integer  ::  i
+    nzd  = 15
+    allocate (pr(nzd))
+    pmin_diag = 4000._wp
+    pmin = p_min(pmin_diag)
+    pmin = min (pmin,pmin_diag)
+    pr(1) = log(pmin)
+    do i=1,nzd-1
+       pr(i+1) = log(pmin) + i
+    end do
+  end subroutine foo
+  function p_min_wp (x) result (p_min)
+    real(wp), intent(in) :: x
+    real(wp)             :: p_min
+    p_min = x
+  end function p_min_wp
+end module pr93527
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix PR92537

Richard Sandiford-9
Richard Biener <[hidden email]> writes:
> There's a bit left in this PR which the following fixes.  The root
> only became external late and the check more naturally belongs to
> the new place.
>
> Boostrap and regtest running on x86_64-unknown-linux-gnu.

FWIW, I was just about to post the patch below before seeing your
message. :-)  Thought I might as well post it anyway just in case.

I guess the slight advantage to keeping the vect_analyze_slp_instance
test is that we can still free child nodes at that point, so it might
be more efficient to catch it "early".  It probably doesn't make much
differnce in practice though.

Tested on aarch64-linux-gnu.

Thanks,
Richard


2019-11-20  Richard Sandiford  <[hidden email]>

gcc/
        PR tree-optimization/92537
        * tree-vect-slp.c (vect_slp_analyze_node_operations): Don't
        allow the root node to be external.

gcc/testsuite/
        PR tree-optimization/92537
        * gcc.dg/vect/pr92537.c: New test.
        * gfortran.dg/vect/pr92537.f90: Likewise.

Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c 2019-11-18 15:21:12.919993766 +0000
+++ gcc/tree-vect-slp.c 2019-11-20 09:48:57.145101295 +0000
@@ -2848,7 +2848,7 @@ vect_slp_analyze_node_operations (vec_in
   slp_tree child;
 
   if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
-    return true;
+    return node != SLP_INSTANCE_TREE (node_instance);
 
   /* If we already analyzed the exact same set of scalar stmts we're done.
      We share the generated vector stmts for those.
Index: gcc/testsuite/gcc.dg/vect/pr92537.c
===================================================================
--- /dev/null 2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.dg/vect/pr92537.c 2019-11-20 09:48:57.145101295 +0000
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+int a[64];
+int b, c, e;
+short d;
+short f[64];
+void g() {
+  b = 0;
+  c = d >> 3;
+  for (; b < 64 - 1; b++) {
+    e = 0;
+    e -= a[b] * c;
+    f[b] = e;
+  }
+}
Index: gcc/testsuite/gfortran.dg/vect/pr92537.f90
===================================================================
--- /dev/null 2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gfortran.dg/vect/pr92537.f90 2019-11-20 09:48:57.145101295 +0000
@@ -0,0 +1,32 @@
+! { dg-do compile } */
+! { dg-additional-options "-fno-inline -march=skylake" }
+
+MODULE pr93527
+  implicit none
+  integer, parameter :: wp = kind (1.d0)
+  interface p_min
+     module procedure p_min_wp
+  end interface
+contains
+  subroutine foo (pr)
+    real(wp), pointer     :: pr(:)
+    integer  ::  nzd
+    real(wp) ::  pmin
+    real(wp) ::  pmin_diag
+    integer  ::  i
+    nzd  = 15
+    allocate (pr(nzd))
+    pmin_diag = 4000._wp
+    pmin = p_min(pmin_diag)
+    pmin = min (pmin,pmin_diag)
+    pr(1) = log(pmin)
+    do i=1,nzd-1
+       pr(i+1) = log(pmin) + i
+    end do
+  end subroutine foo
+  function p_min_wp (x) result (p_min)
+    real(wp), intent(in) :: x
+    real(wp)             :: p_min
+    p_min = x
+  end function p_min_wp
+end module pr93527
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix PR92537

Richard Biener
On Wed, 20 Nov 2019, Richard Sandiford wrote:

> Richard Biener <[hidden email]> writes:
> > There's a bit left in this PR which the following fixes.  The root
> > only became external late and the check more naturally belongs to
> > the new place.
> >
> > Boostrap and regtest running on x86_64-unknown-linux-gnu.
>
> FWIW, I was just about to post the patch below before seeing your
> message. :-)  Thought I might as well post it anyway just in case.

I was thinking whether it makes sense to have external root nodes
and they probably can only appear with the CTORs right now (otherwise
they'll be stores).  So possibly generalizing the check like you
do makes sense, still the check belongs in vect_slp_analyze_operations ;)

> I guess the slight advantage to keeping the vect_analyze_slp_instance
> test is that we can still free child nodes at that point, so it might
> be more efficient to catch it "early".  It probably doesn't make much
> differnce in practice though.

But if the root is external there are no child nodes (well, besides
that root).  But yeah, having the check twice looked odd to me.

> Tested on aarch64-linux-gnu.

I've installed my variant before seeing you mail, so...

I also have a prototype for finding "random" roots for SLP
vectorization (just a bunch of same stmts) which runs into
the same issue (but also very many others ;))

Thanks,
Richard.

>
> Thanks,
> Richard
>
>
> 2019-11-20  Richard Sandiford  <[hidden email]>
>
> gcc/
> PR tree-optimization/92537
> * tree-vect-slp.c (vect_slp_analyze_node_operations): Don't
> allow the root node to be external.
>
> gcc/testsuite/
> PR tree-optimization/92537
> * gcc.dg/vect/pr92537.c: New test.
> * gfortran.dg/vect/pr92537.f90: Likewise.
>
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2019-11-18 15:21:12.919993766 +0000
> +++ gcc/tree-vect-slp.c 2019-11-20 09:48:57.145101295 +0000
> @@ -2848,7 +2848,7 @@ vect_slp_analyze_node_operations (vec_in
>    slp_tree child;
>  
>    if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
> -    return true;
> +    return node != SLP_INSTANCE_TREE (node_instance);
>  
>    /* If we already analyzed the exact same set of scalar stmts we're done.
>       We share the generated vector stmts for those.
> Index: gcc/testsuite/gcc.dg/vect/pr92537.c
> ===================================================================
> --- /dev/null 2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gcc.dg/vect/pr92537.c 2019-11-20 09:48:57.145101295 +0000
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +int a[64];
> +int b, c, e;
> +short d;
> +short f[64];
> +void g() {
> +  b = 0;
> +  c = d >> 3;
> +  for (; b < 64 - 1; b++) {
> +    e = 0;
> +    e -= a[b] * c;
> +    f[b] = e;
> +  }
> +}
> Index: gcc/testsuite/gfortran.dg/vect/pr92537.f90
> ===================================================================
> --- /dev/null 2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gfortran.dg/vect/pr92537.f90 2019-11-20 09:48:57.145101295 +0000
> @@ -0,0 +1,32 @@
> +! { dg-do compile } */
> +! { dg-additional-options "-fno-inline -march=skylake" }
> +
> +MODULE pr93527
> +  implicit none
> +  integer, parameter :: wp = kind (1.d0)
> +  interface p_min
> +     module procedure p_min_wp
> +  end interface
> +contains
> +  subroutine foo (pr)
> +    real(wp), pointer     :: pr(:)
> +    integer  ::  nzd
> +    real(wp) ::  pmin
> +    real(wp) ::  pmin_diag
> +    integer  ::  i
> +    nzd  = 15
> +    allocate (pr(nzd))
> +    pmin_diag = 4000._wp
> +    pmin = p_min(pmin_diag)
> +    pmin = min (pmin,pmin_diag)
> +    pr(1) = log(pmin)
> +    do i=1,nzd-1
> +       pr(i+1) = log(pmin) + i
> +    end do
> +  end subroutine foo
> +  function p_min_wp (x) result (p_min)
> +    real(wp), intent(in) :: x
> +    real(wp)             :: p_min
> +    p_min = x
> +  end function p_min_wp
> +end module pr93527
>
--
Richard Biener <[hidden email]>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)