[Bug tree-optimization/87621] New: auto-vectorization fails for exponentiation code

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug tree-optimization/87621] New: auto-vectorization fails for exponentiation code

asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

            Bug ID: 87621
           Summary: auto-vectorization fails for exponentiation code
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/bgieBT

template <typename T>
T pow(T x, unsigned int n)
{
        if (!n)
                return 1;

        T y = 1;
        while (n > 1)
        {
                if (n%2)
                        y *= x;
                x = x*x; // unsupported use in stmt
                n /= 2;
        }
        return x*y;
}

void testVec(int* x)
{
        // loop nest containing two or more consecutive inner loops cannot be
vectorized
        for (int i = 0; i < 8; ++i)
                x[i] = pow(x[i], 10);
}
Reply | Threaded
Open this post in threaded view
|

[Bug tree-optimization/87621] auto-vectorization fails for exponentiation code

asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

--- Comment #1 from krux <hoganmeier at gmail dot com> ---
Interestingly it happily unrolls the loop even with -fno-unroll-loops.
Reply | Threaded
Open this post in threaded view
|

[Bug tree-optimization/87621] outer loop auto-vectorization fails for exponentiation code

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2018-10-16
                 CC|                            |rguenth at gcc dot gnu.org
             Blocks|                            |53947
            Summary|auto-vectorization fails    |outer loop
                   |for exponentiation code     |auto-vectorization fails
                   |                            |for exponentiation code
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is the unsupported reduction.  We can't vectorize a

 x = x*x;

reduction.  And I don't see how we could.

We could eventually vectorize the outer loop but outer loop vectorization
is "confused" by the if-conversion we need to do to the inner loop.

Fixing that (y *= n%2 ? x : 1) yields outer loop vectorization failure like

t.ii:20:20: note:   vect_is_simple_use: operand y_36 = PHI <1(3),
prephitmp_27(10)>, type of def: unknown
t.ii:20:20: missed:   Unsupported pattern.
t.ii:17:6: missed:   not vectorized: unsupported use in stmt.
t.ii:20:20: missed:  unexpected pattern.
t.ii:20:20: missed: couldn't vectorize loop

that is because we "simplified" the multiplication by 1 and thus the
reduction op becomes

 y = n%2 ? new_y : y;

and appearantly we do not like this (not sure why the reduction structure
is relevant for outer loop vectorization).  We do not actually detect this
as reduction, but we could simply identify inner loop reductions by
looking for the loop-closed PHIs.


So - were you expecting outer loop vectorization to happen?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
Reply | Threaded
Open this post in threaded view
|

[Bug tree-optimization/87621] outer loop auto-vectorization fails for exponentiation code

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

--- Comment #3 from krux <hoganmeier at gmail dot com> ---
Yes see the godbolt link.
clang compiles it down to a few vpmulld's.
Reply | Threaded
Open this post in threaded view
|

[Bug tree-optimization/87621] outer loop auto-vectorization fails for exponentiation code

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk.
Reply | Threaded
Open this post in threaded view
|

[Bug tree-optimization/87621] outer loop auto-vectorization fails for exponentiation code

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Fri Nov  9 10:53:31 2018
New Revision: 265959

URL: https://gcc.gnu.org/viewcvs?rev=265959&root=gcc&view=rev
Log:
2018-11-09  Richard Biener  <[hidden email]>

        PR tree-optimization/87621
        * tree-vect-loop.c (vectorizable_reduction): Handle reduction
        op with only phi inputs.
        * tree-ssa-loop-ch.c: Include tree-ssa-sccvn.h.
        (ch_base::copy_headers): Run CSE on copied loop headers.
        (pass_ch_vect::process_loop_p): Simplify.

        * g++.dg/vect/pr87621.cc: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/vect/pr87621.cc
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-ch.c
    trunk/gcc/tree-vect-loop.c