[Bug middle-end/88784] New: Middle end is missing some optimizations about unsigned

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] New: Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

            Bug ID: 88784
           Summary: Middle end is missing some optimizations about
                    unsigned
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: helijia at gcc dot gnu.org
  Target Milestone: ---

For both operands are unsigned, the following optimizations are valid, and
missing:
1. X > Y && X != 0 --> X > Y
2. X > Y || X != 0 --> X != 0
3. X <= Y || X != 0 --> true
4. X <= Y || X == 0 --> X <= Y
5. X > Y && X == 0 --> false

unsigned foo(unsigned x, unsigned y) { return x > y && x != 0; }
should fold to x > y, but I found we haven't done it right now.
I compile the code with the following command.
g++ unsigned.cpp -Ofast -c -S -o unsigned.s -fdump-tree-all
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2019-01-10
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  Similar for singed and X > Y && X != INT_MIN, etc.

ifcombine is one place to fix (maybe_fold_and_comparisons and friends),
match.pd in case this gets lowered to BIT_AND/IOR.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

Qi Feng <ffengqi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ffengqi at gcc dot gnu.org

--- Comment #2 from Qi Feng <ffengqi at gcc dot gnu.org> ---
I tried some modifications in and_comparisons_1 and or_comparisons_1. It seems
that `||' is transformed to `&&' somewhere. And that makes the changes in
or_comparisons_1 noneffective. Should I find the place where the transformation
happens and make changes before that?
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #3 from Qi Feng <ffengqi at gcc dot gnu.org> ---
I have extended the transformations as following, the first five are the
original ones:

* unsigned
  Use UINT_MAX for demonstration, similar for UCHAR_MAX, USHRT_MAX, UINT_MAX,
  ULONG_MAX, ULLONG_MAX

  x >  y   &&   x != 0           -->   x > y
  x >  y   ||   x != 0           -->   x != 0
  x <= y   ||   x != 0           -->   true
  x <= y   ||   x == 0           -->   x <= y
  x >  y   &&   x == 0           -->   false

  x <  y   &&   x != UINT_MAX    -->   x < y
  x <  y   ||   x != UINT_MAX    -->   x != UINT_MAX
  x >= y   ||   x != UINT_MAX    -->   true
  x >= y   ||   x == UINT_MAX    -->   x >= y
  x <  y   &&   x == UINT_MAX    -->   false

* signed
  Use INT_MIN, INT_MAX for demonstration, similar for SCHAR_MIN, SHRT_MIN,
  INT_MIN, LONG_MIN, LLONG_MIN, SCHAR_MAX, SHRT_MAX, INT_MAX, LONG_MAX,
  LLONG_MAX

  x >  y   &&   x != INT_MIN     -->   x > y
  x >  y   ||   x != INT_MIN     -->   x != INT_MIN
  x <= y   ||   x != INT_MIN     -->   true
  x <= y   ||   x == INT_MIN     -->   x <= y
  x >  y   &&   x == INT_MIN     -->   false

  x <  y   &&   x != INT_MAX     -->   x < y
  x <  y   ||   x != INT_MAX     -->   x != UINT_MAX
  x >= y   ||   x != INT_MAX     -->   true
  x >= y   ||   x == INT_MAX     -->   x >= y
  x <  y   &&   x == INT_MAX     -->   false


Want to know if I missed something.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #4 from Qi Feng <ffengqi at gcc dot gnu.org> ---
The fourth to the last should be:

  x <  y   ||   x != INT_MAX     -->   x != UINT_MAX

sorry for the typo.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #5 from Qi Feng <ffengqi at gcc dot gnu.org> ---
(In reply to Qi Feng from comment #4)
> The fourth to the last should be:
>
>   x <  y   ||   x != INT_MAX     -->   x != UINT_MAX
>
> sorry for the typo.

   x <  y   ||   x != INT_MAX     -->   x != INT_MAX

typo again...
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #6 from Fredrik Hederstierna <[hidden email]> ---
Created attachment 46397
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46397&action=edit
Some more patterns

Looking into this I found some more places where it seems to be non-optimal
code, maybe separate issue, but these are also example of equal evaluation for
unsigned types ?

Test 1
  (x > y) || (x > (y / N))   equal to
  (x > (y / N))

Test 2
  (x > y) || (x > (y >> N))  equal to
  (x > (y >> N))

Test 3
  (x > y) && (x > (y / N))   equal to
  (x > y)

Test 4
  (x > y) && (x > (y >> N))  equal to
  (x > y)

One thing to consider here maybe that depending on optimizing for size or
speed, then the order of evaluation can be changed, so like if some operation
is costy, then it could be avoided to obtain higher speed possibly assuming it
will accept arguments prior in list I guess. But when optimizing for size, then
I think always the more simplified expression would apply?

Example for arm using above expressions,
(code attached)

00000000 <test_xy_1_org>:
   0:   b510            push    {r4, lr}
   2:   000b            movs    r3, r1
   4:   0004            movs    r4, r0
   6:   2001            movs    r0, #1
   8:   428c            cmp     r4, r1
   a:   d807            bhi.n   1c <test_xy_1_org+0x1c>
   c:   2103            movs    r1, #3
   e:   0018            movs    r0, r3
  10:   f7ff fffe       bl      0 <__aeabi_uidiv>
  14:   b2c0            uxtb    r0, r0
  16:   42a0            cmp     r0, r4
  18:   4180            sbcs    r0, r0
  1a:   4240            negs    r0, r0
  1c:   bd10            pop     {r4, pc}

0000001e <test_xy_1_equ>:
  1e:   b510            push    {r4, lr}
  20:   0004            movs    r4, r0
  22:   0008            movs    r0, r1
  24:   2103            movs    r1, #3
  26:   f7ff fffe       bl      0 <__aeabi_uidiv>
  2a:   b2c0            uxtb    r0, r0
  2c:   42a0            cmp     r0, r4
  2e:   4180            sbcs    r0, r0
  30:   4240            negs    r0, r0
  32:   bd10            pop     {r4, pc}

00000034 <test_xy_2_org>:
  34:   0003            movs    r3, r0
  36:   2001            movs    r0, #1
  38:   428b            cmp     r3, r1
  3a:   d803            bhi.n   44 <test_xy_2_org+0x10>
  3c:   08c9            lsrs    r1, r1, #3
  3e:   4299            cmp     r1, r3
  40:   4189            sbcs    r1, r1
  42:   4248            negs    r0, r1
  44:   4770            bx      lr

00000046 <test_xy_2_equ>:
  46:   08c9            lsrs    r1, r1, #3
  48:   4281            cmp     r1, r0
  4a:   4180            sbcs    r0, r0
  4c:   4240            negs    r0, r0
  4e:   4770            bx      lr

00000050 <test_xy_3_org>:
  50:   b510            push    {r4, lr}
  52:   000b            movs    r3, r1
  54:   0004            movs    r4, r0
  56:   2000            movs    r0, #0
  58:   428c            cmp     r4, r1
  5a:   d907            bls.n   6c <test_xy_3_org+0x1c>
  5c:   2103            movs    r1, #3
  5e:   0018            movs    r0, r3
  60:   f7ff fffe       bl      0 <__aeabi_uidiv>
  64:   b2c0            uxtb    r0, r0
  66:   42a0            cmp     r0, r4
  68:   4180            sbcs    r0, r0
  6a:   4240            negs    r0, r0
  6c:   bd10            pop     {r4, pc}

0000006e <test_xy_3_equ>:
  6e:   4281            cmp     r1, r0
  70:   4180            sbcs    r0, r0
  72:   4240            negs    r0, r0
  74:   4770            bx      lr

00000076 <test_xy_4_org>:
  76:   0003            movs    r3, r0
  78:   2000            movs    r0, #0
  7a:   428b            cmp     r3, r1
  7c:   d903            bls.n   86 <test_xy_4_org+0x10>
  7e:   08c9            lsrs    r1, r1, #3
  80:   4299            cmp     r1, r3
  82:   4189            sbcs    r1, r1
  84:   4248            negs    r0, r1
  86:   4770            bx      lr

00000088 <test_xy_4_equ>:
  88:   4281            cmp     r1, r0
  8a:   4180            sbcs    r0, r0
  8c:   4240            negs    r0, r0
  8e:   4770            bx      lr
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #7 from Qi Feng <ffengqi at gcc dot gnu.org> ---
I add some patterns in match.pd which handles the original 5 transformations.
But I don't the language used in match.pd well, the patterns I wrote are very
similar.

And I haven't found predicates for constant values other than zero (INT_MIN,
LONG_MIN, LLONG_MIN etc.).
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 22 May 2019, ffengqi at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784
>
> --- Comment #7 from Qi Feng <ffengqi at gcc dot gnu.org> ---
> I add some patterns in match.pd which handles the original 5 transformations.
> But I don't the language used in match.pd well, the patterns I wrote are very
> similar.

Sometimes iteration helps but sometimes it obfuscates things too much.
match.pd also runs through the preprocessor (OK, I didn't really suggest
to use macros to merge things ;))

> And I haven't found predicates for constant values other than zero (INT_MIN,
> LONG_MIN, LLONG_MIN etc.).

wi:eq_p (wi::to_wide (@1), wi::max_value (TYPE_PRECISION (TREE_TYPE (@1)),
SIGNED))

(ok, a bit long...)
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #9 from Qi Feng <ffengqi at gcc dot gnu.org> ---
And there's another problem. Take `x >  y  &&  x != 0   -->   x > y' for
example, I would also like to do

   x <  y  &&  y != 0  -->  x < y
   x != 0  &&  x >  y  -->  x > y
   y != 0  &&  x <  y  -->  x < y

If the constant always comes in as the second operand is incorrect, these would
have to be doubled.

I tried to add :c to truth_andif, but got the `operation is not commutative'
error.  I also tried to make truth_andif commutative by modifying genmatch.c,
but again, I don't know it well, afraid that I would break something.

The patterns I wrote looks like:

    /* x >  y  &&  x != 0     -->     x > y
       Only for unsigned x and y.  */
    (simplify
     (truth_andif:c (gt@2 @0 @1) (ne @0 integer_zerop))
     (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
          && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1)))
       @2))

I have to wrote 4 of this with minor modification for a single transformation.
If there's better way to do it, please do leave a comment.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Qi Feng from comment #9)

> And there's another problem. Take `x >  y  &&  x != 0   -->   x > y' for
> example, I would also like to do
>
>    x <  y  &&  y != 0  -->  x < y
>    x != 0  &&  x >  y  -->  x > y
>    y != 0  &&  x <  y  -->  x < y
>
> If the constant always comes in as the second operand is incorrect, these
> would have to be doubled.
>
> I tried to add :c to truth_andif, but got the `operation is not commutative'
> error.  I also tried to make truth_andif commutative by modifying
> genmatch.c, but again, I don't know it well, afraid that I would break
> something.
>
> The patterns I wrote looks like:
>
>     /* x >  y  &&  x != 0     -->     x > y
>        Only for unsigned x and y.  */
>     (simplify
>      (truth_andif:c (gt@2 @0 @1) (ne @0 integer_zerop))
>      (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
>           && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED
> (TREE_TYPE(@1)))
>        @2))
>
> I have to wrote 4 of this with minor modification for a single
> transformation. If there's better way to do it, please do leave a comment.

I think first of all you do _not_ want to use truth_andif since that
will only prevail iff x or y have side-effects.  To match on GIMPLE
you want bit_and instead/as well since all truth_ stuff doesn't prevail there.

And obviously truth_andif is _not_ commutative.  You can use :C if you
want to force it though.  Both truth_and and bit_and are commutative.
So sth like

(for and (truth_and bit_and)
 (for ltgtop (lt le)
  (simplify
   (and:c (ltgtop:c@2 @0 @1) (ne @0 integer_zerop))
   (if (...)
    @2)))

should cover all of

   x <  y  &&  y != 0  -->  x < y
   x != 0  &&  x >  y  -->  x > y
   y != 0  &&  x <  y  -->  x < y
   x <  y  &&  y != 0  -->  x < y

note that from

(and (lt:c@2 @0 @1) (ne @0 integer_zerop))

we generate

(and (lt@2 @0 @1) (ne @0 integer_zerop))
(and (gt@2 @1 @0) (ne @0 integer_zerop))

so :c will ensure the semantically same operation will be present with
swapped operands.  As opposed to :C which would do lt@2 @1 @0.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #11 from Qi Feng <ffengqi at gcc dot gnu.org> ---
I tried 2 patterns for the following test

  /* 1. x >  y  &&  x != 0     -->     x > y  */
  /* 2. y <  x  &&  x != 0     -->     y < x  */
  /* 3. x != 0  &&  x >  y     -->     x > y  */
  /* 4. x != 0  &&  y <  x     -->     y < x  */

  _Bool f1 (unsigned x, unsigned y)
  {
    return x >  y  &&  x != 0;
  }

  _Bool f2 (unsigned x, unsigned y)
  {
    return y <  x  &&  x != 0;
  }

  _Bool f3 (unsigned x, unsigned y)
  {
    return x != 0  &&  x >  y;
  }

  _Bool f4 (unsigned x, unsigned y)
  {
    return x != 0  &&  y <  x;
  }

The first one is

  (for and (truth_and bit_and)
    (simplify
     (and:c (gt:c@2 @0 @1) (ne @0 integer_zerop))
      (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
           && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1)))
       @2)))

The transformations did not happen as I checked the dump files of
-fdump-tree-{original,optimized}.

And the second one is

  (simplify
   (truth_andif:C (gt:c@2 @0 @1) (ne @0 integer_zerop))
    (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
         && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1)))
     @2))

The transformations did happen for all the 4 functions. And the dump of
-fdump-tree-original showes they already happened, so I guess the pattern is
matched before the process get to GIMPLE.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #12 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Qi Feng from comment #11)

> I tried 2 patterns for the following test
>
>   /* 1. x >  y  &&  x != 0     -->     x > y  */
>   /* 2. y <  x  &&  x != 0     -->     y < x  */
>   /* 3. x != 0  &&  x >  y     -->     x > y  */
>   /* 4. x != 0  &&  y <  x     -->     y < x  */
>
>   _Bool f1 (unsigned x, unsigned y)
>   {
>     return x >  y  &&  x != 0;
>   }
>
>   _Bool f2 (unsigned x, unsigned y)
>   {
>     return y <  x  &&  x != 0;
>   }
>
>   _Bool f3 (unsigned x, unsigned y)
>   {
>     return x != 0  &&  x >  y;
>   }
>
>   _Bool f4 (unsigned x, unsigned y)
>   {
>     return x != 0  &&  y <  x;
>   }
>
> The first one is
>
>   (for and (truth_and bit_and)
>     (simplify
>      (and:c (gt:c@2 @0 @1) (ne @0 integer_zerop))
>       (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
>            && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED
> (TREE_TYPE(@1)))
>        @2)))
>
> The transformations did not happen as I checked the dump files of
> -fdump-tree-{original,optimized}.

It isn't supposed to be done in original anyway. It does work in optimized
(even forwprop1) for me. Did you forget to pass -O? Did you look at some old
dump file?

(I think you could use ANY_INTEGRAL_TYPE_P, this seems valid for vectors)
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #13 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 24 May 2019, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784
>
> --- Comment #12 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to Qi Feng from comment #11)
> > I tried 2 patterns for the following test
> >
> >   /* 1. x >  y  &&  x != 0     -->     x > y  */
> >   /* 2. y <  x  &&  x != 0     -->     y < x  */
> >   /* 3. x != 0  &&  x >  y     -->     x > y  */
> >   /* 4. x != 0  &&  y <  x     -->     y < x  */
> >
> >   _Bool f1 (unsigned x, unsigned y)
> >   {
> >     return x >  y  &&  x != 0;
> >   }
> >
> >   _Bool f2 (unsigned x, unsigned y)
> >   {
> >     return y <  x  &&  x != 0;
> >   }
> >
> >   _Bool f3 (unsigned x, unsigned y)
> >   {
> >     return x != 0  &&  x >  y;
> >   }
> >
> >   _Bool f4 (unsigned x, unsigned y)
> >   {
> >     return x != 0  &&  y <  x;
> >   }
> >
> > The first one is
> >
> >   (for and (truth_and bit_and)
> >     (simplify
> >      (and:c (gt:c@2 @0 @1) (ne @0 integer_zerop))
> >       (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
> >            && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED
> > (TREE_TYPE(@1)))
> >        @2)))
> >
> > The transformations did not happen as I checked the dump files of
> > -fdump-tree-{original,optimized}.
>
> It isn't supposed to be done in original anyway. It does work in optimized
> (even forwprop1) for me. Did you forget to pass -O? Did you look at some old
> dump file?
>
> (I think you could use ANY_INTEGRAL_TYPE_P, this seems valid for vectors)

I would have expected fold to first change the TRUTH_ANDIF to a
TRUTH_AND and then your pattern match.  But maybe I misremember
that we have such transformation for cases where the 2nd operand
doesn't have side-effects.

While genmatch inserts checks for and rejects operands with side-effects
I still wouldn't use truth_andif here.

As Marc says, expecting the transform in .original is probably
premature.  OTOH whether it is applicable on GIMPLE in the end might
depend on LOGICAL_OP_NON_SHORT_CIRCUIT and BRANCH_COST.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #14 from Qi Feng <ffengqi at gcc dot gnu.org> ---
Checking .original and .optimized file is just a quick method I use to check
whether an optimization happened (if not happen in first and last pass,
probably it didn't happen).  I didn't mean or think the transform should happen
in these passes.

Using the second pattern, the result of transform already showed in .original
file, so I think it maybe happened earlier (like in GENERIC).

And I tried the first pattern again using a totally clean debug build
(bootstrap disabled), still didn't see the transformation happen.  I don't know
what went wrong. :(

(I tested it on a ppc64le machine, don't know if that matters)
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #15 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 24 May 2019, ffengqi at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784
>
> --- Comment #14 from Qi Feng <ffengqi at gcc dot gnu.org> ---
> Checking .original and .optimized file is just a quick method I use to check
> whether an optimization happened (if not happen in first and last pass,
> probably it didn't happen).  I didn't mean or think the transform should happen
> in these passes.
>
> Using the second pattern, the result of transform already showed in .original
> file, so I think it maybe happened earlier (like in GENERIC).
>
> And I tried the first pattern again using a totally clean debug build
> (bootstrap disabled), still didn't see the transformation happen.  I don't know
> what went wrong. :(
>
> (I tested it on a ppc64le machine, don't know if that matters)

It can matter because we might have gimplified the && to
control flow.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #16 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to [hidden email] from comment #15)
> It can matter because we might have gimplified the && to
> control flow.

Indeed. You want to look at the forwprop1 dump to see what gimple-match would
need to match (and does match on x86_64).

When X > Y && X != 0 is represented with control flow, in theory, VRP could
deduce a range [1,inf] for X from X>Y, but in practice it uses a symbolic range
[y_3(D) + 1, +INF] from which we fail to deduce X!=0. I think the new ranger
thing that doesn't try to do symbolic ranges would handle that better ;-)
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #17 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 24 May 2019, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784
>
> --- Comment #16 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to [hidden email] from comment #15)
> > It can matter because we might have gimplified the && to
> > control flow.
>
> Indeed. You want to look at the forwprop1 dump to see what gimple-match would
> need to match (and does match on x86_64).
>
> When X > Y && X != 0 is represented with control flow, in theory, VRP could
> deduce a range [1,inf] for X from X>Y, but in practice it uses a symbolic range
> [y_3(D) + 1, +INF] from which we fail to deduce X!=0. I think the new ranger
> thing that doesn't try to do symbolic ranges would handle that better ;-)

Note that in theory the ifcombine pass should pick up the
opportunity to simplify the comparison via maybe_fold_and_comparisons.
It looks like that code is quite old and could need a rewrite
to match-and-simplify.  But of course it exists to avoid creating
trees for the combination parts which still has a point in
match-and-simplify times.  An opportunity for an alternate
code-generator from match.pd since the patterns are still valid
sources for the transform just the input is bigger exploded
forms.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #18 from Qi Feng <ffengqi at gcc dot gnu.org> ---
I only learned gcc for about 2 months, and I have to say that I don't fully
understand what you guys were saying. Is match.pd the right place to fix this
issue? Am I in the wrong direction?
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

seurer at gcc dot gnu.org
In reply to this post by seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #19 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 27 May 2019, ffengqi at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784
>
> --- Comment #18 from Qi Feng <ffengqi at gcc dot gnu.org> ---
> I only learned gcc for about 2 months, and I have to say that I don't fully
> understand what you guys were saying. Is match.pd the right place to fix this
> issue? Am I in the wrong direction?

Yes, match.pd is the correct place to add this kind of simplifications.
No, please don't use *_{and,or}if codes there if possible.  Yes, as
I said upthread it might be necessary to rewrite some of GCCs
infrastructure to get the pattern applied in all situations.
12