[PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Li Jia He-2
Hi,

According to the optimizable case described by Qi Feng on
issue 88784, we can combine the cases into the following:

1. x >  y  &&  x != XXX_MIN  -->   x > y
2. x >  y  &&  x == XXX_MIN  -->   false
3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN

4. x <  y  &&  x != XXX_MAX  -->   x < y
5. x <  y  &&  x == XXX_MAX  -->   false
6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX

7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
8. x <= y  ||  x != XXX_MIN  -->   true
9. x <= y  ||  x == XXX_MIN  -->   x <= y

10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
11. x >= y  ||  x != XXX_MAX  -->   true
12. x >= y  ||  x == XXX_MAX  -->   x >= y

Note: XXX_MIN represents the minimum value of type x.
      XXX_MAX represents the maximum value of type x.

Here we don't need to care about whether the operation is
signed or unsigned.  For example, in the below equation:

'x >  y  &&  x != XXX_MIN  -->   x > y'

If the x type is signed int and XXX_MIN is INT_MIN, we can
optimize it to 'x > y'.  However, if the type of x is unsigned
int and XXX_MIN is 0, we can still optimize it to 'x > y'.

The regression testing for the patch was done on GCC mainline on

    powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.  Is it OK for trunk ?

Thanks,
Lijia He

gcc/ChangeLog

2019-06-27  Li Jia He  <[hidden email]>
            Qi Feng  <[hidden email]>

        PR middle-end/88784
        * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
        (and_comparisons_1): Use and_comparisons_contain_equal_operands.
        (or_comparisons_contain_equal_operands): New function.
        (or_comparisons_1): Use or_comparisons_contain_equal_operands.

gcc/testsuite/ChangeLog

2019-06-27  Li Jia He  <[hidden email]>
            Qi Feng  <[hidden email]>

        PR middle-end/88784
        * gcc.dg/pr88784-1.c: New testcase.
        * gcc.dg/pr88784-2.c: New testcase.
        * gcc.dg/pr88784-3.c: New testcase.
        * gcc.dg/pr88784-4.c: New testcase.
        * gcc.dg/pr88784-5.c: New testcase.
        * gcc.dg/pr88784-6.c: New testcase.
        * gcc.dg/pr88784-7.c: New testcase.
        * gcc.dg/pr88784-8.c: New testcase.
        * gcc.dg/pr88784-9.c: New testcase.
        * gcc.dg/pr88784-10.c: New testcase.
        * gcc.dg/pr88784-11.c: New testcase.
        * gcc.dg/pr88784-12.c: New testcase.
---
 gcc/gimple-fold.c                 | 124 ++++++++++++++++++++++++++++++
 gcc/testsuite/gcc.dg/pr88784-1.c  |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-10.c |  32 ++++++++
 gcc/testsuite/gcc.dg/pr88784-11.c |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-12.c |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-2.c  |  30 ++++++++
 gcc/testsuite/gcc.dg/pr88784-3.c  |  32 ++++++++
 gcc/testsuite/gcc.dg/pr88784-4.c  |  32 ++++++++
 gcc/testsuite/gcc.dg/pr88784-5.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-6.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-7.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-8.c  |  31 ++++++++
 gcc/testsuite/gcc.dg/pr88784-9.c  |  32 ++++++++
 13 files changed, 496 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-10.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-11.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-12.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-6.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-8.c
 create mode 100644 gcc/testsuite/gcc.dg/pr88784-9.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index dfb31a02078..6d2472d2fcb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -5535,6 +5535,62 @@ and_var_with_comparison_1 (gimple *stmt,
   return NULL_TREE;
 }
 
+/* Try to simplify the AND of two comparisons defined by
+   (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
+   This optimization needs to satisfy op1a equal to op2a
+   or op1b equal to op2a.  If this can be done without
+   constructing an intermediate value, return the resulting
+   tree; otherwise NULL_TREE is returned.  */
+
+static tree
+and_comparisons_contain_equal_operands (enum tree_code code1, tree op1a,
+ tree op1b, enum tree_code code2,
+ tree op2a, tree op2b)
+{
+  /* Transform ((Y1 CODE1 X) AND (X CODE2 Y2)) to
+     ((X CODE1' Y1) AND (X CODE2 Y2)).  */
+  if (!operand_equal_p (op1a, op1b, 0) && operand_equal_p (op1b, op2a, 0)
+      && !operand_equal_p (op2a, op2b, 0))
+    return and_comparisons_contain_equal_operands (swap_tree_comparison (code1),
+   op1b, op1a, code2, op2a,
+   op2b);
+
+  tree op1a_type = TREE_TYPE (op1a);
+  tree op1b_type = TREE_TYPE (op1b);
+  tree op2b_type = TREE_TYPE (op2b);
+
+  if (INTEGRAL_TYPE_P (op1a_type) && INTEGRAL_TYPE_P (op1b_type)
+      && operand_equal_p (op1a, op2a, 0) && TREE_CODE (op2b) == INTEGER_CST)
+    {
+      if (wi::eq_p (wi::to_wide (op2b), wi::min_value (op2b_type)))
+ {
+  /* x > y && x != XXX_MIN --> x > y  */
+  if (code1 == GT_EXPR && code2 == NE_EXPR)
+    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+  /* x > y && x == XXX_MIN --> false  */
+  if (code1 == GT_EXPR && code2 == EQ_EXPR)
+    return boolean_false_node;
+  /* x <= y && x == XXX_MIN --> x == XXX_MIN  */
+  if (code1 == LE_EXPR && code2 == EQ_EXPR)
+    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+ }
+      else if (wi::eq_p (wi::to_wide (op2b), wi::max_value (op2b_type)))
+ {
+  /* x < y && x != XXX_MAX --> x < y  */
+  if (code1 == LT_EXPR && code2 == NE_EXPR)
+    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+  /* x < y && x == XXX_MAX --> false  */
+  if (code1 == LT_EXPR && code2 == EQ_EXPR)
+    return boolean_false_node;
+  /* x >= y && x == XXX_MAX --> x == XXX_MAX  */
+  if (code1 == GE_EXPR && code2 == EQ_EXPR)
+    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+ }
+    }
+
+  return NULL_TREE;
+}
+
 /* Try to simplify the AND of two comparisons defined by
    (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
    If this can be done without constructing an intermediate value,
@@ -5546,6 +5602,12 @@ static tree
 and_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
    enum tree_code code2, tree op2a, tree op2b)
 {
+  /* Try to optimize ((x CODE1 y1) AND (x CODE2 y2))
+     and ((y1 CODE1 x) AND (x CODE2 y2)).  */
+  if (tree t = and_comparisons_contain_equal_operands (code1, op1a, op1b, code2,
+       op2a, op2b))
+    return t;
+
   tree truth_type = truth_type_for (TREE_TYPE (op1a));
 
   /* First check for ((x CODE1 y) AND (x CODE2 y)).  */
@@ -5999,6 +6061,62 @@ or_var_with_comparison_1 (gimple *stmt,
   return NULL_TREE;
 }
 
+/* Try to simplify the OR of two comparisons defined by
+   (OP1A CODE1 OP1B) or (OP2A CODE2 OP2B), respectively.
+   This optimization needs to satisfy op1a equal to op2a
+   or op1b equal to op2a.  If this can be done without
+   constructing an intermediate value, return the resulting
+   tree; otherwise NULL_TREE is returned.  */
+
+static tree
+or_comparisons_contain_equal_operands (enum tree_code code1, tree op1a,
+       tree op1b, enum tree_code code2,
+       tree op2a, tree op2b)
+{
+  /* Transform ((Y1 CODE1 X) OR (X CODE2 Y2)) to
+     ((X CODE1' Y1) OR (X CODE2 Y2)).  */
+  if (!operand_equal_p (op1a, op1b, 0) && operand_equal_p (op1b, op2a, 0)
+      && !operand_equal_p (op2a, op2b, 0))
+    return or_comparisons_contain_equal_operands (swap_tree_comparison (code1),
+  op1b, op1a, code2, op2a,
+  op2b);
+
+  tree op1a_type = TREE_TYPE (op1a);
+  tree op1b_type = TREE_TYPE (op1b);
+  tree op2b_type = TREE_TYPE (op2b);
+
+  if (INTEGRAL_TYPE_P (op1a_type) && INTEGRAL_TYPE_P (op1b_type)
+      && operand_equal_p (op1a, op2a, 0) && TREE_CODE (op2b) == INTEGER_CST)
+    {
+      if (wi::eq_p (wi::to_wide (op2b), wi::min_value (op2b_type)))
+ {
+  /* x > y || x != XXX_MIN --> x != XXX_MIN  */
+  if (code1 == GT_EXPR && code2 == NE_EXPR)
+    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+  /* x <= y || x != XXX_MIN --> true  */
+  if (code1 == LE_EXPR && code2 == NE_EXPR)
+    return boolean_true_node;
+  /* x <= y || x == XXX_MIN --> x <= y  */
+  if (code1 == LE_EXPR && code2 == EQ_EXPR)
+    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+ }
+      else if (wi::eq_p (wi::to_wide (op2b), wi::max_value (op2b_type)))
+ {
+  /* x < y || x != XXX_MAX --> x != XXX_MAX  */
+  if (code1 == LT_EXPR && code2 == NE_EXPR)
+    return fold_build2 (code2, boolean_type_node, op2a, op2b);
+  /* x >= y || x != XXX_MAX --> true  */
+  if (code1 == GE_EXPR && code2 == NE_EXPR)
+    return boolean_true_node;
+  /* x >= y || x == XXX_MAX --> x >= y  */
+  if (code1 == GE_EXPR && code2 == EQ_EXPR)
+    return fold_build2 (code1, boolean_type_node, op1a, op1b);
+ }
+    }
+
+  return NULL_TREE;
+}
+
 /* Try to simplify the OR of two comparisons defined by
    (OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
    If this can be done without constructing an intermediate value,
@@ -6010,6 +6128,12 @@ static tree
 or_comparisons_1 (enum tree_code code1, tree op1a, tree op1b,
   enum tree_code code2, tree op2a, tree op2b)
 {
+  /* Try to optimize ((x CODE1 y1) OR (x CODE2 y2))
+     and ((y1 CODE1 x) OR (x CODE2 y2)).  */
+  if (tree t = or_comparisons_contain_equal_operands (code1, op1a, op1b, code2,
+      op2a, op2b))
+    return t;
+
   tree truth_type = truth_type_for (TREE_TYPE (op1a));
 
   /* First check for ((x CODE1 y) OR (x CODE2 y)).  */
diff --git a/gcc/testsuite/gcc.dg/pr88784-1.c b/gcc/testsuite/gcc.dg/pr88784-1.c
new file mode 100644
index 00000000000..067d40d0739
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x != 0 --> x > y */
+  return x > y && x != 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x != UINT_MAX --> x < y */
+  return x < y && x != UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x != INT_MIN --> x > y */
+  return x > y && x != INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x != INT_MAX --> x < y */
+  return x < y && x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-10.c b/gcc/testsuite/gcc.dg/pr88784-10.c
new file mode 100644
index 00000000000..fa185d680c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-10.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x != 0 --> true */
+  return x <= y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x != UINT_MAX --> true */
+  return x >= y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x != INT_MIN --> true */
+  return x <= y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x != INT_MAX --> true */
+  return x >= y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " <= " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " >= " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-11.c b/gcc/testsuite/gcc.dg/pr88784-11.c
new file mode 100644
index 00000000000..4465910efbb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-11.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x == 0 --> x <= y */
+  return x <= y || x == 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x == UINT_MAX --> x >= y */
+  return x >= y || x == UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x == INT_MIN --> x <= y */
+  return x <= y || x == INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x == INT_MAX --> x >= y */
+  return x >= y || x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-12.c b/gcc/testsuite/gcc.dg/pr88784-12.c
new file mode 100644
index 00000000000..477bba07821
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-12.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x == 0 --> x <= y */
+  return x <= y || x == 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x == UINT_MAX --> x >= y */
+  return x >= y || x == UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x == INT_MIN --> x <= y */
+  return x <= y || x == INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x == INT_MAX --> x >= y */
+  return x >= y || x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-2.c b/gcc/testsuite/gcc.dg/pr88784-2.c
new file mode 100644
index 00000000000..265ddc7ceeb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-2.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x != 0 --> x > y */
+  return x > y && x != 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x != UINT_MAX --> x < y */
+  return x < y && x != UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x != INT_MIN --> x > y */
+  return x > y && x != INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x != INT_MAX --> x < y */
+  return x < y && x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-3.c b/gcc/testsuite/gcc.dg/pr88784-3.c
new file mode 100644
index 00000000000..be2ce315e60
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-3.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x == 0 --> false */
+  return x > y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x == UINT_MAX --> false */
+  return x < y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x == INT_MIN --> false */
+  return x > y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x == INT_MAX --> false */
+  return x < y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " > " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " < " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-4.c b/gcc/testsuite/gcc.dg/pr88784-4.c
new file mode 100644
index 00000000000..b927e712464
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-4.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x > y && x == 0 --> false */
+  return x > y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x < y && x == UINT_MAX --> false */
+  return x < y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x > y && x == INT_MIN --> false */
+  return x > y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x < y && x == INT_MAX --> false */
+  return x < y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " > " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " < " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-5.c b/gcc/testsuite/gcc.dg/pr88784-5.c
new file mode 100644
index 00000000000..c6a349d7c75
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-5.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x <= y && x == 0 --> x == 0 */
+  return x <= y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x >= y && x == UINT_MAX --> x == UINT_MAX */
+  return x >= y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x <= y && x == INT_MIN --> x == INT_MIN */
+  return x <= y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x >= y && x == INT_MAX --> x == INT_MAX */
+  return x >= y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " <= " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " >= " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-6.c b/gcc/testsuite/gcc.dg/pr88784-6.c
new file mode 100644
index 00000000000..5b5d2221bf0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-6.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool and1(unsigned x, unsigned y)
+{
+  /* x <= y && x == 0 --> x == 0 */
+  return x <= y && x == 0;
+}
+
+_Bool and2(unsigned x, unsigned y)
+{
+  /* x >= y && x == UINT_MAX --> x == UINT_MAX */
+  return x >= y && x == UINT_MAX;
+}
+
+_Bool and3(signed x, signed y)
+{
+  /* x <= y && x == INT_MIN --> x == INT_MIN */
+  return x <= y && x == INT_MIN;
+}
+
+_Bool and4(signed x, signed y)
+{
+  /* x >= y && x == INT_MAX --> x == INT_MAX */
+  return x >= y && x == INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " <= " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " >= " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-7.c b/gcc/testsuite/gcc.dg/pr88784-7.c
new file mode 100644
index 00000000000..937d2d26593
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-7.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x > y || x != 0 --> x != 0 */
+  return x > y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x < y || x != UINT_MAX --> x != UINT_MAX */
+  return x < y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x > y || x != INT_MIN --> x != INT_MIN */
+  return x > y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x < y || x != INT_MAX --> x != INT_MAX */
+  return x < y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " > " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " < " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-8.c b/gcc/testsuite/gcc.dg/pr88784-8.c
new file mode 100644
index 00000000000..7f5c845eb27
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-8.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dce3 --param logical-op-non-short-circuit=1" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x > y || x != 0 --> x != 0 */
+  return x > y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x < y || x != UINT_MAX --> x != UINT_MAX */
+  return x < y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x > y || x != INT_MIN --> x != INT_MIN */
+  return x > y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x < y || x != INT_MAX --> x != INT_MAX */
+  return x < y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " > " "dce3" } } */
+/* { dg-final { scan-tree-dump-not " < " "dce3" } } */
diff --git a/gcc/testsuite/gcc.dg/pr88784-9.c b/gcc/testsuite/gcc.dg/pr88784-9.c
new file mode 100644
index 00000000000..94f62d94ede
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88784-9.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine --param logical-op-non-short-circuit=0" } */
+
+#include <limits.h>
+
+_Bool or1(unsigned x, unsigned y)
+{
+  /* x <= y || x != 0 --> true */
+  return x <= y || x != 0;
+}
+
+_Bool or2(unsigned x, unsigned y)
+{
+  /* x >= y || x != UINT_MAX --> true */
+  return x >= y || x != UINT_MAX;
+}
+
+_Bool or3(signed x, signed y)
+{
+  /* x <= y || x != INT_MIN --> true */
+  return x <= y || x != INT_MIN;
+}
+
+_Bool or4(signed x, signed y)
+{
+  /* x >= y || x != INT_MAX --> true */
+  return x >= y || x != INT_MAX;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " <= " "ifcombine" } } */
+/* { dg-final { scan-tree-dump-not " >= " "ifcombine" } } */
--
2.17.1

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Jeff Law
On 6/27/19 12:11 AM, Li Jia He wrote:

> Hi,
>
> According to the optimizable case described by Qi Feng on
> issue 88784, we can combine the cases into the following:
>
> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> 2. x >  y  &&  x == XXX_MIN  -->   false
> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>
> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> 5. x <  y  &&  x == XXX_MAX  -->   false
> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>
> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> 8. x <= y  ||  x != XXX_MIN  -->   true
> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>
> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> 11. x >= y  ||  x != XXX_MAX  -->   true
> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>
> Note: XXX_MIN represents the minimum value of type x.
>       XXX_MAX represents the maximum value of type x.
>
> Here we don't need to care about whether the operation is
> signed or unsigned.  For example, in the below equation:
>
> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>
> If the x type is signed int and XXX_MIN is INT_MIN, we can
> optimize it to 'x > y'.  However, if the type of x is unsigned
> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>
> The regression testing for the patch was done on GCC mainline on
>
>     powerpc64le-unknown-linux-gnu (Power 9 LE)
>
> with no regressions.  Is it OK for trunk ?
>
> Thanks,
> Lijia He
>
> gcc/ChangeLog
>
> 2019-06-27  Li Jia He  <[hidden email]>
>    Qi Feng  <[hidden email]>
>
> PR middle-end/88784
> * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
> (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> (or_comparisons_contain_equal_operands): New function.
> (or_comparisons_1): Use or_comparisons_contain_equal_operands.
Would this be better done via match.pd?  ISTM this transformation would
be well suited for that framework.

jeff
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Li Jia He-2


On 2019/6/27 11:48 PM, Jeff Law wrote:

> On 6/27/19 12:11 AM, Li Jia He wrote:
>> Hi,
>>
>> According to the optimizable case described by Qi Feng on
>> issue 88784, we can combine the cases into the following:
>>
>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>> 2. x >  y  &&  x == XXX_MIN  -->   false
>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>
>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>> 5. x <  y  &&  x == XXX_MAX  -->   false
>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>
>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>> 8. x <= y  ||  x != XXX_MIN  -->   true
>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>
>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>> 11. x >= y  ||  x != XXX_MAX  -->   true
>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>
>> Note: XXX_MIN represents the minimum value of type x.
>>        XXX_MAX represents the maximum value of type x.
>>
>> Here we don't need to care about whether the operation is
>> signed or unsigned.  For example, in the below equation:
>>
>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>
>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>> optimize it to 'x > y'.  However, if the type of x is unsigned
>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>
>> The regression testing for the patch was done on GCC mainline on
>>
>>      powerpc64le-unknown-linux-gnu (Power 9 LE)
>>
>> with no regressions.  Is it OK for trunk ?
>>
>> Thanks,
>> Lijia He
>>
>> gcc/ChangeLog
>>
>> 2019-06-27  Li Jia He  <[hidden email]>
>>    Qi Feng  <[hidden email]>
>>
>> PR middle-end/88784
>> * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
>> (and_comparisons_1): Use and_comparisons_contain_equal_operands.
>> (or_comparisons_contain_equal_operands): New function.
>> (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> Would this be better done via match.pd?  ISTM this transformation would
> be well suited for that framework.

Hi, Jeff

I did this because of the following test case:
`
_Bool comp(unsigned x, unsigned y)
{
   return x > y && x != 0;
}
`
The gimple file dumped on the power platform is:
`
comp (unsigned int x, unsigned int y)
{
   _Bool D.2837;
   int iftmp.0;

   if (x > y) goto <D.2841>; else goto <D.2839>;
   <D.2841>:
   if (x != 0) goto <D.2842>; else goto <D.2839>;
   <D.2842>:
   iftmp.0 = 1;
   goto <D.2840>;
   <D.2839>:
   iftmp.0 = 0;
   <D.2840>:
   D.2837 = (_Bool) iftmp.0;
   return D.2837;
}
`
However, the gimple file dumped on x86 is
`
comp (unsigned int x, unsigned int y)
{
   _Bool D.2837;

   _1 = x > y;
   _2 = x != 0;
   _3 = _1 & _2;
   _4 = (int) _3;
   D.2837 = (_Bool) _4;
   return D.2837;
}
`

The reason for the inconsistency between these two behaviors is param
logical-op-non-short-circuit.  If we add the pattern to the match.pd
file, we can only optimize the situation in which the statement is in
the same basic block (logical-op-non-short-circuit=1, x86).  But for
a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
can't handle this situation.

Another reason is that I found out maybe_fold_and_comparisons and
maybe_fold_or_comparisons are not only called by ifcombine pass but
also by reassoc pass. Using this method can basically unify param
logical-op-non-short-circuit=0 or 1.

Thanks,
Lijia He

>
> jeff
>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Andrew Pinski-2
On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:

>
>
>
> On 2019/6/27 11:48 PM, Jeff Law wrote:
> > On 6/27/19 12:11 AM, Li Jia He wrote:
> >> Hi,
> >>
> >> According to the optimizable case described by Qi Feng on
> >> issue 88784, we can combine the cases into the following:
> >>
> >> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> >> 2. x >  y  &&  x == XXX_MIN  -->   false
> >> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> >>
> >> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> >> 5. x <  y  &&  x == XXX_MAX  -->   false
> >> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> >>
> >> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> >> 8. x <= y  ||  x != XXX_MIN  -->   true
> >> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> >>
> >> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> >> 11. x >= y  ||  x != XXX_MAX  -->   true
> >> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> >>
> >> Note: XXX_MIN represents the minimum value of type x.
> >>        XXX_MAX represents the maximum value of type x.
> >>
> >> Here we don't need to care about whether the operation is
> >> signed or unsigned.  For example, in the below equation:
> >>
> >> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> >>
> >> If the x type is signed int and XXX_MIN is INT_MIN, we can
> >> optimize it to 'x > y'.  However, if the type of x is unsigned
> >> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> >>
> >> The regression testing for the patch was done on GCC mainline on
> >>
> >>      powerpc64le-unknown-linux-gnu (Power 9 LE)
> >>
> >> with no regressions.  Is it OK for trunk ?
> >>
> >> Thanks,
> >> Lijia He
> >>
> >> gcc/ChangeLog
> >>
> >> 2019-06-27  Li Jia He  <[hidden email]>
> >>          Qi Feng  <[hidden email]>
> >>
> >>      PR middle-end/88784
> >>      * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
> >>      (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> >>      (or_comparisons_contain_equal_operands): New function.
> >>      (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > Would this be better done via match.pd?  ISTM this transformation would
> > be well suited for that framework.
>
> Hi, Jeff
>
> I did this because of the following test case:
> `
> _Bool comp(unsigned x, unsigned y)
> {
>    return x > y && x != 0;
> }
> `
> The gimple file dumped on the power platform is:
> `
> comp (unsigned int x, unsigned int y)
> {
>    _Bool D.2837;
>    int iftmp.0;
>
>    if (x > y) goto <D.2841>; else goto <D.2839>;
>    <D.2841>:
>    if (x != 0) goto <D.2842>; else goto <D.2839>;
>    <D.2842>:
>    iftmp.0 = 1;
>    goto <D.2840>;
>    <D.2839>:
>    iftmp.0 = 0;
>    <D.2840>:
>    D.2837 = (_Bool) iftmp.0;
>    return D.2837;
> }
> `
> However, the gimple file dumped on x86 is
> `
> comp (unsigned int x, unsigned int y)
> {
>    _Bool D.2837;
>
>    _1 = x > y;
>    _2 = x != 0;
>    _3 = _1 & _2;
>    _4 = (int) _3;
>    D.2837 = (_Bool) _4;
>    return D.2837;
> }
> `
>
> The reason for the inconsistency between these two behaviors is param
> logical-op-non-short-circuit.  If we add the pattern to the match.pd
> file, we can only optimize the situation in which the statement is in
> the same basic block (logical-op-non-short-circuit=1, x86).  But for
> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> can't handle this situation.
>
> Another reason is that I found out maybe_fold_and_comparisons and
> maybe_fold_or_comparisons are not only called by ifcombine pass but
> also by reassoc pass. Using this method can basically unify param
> logical-op-non-short-circuit=0 or 1.


As mentioned before ifcombine pass should be using gimple-match
instead of fold_build.  Try converting ifcombine over to gimple-match
infrastructure and add these to match.pd.
NOTE tree-ssa-phiopt should also be moved over to gimple-match but
that is a different issue.


Thanks,
Andrew Pinski

>
> Thanks,
> Lijia He
>
> >
> > jeff
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Richard Biener
On Fri, 28 Jun 2019, Andrew Pinski wrote:

> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:
> >
> >
> >
> > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > >> Hi,
> > >>
> > >> According to the optimizable case described by Qi Feng on
> > >> issue 88784, we can combine the cases into the following:
> > >>
> > >> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > >> 2. x >  y  &&  x == XXX_MIN  -->   false
> > >> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > >>
> > >> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > >> 5. x <  y  &&  x == XXX_MAX  -->   false
> > >> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > >>
> > >> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > >> 8. x <= y  ||  x != XXX_MIN  -->   true
> > >> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > >>
> > >> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > >> 11. x >= y  ||  x != XXX_MAX  -->   true
> > >> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > >>
> > >> Note: XXX_MIN represents the minimum value of type x.
> > >>        XXX_MAX represents the maximum value of type x.
> > >>
> > >> Here we don't need to care about whether the operation is
> > >> signed or unsigned.  For example, in the below equation:
> > >>
> > >> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > >>
> > >> If the x type is signed int and XXX_MIN is INT_MIN, we can
> > >> optimize it to 'x > y'.  However, if the type of x is unsigned
> > >> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > >>
> > >> The regression testing for the patch was done on GCC mainline on
> > >>
> > >>      powerpc64le-unknown-linux-gnu (Power 9 LE)
> > >>
> > >> with no regressions.  Is it OK for trunk ?
> > >>
> > >> Thanks,
> > >> Lijia He
> > >>
> > >> gcc/ChangeLog
> > >>
> > >> 2019-06-27  Li Jia He  <[hidden email]>
> > >>          Qi Feng  <[hidden email]>
> > >>
> > >>      PR middle-end/88784
> > >>      * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
> > >>      (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> > >>      (or_comparisons_contain_equal_operands): New function.
> > >>      (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > Would this be better done via match.pd?  ISTM this transformation would
> > > be well suited for that framework.
> >
> > Hi, Jeff
> >
> > I did this because of the following test case:
> > `
> > _Bool comp(unsigned x, unsigned y)
> > {
> >    return x > y && x != 0;
> > }
> > `
> > The gimple file dumped on the power platform is:
> > `
> > comp (unsigned int x, unsigned int y)
> > {
> >    _Bool D.2837;
> >    int iftmp.0;
> >
> >    if (x > y) goto <D.2841>; else goto <D.2839>;
> >    <D.2841>:
> >    if (x != 0) goto <D.2842>; else goto <D.2839>;
> >    <D.2842>:
> >    iftmp.0 = 1;
> >    goto <D.2840>;
> >    <D.2839>:
> >    iftmp.0 = 0;
> >    <D.2840>:
> >    D.2837 = (_Bool) iftmp.0;
> >    return D.2837;
> > }
> > `
> > However, the gimple file dumped on x86 is
> > `
> > comp (unsigned int x, unsigned int y)
> > {
> >    _Bool D.2837;
> >
> >    _1 = x > y;
> >    _2 = x != 0;
> >    _3 = _1 & _2;
> >    _4 = (int) _3;
> >    D.2837 = (_Bool) _4;
> >    return D.2837;
> > }
> > `
> >
> > The reason for the inconsistency between these two behaviors is param
> > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > file, we can only optimize the situation in which the statement is in
> > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > can't handle this situation.
> >
> > Another reason is that I found out maybe_fold_and_comparisons and
> > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > also by reassoc pass. Using this method can basically unify param
> > logical-op-non-short-circuit=0 or 1.
>
>
> As mentioned before ifcombine pass should be using gimple-match
> instead of fold_build.  Try converting ifcombine over to gimple-match
> infrastructure and add these to match.pd.

Yes, I mentioned that in the PR.  The issue is that at the moment
to combine x > y with x <= y you'd have to build GENERIC trees
for both or temporary GIMPLE assign with a SSA def (and then feed
that into the GENERIC or GIMPLE match.pd path).

maybe_fold_and/or_comparisons handle two exploded binary expressions
while the current match.pd entries handle at most one exploded one
(the outermost then, either AND or OR).  But it would be definitely
doable to auto-generate maybe_fold_and/or_comparisons from match.pd
patterns which is what I'd ultimatively suggest to do (in some more
generalized form maybe).  Either with a separate genmatch invocation
or as part of the --gimple processing (not sure what is more feasible
here).

I told Li Jia He that I don't expect him to do this work.

Note I didn't review the actual patch yet.

Thanks,
Richard.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Li Jia He-2


On 2019/7/1 3:30 PM, Richard Biener wrote:

> On Fri, 28 Jun 2019, Andrew Pinski wrote:
>
>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:
>>>
>>>
>>>
>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
>>>> On 6/27/19 12:11 AM, Li Jia He wrote:
>>>>> Hi,
>>>>>
>>>>> According to the optimizable case described by Qi Feng on
>>>>> issue 88784, we can combine the cases into the following:
>>>>>
>>>>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>>>>> 2. x >  y  &&  x == XXX_MIN  -->   false
>>>>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>>>>
>>>>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>>>>> 5. x <  y  &&  x == XXX_MAX  -->   false
>>>>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>>>>
>>>>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>>>>> 8. x <= y  ||  x != XXX_MIN  -->   true
>>>>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>>>>
>>>>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>>>>> 11. x >= y  ||  x != XXX_MAX  -->   true
>>>>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>>>>
>>>>> Note: XXX_MIN represents the minimum value of type x.
>>>>>         XXX_MAX represents the maximum value of type x.
>>>>>
>>>>> Here we don't need to care about whether the operation is
>>>>> signed or unsigned.  For example, in the below equation:
>>>>>
>>>>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>>>>
>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>>>>> optimize it to 'x > y'.  However, if the type of x is unsigned
>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>>>>
>>>>> The regression testing for the patch was done on GCC mainline on
>>>>>
>>>>>       powerpc64le-unknown-linux-gnu (Power 9 LE)
>>>>>
>>>>> with no regressions.  Is it OK for trunk ?
>>>>>
>>>>> Thanks,
>>>>> Lijia He
>>>>>
>>>>> gcc/ChangeLog
>>>>>
>>>>> 2019-06-27  Li Jia He  <[hidden email]>
>>>>>           Qi Feng  <[hidden email]>
>>>>>
>>>>>       PR middle-end/88784
>>>>>       * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
>>>>>       (and_comparisons_1): Use and_comparisons_contain_equal_operands.
>>>>>       (or_comparisons_contain_equal_operands): New function.
>>>>>       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
>>>> Would this be better done via match.pd?  ISTM this transformation would
>>>> be well suited for that framework.
>>>
>>> Hi, Jeff
>>>
>>> I did this because of the following test case:
>>> `
>>> _Bool comp(unsigned x, unsigned y)
>>> {
>>>     return x > y && x != 0;
>>> }
>>> `
>>> The gimple file dumped on the power platform is:
>>> `
>>> comp (unsigned int x, unsigned int y)
>>> {
>>>     _Bool D.2837;
>>>     int iftmp.0;
>>>
>>>     if (x > y) goto <D.2841>; else goto <D.2839>;
>>>     <D.2841>:
>>>     if (x != 0) goto <D.2842>; else goto <D.2839>;
>>>     <D.2842>:
>>>     iftmp.0 = 1;
>>>     goto <D.2840>;
>>>     <D.2839>:
>>>     iftmp.0 = 0;
>>>     <D.2840>:
>>>     D.2837 = (_Bool) iftmp.0;
>>>     return D.2837;
>>> }
>>> `
>>> However, the gimple file dumped on x86 is
>>> `
>>> comp (unsigned int x, unsigned int y)
>>> {
>>>     _Bool D.2837;
>>>
>>>     _1 = x > y;
>>>     _2 = x != 0;
>>>     _3 = _1 & _2;
>>>     _4 = (int) _3;
>>>     D.2837 = (_Bool) _4;
>>>     return D.2837;
>>> }
>>> `
>>>
>>> The reason for the inconsistency between these two behaviors is param
>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>> file, we can only optimize the situation in which the statement is in
>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>> can't handle this situation.
>>>
>>> Another reason is that I found out maybe_fold_and_comparisons and
>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>> also by reassoc pass. Using this method can basically unify param
>>> logical-op-non-short-circuit=0 or 1.
>>
>>
>> As mentioned before ifcombine pass should be using gimple-match
>> instead of fold_build.  Try converting ifcombine over to gimple-match
>> infrastructure and add these to match.pd.
>
> Yes, I mentioned that in the PR.  The issue is that at the moment
> to combine x > y with x <= y you'd have to build GENERIC trees
> for both or temporary GIMPLE assign with a SSA def (and then feed
> that into the GENERIC or GIMPLE match.pd path).
Hi,

I did some experimentation using ‘temporary GIMPLE with a SSA def (and
then feed that into the GIMPLE match.pd path’.  Could we consider the
code in the attachment(I did a test and the code took effect)?

Thanks,
Lijia He

>
> maybe_fold_and/or_comparisons handle two exploded binary expressions
> while the current match.pd entries handle at most one exploded one
> (the outermost then, either AND or OR).  But it would be definitely
> doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> patterns which is what I'd ultimatively suggest to do (in some more
> generalized form maybe).  Either with a separate genmatch invocation
> or as part of the --gimple processing (not sure what is more feasible
> here).
>
> I told Li Jia He that I don't expect him to do this work.
>
> Note I didn't review the actual patch yet.
>
> Thanks,
> Richard.
>

match.diff (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Richard Biener
On Tue, 2 Jul 2019, Li Jia He wrote:

>
>
> On 2019/7/1 3:30 PM, Richard Biener wrote:
> > On Fri, 28 Jun 2019, Andrew Pinski wrote:
> >
> > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:
> > > >
> > > >
> > > >
> > > > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > > > > > Hi,
> > > > > >
> > > > > > According to the optimizable case described by Qi Feng on
> > > > > > issue 88784, we can combine the cases into the following:
> > > > > >
> > > > > > 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > > > > > 2. x >  y  &&  x == XXX_MIN  -->   false
> > > > > > 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > > > > >
> > > > > > 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > > > > > 5. x <  y  &&  x == XXX_MAX  -->   false
> > > > > > 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > > > > >
> > > > > > 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > > > > > 8. x <= y  ||  x != XXX_MIN  -->   true
> > > > > > 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > > > > >
> > > > > > 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > > > > > 11. x >= y  ||  x != XXX_MAX  -->   true
> > > > > > 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > > > > >
> > > > > > Note: XXX_MIN represents the minimum value of type x.
> > > > > >         XXX_MAX represents the maximum value of type x.
> > > > > >
> > > > > > Here we don't need to care about whether the operation is
> > > > > > signed or unsigned.  For example, in the below equation:
> > > > > >
> > > > > > 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > > > > >
> > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can
> > > > > > optimize it to 'x > y'.  However, if the type of x is unsigned
> > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > > > > >
> > > > > > The regression testing for the patch was done on GCC mainline on
> > > > > >
> > > > > >       powerpc64le-unknown-linux-gnu (Power 9 LE)
> > > > > >
> > > > > > with no regressions.  Is it OK for trunk ?
> > > > > >
> > > > > > Thanks,
> > > > > > Lijia He
> > > > > >
> > > > > > gcc/ChangeLog
> > > > > >
> > > > > > 2019-06-27  Li Jia He  <[hidden email]>
> > > > > >           Qi Feng  <[hidden email]>
> > > > > >
> > > > > >       PR middle-end/88784
> > > > > >       * gimple-fold.c (and_comparisons_contain_equal_operands): New
> > > > > > function.
> > > > > >       (and_comparisons_1): Use
> > > > > > and_comparisons_contain_equal_operands.
> > > > > >       (or_comparisons_contain_equal_operands): New function.
> > > > > >       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > > > Would this be better done via match.pd?  ISTM this transformation
> > > > > would
> > > > > be well suited for that framework.
> > > >
> > > > Hi, Jeff
> > > >
> > > > I did this because of the following test case:
> > > > `
> > > > _Bool comp(unsigned x, unsigned y)
> > > > {
> > > >     return x > y && x != 0;
> > > > }
> > > > `
> > > > The gimple file dumped on the power platform is:
> > > > `
> > > > comp (unsigned int x, unsigned int y)
> > > > {
> > > >     _Bool D.2837;
> > > >     int iftmp.0;
> > > >
> > > >     if (x > y) goto <D.2841>; else goto <D.2839>;
> > > >     <D.2841>:
> > > >     if (x != 0) goto <D.2842>; else goto <D.2839>;
> > > >     <D.2842>:
> > > >     iftmp.0 = 1;
> > > >     goto <D.2840>;
> > > >     <D.2839>:
> > > >     iftmp.0 = 0;
> > > >     <D.2840>:
> > > >     D.2837 = (_Bool) iftmp.0;
> > > >     return D.2837;
> > > > }
> > > > `
> > > > However, the gimple file dumped on x86 is
> > > > `
> > > > comp (unsigned int x, unsigned int y)
> > > > {
> > > >     _Bool D.2837;
> > > >
> > > >     _1 = x > y;
> > > >     _2 = x != 0;
> > > >     _3 = _1 & _2;
> > > >     _4 = (int) _3;
> > > >     D.2837 = (_Bool) _4;
> > > >     return D.2837;
> > > > }
> > > > `
> > > >
> > > > The reason for the inconsistency between these two behaviors is param
> > > > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > > > file, we can only optimize the situation in which the statement is in
> > > > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > > > can't handle this situation.
> > > >
> > > > Another reason is that I found out maybe_fold_and_comparisons and
> > > > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > > > also by reassoc pass. Using this method can basically unify param
> > > > logical-op-non-short-circuit=0 or 1.
> > >
> > >
> > > As mentioned before ifcombine pass should be using gimple-match
> > > instead of fold_build.  Try converting ifcombine over to gimple-match
> > > infrastructure and add these to match.pd.
> >
> > Yes, I mentioned that in the PR.  The issue is that at the moment
> > to combine x > y with x <= y you'd have to build GENERIC trees
> > for both or temporary GIMPLE assign with a SSA def (and then feed
> > that into the GENERIC or GIMPLE match.pd path).
>
> Hi,
>
> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
> feed that into the GIMPLE match.pd path’.  Could we consider the code in the
> attachment(I did a test and the code took effect)?
No, that's excessive overhead IMHO - the whole point of these functions
originally was to avoid build2 (...) on both conditionals.  If we
now create two SSA names and two GIMPLE statements that's too much.

As said it shouldn't be rocket science to teach genmatch to
auto-generate those functions from match.pd patterns but it certainly
is some work (less so for somebody with genmatch knowledge).
I'll give it a poke...

Richard.

> Thanks,
> Lijia He
>
> >
> > maybe_fold_and/or_comparisons handle two exploded binary expressions
> > while the current match.pd entries handle at most one exploded one
> > (the outermost then, either AND or OR).  But it would be definitely
> > doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> > patterns which is what I'd ultimatively suggest to do (in some more
> > generalized form maybe).  Either with a separate genmatch invocation
> > or as part of the --gimple processing (not sure what is more feasible
> > here).
> >
> > I told Li Jia He that I don't expect him to do this work.
> >
> > Note I didn't review the actual patch yet.
> >
> > Thanks,
> > Richard.
> >
>
--
Richard Biener <[hidden email]>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Richard Biener
On Tue, 2 Jul 2019, Richard Biener wrote:

> On Tue, 2 Jul 2019, Li Jia He wrote:
>
> >
> >
> > On 2019/7/1 3:30 PM, Richard Biener wrote:
> > > On Fri, 28 Jun 2019, Andrew Pinski wrote:
> > >
> > > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > > > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > According to the optimizable case described by Qi Feng on
> > > > > > > issue 88784, we can combine the cases into the following:
> > > > > > >
> > > > > > > 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > > > > > > 2. x >  y  &&  x == XXX_MIN  -->   false
> > > > > > > 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > > > > > >
> > > > > > > 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > > > > > > 5. x <  y  &&  x == XXX_MAX  -->   false
> > > > > > > 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > > > > > >
> > > > > > > 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > > > > > > 8. x <= y  ||  x != XXX_MIN  -->   true
> > > > > > > 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > > > > > >
> > > > > > > 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > > > > > > 11. x >= y  ||  x != XXX_MAX  -->   true
> > > > > > > 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > > > > > >
> > > > > > > Note: XXX_MIN represents the minimum value of type x.
> > > > > > >         XXX_MAX represents the maximum value of type x.
> > > > > > >
> > > > > > > Here we don't need to care about whether the operation is
> > > > > > > signed or unsigned.  For example, in the below equation:
> > > > > > >
> > > > > > > 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > > > > > >
> > > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can
> > > > > > > optimize it to 'x > y'.  However, if the type of x is unsigned
> > > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > > > > > >
> > > > > > > The regression testing for the patch was done on GCC mainline on
> > > > > > >
> > > > > > >       powerpc64le-unknown-linux-gnu (Power 9 LE)
> > > > > > >
> > > > > > > with no regressions.  Is it OK for trunk ?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Lijia He
> > > > > > >
> > > > > > > gcc/ChangeLog
> > > > > > >
> > > > > > > 2019-06-27  Li Jia He  <[hidden email]>
> > > > > > >           Qi Feng  <[hidden email]>
> > > > > > >
> > > > > > >       PR middle-end/88784
> > > > > > >       * gimple-fold.c (and_comparisons_contain_equal_operands): New
> > > > > > > function.
> > > > > > >       (and_comparisons_1): Use
> > > > > > > and_comparisons_contain_equal_operands.
> > > > > > >       (or_comparisons_contain_equal_operands): New function.
> > > > > > >       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > > > > Would this be better done via match.pd?  ISTM this transformation
> > > > > > would
> > > > > > be well suited for that framework.
> > > > >
> > > > > Hi, Jeff
> > > > >
> > > > > I did this because of the following test case:
> > > > > `
> > > > > _Bool comp(unsigned x, unsigned y)
> > > > > {
> > > > >     return x > y && x != 0;
> > > > > }
> > > > > `
> > > > > The gimple file dumped on the power platform is:
> > > > > `
> > > > > comp (unsigned int x, unsigned int y)
> > > > > {
> > > > >     _Bool D.2837;
> > > > >     int iftmp.0;
> > > > >
> > > > >     if (x > y) goto <D.2841>; else goto <D.2839>;
> > > > >     <D.2841>:
> > > > >     if (x != 0) goto <D.2842>; else goto <D.2839>;
> > > > >     <D.2842>:
> > > > >     iftmp.0 = 1;
> > > > >     goto <D.2840>;
> > > > >     <D.2839>:
> > > > >     iftmp.0 = 0;
> > > > >     <D.2840>:
> > > > >     D.2837 = (_Bool) iftmp.0;
> > > > >     return D.2837;
> > > > > }
> > > > > `
> > > > > However, the gimple file dumped on x86 is
> > > > > `
> > > > > comp (unsigned int x, unsigned int y)
> > > > > {
> > > > >     _Bool D.2837;
> > > > >
> > > > >     _1 = x > y;
> > > > >     _2 = x != 0;
> > > > >     _3 = _1 & _2;
> > > > >     _4 = (int) _3;
> > > > >     D.2837 = (_Bool) _4;
> > > > >     return D.2837;
> > > > > }
> > > > > `
> > > > >
> > > > > The reason for the inconsistency between these two behaviors is param
> > > > > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > > > > file, we can only optimize the situation in which the statement is in
> > > > > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > > > > can't handle this situation.
> > > > >
> > > > > Another reason is that I found out maybe_fold_and_comparisons and
> > > > > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > > > > also by reassoc pass. Using this method can basically unify param
> > > > > logical-op-non-short-circuit=0 or 1.
> > > >
> > > >
> > > > As mentioned before ifcombine pass should be using gimple-match
> > > > instead of fold_build.  Try converting ifcombine over to gimple-match
> > > > infrastructure and add these to match.pd.
> > >
> > > Yes, I mentioned that in the PR.  The issue is that at the moment
> > > to combine x > y with x <= y you'd have to build GENERIC trees
> > > for both or temporary GIMPLE assign with a SSA def (and then feed
> > > that into the GENERIC or GIMPLE match.pd path).
> >
> > Hi,
> >
> > I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
> > feed that into the GIMPLE match.pd path’.  Could we consider the code in the
> > attachment(I did a test and the code took effect)?
>
> No, that's excessive overhead IMHO - the whole point of these functions
> originally was to avoid build2 (...) on both conditionals.  If we
> now create two SSA names and two GIMPLE statements that's too much.
>
> As said it shouldn't be rocket science to teach genmatch to
> auto-generate those functions from match.pd patterns but it certainly
> is some work (less so for somebody with genmatch knowledge).
> I'll give it a poke...
OK, it's not so easy.  I guess we could lower the cost of building
SSA names / gimple stmts significantly if we allocated them on the
stack like via

gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 *
sizeof (tree));
memset (stmt1, 0, ...);
gimple_set_code (stmt1, GIMPLE_ASSIGN);
gimple_set_num_ops (stmt1, 3);
gimple_init_singleton (stmt1);

gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
gimple_assign_set_rhs_with_ops (&gsi, actual operation...);

tree lhs1 = XALLOCA (tree_ssa_name);
memset (lhs1, 0, sizeof (tree_ssa_name));
TREE_SET_CODE (lhs1, SSA_NAME);
TREE_TYPE (lhs1) = ...;

gimple_assing_set_lhs (stmt1, lhs1);

it's all a bit ugly and some factoring in the generic gimple machinery
might easen this kind of hacks, but well...

With the above you could then use

  gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
                      boolean_type_node, lhs1, lhs2);
  if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
    ... successfully simplified into 'op' ...

with the simplification path then extracting the simplified result
from 'op'.  Note this deliberately leaves out passing a gimple_seq
as storage for more complex simplification results to match
existing behavior.

Your patch has the GC allocation and SSA name (and namespace!)
allocation overhead and most definitely misses releasing the
SSA names you allocated.

Note with the above hack the simplified result has to be checked
for mentions of lhs1 or lhs2 - a case we have to reject because
their definitions are transitional.

Richard.

> Richard.
>
> > Thanks,
> > Lijia He
> >
> > >
> > > maybe_fold_and/or_comparisons handle two exploded binary expressions
> > > while the current match.pd entries handle at most one exploded one
> > > (the outermost then, either AND or OR).  But it would be definitely
> > > doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> > > patterns which is what I'd ultimatively suggest to do (in some more
> > > generalized form maybe).  Either with a separate genmatch invocation
> > > or as part of the --gimple processing (not sure what is more feasible
> > > here).
> > >
> > > I told Li Jia He that I don't expect him to do this work.
> > >
> > > Note I didn't review the actual patch yet.
> > >
> > > Thanks,
> > > Richard.
> > >
> >
>
>
--
Richard Biener <[hidden email]>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Li Jia He-2


On 2019/7/2 4:51 PM, Richard Biener wrote:

> On Tue, 2 Jul 2019, Richard Biener wrote:
>
>> On Tue, 2 Jul 2019, Li Jia He wrote:
>>
>>>
>>>
>>> On 2019/7/1 3:30 PM, Richard Biener wrote:
>>>> On Fri, 28 Jun 2019, Andrew Pinski wrote:
>>>>
>>>>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
>>>>>>> On 6/27/19 12:11 AM, Li Jia He wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> According to the optimizable case described by Qi Feng on
>>>>>>>> issue 88784, we can combine the cases into the following:
>>>>>>>>
>>>>>>>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>>>>>>>> 2. x >  y  &&  x == XXX_MIN  -->   false
>>>>>>>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>>>>>>>
>>>>>>>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>>>>>>>> 5. x <  y  &&  x == XXX_MAX  -->   false
>>>>>>>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>>>>>>>
>>>>>>>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>>>>>>>> 8. x <= y  ||  x != XXX_MIN  -->   true
>>>>>>>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>>>>>>>
>>>>>>>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>>>>>>>> 11. x >= y  ||  x != XXX_MAX  -->   true
>>>>>>>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>>>>>>>
>>>>>>>> Note: XXX_MIN represents the minimum value of type x.
>>>>>>>>          XXX_MAX represents the maximum value of type x.
>>>>>>>>
>>>>>>>> Here we don't need to care about whether the operation is
>>>>>>>> signed or unsigned.  For example, in the below equation:
>>>>>>>>
>>>>>>>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>>>>>>>
>>>>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>>>>>>>> optimize it to 'x > y'.  However, if the type of x is unsigned
>>>>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>>>>>>>
>>>>>>>> The regression testing for the patch was done on GCC mainline on
>>>>>>>>
>>>>>>>>        powerpc64le-unknown-linux-gnu (Power 9 LE)
>>>>>>>>
>>>>>>>> with no regressions.  Is it OK for trunk ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Lijia He
>>>>>>>>
>>>>>>>> gcc/ChangeLog
>>>>>>>>
>>>>>>>> 2019-06-27  Li Jia He  <[hidden email]>
>>>>>>>>            Qi Feng  <[hidden email]>
>>>>>>>>
>>>>>>>>        PR middle-end/88784
>>>>>>>>        * gimple-fold.c (and_comparisons_contain_equal_operands): New
>>>>>>>> function.
>>>>>>>>        (and_comparisons_1): Use
>>>>>>>> and_comparisons_contain_equal_operands.
>>>>>>>>        (or_comparisons_contain_equal_operands): New function.
>>>>>>>>        (or_comparisons_1): Use or_comparisons_contain_equal_operands.
>>>>>>> Would this be better done via match.pd?  ISTM this transformation
>>>>>>> would
>>>>>>> be well suited for that framework.
>>>>>>
>>>>>> Hi, Jeff
>>>>>>
>>>>>> I did this because of the following test case:
>>>>>> `
>>>>>> _Bool comp(unsigned x, unsigned y)
>>>>>> {
>>>>>>      return x > y && x != 0;
>>>>>> }
>>>>>> `
>>>>>> The gimple file dumped on the power platform is:
>>>>>> `
>>>>>> comp (unsigned int x, unsigned int y)
>>>>>> {
>>>>>>      _Bool D.2837;
>>>>>>      int iftmp.0;
>>>>>>
>>>>>>      if (x > y) goto <D.2841>; else goto <D.2839>;
>>>>>>      <D.2841>:
>>>>>>      if (x != 0) goto <D.2842>; else goto <D.2839>;
>>>>>>      <D.2842>:
>>>>>>      iftmp.0 = 1;
>>>>>>      goto <D.2840>;
>>>>>>      <D.2839>:
>>>>>>      iftmp.0 = 0;
>>>>>>      <D.2840>:
>>>>>>      D.2837 = (_Bool) iftmp.0;
>>>>>>      return D.2837;
>>>>>> }
>>>>>> `
>>>>>> However, the gimple file dumped on x86 is
>>>>>> `
>>>>>> comp (unsigned int x, unsigned int y)
>>>>>> {
>>>>>>      _Bool D.2837;
>>>>>>
>>>>>>      _1 = x > y;
>>>>>>      _2 = x != 0;
>>>>>>      _3 = _1 & _2;
>>>>>>      _4 = (int) _3;
>>>>>>      D.2837 = (_Bool) _4;
>>>>>>      return D.2837;
>>>>>> }
>>>>>> `
>>>>>>
>>>>>> The reason for the inconsistency between these two behaviors is param
>>>>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>>>>> file, we can only optimize the situation in which the statement is in
>>>>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>>>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>>>>> can't handle this situation.
>>>>>>
>>>>>> Another reason is that I found out maybe_fold_and_comparisons and
>>>>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>>>>> also by reassoc pass. Using this method can basically unify param
>>>>>> logical-op-non-short-circuit=0 or 1.
>>>>>
>>>>>
>>>>> As mentioned before ifcombine pass should be using gimple-match
>>>>> instead of fold_build.  Try converting ifcombine over to gimple-match
>>>>> infrastructure and add these to match.pd.
>>>>
>>>> Yes, I mentioned that in the PR.  The issue is that at the moment
>>>> to combine x > y with x <= y you'd have to build GENERIC trees
>>>> for both or temporary GIMPLE assign with a SSA def (and then feed
>>>> that into the GENERIC or GIMPLE match.pd path).
>>>
>>> Hi,
>>>
>>> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
>>> feed that into the GIMPLE match.pd path’.  Could we consider the code in the
>>> attachment(I did a test and the code took effect)?
>>
>> No, that's excessive overhead IMHO - the whole point of these functions
>> originally was to avoid build2 (...) on both conditionals.  If we
>> now create two SSA names and two GIMPLE statements that's too much.
>>
>> As said it shouldn't be rocket science to teach genmatch to
>> auto-generate those functions from match.pd patterns but it certainly
>> is some work (less so for somebody with genmatch knowledge).
>> I'll give it a poke...
>
> OK, it's not so easy.  I guess we could lower the cost of building
> SSA names / gimple stmts significantly if we allocated them on the
> stack like via
>
> gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 *
> sizeof (tree));
> memset (stmt1, 0, ...);
> gimple_set_code (stmt1, GIMPLE_ASSIGN);
> gimple_set_num_ops (stmt1, 3);
> gimple_init_singleton (stmt1);
>
> gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
> gimple_assign_set_rhs_with_ops (&gsi, actual operation...);
>
> tree lhs1 = XALLOCA (tree_ssa_name);
> memset (lhs1, 0, sizeof (tree_ssa_name));
> TREE_SET_CODE (lhs1, SSA_NAME);
> TREE_TYPE (lhs1) = ...;
>
> gimple_assing_set_lhs (stmt1, lhs1);
>
> it's all a bit ugly and some factoring in the generic gimple machinery
> might easen this kind of hacks, but well...
>
> With the above you could then use
>
>    gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
>      boolean_type_node, lhs1, lhs2);
>    if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
>      ... successfully simplified into 'op' ...
>
> with the simplification path then extracting the simplified result
> from 'op'.  Note this deliberately leaves out passing a gimple_seq
> as storage for more complex simplification results to match
> existing behavior.
>
> Your patch has the GC allocation and SSA name (and namespace!)
> allocation overhead and most definitely misses releasing the
> SSA names you allocated.
>
> Note with the above hack the simplified result has to be checked
> for mentions of lhs1 or lhs2 - a case we have to reject because
> their definitions are transitional.
>
Hi,

   I made some changes based on the recommendations. Would you like to
   help me to see it again ? Sorry, it took so long time to provide the
   patch.

   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
        The reason is that I did not found a good way to handle the
        optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
        Maybe I missing some important information about match.pd.
        2. The gimple_resimplify2 function is not used.  Since stmt1,
        stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
        question with the value on the stack as the return value ?
        I may have misunderstood Richard's intention.

Thanks,
Lijia He

> Richard.
>
>> Richard.
>>
>>> Thanks,
>>> Lijia He
>>>
>>>>
>>>> maybe_fold_and/or_comparisons handle two exploded binary expressions
>>>> while the current match.pd entries handle at most one exploded one
>>>> (the outermost then, either AND or OR).  But it would be definitely
>>>> doable to auto-generate maybe_fold_and/or_comparisons from match.pd
>>>> patterns which is what I'd ultimatively suggest to do (in some more
>>>> generalized form maybe).  Either with a separate genmatch invocation
>>>> or as part of the --gimple processing (not sure what is more feasible
>>>> here).
>>>>
>>>> I told Li Jia He that I don't expect him to do this work.
>>>>
>>>> Note I didn't review the actual patch yet.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>
>>
>>
>

0001-Auto-generate-maybe_fold_and-or_comparisons-from-mat.patch (10K) Download Attachment
0002-Fix-PR88784-middle-end-is-missing-some-optimizations.patch (26K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Martin Liška-2
@Richi: PING^1

On 7/16/19 8:35 AM, Li Jia He wrote:

>
>
> On 2019/7/2 4:51 PM, Richard Biener wrote:
>> On Tue, 2 Jul 2019, Richard Biener wrote:
>>
>>> On Tue, 2 Jul 2019, Li Jia He wrote:
>>>
>>>>
>>>>
>>>> On 2019/7/1 3:30 PM, Richard Biener wrote:
>>>>> On Fri, 28 Jun 2019, Andrew Pinski wrote:
>>>>>
>>>>>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <[hidden email]> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
>>>>>>>> On 6/27/19 12:11 AM, Li Jia He wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> According to the optimizable case described by Qi Feng on
>>>>>>>>> issue 88784, we can combine the cases into the following:
>>>>>>>>>
>>>>>>>>> 1. x >  y  &&  x != XXX_MIN  -->   x > y
>>>>>>>>> 2. x >  y  &&  x == XXX_MIN  -->   false
>>>>>>>>> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>>>>>>>>>
>>>>>>>>> 4. x <  y  &&  x != XXX_MAX  -->   x < y
>>>>>>>>> 5. x <  y  &&  x == XXX_MAX  -->   false
>>>>>>>>> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>>>>>>>>>
>>>>>>>>> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
>>>>>>>>> 8. x <= y  ||  x != XXX_MIN  -->   true
>>>>>>>>> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>>>>>>>>>
>>>>>>>>> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
>>>>>>>>> 11. x >= y  ||  x != XXX_MAX  -->   true
>>>>>>>>> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>>>>>>>>>
>>>>>>>>> Note: XXX_MIN represents the minimum value of type x.
>>>>>>>>>          XXX_MAX represents the maximum value of type x.
>>>>>>>>>
>>>>>>>>> Here we don't need to care about whether the operation is
>>>>>>>>> signed or unsigned.  For example, in the below equation:
>>>>>>>>>
>>>>>>>>> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>>>>>>>>>
>>>>>>>>> If the x type is signed int and XXX_MIN is INT_MIN, we can
>>>>>>>>> optimize it to 'x > y'.  However, if the type of x is unsigned
>>>>>>>>> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>>>>>>>>>
>>>>>>>>> The regression testing for the patch was done on GCC mainline on
>>>>>>>>>
>>>>>>>>>        powerpc64le-unknown-linux-gnu (Power 9 LE)
>>>>>>>>>
>>>>>>>>> with no regressions.  Is it OK for trunk ?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Lijia He
>>>>>>>>>
>>>>>>>>> gcc/ChangeLog
>>>>>>>>>
>>>>>>>>> 2019-06-27  Li Jia He  <[hidden email]>
>>>>>>>>>            Qi Feng  <[hidden email]>
>>>>>>>>>
>>>>>>>>>        PR middle-end/88784
>>>>>>>>>        * gimple-fold.c (and_comparisons_contain_equal_operands): New
>>>>>>>>> function.
>>>>>>>>>        (and_comparisons_1): Use
>>>>>>>>> and_comparisons_contain_equal_operands.
>>>>>>>>>        (or_comparisons_contain_equal_operands): New function.
>>>>>>>>>        (or_comparisons_1): Use or_comparisons_contain_equal_operands.
>>>>>>>> Would this be better done via match.pd?  ISTM this transformation
>>>>>>>> would
>>>>>>>> be well suited for that framework.
>>>>>>>
>>>>>>> Hi, Jeff
>>>>>>>
>>>>>>> I did this because of the following test case:
>>>>>>> `
>>>>>>> _Bool comp(unsigned x, unsigned y)
>>>>>>> {
>>>>>>>      return x > y && x != 0;
>>>>>>> }
>>>>>>> `
>>>>>>> The gimple file dumped on the power platform is:
>>>>>>> `
>>>>>>> comp (unsigned int x, unsigned int y)
>>>>>>> {
>>>>>>>      _Bool D.2837;
>>>>>>>      int iftmp.0;
>>>>>>>
>>>>>>>      if (x > y) goto <D.2841>; else goto <D.2839>;
>>>>>>>      <D.2841>:
>>>>>>>      if (x != 0) goto <D.2842>; else goto <D.2839>;
>>>>>>>      <D.2842>:
>>>>>>>      iftmp.0 = 1;
>>>>>>>      goto <D.2840>;
>>>>>>>      <D.2839>:
>>>>>>>      iftmp.0 = 0;
>>>>>>>      <D.2840>:
>>>>>>>      D.2837 = (_Bool) iftmp.0;
>>>>>>>      return D.2837;
>>>>>>> }
>>>>>>> `
>>>>>>> However, the gimple file dumped on x86 is
>>>>>>> `
>>>>>>> comp (unsigned int x, unsigned int y)
>>>>>>> {
>>>>>>>      _Bool D.2837;
>>>>>>>
>>>>>>>      _1 = x > y;
>>>>>>>      _2 = x != 0;
>>>>>>>      _3 = _1 & _2;
>>>>>>>      _4 = (int) _3;
>>>>>>>      D.2837 = (_Bool) _4;
>>>>>>>      return D.2837;
>>>>>>> }
>>>>>>> `
>>>>>>>
>>>>>>> The reason for the inconsistency between these two behaviors is param
>>>>>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>>>>>> file, we can only optimize the situation in which the statement is in
>>>>>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>>>>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>>>>>> can't handle this situation.
>>>>>>>
>>>>>>> Another reason is that I found out maybe_fold_and_comparisons and
>>>>>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>>>>>> also by reassoc pass. Using this method can basically unify param
>>>>>>> logical-op-non-short-circuit=0 or 1.
>>>>>>
>>>>>>
>>>>>> As mentioned before ifcombine pass should be using gimple-match
>>>>>> instead of fold_build.  Try converting ifcombine over to gimple-match
>>>>>> infrastructure and add these to match.pd.
>>>>>
>>>>> Yes, I mentioned that in the PR.  The issue is that at the moment
>>>>> to combine x > y with x <= y you'd have to build GENERIC trees
>>>>> for both or temporary GIMPLE assign with a SSA def (and then feed
>>>>> that into the GENERIC or GIMPLE match.pd path).
>>>>
>>>> Hi,
>>>>
>>>> I did some experimentation using ‘temporary GIMPLE with a SSA def (and then
>>>> feed that into the GIMPLE match.pd path’.  Could we consider the code in the
>>>> attachment(I did a test and the code took effect)?
>>>
>>> No, that's excessive overhead IMHO - the whole point of these functions
>>> originally was to avoid build2 (...) on both conditionals.  If we
>>> now create two SSA names and two GIMPLE statements that's too much.
>>>
>>> As said it shouldn't be rocket science to teach genmatch to
>>> auto-generate those functions from match.pd patterns but it certainly
>>> is some work (less so for somebody with genmatch knowledge).
>>> I'll give it a poke...
>>
>> OK, it's not so easy.  I guess we could lower the cost of building
>> SSA names / gimple stmts significantly if we allocated them on the
>> stack like via
>>
>> gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 *
>> sizeof (tree));
>> memset (stmt1, 0, ...);
>> gimple_set_code (stmt1, GIMPLE_ASSIGN);
>> gimple_set_num_ops (stmt1, 3);
>> gimple_init_singleton (stmt1);
>>
>> gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
>> gimple_assign_set_rhs_with_ops (&gsi, actual operation...);
>>
>> tree lhs1 = XALLOCA (tree_ssa_name);
>> memset (lhs1, 0, sizeof (tree_ssa_name));
>> TREE_SET_CODE (lhs1, SSA_NAME);
>> TREE_TYPE (lhs1) = ...;
>>
>> gimple_assing_set_lhs (stmt1, lhs1);
>>
>> it's all a bit ugly and some factoring in the generic gimple machinery
>> might easen this kind of hacks, but well...
>>
>> With the above you could then use
>>
>>    gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
>>               boolean_type_node, lhs1, lhs2);
>>    if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
>>      ... successfully simplified into 'op' ...
>>
>> with the simplification path then extracting the simplified result
>> from 'op'.  Note this deliberately leaves out passing a gimple_seq
>> as storage for more complex simplification results to match
>> existing behavior.
>>
>> Your patch has the GC allocation and SSA name (and namespace!)
>> allocation overhead and most definitely misses releasing the
>> SSA names you allocated.
>>
>> Note with the above hack the simplified result has to be checked
>> for mentions of lhs1 or lhs2 - a case we have to reject because
>> their definitions are transitional.
>>
>
> Hi,
>
>   I made some changes based on the recommendations. Would you like to
>   help me to see it again ? Sorry, it took so long time to provide the
>   patch.
>
>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
>     The reason is that I did not found a good way to handle the
>     optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
>     Maybe I missing some important information about match.pd.
>     2. The gimple_resimplify2 function is not used.  Since stmt1,
>     stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
>     question with the value on the stack as the return value ?
>     I may have misunderstood Richard's intention.
>
> Thanks,
> Lijia He
>
>> Richard.
>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Lijia He
>>>>
>>>>>
>>>>> maybe_fold_and/or_comparisons handle two exploded binary expressions
>>>>> while the current match.pd entries handle at most one exploded one
>>>>> (the outermost then, either AND or OR).  But it would be definitely
>>>>> doable to auto-generate maybe_fold_and/or_comparisons from match.pd
>>>>> patterns which is what I'd ultimatively suggest to do (in some more
>>>>> generalized form maybe).  Either with a separate genmatch invocation
>>>>> or as part of the --gimple processing (not sure what is more feasible
>>>>> here).
>>>>>
>>>>> I told Li Jia He that I don't expect him to do this work.
>>>>>
>>>>> Note I didn't review the actual patch yet.
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>
>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Richard Biener
In reply to this post by Li Jia He-2
On Tue, 16 Jul 2019, Li Jia He wrote:

> Hi,
>
>   I made some changes based on the recommendations. Would you like to
>   help me to see it again ? Sorry, it took so long time to provide the
>   patch.
>
>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
> The reason is that I did not found a good way to handle the
> optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
> Maybe I missing some important information about match.pd.
> 2. The gimple_resimplify2 function is not used.  Since stmt1,
> stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
> question with the value on the stack as the return value ?
> I may have misunderstood Richard's intention.

Sorry for the delay in reviewing.

Rather than exporting gimple_set_code and gimple_size I'd split
out a

void
gimple_init (gimple *, enum gimple_code code, unsigned nops);

from gimple_alloc (changing that to GC allocate not cleared
memory) doing all of the actual initialization.  Then the
allocation would go like

gimple *stmt1 = (gimple *)XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN,
3));
gimple_init (stmt1, GIMPLE_ASSIGN, 3);

with an overload for gimple_size to account for # of ops.

+  /* Allocate SSA names(lhs1) on the stack.  */
+  tree lhs1 = XALLOCA (tree_node);

you can use (tree) XALLOCA (tree_ssa_name) here

+  /* Call the interface function of match.pd to simplify the expression.  
*/
+  tree t = gimple_simplify (code, boolean_type_node, gimple_assign_lhs
(stmt1),
+                           gimple_assign_lhs (stmt2), NULL,
+                           follow_all_ssa_edges);
+
+  if (t)

As I told Martin offline the appropriate function to use is

 You'd do

  gimple_match_op op (gimple_match_cond::UNCOND, code,
         boolean_type_node, gimple_assign_lhs (stmt1),
gimple_assign_lhs (stmt2));
  if (op->resimplify (NULL, follow_all_ssa_edges))
   {
      if (gimple_simplified_result_is_gimple_val (res_op))
        .. got a constant or SSA name ..
      else if (res_op->code.is_tree_code ()
                 && TREE_CODE_CLASS ((tree_code)res_op->code)) ==
tcc_comparison)
        ... got a comparison res_op->op[0] res_op->code res_op->op[1] ...

so you get the outermost expression back decomposed.

Otherwise with you passing NULL as the gimple_seq you'll miss quite
some simplification cases.

You also have to watch out for the result containing the LHS of one
of your temporary stmts.

+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code1,
op1a,
+                                                    op1b, code2, op2a,
op2b))
+    return t;
+
+  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code2,
op2a,
+                                                    op2b, code1, op1a,
op1b))
+    return t;

with match.pd rules you shouldn't need to call this twice, once with
swapped operands.

Otherwise this first patch looks like what I'd have done and we
can build upon it.

Not sure if you or Martin wants to improve it according to my
comments.

Thanks,
Richard.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Richard Biener
In reply to this post by Li Jia He-2
On Tue, 16 Jul 2019, Li Jia He wrote:

> Hi,
>
>   I made some changes based on the recommendations. Would you like to
>   help me to see it again ? Sorry, it took so long time to provide the
>   patch.
>
>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
> The reason is that I did not found a good way to handle the
> optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
> Maybe I missing some important information about match.pd.
> 2. The gimple_resimplify2 function is not used.  Since stmt1,
> stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
> question with the value on the stack as the return value ?
> I may have misunderstood Richard's intention.

And now for the match.pd patch.

+/* x >  y  &&  x != XXX_MIN  -->  x > y  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
(TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    @3)))
+
+/* x >  y  &&  x == XXX_MIN  -->  false  */
+(for and (truth_and bit_and)
+ (simplify
+  (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
(TREE_TYPE(@1))
+       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
+    { boolean_false_node; })))

you could merge those two via

 (for eqne (eq ne)
  (for and (....
   (simplify
    (and:c (gt:c @0 @1) (eqne @0 INTEGER_CST@2))
    (if (...)
     (switch
      (if (eqne == NE_EXPR)
       @3)
      (if (eqne == EQ_EXPR)
       { constant_boolean_node (false, type); }))))

notice using constant_boolean_node (false, type); instead of
boolean_false_node.  I suspect more unification is possible.

Also you could do

(match min_value
 INTEGER_CST
 (if (INTEGRAL_TYPE_P (type)
      && wi::eq_p (wi::to_wide (t), wi::min_value (type)))))

and then write

 (simplify
  (and:c (gt:c @0 @1) (eq @0 min_value))
  (...

Your

(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
(TREE_TYPE(@1))

is redundant, it's enough to check either @0 or @1 given they
have to be compatible for the gt operation.  Note you probably
want to use

  (and:c (gt:c @0 @1) (eq @@0 min_value))

and verify that types_match (@1, @0) because when @0 are a constant
(and (eq @0 min_value) is not folded which can happen) then they
might have different types and thus you could have
(SHORT_MAX > intvar) && (SHORT_MAX == SHORT_MAX)

That said, the patterns can be quite a bit simplified I think.

Richard.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Martin Liška-2
In reply to this post by Richard Biener
On 9/5/19 3:01 PM, Richard Biener wrote:
> Not sure if you or Martin wants to improve it according to my
> comments.

Yes please. I'm working on that based on the review you provided.

Thanks,
Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Martin Liška-2
In reply to this post by Richard Biener
On 9/5/19 3:17 PM, Richard Biener wrote:
> That said, the patterns can be quite a bit simplified I think.

I will take care of it as well.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Richard Biener
In reply to this post by Martin Liška-2
On Thu, 5 Sep 2019, Martin Liška wrote:

> On 9/5/19 3:01 PM, Richard Biener wrote:
> > Not sure if you or Martin wants to improve it according to my
> > comments.
>
> Yes please. I'm working on that based on the review you provided.

Oh, and it just occured to me since we're doing single_use checks
on SSA names in match.pd that you need to initialize the
SSA_NAME_IMM_USE_NODE () as well - see make_ssa_name_fn, probably
makes sense to split out a short helper for that.

Richard.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

Martin Liška-2
On 9/6/19 10:01 AM, Richard Biener wrote:

> On Thu, 5 Sep 2019, Martin Liška wrote:
>
>> On 9/5/19 3:01 PM, Richard Biener wrote:
>>> Not sure if you or Martin wants to improve it according to my
>>> comments.
>>
>> Yes please. I'm working on that based on the review you provided.
>
> Oh, and it just occured to me since we're doing single_use checks
> on SSA names in match.pd that you need to initialize the
> SSA_NAME_IMM_USE_NODE () as well - see make_ssa_name_fn, probably
> makes sense to split out a short helper for that.

Yes, I already hit the ICE :)

Martin

>
> Richard.
>

Reply | Threaded
Open this post in threaded view
|

[PATCH 1/2] Auto-generate maybe_fold_and/or_comparisons from match.pd

Martin Liška-2
In reply to this post by Richard Biener
On 9/5/19 3:01 PM, Richard Biener wrote:

> On Tue, 16 Jul 2019, Li Jia He wrote:
>
>> Hi,
>>
>>   I made some changes based on the recommendations. Would you like to
>>   help me to see it again ? Sorry, it took so long time to provide the
>>   patch.
>>
>>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
>> The reason is that I did not found a good way to handle the
>> optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
>> Maybe I missing some important information about match.pd.
>> 2. The gimple_resimplify2 function is not used.  Since stmt1,
>> stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
>> question with the value on the stack as the return value ?
>> I may have misunderstood Richard's intention.
>
> Sorry for the delay in reviewing.
>
> Rather than exporting gimple_set_code and gimple_size I'd split
> out a
>
> void
> gimple_init (gimple *, enum gimple_code code, unsigned nops);
>
> from gimple_alloc (changing that to GC allocate not cleared
> memory) doing all of the actual initialization.  Then the
> allocation would go like
>
> gimple *stmt1 = (gimple *)XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN,
> 3));
> gimple_init (stmt1, GIMPLE_ASSIGN, 3);
>
> with an overload for gimple_size to account for # of ops.
>
> +  /* Allocate SSA names(lhs1) on the stack.  */
> +  tree lhs1 = XALLOCA (tree_node);
>
> you can use (tree) XALLOCA (tree_ssa_name) here
>
> +  /* Call the interface function of match.pd to simplify the expression.  
> */
> +  tree t = gimple_simplify (code, boolean_type_node, gimple_assign_lhs
> (stmt1),
> +                           gimple_assign_lhs (stmt2), NULL,
> +                           follow_all_ssa_edges);
> +
> +  if (t)
>
> As I told Martin offline the appropriate function to use is
>
>  You'd do
>
>   gimple_match_op op (gimple_match_cond::UNCOND, code,
>          boolean_type_node, gimple_assign_lhs (stmt1),
> gimple_assign_lhs (stmt2));
>   if (op->resimplify (NULL, follow_all_ssa_edges))
>    {
>       if (gimple_simplified_result_is_gimple_val (res_op))
>         .. got a constant or SSA name ..
>       else if (res_op->code.is_tree_code ()
>                  && TREE_CODE_CLASS ((tree_code)res_op->code)) ==
> tcc_comparison)
>         ... got a comparison res_op->op[0] res_op->code res_op->op[1] ...
>
> so you get the outermost expression back decomposed.
>
> Otherwise with you passing NULL as the gimple_seq you'll miss quite
> some simplification cases.
>
> You also have to watch out for the result containing the LHS of one
> of your temporary stmts.
>
> +  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code1,
> op1a,
> +                                                    op1b, code2, op2a,
> op2b))
> +    return t;
> +
> +  if (tree t = maybe_fold_comparisons_from_match_pd (BIT_AND_EXPR, code2,
> op2a,
> +                                                    op2b, code1, op1a,
> op1b))
> +    return t;
>
> with match.pd rules you shouldn't need to call this twice, once with
> swapped operands.
>
> Otherwise this first patch looks like what I'd have done and we
> can build upon it.
>
> Not sure if you or Martin wants to improve it according to my
> comments.
>
> Thanks,
> Richard.
>
I'm sending patch that addresses the feedback from Richard.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

0001-Auto-generate-maybe_fold_and-or_comparisons-from-mat.patch (10K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/2] Fix PR88784, middle end is missing some optimizations about unsigned

Martin Liška-2
In reply to this post by Richard Biener
On 9/5/19 3:17 PM, Richard Biener wrote:

> On Tue, 16 Jul 2019, Li Jia He wrote:
>
>> Hi,
>>
>>   I made some changes based on the recommendations. Would you like to
>>   help me to see it again ? Sorry, it took so long time to provide the
>>   patch.
>>
>>   Note: 1. I keep the code for and_comparisons_1 and or_comparisons_1.
>> The reason is that I did not found a good way to handle the
>> optimization of '((x CODE1 y) AND (x CODE2 y))' in match.pd.
>> Maybe I missing some important information about match.pd.
>> 2. The gimple_resimplify2 function is not used.  Since stmt1,
>> stmt2, lhs1 and lhs2 are allocated on the stack, Is there a
>> question with the value on the stack as the return value ?
>> I may have misunderstood Richard's intention.
>
> And now for the match.pd patch.
>
> +/* x >  y  &&  x != XXX_MIN  -->  x > y  */
> +(for and (truth_and bit_and)
> + (simplify
> +  (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2))
> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
> (TREE_TYPE(@1))
> +       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
> +    @3)))
> +
> +/* x >  y  &&  x == XXX_MIN  -->  false  */
> +(for and (truth_and bit_and)
> + (simplify
> +  (and:c (gt:c @0 @1) (eq @0 INTEGER_CST@2))
> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
> (TREE_TYPE(@1))
> +       && (wi::eq_p (wi::to_wide (@2), wi::min_value (TREE_TYPE (@2)))))
> +    { boolean_false_node; })))
>
> you could merge those two via
>
>  (for eqne (eq ne)
>   (for and (....
>    (simplify
>     (and:c (gt:c @0 @1) (eqne @0 INTEGER_CST@2))
>     (if (...)
>      (switch
>       (if (eqne == NE_EXPR)
>        @3)
>       (if (eqne == EQ_EXPR)
>        { constant_boolean_node (false, type); }))))
>
> notice using constant_boolean_node (false, type); instead of
> boolean_false_node.  I suspect more unification is possible.
>
> Also you could do
>
> (match min_value
>  INTEGER_CST
>  (if (INTEGRAL_TYPE_P (type)
>       && wi::eq_p (wi::to_wide (t), wi::min_value (type)))))
>
> and then write
>
>  (simplify
>   (and:c (gt:c @0 @1) (eq @0 min_value))
>   (...
>
> Your
>
> (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P
> (TREE_TYPE(@1))
>
> is redundant, it's enough to check either @0 or @1 given they
> have to be compatible for the gt operation.  Note you probably
> want to use
>
>   (and:c (gt:c @0 @1) (eq @@0 min_value))
>
> and verify that types_match (@1, @0) because when @0 are a constant
> (and (eq @0 min_value) is not folded which can happen) then they
> might have different types and thus you could have
> (SHORT_MAX > intvar) && (SHORT_MAX == SHORT_MAX)
>
> That said, the patterns can be quite a bit simplified I think.
>
> Richard.
>
Likewise, I applied the suggested simplification.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

0002-Fix-PR88784-middle-end-is-missing-some-optimizations.patch (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/2] Auto-generate maybe_fold_and/or_comparisons from match.pd

Martin Liška-2
In reply to this post by Martin Liška-2
Hi.

I'm sending slightly updated version of the patch where we
need to properly select type in maybe_fold_comparisons_from_match_pd
function for the created SSA_NAMEs. We can be called for a VECTOR_TYPE
and so that we can't return a boolean_type_node.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

0001-Auto-generate-maybe_fold_and-or_comparisons-from-mat.patch (10K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/5] Rewrite part of and_comparisons_1 into match.pd.

Martin Liška-2
In reply to this post by Martin Liška-2
Hi.

The patch is about transition of and_comparisons_1 matching
into match.pd.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

0003-Rewrite-part-of-and_comparisons_1-into-match.pd.patch (25K) Download Attachment
123