[Bug rtl-optimization/87600] New: Fix for PRs 86939 and 87479 causes build issues for several targets

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] New: Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

            Bug ID: 87600
           Summary: Fix for PRs 86939 and 87479 causes build issues for
                    several targets
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

The IRA/LRA fix (revision 264897) for PR86939 and PR87479 has caused build
issues on several targets.  This bugzilla entry will be used to track the
change(s) required to get the targets building again.

The affected targets seem to be: aarch64, alpha, arm, hppa, s390x and sh4.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2018-10-12
                 CC|                            |dje at gcc dot gnu.org,
                   |                            |law at gcc dot gnu.org,
                   |                            |segher at gcc dot gnu.org,
                   |                            |vmakarov at gcc dot gnu.org,
                   |                            |wschmidt at gcc dot gnu.org
           Assignee|unassigned at gcc dot gnu.org      |bergner at gcc dot gnu.org
   Target Milestone|---                         |9.0
     Ever confirmed|0                           |1
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

--- Comment #1 from Peter Bergner <bergner at gcc dot gnu.org> ---
Pasting some edited commentary from the gcc-patches mailing list:

On 10/8/18 9:52 AM, Jeff Law wrote:
> My tester is showing a variety of problems as well.  hppa, sh4, aarch64,
> aarch64_be, alpha arm* and s390x bootstraps are all failing at various
> points.   Others may trickle in over time, but clearly something went
> wrong recently.  I'm going to start trying to pick them off after the
> morning meetings are wrapped up.

I looked into the PA-RISC test case issue Jeff sent me that is caused by my
patch that handles conflicts wrt reg copies and I _think_ I have a handle
on what the problem is, but don't yet know how to fix.  Jeff, I agree with
the analysis you gave in your email to me, but to get Vlad up to speed, here
it is again along with some additional info.

Compiling the test case, we have the following RTL during IRA.  I have also
annotated the pseudos in the RTL to include their hard reg assignment:

(insn 30 19 32 3 (set (reg/f:SI 112 "%r19") ....))
(insn 32 30 20 3 (set (reg:SI 26 %r26)
                      (reg/f:SI 101 "%r26")))
[snip]
(insn 23 22 49 3 (set (reg:SI 109 "%r28")
                 (mem:SI (reg/f:SI 101 "%r26"))))
(insn 49 23 31 3 (set (reg/f:SI 100 "%r28")
        (if_then_else:SI (eq (reg:SI 109 "%r28") (const_int 15 [0xf]))
            (reg/f:SI 101 "%r26")
            (const_int 0 [0])))
     (expr_list:REG_DEAD (reg:SI 109 "%r28")
        (expr_list:REG_DEAD (reg/f:SI 101 "%r26"))))
(insn 31 49 33 3 (set (mem/f:SI (reg/f:SI 112 "%r19"))
                      (reg/f:SI 100 "%r28"))
     (expr_list:REG_DEAD (reg/f:SI 112 "%r19")
        (expr_list:REG_DEAD (reg/f:SI 100 "%r28"))))
(call_insn 33 31 36 3 (parallel [
            (call (mem:SI (symbol_ref/v:SI ("@_Z3yowP11dw_cfi_node"))
                (const_int 16 [0x10]))
            (clobber (reg:SI 1 %r1))
            (clobber (reg:SI 2 %r2))
            (use (const_int 0 [0]))])
     (expr_list:REG_DEAD (reg:SI 26 %r26))
    (expr_list:SI (use (reg:SI 26 %r26)))))

Before my patch, hard reg %r26 and pseudo 101 conflicted, but with my patch
they now (correctly) do not.  IRA is able to assign hard regs to all of the
pseudos, but the constraints in the pattern for insn 49 forces op0 (pseudo 100)
and op1 (pseudo 101) to have the same hard reg.  They are not, so reload comes
along and spills insn 49 with the following reloads:

Reloads for insn # 49
Reload 0: reload_in (SI) = (reg/f:SI 26 %r26 [orig:101 _10 ] [101])
          reload_out (SI) = (reg/f:SI 28 %r28 [orig:100 iftmp.2_5 ] [100])
          GENERAL_REGS, RELOAD_OTHER (opnum = 0)
          reload_in_reg: (reg/f:SI 26 %r26 [orig:101 _10 ] [101])
          reload_out_reg: (reg/f:SI 28 %r28 [orig:100 iftmp.2_5 ] [100])
          reload_reg_rtx: (reg/f:SI 26 %r26 [orig:101 _10 ] [101])

..giving us the following updated insn:

(insn 49 23 58 3 (set (reg/f:SI 26 %r26 [101])
        (if_then_else:SI (eq (reg:SI 28 %r28 [109])
                (const_int 15))
            (reg/f:SI 26 %r26 [101])
            (const_int 0 [0]))))

The problem is, that hard reg %r26 is defined in insn 32, to be used in
insn 33, so using %r26 as the reload reg is wrong, because it will clobber
the value we set in insn 32.  HPPA is a reload (not LRA) target and looking
thru reload, it assumes that if any input pseudo dies in an insn, then the
hard reg assigned to the pseudo is free game for use with an output reload.
With my patch, that assumption is (now) wrong, because another simultaneously
live pseudo may be using that same hard reg or in this case, the hard reg
itself is still live.

Given no one wants to patch reload, the decision has been made to disable
the special reg copy handling my patch introduced whenever we're compiling
for a reload target via the following patch:

Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c     (revision 264897)
+++ gcc/ira-lives.c     (working copy)
@@ -1064,6 +1064,11 @@ find_call_crossed_cheap_reg (rtx_insn *i
 rtx
 non_conflicting_reg_copy_p (rtx_insn *insn)
 {
+  /* Reload has issues with overlapping pseudos being assigned to the
+     same hard register, so don't allow it.  See PR87600 for details.  */
+  if (!targetm.lra_p ())
+    return NULL_RTX;
+
   rtx set = single_set (insn);

   /* Disallow anything other than a simple register to register copy
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

--- Comment #2 from Peter Bergner <bergner at gcc dot gnu.org> ---
Author: bergner
Date: Fri Oct 12 16:31:11 2018
New Revision: 265113

URL: https://gcc.gnu.org/viewcvs?rev=265113&root=gcc&view=rev
Log:
        PR rtl-optimization/87600
        * ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ira-lives.c
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

--- Comment #3 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Here is an example for aarch64:

===
long f(long x)
{
        register long a asm("x0");
        asm("bla %0 %1" : "+&r"(a) : "r"(x));
        return a;
}
===

The first asm operand is a local register variable set to x0, so this one
should end up in x0.  But "x" is passed in x0 as function argument, and then
combined into the asm, making both asm operands hard register 0.  And then LRA
has the job of making things work (operand 0 is earlyclobber so not both args
can be hard reg 0), and LRA picks the wrong solution:

        bla x1 x0

Not letting combine combine the register moves that copy from the function
argument registers into pseudos fixes this.  But it costs 1%-5% of code size
on all targets (most are about 2%): most targets can usefully combine register
moves into other instructions, for example many targets have operations that
set a flags register as side effect.

I'm going to try to disallow combining the hard reg -> pseudo moves, because
that should help with the register alloc problems (it also gives better
register alloc!), but at the same time introducing an extra copy (from pseudo
to a new pseudo).  In theory this should be the best of both worlds.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

--- Comment #4 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Oh btw, the #c3 problem isn't new at all, it happens with 4.8 already for
example.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Author: segher
Date: Mon Oct 22 20:23:39 2018
New Revision: 265398

URL: https://gcc.gnu.org/viewcvs?rev=265398&root=gcc&view=rev
Log:
combine: Do not combine moves from hard registers

On most targets every function starts with moves from the parameter
passing (hard) registers into pseudos.  Similarly, after every call
there is a move from the return register into a pseudo.  These moves
usually combine with later instructions (leaving pretty much the same
instruction, just with a hard reg instead of a pseudo).

This isn't a good idea.  Register allocation can get rid of unnecessary
moves just fine, and moving the parameter passing registers into many
later instructions tends to prevent good register allocation.  This
patch disallows combining moves from a hard (non-fixed) register.

This also avoid the problem mentioned in PR87600 #c3 (combining hard
registers into inline assembler is problematic).

Because the register move can often be combined with other instructions
*itself*, for example for setting some condition code, this patch adds
extra copies via new pseudos after every copy-from-hard-reg.

On some targets this reduces average code size.  On others it increases
it a bit, 0.1% or 0.2% or so.  (I tested this on all *-linux targets).


        PR rtl-optimization/87600
        * combine.c: Add include of expr.h.
        (cant_combine_insn_p): Do not combine moves from any hard non-fixed
        register to a pseudo.
        (make_more_copies): New function, add a copy to a new pseudo after
        the moves from hard registers into pseudos.
        (rest_of_handle_combine): Declare rebuild_jump_labels_after_combine
        later.  Call make_more_copies.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

--- Comment #6 from Peter Bergner <bergner at gcc dot gnu.org> ---
Author: bergner
Date: Thu Nov  8 22:39:45 2018
New Revision: 265942

URL: https://gcc.gnu.org/viewcvs?rev=265942&root=gcc&view=rev
Log:
gcc/
        PR rtl-optimization/87600
        * cfgexpand.c (expand_asm_stmt): Catch illegal asm constraint usage.
        * lra-constraints.c (process_alt_operands): Skip illegal hard
        register usage.  Prefer reloading non hard register operands.

gcc/testsuite/
        PR rtl-optimization/87600
        * gcc.dg/pr87600.h: New file.
        * gcc.dg/pr87600-1.c: New test.
        * gcc.dg/pr87600-2.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/pr87600-1.c
    trunk/gcc/testsuite/gcc.dg/pr87600-2.c
    trunk/gcc/testsuite/gcc.dg/pr87600.h
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgexpand.c
    trunk/gcc/lra-constraints.c
    trunk/gcc/testsuite/ChangeLog
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/87600] Fix for PRs 86939 and 87479 causes build issues for several targets

segher at gcc dot gnu.org
In reply to this post by segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87600

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Peter Bergner <bergner at gcc dot gnu.org> ---
Ok, I think this resolves all but one ARM issue which is being tracked in
PR87899, so I'm going to mark this bugzilla as fixed.