[Bug rtl-optimization/86438] New: [8/9 Regression] wrong code at -Os

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] New: [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

            Bug ID: 86438
           Summary: [8/9 Regression] wrong code at -Os
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zsojka at seznam dot cz
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu

Created attachment 44365
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44365&action=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -Os testcase.c
$ ./a.out
Aborted

Diff between -O2 -> -Os shows:
@@ -77,15 +76,15 @@
 # testcase.c:13:   u32 f = __builtin_sub_overflow_p (d, (u128) d, (u64)0);
        mov     ebx, 0  # _2,
 # testcase.c:12:   u64 d = (g ? 5 : 4);
-       setne   al      #, iftmp.0_9
-       movzx   eax, al # iftmp.0_9, iftmp.0_9
-       add     rax, 4  # iftmp.0_9,
+       cmp     rax, 1  # tmp100,
+       sbb     rax, rax        # iftmp.0_9
+       add     rax, 5  # iftmp.0_9,
 # testcase.c:13:   u32 f = __builtin_sub_overflow_p (d, (u128) d, (u64)0);
        mov     rcx, rax        # _2, iftmp.0_9
        setc    al      #, _11

The - (-O2) never sets CF (thus the last setc always sets al=0).
The + (-Os) set CF if g==0 (thus the last setc sets al=(g==0)).

The __builtin_sub_overflow_p() in fact never overflows, because it does
(u64)d-(u128)d == 0 in all cases.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-262509-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-262509-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
gcc version 9.0.0 20180709 (experimental) (GCC)
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2018-07-09
     Ever confirmed|0                           |1

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
Looks like postreload cmp elimitation pass is at fault.

It converts:

(insn 64 63 76 2 (parallel [
            (set (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])
                (plus:DI (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])
                    (const_int 5 [0x5])))
            (clobber (reg:CC 17 flags))
        ]) "pr86438.c":17 232 {*adddi_1}
     (nil))
(insn 76 64 77 2 (set (reg:DI 4 si [orig:88 _2 ] [88])
        (const_int 0 [0])) "pr86438.c":18 85 {*movdi_internal}
     (nil))
(insn 77 76 14 2 (set (reg:DI 5 di [ _2+8 ])
        (const_int 0 [0])) "pr86438.c":18 85 {*movdi_internal}
     (nil))
(insn 14 77 15 2 (set (reg:DI 4 si [orig:88 _2 ] [88])
        (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])) "pr86438.c":18 85
{*movdi_internal}
     (nil))
(insn 15 14 66 2 (set (reg:DI 5 di [ _2+8 ])
        (const_int 0 [0])) "pr86438.c":18 85 {*movdi_internal}
     (nil))
(insn 66 15 67 2 (set (reg:CC 17 flags)
        (compare:CC (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])
            (reg:DI 4 si [orig:88 _2 ] [88]))) "pr86438.c":18 12 {*cmpdi_1}
     (nil))
(note 67 66 78 2 NOTE_INSN_DELETED)
(insn 78 67 79 2 (set (reg:QI 2 cx [orig:96 _11+8 ] [96])
        (ltu:QI (reg:CC 17 flags)
            (const_int 0 [0]))) "pr86438.c":18 700 {*setcc_qi}
     (nil))

to:

(insn 64 63 76 2 (parallel [
            (set (reg:CCC 17 flags)
                (compare:CCC (plus:DI (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])
                        (const_int 5 [0x5]))
                    (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])))
            (set (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])
                (plus:DI (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])
                    (const_int 5 [0x5])))
        ]) "pr86438.c":17 346 {*adddi3_cc_overflow_1}
     (nil))
(insn 76 64 77 2 (set (reg:DI 4 si [orig:88 _2 ] [88])
        (const_int 0 [0])) "pr86438.c":18 85 {*movdi_internal}
     (nil))
(insn 77 76 14 2 (set (reg:DI 5 di [ _2+8 ])
        (const_int 0 [0])) "pr86438.c":18 85 {*movdi_internal}
     (nil))
(insn 14 77 15 2 (set (reg:DI 4 si [orig:88 _2 ] [88])
        (reg:DI 0 ax [orig:94 iftmp.0_9 ] [94])) "pr86438.c":18 85
{*movdi_internal}
     (nil))
(insn 15 14 67 2 (set (reg:DI 5 di [ _2+8 ])
        (const_int 0 [0])) "pr86438.c":18 85 {*movdi_internal}
     (nil))
(note 67 15 78 2 NOTE_INSN_DELETED)
(insn 78 67 79 2 (set (reg:QI 2 cx [orig:96 _11+8 ] [96])
        (ltu:QI (reg:CCC 17 flags)
            (const_int 0 [0]))) "pr86438.c":18 700 {*setcc_qi}
     (nil))

Please note that (insn 66) is deleted and (insn 64) gets converted to an insn
that sets flags reg in CCCmode (where only carry flag is valid). The original
sequence sets flags reg in CCmode (where all flags are valid), so these two
sequences are not identical.

Adding -fno-compare-elim fixes the test.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |8.3
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |ebotcazou at gcc dot gnu.org
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
Here is what happens:

compare operators in (insn 66) are substituted with their defs from (insn 64)
and (insn 14). The CC mode is calculated from SELECT_CC_MODE, which really
returns CCCmode. The flags reg clobber is substituted with the compare, and
insn then matches {*adddi3_cc_overflow_1}.

Unfortunately, the compare is not correct. %rax from (insn 14) is already
updated, and should not be compared with its previous value from (insn 64).
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

Alexandre Oliva <aoliva at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |aoliva at gcc dot gnu.org
           Assignee|unassigned at gcc dot gnu.org      |aoliva at gcc dot gnu.org

--- Comment #3 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
Mine
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

--- Comment #4 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
Created attachment 44970
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44970&action=edit
candidate patch
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

--- Comment #5 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
The candidate patch regresses the regression test added along with the fix for
bug 49095.  At first I thought it should, but there seems to be logic in place
to adjust the compare for the flags-clobbering insn.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

Alexandre Oliva <aoliva at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #44970|0                           |1
        is obsolete|                            |

--- Comment #6 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
Created attachment 44971
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44971&action=edit
another candidate patch

This one does not regress pr49095 with -m32, but -m64 scan-assembler-times
fails before and after.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8/9 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

--- Comment #8 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
Author: aoliva
Date: Fri Nov  9 10:16:09 2018
New Revision: 265957

URL: https://gcc.gnu.org/viewcvs?rev=265957&root=gcc&view=rev
Log:
[PR86438] compare-elim: cope with set of in_b

When in_a resolves to a register set in the prev_clobber insn, we may
use the SET_SRC for the compare instead.  However, when in_b so
resolves, we proceed to use the reg with its earlier value.  When both
resolve to the same register and prev_clobber is an insn that modifies
the register, this arrangement may cause the compare to match (when it
shouldn't) and the elimination of the compare to incorrectly succeed.

(set (reg 1) (plus (reg 1) (const_int N)))
(set (reg 2) (reg 1))
(set (reg flags) (compare (reg 1) (reg 2)))

in_a: (reg 1)            --> (plus (reg 1) (const_int N))
in_b: (reg 2) -> (reg 1) -/> oops

(parallel [
 (set (reg flags) (compare (plus (reg 1) (const_int N))
                           (reg 1))) ;; should be (plus...)
 (set (reg 1) (plus (reg 1) (const_int N)))])
(set (reg 2) (reg 1))

This patch arranges for in_b to also undergo SET_SRC substitution
when appropriate, with a shortcut for when in_a and in_b are the same
rtx.


for  gcc/ChangeLog

        PR rtl-optimization/86438
        * compare-elim.c (try_eliminate_compare): Use SET_SRC instead
        of in_b for the compare if in_b is SET_DEST.

for  gcc/testsuite/ChangeLog

        PR rtl-optimization/86438
        * gcc.dg/torture/pr86438.c: New.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr86438.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/compare-elim.c
    trunk/gcc/testsuite/ChangeLog
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/86438] [8 Regression] wrong code at -Os

asolokha at gmx dot com
In reply to this post by asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86438

Andrew Stubbs <ams at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ams at gcc dot gnu.org

--- Comment #9 from Andrew Stubbs <ams at gcc dot gnu.org> ---
The testcase is undefined for targets that have no 128-bit type (i.e. long long
is 64-bit, and __int128 doesn't exist).

For (not yet in tree) GCN port I get "warning: right shift count >= width of
type [-Wshift-count-overflow]", and then the execution test aborts.