[Bug rtl-optimization/64242] New: Longjmp expansion incorrect on i386

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/64242] New: Longjmp expansion incorrect on i386

msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

            Bug ID: 64242
           Summary: Longjmp expansion incorrect on i386
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wdijkstr at arm dot com

As PR rtl-optimization/64151 showed, the longjmp expansion on i386 is incorrect
if the base register is spilled. It turns out it is trivial to write an example
that reproduces this without my patch:

void
broken_longjmp(int x, void *buf[20])
{
  if (x == 0) return;
  __builtin_longjmp (buf, 1);
}

With -O2 this produces:

        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        testl   %eax, %eax
        jne     .L5
        leave
        ret
.L5:
        movl    12(%ebp), %eax
        movl    4(%eax), %eax
        movl    12(%ebp), %edx
        movl    (%edx), %ebp    *** load new ebp
        movl    12(%ebp), %ecx  *** try to use old ebp
        movl    8(%ecx), %esp
        jmp     *%eax
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/64242] Longjmp expansion incorrect on i386

msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
I don't think this test is
valid. You can search for a
builtin longjmp documentation
bug of mine.
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/64242] Longjmp expansion incorrect on i386

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
Dup of PR 59039?
Reply | Threaded
Open this post in threaded view
|

[Bug rtl-optimization/64242] Longjmp expansion incorrect on i386

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #3 from Wilco <wdijkstr at arm dot com> ---
(In reply to H.J. Lu from comment #2)
> Dup of PR 59039?

No that talks about not using __builtin_setjmp and __builtin_longjmp within the
same function. I only used longjmp. Or are they so broken you need to put both
the __builtin_longjmp and __builtin_setjmp in their own non-inlineable
functions and make sure you never pass the jmpbuf via an argument? If so what
is the point?
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|i386                        |
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2018-11-29
                 CC|                            |wilco at gcc dot gnu.org
          Component|rtl-optimization            |middle-end
            Summary|Longjmp expansion incorrect |Longjmp expansion incorrect
                   |on i386                     |
     Ever confirmed|0                           |1
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |wilco at gcc dot gnu.org

--- Comment #4 from Wilco <wilco at gcc dot gnu.org> ---
I can cause this to fail on AArch64 as well with a local buffer - the issue is
blatantly obvious in the expansion of longjmp.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #5 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Author: law
Date: Fri Nov 30 23:06:51 2018
New Revision: 266697

URL: https://gcc.gnu.org/viewcvs?rev=266697&root=gcc&view=rev
Log:
        PR middle-end/64242
        * builtins.c (expand_builtin_longjmp): Use a temporary when restoring
        the frame pointer.
        (expand_builtin_nonlocal_goto): Likewise.

        * gcc.c-torture/execute/pr64242.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr64242.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/builtins.c
    trunk/gcc/testsuite/ChangeLog
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |law at redhat dot com
         Resolution|---                         |FIXED

--- Comment #6 from Jeffrey A. Law <law at redhat dot com> ---
Fixed on trunk by Wilco's patch.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

Christophe Lyon <clyon at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |clyon at gcc dot gnu.org

--- Comment #7 from Christophe Lyon <clyon at gcc dot gnu.org> ---
Hello,
I can see the new test failing on arm-none-eabi when:
- GCC is configured with default cpu/mode, QEMU simulates arm926
- GCC is configured --with-mode=thumb --with-cpu=cortex-m3, QEMU simulates
cortex-m3

FAIL: gcc.c-torture/execute/pr64242.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/pr64242.c   -O3 -g  execution test

I'm attaching the traces from QEMU, starting with main().
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #8 from Christophe Lyon <clyon at gcc dot gnu.org> ---
Created attachment 45137
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45137&action=edit
QEMU traces for --with-cpu=default / QEMU --cpu arm926
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #9 from Christophe Lyon <clyon at gcc dot gnu.org> ---
Created attachment 45138
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45138&action=edit
QEMU traces for --with-cpu=cortex-m3 / QEMU --cpu cortex-m3
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The testcase also FAILs on x86_64-linux and i686-linux for me.
The testcase is just invalid, you can't assume that __builtin_alloca (0) will
not allocate anything on the stack at all, on many targets it does.
E.g. on x86_64 it does:
        movslq  x(%rip), %rax
        addq    $23, %rax
        andq    $-16, %rax
        subq    %rax, %rsp
        leaq    15(%rsp), %rax
        andq    $-16, %rax
        movq    %rax, p(%rip)
so for 0 it subtracts 16 bytes from %rsp.
What was the failure on i386?
I see
        .cfi_startproc
        pushl   %ebp
        .cfi_def_cfa_offset 8
        .cfi_offset 5, -8
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        subl    $32, %esp
        movl    8(%ebp), %eax
        movl    (%eax), %edx
        movl    8(%eax), %ecx
        movl    %edx, -20(%ebp)
        movl    4(%eax), %edx
        movl    %ecx, -12(%ebp)
        movl    12(%eax), %ecx
        movl    16(%eax), %eax
        movl    %edx, -16(%ebp)
        movl    %ecx, -8(%ebp)
        movl    %eax, -4(%ebp)
        movl    -20(%ebp), %ebp
        movl    -12(%ebp), %esp
        jmp     *%edx
        .cfi_endproc
At the start of the movl -20(%ebp), %ebp instruction both -12(%ebp) and
-12(-20(%ebp)) contain the same value though, it has been stored there a few
instructions earlier (movl %ecx, -12(%ebp) above it).
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #11 from H.J. Lu <hjl.tools at gmail dot com> ---
The testcase failed on many targets.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |---

--- Comment #12 from H.J. Lu <hjl.tools at gmail dot com> ---
Reopen
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I wonder about following, on i686-linux it FAILs with older trunk and succeeds
with current trunk.  Without the (useless) stack realignment the right value of
stack pointer happened to be in exactly that slot from which it read memory.
While still not fully portable, I think if the two alloca (0) are more than
1024 bytes appart, something is wrong with the target or at least alloca is
helplessly expensive there.

--- gcc/testsuite/gcc.c-torture/execute/pr64242.c       2018-12-01
00:25:08.082009500 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr64242.c       2018-12-03
16:43:33.343875994 +0100
@@ -11,20 +11,40 @@ broken_longjmp(void *p)
   __builtin_longjmp (buf, 1);
 }

+__attribute ((noipa)) __UINTPTR_TYPE__
+foo(void *p)
+{
+  return (__UINTPTR_TYPE__) p;
+}
+
+__attribute ((noipa)) void
+bar(void *p)
+{
+  asm volatile ("" : : "r" (p));
+}
+
 volatile int x = 0;
-volatile void *p;
+void *volatile p;
+void *volatile q;
 int
 main (void)
 {
   void *buf[5];
+  struct __attribute__((aligned (32))) S { int a[4]; } s;
+  bar (&s);
   p = __builtin_alloca (x);
-
   if (!__builtin_setjmp (buf))
     broken_longjmp (buf);

   /* Fails if stack pointer corrupted.  */
-  if (p != __builtin_alloca (x))
-    abort();
+  q = __builtin_alloca (x);
+  if (foo (p) < foo (q))
+    {
+      if (foo (q) - foo (p) >= 1024)
+       abort ();
+    }
+  else if (foo (p) - foo (q) >= 1024)
+    abort ();

   return 0;
 }
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #14 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #13)
> I wonder about following, on i686-linux it FAILs with older trunk and
> succeeds with current trunk.  Without the (useless) stack realignment the
> right value of stack pointer happened to be in exactly that slot from which
> it read memory.

We could just increase the alloca size to 1 to avoid that (however there is
always a possibility that the loaded value happens to be valid).

> While still not fully portable, I think if the two alloca (0) are more than
> 1024 bytes appart, something is wrong with the target or at least alloca is
> helplessly expensive there.

If alloca (x) always allocates extra bytes for no reason then that's a separate
issue with alloca - I fixed this in the generic code last year.

> --- gcc/testsuite/gcc.c-torture/execute/pr64242.c 2018-12-01
> 00:25:08.082009500 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr64242.c 2018-12-03
> 16:43:33.343875994 +0100
> @@ -11,20 +11,40 @@ broken_longjmp(void *p)
>    __builtin_longjmp (buf, 1);
>  }
>  
> +__attribute ((noipa)) __UINTPTR_TYPE__
> +foo(void *p)
> +{
> +  return (__UINTPTR_TYPE__) p;
> +}
> +
> +__attribute ((noipa)) void
> +bar(void *p)
> +{
> +  asm volatile ("" : : "r" (p));
> +}
> +
>  volatile int x = 0;
> -volatile void *p;
> +void *volatile p;
> +void *volatile q;
>  int
>  main (void)
>  {
>    void *buf[5];
> +  struct __attribute__((aligned (32))) S { int a[4]; } s;
> +  bar (&s);

Not sure what the purpose of this would be?

>    p = __builtin_alloca (x);
> -
>    if (!__builtin_setjmp (buf))
>      broken_longjmp (buf);
>  
>    /* Fails if stack pointer corrupted.  */
> -  if (p != __builtin_alloca (x))
> -    abort();
> +  q = __builtin_alloca (x);
> +  if (foo (p) < foo (q))
> +    {
> +      if (foo (q) - foo (p) >= 1024)
> + abort ();
> +    }
> +  else if (foo (p) - foo (q) >= 1024)
> +    abort ();

I think that could just become __builtin_absl (foo (q) - foo (p)) > 64.

>    return 0;
>  }
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Author: jakub
Date: Mon Dec  3 20:57:14 2018
New Revision: 266766

URL: https://gcc.gnu.org/viewcvs?rev=266766&root=gcc&view=rev
Log:
        PR middle-end/64242
        * gcc.c-torture/execute/pr64242.c (foo, bar): New functions.
        (p): Make it void *volatile instead of volatile void *.
        (q): New variable.
        (main): Add a dummy 32-byte aligned variable and escape its address.
        Don't require that the two __builtin_alloca (0) calls return the
        same address, just require that their difference is smaller than
        1024 bytes.

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.c-torture/execute/pr64242.c
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #16 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Christophe Lyon from comment #9)
> Created attachment 45138 [details]
> QEMU traces for --with-cpu=cortex-m3 / QEMU --cpu cortex-m3

Thanks, I can reproduce this now. It fails due to sched1 scheduling the move
before the sp restore (correct at this point since it doesn't use a frame
pointer) and reload then forward propagates a frame pointer use into it. Even
if I add additional clobbers for sfp it won't block the substitution.

So maybe we need an unspec around the address generation to be safe.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #17 from Jeffrey A. Law <law at redhat dot com> ---
Or just emit a blockage insn to avoid the undesirable code motion.
Reply | Threaded
Open this post in threaded view
|

[Bug middle-end/64242] Longjmp expansion incorrect

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

--- Comment #18 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #17)
> Or just emit a blockage insn to avoid the undesirable code motion.

I tried that already, it doesn't affect the forward substitution - I guess it
simply assumes it is always valid, and a function never writes to SFP/SP/FP.
12