[Bug inline-asm/85593] New: GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug inline-asm/85593] New: GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

            Bug ID: 85593
           Summary: GCC on ARM allocates R3 for local variable when
                    calling naked function with O2 optimizations enabled
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: inline-asm
          Assignee: unassigned at gcc dot gnu.org
          Reporter: austinpmorton at gmail dot com
  Target Milestone: ---

Created attachment 44048
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44048&action=edit
example code triggering bug

When calling a naked function declared in the same translation unit which
includes inline assembly that modifies registers which are not preserved across
function calls according to the arm abi (R0-R3), GCC 5+ can incorrectly
allocate these registers for local variables in the caller and expect them to
persist between function calls.

* -O2 must be enabled to observe this behavior.
* the function must be declared in the same translation unit, if declared
extern gcc correctly allocates preserved registers for its locals
* GCC 4.6.4 correctly allocates R4 in the example code given
* GCC 5.4, and all higher versions tested incorrectly allocates R3 in the
example code given
* I have built a clean copy of GCC trunk and observed this issue as well

note: this issue is present on all arm triplets I have tested, not just
arm-none-eabi.

compile bug.c with the following options to observe the bug
arm-none-eabi-gcc -O2 -DBUG -S bug.c -o -

compile bug.c with the following options to observe the correct behavior with
an extern declaration
arm-none-eabi-gcc -O2 -S bug.c -o -

note that in the non-working compiler versions gcc attempts to load a value
from r3 between calls to test2, whereas in the working compiler versions gcc
will produce identical code save for using r4.

for convenience sake, here is the example shown in working vs nonworking
compiler versions on godbolt:
https://godbolt.org/g/kMSn8W
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I don't think naked function are supposed to be called directly.
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #2 from Austin Morton <austinpmorton at gmail dot com> ---
Where is this limitation documented? The GCC documentation on the naked
function attribute makes no mention of such a caveat:
https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html


See here for real world usage that triggers this issue:
https://git.kernel.org/pub/scm/bluetooth/sbc.git/tree/sbc/sbc_primitives_armv6.c

Note that I am not the original author of the above code, just someone who ran
into issues attempting to use it on gcc newer than 4.x

In this specific case it is simple enough to fix by pushing r3, but it seemed
like a bug with the compiler, considering the behavior changed between major
releases.

If this is indeed an issue with the code itself and not the compiler, that is
fine, but it should certainly be documented that naked functions are not meant
to be called directly.
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ramana at gcc dot gnu.org

--- Comment #3 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
Changing the testcase to indicate the clobbered register r3 makes the test pass
but that is wrong in a naked function. Extended inline assembler *cannot* be
used within a naked function because you inadvertently could end up requiring a
stack slot and therefore cause the function to require a prologue and epilogue
which is exactly contrary to what we want with the naked function !

I *think* the problem really is the fact that you have ipa-ra coming along and
deciding that r3 isn't used at all in the naked function . You can see the
problem disappear with -fno-ipa-ra but that is not a workaround I would
recommend using in your general C flags , because you are using a hammer
disabling a nice optimization to make something like the example "work".

Thus I think what you want is to get rid of naked functions in general and
write the whole thing in assembler and stop faffing about with naked functions
in general. IIRC there is a hook for ipa-ra that says what registers can be
clobbered : can't find it immediately. I suppose for naked functions it is
*all* registers.



regards
Ramana
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #4 from Austin Morton <austinpmorton at gmail dot com> ---
In my particular case I was able to work around the issue by removing the naked
attribute and using extended assembly with a clobbers list.

The resulting code is nearly identical (allowing GCC to generate the correct
pro/epilog instead of hand writing it), and gcc correctly allocates R4 instead
of R3.

This still feels like a bug in GCC.  In the example I gave, if you compiled the
naked function in a separate C file and linked them it would generate the
correct code.  The issue is that GCC is able to "see" the naked function and is
performing optimizations that it shouldn't as a result.

I believe that GCC should treat naked functions as opaque as far as
optimizations are concerned.
At the very least, there should be a note about this kind of issue included in
the documentation of the naked attribute.
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [5,6,7,8 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2018-05-04
            Version|9.0                         |5.4.1
            Summary|GCC on ARM allocates R3 for |[5,6,7,8 Regression] GCC on
                   |local variable when calling |ARM allocates R3 for local
                   |naked function with O2      |variable when calling naked
                   |optimizations enabled       |function with O2
                   |                            |optimizations enabled
     Ever confirmed|0                           |1

--- Comment #5 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to Austin Morton from comment #4)
> In my particular case I was able to work around the issue by removing the
> naked attribute and using extended assembly with a clobbers list.

Removing the naked attribute and using the extended assembler with a clobbers
list is absolutely the correct thing to do.

>
> The resulting code is nearly identical (allowing GCC to generate the correct
> pro/epilog instead of hand writing it), and gcc correctly allocates R4
> instead of R3.
>
> This still feels like a bug in GCC.  In the example I gave, if you compiled
> the naked function in a separate C file and linked them it would generate
> the correct code.  The issue is that GCC is able to "see" the naked function
> and is performing optimizations that it shouldn't as a result.
>
> I believe that GCC should treat naked functions as opaque as far as
> optimizations are concerned.
> At the very least, there should be a note about this kind of issue included
> in the documentation of the naked attribute.

Yes, it should be opaque as far as this IPA-RA optimization is concerned - I
don't think there are many other optimizations that need to treat this as
opaque.  That's what I alluded to in my previous comment


> IIRC there is a hook for ipa-ra that says what
> registers can be clobbered : can't find it immediately. I suppose for naked
> functions it is *all* registers.

I wasn't looking in the backend when I responded earlier, there is no such hook
- I think the correct fix would be to get arm_emit_call_insn to mark *all*
registers as clobbered if the target of the call insn is a naked function i.e.
effectively disabling ipa-ra for naked functions. You'd have to figure out that
the DECL for the target of the call had a "naked" attribute attached to it ...

Do you feel up to writing up a patch assuming you have copyright assignments et
al sewn up ?




>
>
>
> regards
> Ramana
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [6,7,8,9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Eric Gallager <egallager at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |egallager at gcc dot gnu.org
      Known to work|                            |4.6.4
            Summary|[5,6,7,8 Regression] GCC on |[6,7,8,9 Regression] GCC on
                   |ARM allocates R3 for local  |ARM allocates R3 for local
                   |variable when calling naked |variable when calling naked
                   |function with O2            |function with O2
                   |optimizations enabled       |optimizations enabled
      Known to fail|                            |5.4.1

--- Comment #6 from Eric Gallager <egallager at gcc dot gnu.org> ---
gcc 5 branch is closed; retitling
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [6,7,8,9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #7 from Austin Morton <austinpmorton at gmail dot com> ---
I will certainly give writing a patch a try - but I will disclaim up front that
because there is a viable workaround for the issue I was having (patch below
[1]), this issue is "resolved" as far as my employer is concerned.

Nevertheless, I will attempt to tackle this on a weekend out of curiosity
(never had a reason to dig around in compiler guts before).

[1] https://marc.info/?l=linux-bluetooth&m=152535913710490&w=2
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [6/7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |6.5
            Summary|[6,7,8,9 Regression] GCC on |[6/7/8/9 Regression] GCC on
                   |ARM allocates R3 for local  |ARM allocates R3 for local
                   |variable when calling naked |variable when calling naked
                   |function with O2            |function with O2
                   |optimizations enabled       |optimizations enabled
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [6/7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #8 from Eric Gallager <egallager at gcc dot gnu.org> ---
(In reply to Austin Morton from comment #7)
> I will certainly give writing a patch a try - but I will disclaim up front
> that because there is a viable workaround for the issue I was having (patch
> below [1]), this issue is "resolved" as far as my employer is concerned.
>
> Nevertheless, I will attempt to tackle this on a weekend out of curiosity
> (never had a reason to dig around in compiler guts before).
>
> [1] https://marc.info/?l=linux-bluetooth&m=152535913710490&w=2

Have you had an open weekend to attempt to tackle this yet?
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [6/7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #9 from Austin Morton <austinpmorton at gmail dot com> ---
Apologies for letting this sit so long.

I spent an afternoon digging through some of the mentioned functions trying to
familiarize myself with everything but I didn't make it further than that.

That was a few months ago at this point.  I don't think I have the time to go
back at this point - it's probably better left to someone already familiar with
the compiler.

Thanks,
Austin
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [6/7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|6.5                         |7.4

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 6 branch is being closed
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
           Priority|P3                          |P2

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
So somehow the consensus was that IPA-RA needs to be told what to do about
naked functions.  IPA-RA is new in GCC 5.  I'm not sure whether there already
exists a hook for targets to use.
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|7.4                         |7.5
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Generic code already checks "naked" attribute in a bunch of places, no need to
add a target hook IMHO.
attribs.c:  /* A "naked" function attribute implies "noinline" and "noclone"
for
attribs.c:      && lookup_attribute ("naked", attributes) != NULL
attribs.c:      && lookup_attribute_spec (get_identifier ("naked")))
bb-reorder.c:     && !lookup_attribute ("naked", DECL_ATTRIBUTES (fun->decl))
cfgexpand.c:          if (lookup_attribute ("naked",
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 45173
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45173&action=edit
gcc9-pr85593.patch

Untested fix.
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [7/8/9 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Author: jakub
Date: Thu Dec  6 23:41:04 2018
New Revision: 266881

URL: https://gcc.gnu.org/viewcvs?rev=266881&root=gcc&view=rev
Log:
        PR target/85593
        * final.c (rest_of_handle_final): Don't call collect_fn_hard_reg_usage
        for functions with naked attribute.

        * gcc.target/i386/pr85593.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr85593.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/final.c
    trunk/gcc/testsuite/ChangeLog
Reply | Threaded
Open this post in threaded view
|

[Bug target/85593] [7/8 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled

msebor at gcc dot gnu.org
In reply to this post by msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
ARM maintainers - feel free to add some ARM test for naked vs. IPA-RA too.