[Bug c/24027] New: A gcc primitive, under special circumstances, can crash the AVR

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug c/24027] New: A gcc primitive, under special circumstances, can crash the AVR

cvs-commit at gcc dot gnu.org
When the compiler needs to allocate stack space for a function,
it uses the following assembly fragment (commented by me):

in r28,0x3d ; get stack pointer high
in r29,0x3e ; get stack pointer low
sbiw r28,N ; decrement value by N
in r0,0x3f ; get status register
cli ; disable interrupts
out 0x3e,r29 ; write new stack pointer high
out 0x3f,r0 ; enable interrupts, it takes one more insn to enable them
out 0x3d,r28 ; write new stack pointer low, interrupt is still disabled

Unfortunately, there is an AVR feature that the chip manual does not
mention, but that we could confirm with a logic analyser slapped onto
the chip.

If there is an interrupt which arrives just after the cli but before
the out 0x3f,r0, then the AVR indeed executes the out 0x3d,r28 before
the interrupt is accepted, however, it does NOT decrement the stack pointer
after pushing the return address to the stack.

That is, if the content of r29:r28 that is written to the SP is 0x1234,
then the return address will be pushed to the 0x1234, 0x1233 locations
but the stack pointer value at the start of the interrupt routine will
be 0x1234 instead of 0x1232. This naturally causes the interrupt to fetch
its return address from 0x1235 and 0x1236, location that contain any
unrelated data.

We analysed this problem in an AT90CAN128 chip, an avr5 core. The error
manifests itself under very special circumstances: you need a function
that allocates stack space (most AVR code does not use large on-stack
blocks and thus the compiler keeps everything in registers) and an
interrupt that arrives after the "cli" but before the "out 0x3d,r28"
instructions, a very narrow, 7 clock cycle window.

The solution is to change the order of instructions from

out 0x3f,r0
out 0x3d,r28

to

out 0x3d,r28
out 0x3f,r0

in gcc/config/avr/avr.c in functions out_set_stack_ptr() and output_movhi().

The same bug exists in gcc version 3.4.x.

To confirm the bug you should boot the ship, then do the following:

; write some code that causes a timer interrupt to happen in
; 9 clock cycles, then:

  sei ; interrupts are enabled clock is +1
  in r24,0x3d  ; get current stack pointer clock is +3
  in r25,0x3e ; clock is +5
  in r0,0x3f ; get status, clock is +7
  cli ; disable interrupt, clock is +8
  out 0x3e,r25 ; write sp high, clock is +10, interrupt just arrives
  out 0x3f,r0 ; write status, that enables the interrupt after 1 more insn
  out 0x3d,r24 ; stack low, interrupt accepted immediately after this insn

and then in your interrupt service routine:
isr_entry:
   in r26,0x3d
   in r27,0x3e

and you will find that r26:27 and r24:25 will be the same, even though
there should be a difference of 2 bytes (the return address) between
them (we actually checked the bus activities using external RAM as stack
and we also confirmed the behaviour using the above software-only method).

We don't know if this behaviour of the AVR core is specific to the
AT90CAN128 or all avr5 cores do the same, we don't have other chips
handy.

This bug is a ticking timebomb, it manifests itself under very special
circumstances that are very hard to reproduce in a production system but
that can nonetheless happen (as did in our case).

--
           Summary: A gcc primitive, under special circumstances,  can crash
                    the AVR
           Product: gcc
           Version: 4.0.1
            Status: UNCONFIRMED
          Severity: critical
          Priority: P2
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: zoltan at bendor dot com dot au
                CC: gcc-bugs at gcc dot gnu dot org,zoltan at bendor dot com
                    dot au
  GCC host triplet: i386-elf-linux
GCC target triplet: avr-elf-unknown


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24027
Reply | Threaded
Open this post in threaded view
|

[Bug target/24027] A gcc primitive, under special circumstances, can crash the AVR

cvs-commit at gcc dot gnu.org


--
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |normal
          Component|c                           |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24027
Reply | Threaded
Open this post in threaded view
|

[Bug target/24027] A gcc primitive, under special circumstances, can crash the AVR

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-23 12:56 -------
Is there some source someone can look at?

--
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24027
Reply | Threaded
Open this post in threaded view
|

[Bug target/24027] A gcc primitive, under special circumstances, can crash the AVR

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From zoltan at bendor dot com dot au  2005-09-26 07:42 -------
Subject:  A gcc primitive, under special circumstances,  can crash the AVR

 >
 > ------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-23 12:56 -------
 > Is there some source someone can look at?

Yes, I attach an assembly source that details the bug and also serves a
self-contained testcode that demonstrates the bug.

It should be noted that the bug is in the AVR core (or its incomplete
documentation) and *not* in gcc, but gcc can very easily work around
the bug.

Best Regards,

Zoltan


------- Additional Comments From zoltan at bendor dot com dot au  2005-09-26 07:42 -------
Created an attachment (id=9806)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9806&action=view)


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24027
Reply | Threaded
Open this post in threaded view
|

[Bug target/24027] A gcc primitive, under special circumstances, can crash the AVR

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From zoltan at bendor dot com dot au  2005-09-27 11:30 -------
Subject:  A gcc primitive, under special circumstances,  can crash the AVR

Additional comment:

The bug can be retired. Atmel confirmed (and the latest AT90CAN128
manual lists it in the errata) that the bug is in a chip issue.
The bug is just one manifestation of the chip bug. In general,
the AT90CAN128 works erroneously if the stack is located in
external SRAM and there is an already pending interrupt when
the interrupt gets enabled. The only known workaround is to keep
the stack in the internal SRAM. The bug does not affect other chips
with avr5 core.

Therefore, gcc can be left unchanged as the bug is a chip error that
the compiler can not work around.

Zoltan


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24027