[Bug c/24027] New: A gcc primitive, under special circumstances, can crash the AVR
When the compiler needs to allocate stack space for a function,
it uses the following assembly fragment (commented by me):
in r28,0x3d ; get stack pointer high
in r29,0x3e ; get stack pointer low
sbiw r28,N ; decrement value by N
in r0,0x3f ; get status register
cli ; disable interrupts
out 0x3e,r29 ; write new stack pointer high
out 0x3f,r0 ; enable interrupts, it takes one more insn to enable them
out 0x3d,r28 ; write new stack pointer low, interrupt is still disabled
Unfortunately, there is an AVR feature that the chip manual does not
mention, but that we could confirm with a logic analyser slapped onto
If there is an interrupt which arrives just after the cli but before
the out 0x3f,r0, then the AVR indeed executes the out 0x3d,r28 before
the interrupt is accepted, however, it does NOT decrement the stack pointer
after pushing the return address to the stack.
That is, if the content of r29:r28 that is written to the SP is 0x1234,
then the return address will be pushed to the 0x1234, 0x1233 locations
but the stack pointer value at the start of the interrupt routine will
be 0x1234 instead of 0x1232. This naturally causes the interrupt to fetch
its return address from 0x1235 and 0x1236, location that contain any
We analysed this problem in an AT90CAN128 chip, an avr5 core. The error
manifests itself under very special circumstances: you need a function
that allocates stack space (most AVR code does not use large on-stack
blocks and thus the compiler keeps everything in registers) and an
interrupt that arrives after the "cli" but before the "out 0x3d,r28"
instructions, a very narrow, 7 clock cycle window.
The solution is to change the order of instructions from
in gcc/config/avr/avr.c in functions out_set_stack_ptr() and output_movhi().
The same bug exists in gcc version 3.4.x.
To confirm the bug you should boot the ship, then do the following:
; write some code that causes a timer interrupt to happen in
; 9 clock cycles, then:
sei ; interrupts are enabled clock is +1
in r24,0x3d ; get current stack pointer clock is +3
in r25,0x3e ; clock is +5
in r0,0x3f ; get status, clock is +7
cli ; disable interrupt, clock is +8
out 0x3e,r25 ; write sp high, clock is +10, interrupt just arrives
out 0x3f,r0 ; write status, that enables the interrupt after 1 more insn
out 0x3d,r24 ; stack low, interrupt accepted immediately after this insn
and then in your interrupt service routine:
and you will find that r26:27 and r24:25 will be the same, even though
there should be a difference of 2 bytes (the return address) between
them (we actually checked the bus activities using external RAM as stack
and we also confirmed the behaviour using the above software-only method).
We don't know if this behaviour of the AVR core is specific to the
AT90CAN128 or all avr5 cores do the same, we don't have other chips
This bug is a ticking timebomb, it manifests itself under very special
circumstances that are very hard to reproduce in a production system but
that can nonetheless happen (as did in our case).
Summary: A gcc primitive, under special circumstances, can crash
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zoltan at bendor dot com dot au
CC: gcc-bugs at gcc dot gnu dot org,zoltan at bendor dot com
GCC host triplet: i386-elf-linux
GCC target triplet: avr-elf-unknown
[Bug target/24027] A gcc primitive, under special circumstances, can crash the AVR
In reply to this post by cvs-commit at gcc dot gnu.org
------- Additional Comments From zoltan at bendor dot com dot au 2005-09-27 11:30 -------
Subject: A gcc primitive, under special circumstances, can crash the AVR
The bug can be retired. Atmel confirmed (and the latest AT90CAN128
manual lists it in the errata) that the bug is in a chip issue.
The bug is just one manifestation of the chip bug. In general,
the AT90CAN128 works erroneously if the stack is located in
external SRAM and there is an already pending interrupt when
the interrupt gets enabled. The only known workaround is to keep
the stack in the internal SRAM. The bug does not affect other chips
with avr5 core.
Therefore, gcc can be left unchanged as the bug is a chip error that
the compiler can not work around.