Traps for signed arithmetic overflow

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Traps for signed arithmetic overflow

Helmut Eller
Hello,

when compiling this example with gcc -O2 -ftrapv:

  long foo (long x, long y) { return x + y; }
 
  long bar (long x, long y) {
    long z;
    if (__builtin_add_overflow (x, y, &z))
      __builtin_trap ();
    return z;
  }

then GCC seems to produce less efficient code for foo than for bar:

  foo:
        subq    $8, %rsp
        call    __addvdi3@PLT
        addq    $8, %rsp
        ret

  bar:
        movq    %rdi, %rax
        addq    %rsi, %rax
        jo      .L9
        rep ret
  .L9:
        ud2

I see several inefficiencies:

1.) __addvdi3 is not inlined.

2.) %rsp is adjusted before calling __addvdi3.  Why is that needed?

3.) Obviously __addvdi3 is not implemented as sibling-call even though
    -O2 should enable that.

Where should I start, if I wanted to teach GCC how to produce the same
code for foo as for bar?  Would it be enough to add a pattern to
i386.md?  There is already a pattern for "addv<mode>4", but apparently
it's not used in this case.

Helmut

Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Marc Glisse-6
On Fri, 23 Nov 2018, Helmut Eller wrote:

> Hello,
>
> when compiling this example with gcc -O2 -ftrapv:
>
>  long foo (long x, long y) { return x + y; }
>
>  long bar (long x, long y) {
>    long z;
>    if (__builtin_add_overflow (x, y, &z))
>      __builtin_trap ();
>    return z;
>  }
>
> then GCC seems to produce less efficient code for foo than for bar:
>
>  foo:
>        subq    $8, %rsp
>        call    __addvdi3@PLT
>        addq    $8, %rsp
>        ret
>
>  bar:
>        movq    %rdi, %rax
>        addq    %rsi, %rax
>        jo      .L9
>        rep ret
>  .L9:
>        ud2
>
> I see several inefficiencies:
>
> 1.) __addvdi3 is not inlined.
>
> 2.) %rsp is adjusted before calling __addvdi3.  Why is that needed?
>
> 3.) Obviously __addvdi3 is not implemented as sibling-call even though
>    -O2 should enable that.
>
> Where should I start, if I wanted to teach GCC how to produce the same
> code for foo as for bar?  Would it be enough to add a pattern to
> i386.md?  There is already a pattern for "addv<mode>4", but apparently
> it's not used in this case.

Try with "-fsanitize=undefined -fsanitize-undefined-trap-on-error".
-ftrapv is quite broken and not really maintained. I think fixing it would
mean making it an alias for the sanitizer version.

--
Marc Glisse
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Segher Boessenkool
In reply to this post by Helmut Eller
Hi!

On Fri, Nov 23, 2018 at 09:01:56PM +0100, Helmut Eller wrote:

> when compiling this example with gcc -O2 -ftrapv:
>
>   long foo (long x, long y) { return x + y; }
>  
>   long bar (long x, long y) {
>     long z;
>     if (__builtin_add_overflow (x, y, &z))
>       __builtin_trap ();
>     return z;
>   }
>
> then GCC seems to produce less efficient code for foo than for bar:
>
>   foo:
>         subq    $8, %rsp
>         call    __addvdi3@PLT
>         addq    $8, %rsp
>         ret
>
>   bar:
>         movq    %rdi, %rax
>         addq    %rsi, %rax
>         jo      .L9
>         rep ret
>   .L9:
>         ud2
>
> I see several inefficiencies:
>
> 1.) __addvdi3 is not inlined.

It is implemented in libgcc.  The x86 target code does not handle addvdi3,
only addvdi4 (3 calls abort, 4 jumps to its 4th arg).

> 2.) %rsp is adjusted before calling __addvdi3.  Why is that needed?

To keep the stack aligned (to 16 bytes).

> 3.) Obviously __addvdi3 is not implemented as sibling-call even though
>     -O2 should enable that.

It calls via the PLT, do sibling calls via the PLT work in your ABI?

> Where should I start, if I wanted to teach GCC how to produce the same
> code for foo as for bar?  Would it be enough to add a pattern to
> i386.md?  There is already a pattern for "addv<mode>4", but apparently
> it's not used in this case.

As Marc says, -ftrapv is probably not the way to go.

Adding an addv<mode>3 to the i386 backend might help.

You do *not* want exactly the same code, btw; addv3 calls abort on
overflow, that's not the same as executing an ud2 instruction.


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Helmut Eller
In reply to this post by Marc Glisse-6
On Fri, Nov 23 2018, Marc Glisse wrote:

> On Fri, 23 Nov 2018, Helmut Eller wrote:
>
>> when compiling this example with gcc -O2 -ftrapv:
>> [...]
> Try with "-fsanitize=undefined
> -fsanitize-undefined-trap-on-error". -ftrapv is quite broken and not
> really maintained. I think fixing it would mean making it an alias for
> the sanitizer version.

Thanks for the quick reply.  That does indeed produce exactly the code I
want.

It doesn't seem so detect overflow for short or char types, though:

  typedef signed short si;
 
  si foo (si x, si y) { return x + y; }
 
  si bar (si x, si y) {
    si z;
    if (__builtin_add_overflow (x, y, &z))
      __builtin_trap ();
    return z;
  }

compiled with either -O2 -fsanitize=signed-integer-overflow
-fsanitize-undefined-trap-on-error or -ftrapv produces:

  foo:
  leal (%rsi,%rdi), %eax
  ret
 
  bar:
  movswl %di, %eax
  movswl %si, %edi
  addw %di, %ax
  jo .L8
  rep ret
  .L8:
  ud2

For __int128 it seems to be detected again, but the code seems very
long-winded.

Helmut
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Marc Glisse-6
On Fri, 23 Nov 2018, Helmut Eller wrote:

> On Fri, Nov 23 2018, Marc Glisse wrote:
>
>> On Fri, 23 Nov 2018, Helmut Eller wrote:
>>
>>> when compiling this example with gcc -O2 -ftrapv:
>>> [...]
>> Try with "-fsanitize=undefined
>> -fsanitize-undefined-trap-on-error". -ftrapv is quite broken and not
>> really maintained. I think fixing it would mean making it an alias for
>> the sanitizer version.
>
> Thanks for the quick reply.  That does indeed produce exactly the code I
> want.
>
> It doesn't seem so detect overflow for short or char types, though:

That's because there is no overflow. Read about integer promotion.

> For __int128 it seems to be detected again, but the code seems very
> long-winded.

Feel free to search for an existing report in gcc's bugzilla and file a
new one if you can't find it, if there is a missed optimization.

--
Marc Glisse
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Helmut Eller
In reply to this post by Segher Boessenkool
On Fri, Nov 23 2018, Segher Boessenkool wrote:

> On Fri, Nov 23, 2018 at 09:01:56PM +0100, Helmut Eller wrote:
>> 2.) %rsp is adjusted before calling __addvdi3.  Why is that needed?
>
> To keep the stack aligned (to 16 bytes).

Hmm...  I guess %rsp is initially 16 byte aligned and we need to add 8
because the call instruction also pushes the return address.

>> 3.) Obviously __addvdi3 is not implemented as sibling-call even though
>>     -O2 should enable that.
>
> It calls via the PLT, do sibling calls via the PLT work in your ABI?

I think so.  At least

  extern long f (long x, long y);
  long g (long x, long y) { return f(x, y); }

is compiled to:

  g:
        jmp f@PLT

>> Where should I start, if I wanted to teach GCC how to produce the same
>> code for foo as for bar?  Would it be enough to add a pattern to
>> i386.md?  There is already a pattern for "addv<mode>4", but apparently
>> it's not used in this case.
>
> As Marc says, -ftrapv is probably not the way to go.
>
> Adding an addv<mode>3 to the i386 backend might help.

Maybe I'll try it; out of interest.

> You do *not* want exactly the same code, btw; addv3 calls abort on
> overflow, that's not the same as executing an ud2 instruction.

Hmm, yes.  I would prefer ud2, as that is more compact.

Helmut
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Liu Hao-2
In reply to this post by Marc Glisse-6
在 2018/11/24 5:36, Marc Glisse 写道:

> On Fri, 23 Nov 2018, Helmut Eller wrote:
>
>> On Fri, Nov 23 2018, Marc Glisse wrote:
>>
>>> On Fri, 23 Nov 2018, Helmut Eller wrote:
>>>
>>>> when compiling this example with gcc -O2 -ftrapv:
>>>> [...]
>>> Try with "-fsanitize=undefined
>>> -fsanitize-undefined-trap-on-error". -ftrapv is quite broken and not
>>> really maintained. I think fixing it would mean making it an alias for
>>> the sanitizer version.
>>
>> Thanks for the quick reply.  That does indeed produce exactly the code I
>> want.
>>
>> It doesn't seem so detect overflow for short or char types, though:
>
> That's because there is no overflow. Read about integer promotion.

I think I had better say something about this.

You are right; there is no overflow. But who needs such apparently
deficient (if not incomplete at all) behavior? People could start
arguing that integer promotion is mandated by the C standard and is what
'portable' code might rely on. Yes you are right, totally right about a
_very_ poor excuse: The so-called 'portable' code should not rely on the `int`
-> `short` conversion either, which is implementation-defined and not so
portable after all. It is a shame for C to have integer promotion. It is
a flaw for UBSan to discriminate overflows on `short`.


--
Best regards,
LH_Mouse
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Traps for signed arithmetic overflow

Roman Lebedev
On Sat, Nov 24, 2018 at 4:34 PM Liu Hao via cfe-dev
<[hidden email]> wrote:

>
> 在 2018/11/24 5:36, Marc Glisse 写道:
> > On Fri, 23 Nov 2018, Helmut Eller wrote:
> >
> >> On Fri, Nov 23 2018, Marc Glisse wrote:
> >>
> >>> On Fri, 23 Nov 2018, Helmut Eller wrote:
> >>>
> >>>> when compiling this example with gcc -O2 -ftrapv:
> >>>> [...]
> >>> Try with "-fsanitize=undefined
> >>> -fsanitize-undefined-trap-on-error". -ftrapv is quite broken and not
> >>> really maintained. I think fixing it would mean making it an alias for
> >>> the sanitizer version.
> >>
> >> Thanks for the quick reply.  That does indeed produce exactly the code I
> >> want.
> >>
> >> It doesn't seem so detect overflow for short or char types, though:
> >
> > That's because there is no overflow. Read about integer promotion.
>
> I think I had better say something about this.
>
> You are right; there is no overflow. But who needs such apparently
> deficient (if not incomplete at all) behavior? People could start
> arguing that integer promotion is mandated by the C standard and is what
> 'portable' code might rely on. Yes you are right, totally right about a
> _very_ poor excuse: The so-called 'portable' code should not rely on the `int`
> -> `short` conversion either, which is implementation-defined and not so
> portable after all. It is a shame for C to have integer promotion.

> It is a flaw for UBSan to discriminate overflows on `short`.
It is not an ubsan flaw.
There is no overflow happening, therefore it has nothing to report.

First, the sub-int types are promoted to int-sized types, then
those promoted values are added (and this addition is what the ubsan sanitizes),
and then the result is implicitly cast back to the original type.

Naturally, sometimes that is lossy. That is indeed an issue.
There are several bugs about this:
https://bugs.llvm.org/show_bug.cgi?id=21530
https://github.com/google/sanitizers/issues/940

If you look at the release notes, you may see that clang-7 has
gained -fsanitize=implicit-conversion (https://reviews.llvm.org/D48958),
which *does* detect just *this* kind of issues. Yay.

TBN, the version in clang-7 is *very* partial (only detects lossy
truncations), so if you want to use that,
you should use trunk, in which more cases are now detected - sign
changes (https://reviews.llvm.org/D50250)
compound assignment operations (https://reviews.llvm.org/D53949).

Though trunk still does not catch everything - i still need to look
into unary inc/dec, bitfields; atomics.

> --
> Best regards,
> LH_Mouse
Roman.

> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Segher Boessenkool
In reply to this post by Liu Hao-2
On Sat, Nov 24, 2018 at 09:33:20PM +0800, Liu Hao wrote:

> 在 2018/11/24 5:36, Marc Glisse 写道:
> > On Fri, 23 Nov 2018, Helmut Eller wrote:
> >> On Fri, Nov 23 2018, Marc Glisse wrote:
> >>> On Fri, 23 Nov 2018, Helmut Eller wrote:
> >>>> when compiling this example with gcc -O2 -ftrapv:
> >>>> [...]
> >>> Try with "-fsanitize=undefined
> >>> -fsanitize-undefined-trap-on-error". -ftrapv is quite broken and not
> >>> really maintained. I think fixing it would mean making it an alias for
> >>> the sanitizer version.
> >>
> >> Thanks for the quick reply.  That does indeed produce exactly the code I
> >> want.
> >>
> >> It doesn't seem so detect overflow for short or char types, though:
> >
> > That's because there is no overflow. Read about integer promotion.
>
> I think I had better say something about this.
>
> You are right; there is no overflow. But who needs such apparently
> deficient (if not incomplete at all) behavior? People could start
> arguing that integer promotion is mandated by the C standard and is what
> 'portable' code might rely on. Yes you are right, totally right about a
> _very_ poor excuse: The so-called 'portable' code should not rely on the `int`
> -> `short` conversion either, which is implementation-defined and not so
> portable after all. It is a shame for C to have integer promotion. It is
> a flaw for UBSan to discriminate overflows on `short`.

-fsanitize=undefined instruments undefined behaviour.  This isn't undefined
behaviour.  Also, both -fsanitize=signed-integer-overflow and -ftrapv are
documented to only do things for addition, subtraction, and multiplication
(not conversion).

If you want a warning for implementation-defined behaviour, sure, not many
people will oppose that (it will warn all of the time, making it not very
useful, but hey).  Still, it should be a separate option.  Implementation-
defined behaviour is not undefined, after all.


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Vincent Lefevre-3
On 2018-11-24 08:26:39 -0600, Segher Boessenkool wrote:
> -fsanitize=undefined instruments undefined behaviour.  This isn't undefined
> behaviour.  Also, both -fsanitize=signed-integer-overflow and -ftrapv are
> documented to only do things for addition, subtraction, and multiplication
> (not conversion).
>
> If you want a warning for implementation-defined behaviour, sure, not many
> people will oppose that (it will warn all of the time, making it not very
> useful, but hey).  Still, it should be a separate option.  Implementation-
> defined behaviour is not undefined, after all.

It is a design flaw in GCC, which should have chosen the
"implementation-defined signal" solution, as allowed by the
C standard. This would be much more secure.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Reply | Threaded
Open this post in threaded view
|

Re: Traps for signed arithmetic overflow

Jonathan Wakely-4
On Mon, 26 Nov 2018 at 13:52, Vincent Lefevre <[hidden email]> wrote:

>
> On 2018-11-24 08:26:39 -0600, Segher Boessenkool wrote:
> > -fsanitize=undefined instruments undefined behaviour.  This isn't undefined
> > behaviour.  Also, both -fsanitize=signed-integer-overflow and -ftrapv are
> > documented to only do things for addition, subtraction, and multiplication
> > (not conversion).
> >
> > If you want a warning for implementation-defined behaviour, sure, not many
> > people will oppose that (it will warn all of the time, making it not very
> > useful, but hey).  Still, it should be a separate option.  Implementation-
> > defined behaviour is not undefined, after all.
>
> It is a design flaw in GCC, which should have chosen the
> "implementation-defined signal" solution, as allowed by the
> C standard. This would be much more secure.

As allowed by C99 and later, which was after GCC chose it's
implementation-defined behaviour for those conversions. And raising a
signal here would not be appreciated by all developers.

Adding a conversion sanitizer seems like a good solution, as it allows
optional checking, when the developer requests it. The
implementation-defined behaviour doesn't need to raise a signal that
way.