Improving GCC's line table information to help GDB

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Improving GCC's line table information to help GDB

Luis Machado
Hi,

I'd like to get some feedback from the compiler's side before
implementing a fix for this line numbering problem. I also want to make
sure i fix it in the right tool.

This is related to this bug report in GDB's bugzilla:
https://sourceware.org/bugzilla/show_bug.cgi?id=21221

It deals with the cases where we have loops with empty bodies, empty
headers (for loops) or that simply were written in a single line. This
causes GCC to not emit line transitions in one way or another. As a
consequence, GDB won't see the line transition and will continuously
attempt to step/next until it sees one.

For the end user it appears GDB is stuck in a particular loop, with most
of them hitting ctrl-C to interrupt it. In reality GDB is making
progress in the loop, but it will only stop once it goes out of the
loop, where it will see a line transition.

For the sake of reducing the scope of the problem, I'll assume the loops
are written across multiple lines and that we're interested in O0
debugging. Higher optimization levels would probably reshape the loop or
reduce it to a single instruction in some cases.

Take for example the case of BZ #21221...

int main (void)
{
   while (1)
   {
5    for (unsigned int i = 0U; i < 0xFFFFFU; i++)
6    {
7       ;
8    }
   }
}

GCC generates the following code:

    0x00000000000005fa <+0>: push   %rbp
    0x00000000000005fb <+1>: mov    %rsp,%rbp
    0x00000000000005fe <+4>: movl   $0x0,-0x4(%rbp)
    0x0000000000000605 <+11>: jmp    0x60b <main+17>
    0x0000000000000607 <+13>: addl   $0x1,-0x4(%rbp)
    0x000000000000060b <+17>: cmpl   $0xffffe,-0x4(%rbp)
    0x0000000000000612 <+24>: jbe    0x607 <main+13>
    0x0000000000000614 <+26>: jmp    0x5fe <main+4>

And the line table looks like this:

  Line Number Statements:
   [0x00000047]  Extended opcode 2: set Address to 0x5fa
   [0x00000052]  Special opcode 6: advance Address by 0 to 0x5fa and
Line by 1 to 2
   [0x00000053]  Special opcode 64: advance Address by 4 to 0x5fe and
Line by 3 to 5
   [0x00000054]  Extended opcode 4: set Discriminator to 3
   [0x00000058]  Set is_stmt to 0
   [0x00000059]  Special opcode 131: advance Address by 9 to 0x607 and
Line by 0 to 5
   [0x0000005a]  Extended opcode 4: set Discriminator to 1
   [0x0000005e]  Special opcode 61: advance Address by 4 to 0x60b and
Line by 0 to 5
   [0x0000005f]  Special opcode 131: advance Address by 9 to 0x614 and
Line by 0 to 5
   [0x00000060]  Advance PC by 2 to 0x616
   [0x00000062]  Extended opcode 1: End of Sequence

GCC doesn't generate any code or line number transitions for the empty
loop body, therefore GDB keeps cycling inside this loop, in line 5.

Clang, on the other hand, seems to be a bit smarter about this and will
generate a dummy jump to help the debugger.

Here's Clang's code:

    0x00000000004004a0 <+0>: push   %rbp
    0x00000000004004a1 <+1>: mov    %rsp,%rbp
    0x00000000004004a4 <+4>: movl   $0x0,-0x4(%rbp)
    0x00000000004004ab <+11>: movl   $0x0,-0x8(%rbp)
    0x00000000004004b2 <+18>: cmpl   $0xfffff,-0x8(%rbp)
    0x00000000004004b9 <+25>: jae    0x4004d2 <main+50>
X   0x00000000004004bf <+31>: jmpq   0x4004c4 <main+36>
X   0x00000000004004c4 <+36>: mov    -0x8(%rbp),%eax
    0x00000000004004c7 <+39>: add    $0x1,%eax
    0x00000000004004ca <+42>: mov    %eax,-0x8(%rbp)
    0x00000000004004cd <+45>: jmpq   0x4004b2 <main+18>
    0x00000000004004d2 <+50>: jmpq   0x4004ab <main+11>

X marks the spot where a dummy jump was inserted to aid the debugger.
The line table looks like this:

  Line Number Statements:
   [0x00000070]  Extended opcode 2: set Address to 0x4004a0
   [0x0000007b]  Special opcode 6: advance Address by 0 to 0x4004a0 and
Line by 1 to 2
   [0x0000007c]  Set column to 23
   [0x0000007e]  Set prologue_end to true
   [0x0000007f]  Special opcode 162: advance Address by 11 to 0x4004ab
and Line by 3 to 5
   [0x00000080]  Set column to 33
   [0x00000082]  Set is_stmt to 0
   [0x00000083]  Special opcode 103: advance Address by 7 to 0x4004b2
and Line by 0 to 5
   [0x00000084]  Set column to 5
   [0x00000086]  Set is_stmt to 1
   [0x00000087]  Special opcode 103: advance Address by 7 to 0x4004b9
and Line by 0 to 5
X  [0x00000088]  Special opcode 92: advance Address by 6 to 0x4004bf and
Line by 3 to 8
X  [0x00000089]  Set column to 46
   [0x0000008b]  Special opcode 72: advance Address by 5 to 0x4004c4 and
Line by -3 to 5
   [0x0000008c]  Set column to 5
   [0x0000008e]  Set is_stmt to 0
   [0x0000008f]  Special opcode 131: advance Address by 9 to 0x4004cd
and Line by 0 to 5
   [0x00000090]  Set column to 3
   [0x00000092]  Set is_stmt to 1
   [0x00000093]  Special opcode 73: advance Address by 5 to 0x4004d2 and
Line by -2 to 3
   [0x00000094]  Advance PC by 5 to 0x4004d7
   [0x00000096]  Extended opcode 1: End of Sequence

Again, X marks the spot where we tell the debugger there is a line
transition (from line 5 to line 8), and so step/next execution should end.

I'm inclined to say we should fix this in GCC in a similar way. GDB
relies on the line table information since it can't correctly tell when
we have transitioned to a new source line by looking just at the
instruction stream.

My idea is to create a dummy jump (gimple sounds more appropriate) with
the source location of the last line of the loop body (in this case line
number 8). That would trigger the creation of a new line table entry,
making GDB happy.

Is there a better way to force the compiler to output such a line table
transition without having to resort to a dummy jump? Is there a safer
way to add such transitions without worrying about the optimizer getting
rid of them later on? Should we even worry about preserving such
information for higher optimization levels?

I'll also need a way to store the source location of the last line of
the loop body, since closing braces and friends are ignored by GCC for
code generation purposes. We just consume those tokens without second
thought.

There are other interesting variations, like the following:

int main(void)
{
int var = 0;

   for (;;)
   {
7    var++;
8  }

return 0;
}

In the case above, the debugger gets stuck in line 7. With the proposed
solution it would transition to line 8 and then return to line 7.

Another case is this one:

int main (void)
{
   while (1)
   {
5    for (unsigned int i = 0U; i < 0xFFFFFU; i++)
6       ;
   }
}

Similarly, GDB gets stuck in line 5. With the proposed fix, it would
transition to line 6 before returning to line 5.

Feedback would be greatly appreciated.

Thanks,
Luis
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Richard Biener-2
On Tue, Oct 15, 2019 at 9:58 PM Luis Machado <[hidden email]> wrote:

>
> Hi,
>
> I'd like to get some feedback from the compiler's side before
> implementing a fix for this line numbering problem. I also want to make
> sure i fix it in the right tool.
>
> This is related to this bug report in GDB's bugzilla:
> https://sourceware.org/bugzilla/show_bug.cgi?id=21221
>
> It deals with the cases where we have loops with empty bodies, empty
> headers (for loops) or that simply were written in a single line. This
> causes GCC to not emit line transitions in one way or another. As a
> consequence, GDB won't see the line transition and will continuously
> attempt to step/next until it sees one.
>
> For the end user it appears GDB is stuck in a particular loop, with most
> of them hitting ctrl-C to interrupt it. In reality GDB is making
> progress in the loop, but it will only stop once it goes out of the
> loop, where it will see a line transition.
>
> For the sake of reducing the scope of the problem, I'll assume the loops
> are written across multiple lines and that we're interested in O0
> debugging. Higher optimization levels would probably reshape the loop or
> reduce it to a single instruction in some cases.
>
> Take for example the case of BZ #21221...
>
> int main (void)
> {
>    while (1)
>    {
> 5    for (unsigned int i = 0U; i < 0xFFFFFU; i++)
> 6    {
> 7       ;
> 8    }
>    }
> }
>
> GCC generates the following code:
>
>     0x00000000000005fa <+0>:    push   %rbp
>     0x00000000000005fb <+1>:    mov    %rsp,%rbp
>     0x00000000000005fe <+4>:    movl   $0x0,-0x4(%rbp)
>     0x0000000000000605 <+11>:   jmp    0x60b <main+17>
>     0x0000000000000607 <+13>:   addl   $0x1,-0x4(%rbp)
>     0x000000000000060b <+17>:   cmpl   $0xffffe,-0x4(%rbp)
>     0x0000000000000612 <+24>:   jbe    0x607 <main+13>
>     0x0000000000000614 <+26>:   jmp    0x5fe <main+4>
>
> And the line table looks like this:
>
>   Line Number Statements:
>    [0x00000047]  Extended opcode 2: set Address to 0x5fa
>    [0x00000052]  Special opcode 6: advance Address by 0 to 0x5fa and
> Line by 1 to 2
>    [0x00000053]  Special opcode 64: advance Address by 4 to 0x5fe and
> Line by 3 to 5
>    [0x00000054]  Extended opcode 4: set Discriminator to 3
>    [0x00000058]  Set is_stmt to 0
>    [0x00000059]  Special opcode 131: advance Address by 9 to 0x607 and
> Line by 0 to 5
>    [0x0000005a]  Extended opcode 4: set Discriminator to 1
>    [0x0000005e]  Special opcode 61: advance Address by 4 to 0x60b and
> Line by 0 to 5
>    [0x0000005f]  Special opcode 131: advance Address by 9 to 0x614 and
> Line by 0 to 5
>    [0x00000060]  Advance PC by 2 to 0x616
>    [0x00000062]  Extended opcode 1: End of Sequence
>
> GCC doesn't generate any code or line number transitions for the empty
> loop body, therefore GDB keeps cycling inside this loop, in line 5.
>
> Clang, on the other hand, seems to be a bit smarter about this and will
> generate a dummy jump to help the debugger.
>
> Here's Clang's code:
>
>     0x00000000004004a0 <+0>:    push   %rbp
>     0x00000000004004a1 <+1>:    mov    %rsp,%rbp
>     0x00000000004004a4 <+4>:    movl   $0x0,-0x4(%rbp)
>     0x00000000004004ab <+11>:   movl   $0x0,-0x8(%rbp)
>     0x00000000004004b2 <+18>:   cmpl   $0xfffff,-0x8(%rbp)
>     0x00000000004004b9 <+25>:   jae    0x4004d2 <main+50>
> X   0x00000000004004bf <+31>:   jmpq   0x4004c4 <main+36>
> X   0x00000000004004c4 <+36>:   mov    -0x8(%rbp),%eax
>     0x00000000004004c7 <+39>:   add    $0x1,%eax
>     0x00000000004004ca <+42>:   mov    %eax,-0x8(%rbp)
>     0x00000000004004cd <+45>:   jmpq   0x4004b2 <main+18>
>     0x00000000004004d2 <+50>:   jmpq   0x4004ab <main+11>
>
> X marks the spot where a dummy jump was inserted to aid the debugger.
> The line table looks like this:
>
>   Line Number Statements:
>    [0x00000070]  Extended opcode 2: set Address to 0x4004a0
>    [0x0000007b]  Special opcode 6: advance Address by 0 to 0x4004a0 and
> Line by 1 to 2
>    [0x0000007c]  Set column to 23
>    [0x0000007e]  Set prologue_end to true
>    [0x0000007f]  Special opcode 162: advance Address by 11 to 0x4004ab
> and Line by 3 to 5
>    [0x00000080]  Set column to 33
>    [0x00000082]  Set is_stmt to 0
>    [0x00000083]  Special opcode 103: advance Address by 7 to 0x4004b2
> and Line by 0 to 5
>    [0x00000084]  Set column to 5
>    [0x00000086]  Set is_stmt to 1
>    [0x00000087]  Special opcode 103: advance Address by 7 to 0x4004b9
> and Line by 0 to 5
> X  [0x00000088]  Special opcode 92: advance Address by 6 to 0x4004bf and
> Line by 3 to 8
> X  [0x00000089]  Set column to 46
>    [0x0000008b]  Special opcode 72: advance Address by 5 to 0x4004c4 and
> Line by -3 to 5
>    [0x0000008c]  Set column to 5
>    [0x0000008e]  Set is_stmt to 0
>    [0x0000008f]  Special opcode 131: advance Address by 9 to 0x4004cd
> and Line by 0 to 5
>    [0x00000090]  Set column to 3
>    [0x00000092]  Set is_stmt to 1
>    [0x00000093]  Special opcode 73: advance Address by 5 to 0x4004d2 and
> Line by -2 to 3
>    [0x00000094]  Advance PC by 5 to 0x4004d7
>    [0x00000096]  Extended opcode 1: End of Sequence
>
> Again, X marks the spot where we tell the debugger there is a line
> transition (from line 5 to line 8), and so step/next execution should end.
>
> I'm inclined to say we should fix this in GCC in a similar way. GDB
> relies on the line table information since it can't correctly tell when
> we have transitioned to a new source line by looking just at the
> instruction stream.
>
> My idea is to create a dummy jump (gimple sounds more appropriate) with
> the source location of the last line of the loop body (in this case line
> number 8). That would trigger the creation of a new line table entry,
> making GDB happy.
>
> Is there a better way to force the compiler to output such a line table
> transition without having to resort to a dummy jump? Is there a safer
> way to add such transitions without worrying about the optimizer getting
> rid of them later on? Should we even worry about preserving such
> information for higher optimization levels?
>
> I'll also need a way to store the source location of the last line of
> the loop body, since closing braces and friends are ignored by GCC for
> code generation purposes. We just consume those tokens without second
> thought.
>
> There are other interesting variations, like the following:
>
> int main(void)
> {
> int var = 0;
>
>    for (;;)
>    {
> 7    var++;
> 8  }
>
> return 0;
> }
>
> In the case above, the debugger gets stuck in line 7. With the proposed
> solution it would transition to line 8 and then return to line 7.
>
> Another case is this one:
>
> int main (void)
> {
>    while (1)
>    {
> 5    for (unsigned int i = 0U; i < 0xFFFFFU; i++)
> 6       ;
>    }
> }
>
> Similarly, GDB gets stuck in line 5. With the proposed fix, it would
> transition to line 6 before returning to line 5.
>
> Feedback would be greatly appreciated.

I think that adding an extra jump is unwanted.  Instead - if you disregard
the single-source-line case - there's always the jump and the label we jump
to which might/should get different source locations.  Like in one of the above
cases:

main ()
{
  int D.1803;

  [t.c:2:1] {
    int var;

    [t.c:3:5] var = 0;
    <D.1801>:
    [t.c:7:8] var = var + 1;
    [t.c:7:8] goto <D.1801>;
    [t.c:10:8] D.1803 = 0;
    [t.c:10:8] return D.1803;

seen at GIMPLE.  Of course we lose the label once we build the CFG,
but we retain a goto-locus which we could then put back on the
jump statement.  For this case we at the moment get

.L2:
        .loc 1 7 0 discriminator 1
        addl    $1, -4(%rbp)
        jmp     .L2

and we could do

.L2:
        .loc 1 7 0 discriminator 1
        addl    $1, -4(%rbp)
        .loc 1 5 0
        jmp     .L2

thus assign the "destination" location to the jump instruction?

The first question is of course what happens with the edges
goto_locus at the moment and why we get the code we get.

The above solution might also be a bit odd since for the loop
entry we'd first see line 7 and only after that line 5.  But fixing
that would mean we have to output an extra instruction
(where I'd chose a nop instead of some random extra jump).

Richard.

> Thanks,
> Luis
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Maciej W. Rozycki-5
In reply to this post by Luis Machado
Hi Luis,

> Is there a better way to force the compiler to output such a line table
> transition without having to resort to a dummy jump? Is there a safer
> way to add such transitions without worrying about the optimizer getting
> rid of them later on? Should we even worry about preserving such
> information for higher optimization levels?

 Isn't it exactly what statement frontier notes have been invented for
(and implemented) by Alexandre (cc-ed)?  Or am I confused and/or missing
something here?

  Maciej
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Luis Machado
Hi Maciej,

On 10/16/19 11:11 AM, Maciej W. Rozycki wrote:

> Hi Luis,
>
>> Is there a better way to force the compiler to output such a line table
>> transition without having to resort to a dummy jump? Is there a safer
>> way to add such transitions without worrying about the optimizer getting
>> rid of them later on? Should we even worry about preserving such
>> information for higher optimization levels?
>
>   Isn't it exactly what statement frontier notes have been invented for
> (and implemented) by Alexandre (cc-ed)?  Or am I confused and/or missing
> something here?
>
>    Maciej
>

I'm fresh to this topic. I recall he was working on improving debug
information for optimized binaries, which isn't the case for this
example. But it sounds promising nonetheless.

I'll do some reading.

Thanks,
Luis
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Luis Machado
In reply to this post by Richard Biener-2
On 10/16/19 5:59 AM, Richard Biener wrote:

> I think that adding an extra jump is unwanted.  Instead - if you disregard
> the single-source-line case - there's always the jump and the label we jump
> to which might/should get different source locations.  Like in one of the above
> cases:
>
> main ()
> {
>    int D.1803;
>
>    [t.c:2:1] {
>      int var;
>
>      [t.c:3:5] var = 0;
>      <D.1801>:
>      [t.c:7:8] var = var + 1;
>      [t.c:7:8] goto <D.1801>;
>      [t.c:10:8] D.1803 = 0;
>      [t.c:10:8] return D.1803;
>
> seen at GIMPLE.  Of course we lose the label once we build the CFG,
> but we retain a goto-locus which we could then put back on the
> jump statement.  For this case we at the moment get
>
> .L2:
>          .loc 1 7 0 discriminator 1
>          addl    $1, -4(%rbp)
>          jmp     .L2
>
> and we could do
>
> .L2:
>          .loc 1 7 0 discriminator 1
>          addl    $1, -4(%rbp)
>          .loc 1 5 0
>          jmp     .L2
>
> thus assign the "destination" location to the jump instruction?

On a first look, i considered reusing the jump instruction that did not
get a location assigned to it, but that didn't work right for all cases,
such as the one you showed below with the incorrect line ordering.

>
> The first question is of course what happens with the edges
> goto_locus at the moment and why we get the code we get.
>
> The above solution might also be a bit odd since for the loop
> entry we'd first see line 7 and only after that line 5.  But fixing
> that would mean we have to output an extra instruction
> (where I'd chose a nop instead of some random extra jump).

Right. I wanted to preserve the correct order of execution, at least
from a O0 perspective. A nop would work just as well. I'll give this a try.

I don't think it makes sense to output additional instructions in O1+
cases just because we want to have more debug info, but we do need a new
instruction address in some cases, in order to use it in the line table
for the line transition.

Would it make sense to have it restricted to O0?
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Luis Machado
In reply to this post by Luis Machado


On 10/16/19 11:17 AM, Luis Machado wrote:

> Hi Maciej,
>
> On 10/16/19 11:11 AM, Maciej W. Rozycki wrote:
>> Hi Luis,
>>
>>> Is there a better way to force the compiler to output such a line table
>>> transition without having to resort to a dummy jump? Is there a safer
>>> way to add such transitions without worrying about the optimizer getting
>>> rid of them later on? Should we even worry about preserving such
>>> information for higher optimization levels?
>>
>>   Isn't it exactly what statement frontier notes have been invented for
>> (and implemented) by Alexandre (cc-ed)?  Or am I confused and/or missing
>> something here?
>>
>>    Maciej
>>
>
> I'm fresh to this topic. I recall he was working on improving debug
> information for optimized binaries, which isn't the case for this
> example. But it sounds promising nonetheless.
>
> I'll do some reading.
>
> Thanks,
> Luis

It seems, from reading the blog post about SFN's, that it was meant to
help with debugging optimized binaries.

In my case we're aiming at O0 binaries, as optimizations would likely
transform the loop enough that it wouldn't be fixable anyway.
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Richard Biener-2
In reply to this post by Luis Machado
On Wed, Oct 16, 2019 at 4:55 PM Luis Machado <[hidden email]> wrote:

>
> On 10/16/19 5:59 AM, Richard Biener wrote:
> > I think that adding an extra jump is unwanted.  Instead - if you disregard
> > the single-source-line case - there's always the jump and the label we jump
> > to which might/should get different source locations.  Like in one of the above
> > cases:
> >
> > main ()
> > {
> >    int D.1803;
> >
> >    [t.c:2:1] {
> >      int var;
> >
> >      [t.c:3:5] var = 0;
> >      <D.1801>:
> >      [t.c:7:8] var = var + 1;
> >      [t.c:7:8] goto <D.1801>;
> >      [t.c:10:8] D.1803 = 0;
> >      [t.c:10:8] return D.1803;
> >
> > seen at GIMPLE.  Of course we lose the label once we build the CFG,
> > but we retain a goto-locus which we could then put back on the
> > jump statement.  For this case we at the moment get
> >
> > .L2:
> >          .loc 1 7 0 discriminator 1
> >          addl    $1, -4(%rbp)
> >          jmp     .L2
> >
> > and we could do
> >
> > .L2:
> >          .loc 1 7 0 discriminator 1
> >          addl    $1, -4(%rbp)
> >          .loc 1 5 0
> >          jmp     .L2
> >
> > thus assign the "destination" location to the jump instruction?
>
> On a first look, i considered reusing the jump instruction that did not
> get a location assigned to it, but that didn't work right for all cases,
> such as the one you showed below with the incorrect line ordering.
>
> >
> > The first question is of course what happens with the edges
> > goto_locus at the moment and why we get the code we get.
> >
> > The above solution might also be a bit odd since for the loop
> > entry we'd first see line 7 and only after that line 5.  But fixing
> > that would mean we have to output an extra instruction
> > (where I'd chose a nop instead of some random extra jump).
>
> Right. I wanted to preserve the correct order of execution, at least
> from a O0 perspective. A nop would work just as well. I'll give this a try.
>
> I don't think it makes sense to output additional instructions in O1+
> cases just because we want to have more debug info, but we do need a new
> instruction address in some cases, in order to use it in the line table
> for the line transition.
>
> Would it make sense to have it restricted to O0?

Generation of an extra NOP?  Sure.  For O1+ we may want to give
the jump a different location though?

Richard.
Reply | Threaded
Open this post in threaded view
|

Re: Improving GCC's line table information to help GDB

Alexandre Oliva-3
In reply to this post by Luis Machado
On Oct 16, 2019, Luis Machado <[hidden email]> wrote:

> It seems, from reading the blog post about SFN's, that it was meant to
> help with debugging optimized binaries.

Indeed.  Getting rid of the dummy jumps would be one kind of
optimization, and then SFN might help preserve some of the loss of
location info in some cases.  However, SFN doesn't kick in at -O0
because the dummy jumps and all other artifacts of unoptimized code are
retained anyway, so SFN wouldn't have a chance to do any of the good
it's meant to do there.

--
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free!        FSF VP & FSF Latin America board member
GNU Toolchain Engineer                        Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara