Help with bug in GCC garbage collector

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with bug in GCC garbage collector

Steve Ellcey-10
I was wondering if anyone could help me investigate a bug I am seeing
in the GCC garbage collector.  This bug (which may or may not be PR
89179) is causing a segfault in GCC, but when I try to create a
preprocessed source file, the bug doesn't trigger.  The problem is with
the garbage collector trying to mark some memory that has already been
freed.  I have tracked down the initial allocation to:

symbol_table::allocate_cgraph_symbol

It has:

        node = ggc_cleared_alloc<cgraph_node> ();

to allocate a cgraph node.  With the GGC debugging on I see this
allocated:

        Allocating object, requested size=360, actual=360 at 0xffff7029c210 on 0x41b148c0

then freed:

        Freeing object, actual size=360, at 0xffff7029c210 on 0x41b148c0

And then later, while the garbage collector is marking nodes, I see:

        Marking 0xffff7029c210

The garbage collector shouldn't be marking this node if has already
been freed.

So I guess my main question is how do I figure out how the garbage
collector got to this memory location?  I am guessing some GTY pointer
is still pointing to it and hadn't got nulled out when the memory was
freed.  Does that seem like the most likely cause?

I am not sure why I am only running into this with one particular
application on my Aarch64 platform.  I am building it with -fopenmp,
which could have something to do with it (though there are no simd functions in the application).  The application is not that large as C++ programs go.

Steve Ellcey
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Help with bug in GCC garbage collector

Jeff Law
On 8/19/19 4:59 PM, Steve Ellcey wrote:

> I was wondering if anyone could help me investigate a bug I am
> seeing in the GCC garbage collector.  This bug (which may or may not
> be PR 89179) is causing a segfault in GCC, but when I try to create
> a preprocessed source file, the bug doesn't trigger.  The problem is
> with the garbage collector trying to mark some memory that has
> already been freed.  I have tracked down the initial allocation to:
>
> symbol_table::allocate_cgraph_symbol
>
> It has:
>
> node = ggc_cleared_alloc<cgraph_node> ();
>
> to allocate a cgraph node.  With the GGC debugging on I see this
> allocated:
>
> Allocating object, requested size=360, actual=360 at 0xffff7029c210
> on 0x41b148c0
>
> then freed:
>
> Freeing object, actual size=360, at 0xffff7029c210 on 0x41b148c0
>
> And then later, while the garbage collector is marking nodes, I see:
>
> Marking 0xffff7029c210
>
> The garbage collector shouldn't be marking this node if has already
> been freed.
>
> So I guess my main question is how do I figure out how the garbage
> collector got to this memory location?  I am guessing some GTY
> pointer is still pointing to it and hadn't got nulled out when the
> memory was freed.  Does that seem like the most likely cause?
>
> I am not sure why I am only running into this with one particular
> application on my Aarch64 platform.  I am building it with -fopenmp,
> which could have something to do with it (though there are no simd
> functions in the application).  The application is not that large as
> C++ programs go.
There's a real good chance Martin Liska has already fixed this.  He's
made a couple fixes in the last week or so with the interactions between
the GC system and the symbol tables.


2019-08-15  Martin Liska  <[hidden email]>

        PR ipa/91404
        * passes.c (order): Remove.
        (uid_hash_t): Likewise).
        (remove_cgraph_node_from_order): Remove from set
        of pointers (cgraph_node *).
        (insert_cgraph_node_to_order): New.
        (duplicate_cgraph_node_to_order): New.
        (do_per_function_toporder): Register all 3 cgraph hooks.
        Skip removed_nodes now as we know about all of them.


The way I'd approach would be to configure a compiler with
--enable-checking=gc,gcac, just build it through stage1.  Then run your
test through that compiler which should fail.  THen apply Martin's patch
(or update to the head of the trunk), rebuild the stage1 compiler and
verify it works.


jeff
Reply | Threaded
Open this post in threaded view
|

Re: Help with bug in GCC garbage collector

Steve Ellcey-10
On Mon, 2019-08-19 at 17:05 -0600, Jeff Law wrote:

>
> There's a real good chance Martin Liska has already fixed this.  He's
> made a couple fixes in the last week or so with the interactions
> between
> the GC system and the symbol tables.
>
>
> 2019-08-15  Martin Liska  <[hidden email]>
>
>         PR ipa/91404
>         * passes.c (order): Remove.
>         (uid_hash_t): Likewise).
>         (remove_cgraph_node_from_order): Remove from set
>         of pointers (cgraph_node *).
>         (insert_cgraph_node_to_order): New.
>         (duplicate_cgraph_node_to_order): New.
>         (do_per_function_toporder): Register all 3 cgraph hooks.
>         Skip removed_nodes now as we know about all of them.
>
>
> The way I'd approach would be to configure a compiler with
> --enable-checking=gc,gcac, just build it through stage1.  Then run your
> test through that compiler which should fail.  THen apply Martin's patch
> (or update to the head of the trunk), rebuild the stage1 compiler and
> verify it works.

I had already built a compiler with --enable-checking=gc,gcac, that did
not catch the bug (I still got a segfault).  I did update my sources
though and the bug does not happen at ToT so it looks like Martin's
patch did fix my bug.

Steve Ellcey
[hidden email]