LTO+profiled enabled builds

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

LTO+profiled enabled builds

Matthias Klose-6
I'm running into some issues building LTO+profiled enabled configurations in
some constrained build environment called buildds, having four cores and 16GB of
RAM.

configured for all frontends (maximum number of LTO links) and configured with

  --enable-bootstrap \
  --with-build-config=bootstrap-lto-lean \
  --enable-link-mutex

and building the make profiledbootstrap-lean target.

Most builds time out after 150 minutes.

A typical LTO link runs for around one minute on this hardware, however a LTO
link with -fprofile-use runs for up to three hours.

So gcc/lock-and-run.sh runs the first lto-link, waits for all other 300 seconds,
then removes the "stale" locks, and runs everything in parallel ...  Which
surprisingly goes well, because -flto=jobserver is in effect, so I don't see any
memory constraints yet.

The machine then starts building all front-ends, but apparently is not
overloaded, as -flto=jobserver is in effect.  However there is no output, and
that triggers the timeout. Richi mentioned on IRC that the LTO links only have
buffered output (unless you run in debug mode), and that is only emitted once
the link finishes.  However even with unbuffered output, there could be times
when nothing is happening, no warnings?

I'm currently experimenting with a modified lock-and-run.sh, which basically
sets the delay for releasing the "stale" locks to 30min instead of 5 min, runs
the LTO link in the background and checks for the status of the background job,
emitting some "running ..." messages while not finished.  Still adjusting some
parameters, but at least that succeeds on some of my configurations.

The locking mechanism was introduced in 2013,
https://gcc.gnu.org/ml/gcc-patches/2013-05/msg00001.html

lock-and-run.sh should probably modified not to release the "stale" locks based
on a fixed timeout value. How?

While the "no-output" problem can be fixed in the lock script as well
(attached), this doesn't apply to third party apps.  Having unbuffered output
and/or an option to print progress would be beneficial.

Matthias




lock-and-run.sh (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: LTO+profiled enabled builds

Jason Merrill
How does this do for you?

On Thu, Jul 4, 2019 at 7:15 AM Matthias Klose <[hidden email]> wrote:

>
> I'm running into some issues building LTO+profiled enabled configurations in
> some constrained build environment called buildds, having four cores and 16GB of
> RAM.
>
> configured for all frontends (maximum number of LTO links) and configured with
>
>   --enable-bootstrap \
>   --with-build-config=bootstrap-lto-lean \
>   --enable-link-mutex
>
> and building the make profiledbootstrap-lean target.
>
> Most builds time out after 150 minutes.
>
> A typical LTO link runs for around one minute on this hardware, however a LTO
> link with -fprofile-use runs for up to three hours.
>
> So gcc/lock-and-run.sh runs the first lto-link, waits for all other 300 seconds,
> then removes the "stale" locks, and runs everything in parallel ...  Which
> surprisingly goes well, because -flto=jobserver is in effect, so I don't see any
> memory constraints yet.
>
> The machine then starts building all front-ends, but apparently is not
> overloaded, as -flto=jobserver is in effect.  However there is no output, and
> that triggers the timeout. Richi mentioned on IRC that the LTO links only have
> buffered output (unless you run in debug mode), and that is only emitted once
> the link finishes.  However even with unbuffered output, there could be times
> when nothing is happening, no warnings?
>
> I'm currently experimenting with a modified lock-and-run.sh, which basically
> sets the delay for releasing the "stale" locks to 30min instead of 5 min, runs
> the LTO link in the background and checks for the status of the background job,
> emitting some "running ..." messages while not finished.  Still adjusting some
> parameters, but at least that succeeds on some of my configurations.
>
> The locking mechanism was introduced in 2013,
> https://gcc.gnu.org/ml/gcc-patches/2013-05/msg00001.html
>
> lock-and-run.sh should probably modified not to release the "stale" locks based
> on a fixed timeout value. How?
>
> While the "no-output" problem can be fixed in the lock script as well
> (attached), this doesn't apply to third party apps.  Having unbuffered output
> and/or an option to print progress would be beneficial.
>
> Matthias
>
>
>

lock-check-pid.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: LTO+profiled enabled builds

Jason Merrill
Have you had a chance to try this?

On Sat, Sep 14, 2019 at 2:39 PM Jason Merrill <[hidden email]> wrote:

>
> How does this do for you?
>
> On Thu, Jul 4, 2019 at 7:15 AM Matthias Klose <[hidden email]> wrote:
> >
> > I'm running into some issues building LTO+profiled enabled configurations in
> > some constrained build environment called buildds, having four cores and 16GB of
> > RAM.
> >
> > configured for all frontends (maximum number of LTO links) and configured with
> >
> >   --enable-bootstrap \
> >   --with-build-config=bootstrap-lto-lean \
> >   --enable-link-mutex
> >
> > and building the make profiledbootstrap-lean target.
> >
> > Most builds time out after 150 minutes.
> >
> > A typical LTO link runs for around one minute on this hardware, however a LTO
> > link with -fprofile-use runs for up to three hours.
> >
> > So gcc/lock-and-run.sh runs the first lto-link, waits for all other 300 seconds,
> > then removes the "stale" locks, and runs everything in parallel ...  Which
> > surprisingly goes well, because -flto=jobserver is in effect, so I don't see any
> > memory constraints yet.
> >
> > The machine then starts building all front-ends, but apparently is not
> > overloaded, as -flto=jobserver is in effect.  However there is no output, and
> > that triggers the timeout. Richi mentioned on IRC that the LTO links only have
> > buffered output (unless you run in debug mode), and that is only emitted once
> > the link finishes.  However even with unbuffered output, there could be times
> > when nothing is happening, no warnings?
> >
> > I'm currently experimenting with a modified lock-and-run.sh, which basically
> > sets the delay for releasing the "stale" locks to 30min instead of 5 min, runs
> > the LTO link in the background and checks for the status of the background job,
> > emitting some "running ..." messages while not finished.  Still adjusting some
> > parameters, but at least that succeeds on some of my configurations.
> >
> > The locking mechanism was introduced in 2013,
> > https://gcc.gnu.org/ml/gcc-patches/2013-05/msg00001.html
> >
> > lock-and-run.sh should probably modified not to release the "stale" locks based
> > on a fixed timeout value. How?
> >
> > While the "no-output" problem can be fixed in the lock script as well
> > (attached), this doesn't apply to third party apps.  Having unbuffered output
> > and/or an option to print progress would be beneficial.
> >
> > Matthias
> >
> >
> >