[Bug fortran/91778] New: gfortran GCC9 optimizer bug

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug fortran/91778] New: gfortran GCC9 optimizer bug

rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91778

            Bug ID: 91778
           Summary: gfortran GCC9 optimizer bug
           Product: gcc
           Version: 9.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mark.wieczorek at oca dot eu
  Target Milestone: ---

I am writing about a possible bug in the gfortran GCC9 optimizer on macOS
(installed via brew).

Before going into the details, I note that my code (SHTOOLS/pyshtools) is
widely used on many platforms and compilers. My code works with GCC8 compiled
with optimizations "-O" or "-O3", and it works fine with GCC9 when compiled
_without_ optimizations. I was able to "fix" my code to work with GCC9, but I
feel that what I am doing is avoiding a bug in the GCC9 optimizer, and that I
am not in fact "fixing" my code (perhaps I am wrong...).

The problem is related to using the FFTW3 library, which is the most widely
used FFT library for scientific computing. If this is a bug, then others will
probably encounter similar problems. As my code is somewhat long (and given the
lack of time I have now), I will just give you a summary of two problems. If
necessary, I could try to write a "small" example that reproduces these
problems when I have more free time later.

I start by describing how FFTW routines are use. First, you initialize the FFT
operation and get pointers to all the input and output arrays, which are stored
in the variable "plan":

    call dfftw_plan_dft_c2r_1d(plan, nlong, coef, grid)

Then you perform the FFT simply by calling

    call dfftw_execute(plan)

The first problem boils down to this:

    call dfftw_plan_dft_c2r_1d(plan, nlong, coef, grid)

    coef(1) = dcmplx(coef0,0.0d0) ! A
    coef(2:lmax_comp+1) = coef(2:lmax_comp+1) / 2.0d0

    call dfftw_execute(plan) ! AA
    gridglq(i,1:nlong) = grid(1:nlong)

    coef(1) = dcmplx(coef0s,0.0d0) ! B
    coef(2:lmax_comp+1) = coefs(2:lmax_comp+1)/2.0d0

    call dfftw_execute(plan) ! BB
    gridglq(i_s,1:nlong) = grid(1:nlong)


The problem is that the optimizer thinks the line A is redundant with line B
(the same variable is being defined twice). Thus, the optimizer sets line A to
that of line B and deletes line B. I have verified this by doing so in my code.
However, line A is necessary to execute line AA, and line B is necessary to
execute line BB. The optimizer probably doesn't realize this because the
variable "coef" is not explicitly included when calling the function
dfftw_execute(plan).

The second problem I encountered is a little more mysterious. These are the
_last_ 4 lines of the subroutine:

    coef(lmax_comp+1) = coef(lmax_comp+1) + cilm(1,lmax_comp+1,lmax_comp+1)
    coef(nlong-(lmax_comp-1)) = coef(nlong-(lmax_comp-1)) &
                                + cilm(2,lmax_comp+1,lmax_comp+1)

    call dfftw_execute(plan)

    griddh(i_eq,1:nlong) = grid(1:nlong)

The problem is that the optimizer ignores the first two lines. The reason for
this is probably because (1) the variable coef is not explicitly noted in the
fftw call, and (2) the variable coef is not output in the subroutine. Thus, the
optimizer probably thinks that it doesn't need to compute the first two lines

So, in summary, I believe that the GCC9 optimizer is not working correctly
because it doesn't realize that the function call dfftw_execute(plan) actually
depends on the variables coef and grid. Given that my code has worked well with
all other versions of GCC, I suspect that there has been a change in how the
optimizer works.
Reply | Threaded
Open this post in threaded view
|

[Bug fortran/91778] gfortran GCC9 optimizer bug

rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91778

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Are you using c bindings to bind to fftw functions?
Reply | Threaded
Open this post in threaded view
|

[Bug fortran/91778] gfortran GCC9 optimizer bug

rguenth at gcc dot gnu.org
In reply to this post by rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91778

kargl at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2019-09-16
                 CC|                            |kargl at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #2 from kargl at gcc dot gnu.org ---
Need a reproducer.

It would also be beneficial to know what happens when
your code is compiled with -Wall -Werror -fcheck=all
-ffpe-trap=invalid,zero
Reply | Threaded
Open this post in threaded view
|

[Bug fortran/91778] gfortran GCC9 optimizer bug

rguenth at gcc dot gnu.org
In reply to this post by rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91778

Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2019-09-16 00:00:00         |
                 CC|                            |tkoenig at gcc dot gnu.org

--- Comment #3 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
(In reply to Mark Wieczorek from comment #0)

> I am writing about a possible bug in the gfortran GCC9 optimizer on macOS
> (installed via brew).
>
> Before going into the details, I note that my code (SHTOOLS/pyshtools) is
> widely used on many platforms and compilers. My code works with GCC8
> compiled with optimizations "-O" or "-O3", and it works fine with GCC9 when
> compiled _without_ optimizations. I was able to "fix" my code to work with
> GCC9, but I feel that what I am doing is avoiding a bug in the GCC9
> optimizer, and that I am not in fact "fixing" my code (perhaps I am
> wrong...).
>
> The problem is related to using the FFTW3 library, which is the most widely
> used FFT library for scientific computing. If this is a bug, then others
> will probably encounter similar problems. As my code is somewhat long (and
> given the lack of time I have now), I will just give you a summary of two
> problems. If necessary, I could try to write a "small" example that
> reproduces these problems when I have more free time later.

If it turns out that this is needed, please do.  However...

> I start by describing how FFTW routines are use. First, you initialize the
> FFT operation and get pointers to all the input and output arrays, which are
> stored in the variable "plan":
>
>     call dfftw_plan_dft_c2r_1d(plan, nlong, coef, grid)

This sounds very suspicious. According to the Fortran standard, you
cannot stash away a pointer to a Fortran array unless that array
is marked as TARGET. Well, you can, but it's liable to break any time,
and apparently it did.

Can you show the declaration of dfftw_plan_dft_c2r_1d ?

> Then you perform the FFT simply by calling
>
>     call dfftw_execute(plan)
>
> The first problem boils down to this:
>
>     call dfftw_plan_dft_c2r_1d(plan, nlong, coef, grid)
>
>     coef(1) = dcmplx(coef0,0.0d0) ! A
>     coef(2:lmax_comp+1) = coef(2:lmax_comp+1) / 2.0d0
>
>     call dfftw_execute(plan) ! AA
>     gridglq(i,1:nlong) = grid(1:nlong)
>
>     coef(1) = dcmplx(coef0s,0.0d0) ! B
>     coef(2:lmax_comp+1) = coefs(2:lmax_comp+1)/2.0d0
>
>     call dfftw_execute(plan) ! BB
>     gridglq(i_s,1:nlong) = grid(1:nlong)
>
>
> The problem is that the optimizer thinks the line A is redundant with line B
> (the same variable is being defined twice).

And that is correct behavior.

Try marking coef as TARGET or VOLATILE, this should inhibit this
optimization.


> The second problem I encountered is a little more mysterious. These are the
> _last_ 4 lines of the subroutine:
>
>     coef(lmax_comp+1) = coef(lmax_comp+1) + cilm(1,lmax_comp+1,lmax_comp+1)
>     coef(nlong-(lmax_comp-1)) = coef(nlong-(lmax_comp-1)) &
>                                 + cilm(2,lmax_comp+1,lmax_comp+1)
>
>     call dfftw_execute(plan)
>
>     griddh(i_eq,1:nlong) = grid(1:nlong)
>
> The problem is that the optimizer ignores the first two lines. The reason
> for this is probably because (1) the variable coef is not explicitly noted
> in the fftw call, and (2) the variable coef is not output in the subroutine.
> Thus, the optimizer probably thinks that it doesn't need to compute the
> first two lines

Sounds reasonable.

> So, in summary, I believe that the GCC9 optimizer is not working correctly
> because it doesn't realize that the function call dfftw_execute(plan)
> actually depends on the variables coef and grid. Given that my code has
> worked well with all other versions of GCC, I suspect that there has been a
> change in how the optimizer works.

I assume that your program was always non-conforming, and that gcc
has simply gotten better at finding optimization opportunities.
Reply | Threaded
Open this post in threaded view
|

[Bug fortran/91778] gfortran GCC9 optimizer bug

rguenth at gcc dot gnu.org
In reply to this post by rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91778

Mark Wieczorek <mark.wieczorek at oca dot eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Mark Wieczorek <mark.wieczorek at oca dot eu> ---
Thanks for the help. After realizing that the fftw_execute call was in fact
suspicious I went to their web site and found that it had been updated
recently. They state that

"we have had reports that this causes problems with some recent optimizing
Fortran compilers. The problem is, because the input/output arrays are not
passed as explicit arguments to dfftw_execute, the semantics of Fortran (unlike
C) allow the compiler to assume that the input/output arrays are not changed by
dfftw_execute. As a consequence, certain compilers end up optimizing out or
repositioning the call to dfftw_execute, assuming incorrectly that it does
nothing."

They then suggest using new convenience functions that are like

call fftw_execute(plan, coef, grid)

where the coef and grid variable are just placeholders so that optimizer
understands the dependencies.

I am going to consider this closed. Thanks again!