[Patch][Fortran] OpenACC – permit common blocks in some clauses

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[Patch][Fortran] OpenACC – permit common blocks in some clauses

Tobias Burnus-3
This OpenACC-only patch extends the support for /common/ blocks.

[In OpenMP (4.0 to 5.0, unchanged) and gfortran, common blocks are supported in copyin/copyprivate, in firstprivate/lastprivate/private/shared, in threadprivate and in declare target.]

For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.

This patch adds them (for OpenACC only) to copy/copyin/copyout, create/delete,
host, pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
present_or_copy_out, present_or_create and self.
[Of those, only "copy()" is also an OpenMP clause name.]
[Cf. OpenACC 2.7 in 1.9 (for the p* variants) and 2.13; the latter is new since OpenACC 2.0.]



I think the Fortran part is obvious, once one agrees on the list of clauses; and OK from a Fortran's maintainer view.

gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
The ME change is about privatizing common blocks (I haven't studied this part closer.)


@Thomas: Please review
@Jakub, all: comments and approvals are welcome.

Tobias

PS: This patch is the rediffed OG9 (alias OG8) patch 0793cef408c9937f4c4e2423dd1f7d6a97b9bed3 by Cesar Philippidis from 2016. (Which was on gomp-4_0-branch as r240165). Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000 – which in addition also does some other things like handling OpenACC device pointers.


acc-common2.diff (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Thomas Schwinge-8
Hi!

On 2019-10-15T23:32:32+0200, Tobias Burnus <[hidden email]> wrote:
> This OpenACC-only patch extends the support for /common/ blocks.

I'll be quick to note that I don't have any first-hand experience with
Fortran common blocks.  :-P

> [In OpenMP (4.0 to 5.0, unchanged) and gfortran, common blocks are supported in copyin/copyprivate, in firstprivate/lastprivate/private/shared, in threadprivate and in declare target.]
>
> For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.
>
> This patch adds them (for OpenACC only) to copy/copyin/copyout, create/delete,
> host, pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
> present_or_copy_out, present_or_create and self.
> [Of those, only "copy()" is also an OpenMP clause name.]

I'm confused: in
<http://mid.mail-archive.com/20181204133007.GO12380@tucnak> Jakub stated
that "OpenMP doesn't have a copy clause, so I'd expect true here":

| @@ -1051,7 +1052,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
|    if ((mask & OMP_CLAUSE_COPY)
|        && gfc_match ("copy ( ") == MATCH_YES
|        && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
| -   OMP_MAP_TOFROM))
| +   OMP_MAP_TOFROM, openacc))
|      continue;

> [Cf. OpenACC 2.7 in 1.9 (for the p* variants) and 2.13; the latter is new since OpenACC 2.0.]
>
>
>
> I think the Fortran part is obvious, once one agrees on the list of clauses; and OK from a Fortran's maintainer view.

I'll defer to your judgement there, but just one comment: I noticed that
OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses except
'deviceptr' and 'present', the list argument may include a Fortran
_common block_ name enclosed within slashes, if that _common block_ name
also appears in a 'declare' directive 'link' clause".

Are we already properly handling the aspect that requires that the "that
_common block_ name also appears in a 'declare' directive 'link' clause"?

The libgomp execution test cases you're adding all state that "This test
does not exercise ACC DECLARE", yet they supposedly already do work fine.
Or am I understading the OpenACC specification wrongly here?

I'm certainly aware of (big) deficiencies in the OpenACC 'declare'
handling, so I guess my question here may be whether these test cases are
valid after all?

> gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
> The ME change is about privatizing common blocks (I haven't studied this part closer.)

So, please do study that closer.  ;-P

In <http://mid.mail-archive.com/87efx6haep.fsf@...>
I raised some questions, got a bit of an answer, and in
<http://mid.mail-archive.com/87bms85kra.fsf@...>
asked further, didn't get an answer.

All the rationale from Cesar's original submission email should be
transferred into 'gcc/gimplify.c' as much as feasible, to make that
"voodoo code" better understandable.

> @Jakub, all: comments and approvals are welcome.

Indeed.  :-)

> gcc/
> * gimplify.c (oacc_default_clause): Privatize fortran common blocks.
> (omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
> common block decls.

> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -7218,15 +7218,20 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>  {
>    const char *rkind;
>    bool on_device = false;
> +  bool is_private = false;
>    bool declared = is_oacc_declared (decl);
>    tree type = TREE_TYPE (decl);
>  
>    if (lang_hooks.decls.omp_privatize_by_reference (decl))
>      type = TREE_TYPE (type);
>  
> +  if (RECORD_OR_UNION_TYPE_P (type))
> +    is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> +
>    if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
>        && is_global_var (decl)
> -      && device_resident_p (decl))
> +      && device_resident_p (decl)
> +      && !is_private)
>      {
>        on_device = true;
>        flags |= GOVD_MAP_TO_ONLY;
> @@ -7237,7 +7242,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_KERNELS:
>        rkind = "kernels";
>  
> -      if (AGGREGATE_TYPE_P (type))
> +      if (is_private)
> + flags |= GOVD_MAP;
> +      else if (AGGREGATE_TYPE_P (type))
>   {
>    /* Aggregates default to 'present_or_copy', or 'present'.  */
>    if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
> @@ -7254,7 +7261,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_PARALLEL:
>        rkind = "parallel";
>  
> -      if (on_device || declared)
> +      if (is_private)
> + flags |= GOVD_FIRSTPRIVATE;
> +      else if (on_device || declared)
>   flags |= GOVD_MAP;
>        else if (AGGREGATE_TYPE_P (type))
>   {
> @@ -7320,7 +7329,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>   {
>    tree value = get_base_address (DECL_VALUE_EXPR (decl));
>  
> -  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
> +  if (!(ctx->region_type & ORT_ACC)
> +      && value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>      return omp_notice_threadprivate_variable (ctx, decl, value);
>   }
>  
> @@ -7352,7 +7362,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>    n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>    if ((ctx->region_type & ORT_TARGET) != 0)
>      {
> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
> +      shared = !(ctx->region_type & ORT_ACC);
> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
>        if (n == NULL)
>   {
>    unsigned nflags = flags;
> @@ -7520,6 +7531,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>      }
>  
>    shared = ((flags | n->value) & GOVD_SHARED) != 0;
> +  if (ctx->region_type & ORT_ACC)
> +    shared = false;
>    ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
>  
>    /* If nothing changed, there's nothing left to do.  */

> PS: This patch is the rediffed OG9 (alias OG8) patch 0793cef408c9937f4c4e2423dd1f7d6a97b9bed3 by Cesar Philippidis from 2016. (Which was on gomp-4_0-branch as r240165). Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000 – which in addition also does some other things like handling OpenACC device pointers.

There's no Git magic involved there: somebody just (manually) merged
several these patches together into one, for no good reason.  ;-\


Grüße
 Thomas


> diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> new file mode 100644
> index 00000000000..1cbbb49d638
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> @@ -0,0 +1,69 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, validates early matching errors.
> +
> +subroutine subtest
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +end subroutine subtest
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data present(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v)
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +end program test
> diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> new file mode 100644
> index 00000000000..b83638918a3
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> @@ -0,0 +1,49 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, resolver errors such as duplicate data clauses.
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +end program test
> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90
> new file mode 100644
> index 00000000000..a17a33536f3
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90
> @@ -0,0 +1,105 @@
> +! Test data located inside common blocks.  This test does not exercise
> +! ACC DECLARE.
> +
> +module const
> +  integer, parameter :: n = 100
> +end module const
> +
> +subroutine check
> +  use const
> +
> +  implicit none
> +  integer i, x(n), y
> +  common /BLOCK/ x, y
> +
> +  do i = 1, n
> +     if (x(i) .ne. y) call abort
> +  end do
> +end subroutine check
> +
> +module m
> +  use const
> +  integer a(n), b
> +  common /BLOCK/ a, b
> +
> +contains
> +  subroutine mod_implicit_incr
> +    implicit none
> +    integer i
> +
> +    !$acc parallel loop
> +    do i = 1, n
> +       a(i) = b
> +    end do
> +    !$acc end parallel loop
> +
> +    call check
> +  end subroutine mod_implicit_incr
> +
> +  subroutine mod_explicit_incr
> +    implicit none
> +    integer i
> +
> +    !$acc parallel loop copy(a(1:n)) copyin(b)
> +    do i = 1, n
> +       a(i) = b
> +    end do
> +    !$acc end parallel loop
> +
> +    call check
> +  end subroutine mod_explicit_incr
> +end module m
> +
> +subroutine sub_implicit_incr
> +  use const
> +
> +  implicit none
> +  integer i, x(n), y
> +  common /BLOCK/ x, y
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     x(i) = y
> +  end do
> +  !$acc end parallel loop
> +
> +  call check
> +end subroutine sub_implicit_incr
> +
> +subroutine sub_explicit_incr
> +  use const
> +
> +  implicit none
> +  integer i, x(n), y
> +  common /BLOCK/ x, y
> +
> +  !$acc parallel loop copy(x(1:n)) copyin(y)
> +  do i = 1, n
> +     x(i) = y
> +  end do
> +  !$acc end parallel loop
> +
> +  call check
> +end subroutine sub_explicit_incr
> +
> +program main
> +  use m
> +
> +  implicit none
> +
> +  a(:) = -1
> +  b = 5
> +  call mod_implicit_incr
> +
> +  a(:) = -2
> +  b = 6
> +  call mod_explicit_incr
> +
> +  a(:) = -3
> +  b = 7
> +  call sub_implicit_incr
> +
> +  a(:) = -4
> +  b = 8
> +  call sub_explicit_incr
> +end program main
> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90
> new file mode 100644
> index 00000000000..e27a225a024
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90
> @@ -0,0 +1,150 @@
> +! Test data located inside common blocks.  This test does not exercise
> +! ACC DECLARE.  All data clauses are explicit.
> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +subroutine validate
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  do i = 1, n
> +     if (abs(x(i) - i - z) .ge. 0.0001) call abort
> +  end do
> +end subroutine validate
> +
> +subroutine incr
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  !$acc parallel loop pcopy(/BLOCK/)
> +  do i = 1, n
> +     x(i) = x(i) + z
> +  end do
> +  !$acc end parallel loop
> +end subroutine incr
> +
> +program main
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 a(n), b(n), c
> +  common /BLOCK/ a, b, c, j
> +
> +  ! Test copyout, pcopy, device
> +
> +  !$acc data copyout(a, c)
> +
> +  c = 1.0
> +
> +  !$acc update device(c)
> +
> +  !$acc parallel loop pcopy(a)
> +  do i = 1, n
> +     a(i) = i
> +  end do
> +  !$acc end parallel loop
> +
> +  call incr
> +  call incr
> +  call incr
> +  !$acc end data
> +
> +  c = 3.0
> +  call validate
> +
> +  ! Test pcopy without copyout
> +
> +  c = 2.0
> +  call incr
> +  c = 5.0
> +  call validate
> +
> +  ! Test create, delete, host, copyout, copyin
> +
> +  !$acc enter data create(b)
> +
> +  !$acc parallel loop pcopy(b)
> +  do i = 1, n
> +     b(i) = i
> +  end do
> +  !$acc end parallel loop
> +
> +  !$acc update host (b)
> +
> +  !$acc parallel loop pcopy(b) copyout(a) copyin(c)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  !$acc exit data delete(b)
> +
> +  call validate
> +
> +  a(:) = b(:)
> +  c = 0.0
> +  call validate
> +
> +  ! Test copy
> +
> +  c = 1.0
> +  !$acc parallel loop copy(/BLOCK/)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +
> +  ! Test pcopyin, pcopyout FIXME
> +
> +  c = 2.0
> +  !$acc data copyin(b, c) copyout(a)
> +
> +  !$acc parallel loop pcopyin(b, c) pcopyout(a)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  !$acc end data
> +
> +  call validate
> +
> +  ! Test reduction, private
> +
> +  j = 0
> +
> +  !$acc parallel private(i) copy(j)
> +  !$acc loop reduction(+:j)
> +  do i = 1, n
> +     j = j + 1
> +  end do
> +  !$acc end parallel
> +
> +  if (j .ne. n) call abort
> +
> +  ! Test firstprivate, copy
> +
> +  a(:) = 0
> +  c = j
> +
> +  !$acc parallel loop firstprivate(c) copyout(a)
> +  do i = 1, n
> +     a(i) = i + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +end program main
> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90
> new file mode 100644
> index 00000000000..90448d2da72
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90
> @@ -0,0 +1,137 @@
> +! Test data located inside common blocks.  This test does not exercise
> +! ACC DECLARE.  Most of the data clauses are implicit.
> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +subroutine validate
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  do i = 1, n
> +     if (abs(x(i) - i - z) .ge. 0.0001) call abort
> +  end do
> +end subroutine validate
> +
> +subroutine incr_parallel
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     x(i) = x(i) + z
> +  end do
> +  !$acc end parallel loop
> +end subroutine incr_parallel
> +
> +subroutine incr_kernels
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  !$acc kernels
> +  do i = 1, n
> +     x(i) = x(i) + z
> +  end do
> +  !$acc end kernels
> +end subroutine incr_kernels
> +
> +program main
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 a(n), b(n), c
> +  common /BLOCK/ a, b, c, j
> +
> +  !$acc data copyout(a, c)
> +
> +  c = 1.0
> +
> +  !$acc update device(c)
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     a(i) = i
> +  end do
> +  !$acc end parallel loop
> +
> +  call incr_parallel
> +  call incr_parallel
> +  call incr_parallel
> +  !$acc end data
> +
> +  c = 3.0
> +  call validate
> +
> +  ! Test pcopy without copyout
> +
> +  c = 2.0
> +  call incr_kernels
> +  c = 5.0
> +  call validate
> +
> +  !$acc kernels
> +  do i = 1, n
> +     b(i) = i
> +  end do
> +  !$acc end kernels
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +
> +  a(:) = b(:)
> +  c = 0.0
> +  call validate
> +
> +  ! Test copy
> +
> +  c = 1.0
> +  !$acc parallel loop
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +
> +  c = 2.0
> +  !$acc data copyin(b, c) copyout(a)
> +
> +  !$acc kernels
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end kernels
> +
> +  !$acc end data
> +
> +  call validate
> +
> +  j = 0
> +
> +  !$acc parallel loop reduction(+:j)
> +  do i = 1, n
> +     j = j + 1
> +  end do
> +  !$acc end parallel loop
> +
> +  if (j .ne. n) call abort
> +end program main

signature.asc (671 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Tobias Burnus-3
Hi Thomas,

Updated version attached. Changes:
* Use "true" instead of "openacc" for the OpenACC-only "copy()" clause
(as not shared w/ OpenMP)
* Add some documentation to gimplify.c
* Use GOVD_FIRSTPRIVATE also for "kernel"

The patch survived bootstrapping + regtesting on my laptop (no
offloading) and on a build server (with nvptx offloading).

On 10/18/19 3:26 PM, Thomas Schwinge wrote:
> I'll be quick to note that I don't have any first-hand experience with
> Fortran common blocks. :-P

To quote you from below: "So, please do study that closer. ;-P"

I also do not have first-hand experience (as I started with Fortran 95 +
some of F2003), but common blocks were a nice idea of the early 1960 to
provide access to global memory, avoiding to pass all data as arguments
(which also has stack issues). They have been replaced by derived types
and variables declared at module level since Fortran 90. See
https://j3-fortran.org/doc/year/18/18-007r1.pdf or
https://web.stanford.edu/class/me200c/tutorial_77/13_common.html


On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>> For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.
>> […] [Of those, only "copy()" is also an OpenMP clause name.]
> I'm confused: in […] "OpenMP doesn't have a copy clause, so I'd expect true here":

I concur – only "copyin" and "copyprivate" exist in OpenMP. (But thanks
to "if (openacc)" no "openacc" is needed, either.)


> I'll defer to your judgement there, but just one comment: I noticed
> that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses
> except 'deviceptr' and 'present', the list argument may include a
> Fortran_common block_ name enclosed within slashes, if that _common
> block_ name also appears in a 'declare' directive 'link' clause".
>
> Are we already properly handling the aspect that requires that the
> "that _common block_ name also appears in a 'declare' directive 'link'
> clause"?

I don't know neither the OpenACC spec nor the GCC implementation well
enough to claim proper (!) handling. However, as stated above:
device_resident/usedevice/cache/flush/link do support common block
arguments.

Looking at the testsuite, link and device_resident are tested in
gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use
come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been
added.


> The libgomp execution test cases you're adding all state that "This test does not exercise ACC DECLARE", yet they supposedly already do work fine. Or am I understading the OpenACC specification wrongly here?

You need to ask Cesar, who wrote the test case and that comment, why he
added it.

The patch does not touch 'link'/'device_resident' clauses of 'declare',
hence, I think he didn't see a reason to add a run-time test case for
it. – That's independent from whether it is supported by the OpenACC
spec and whether it is "properly" implemented in GCC/gfortran.

> I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling so I guess my question here may be whether these test cases are valid after all?

Well, you are the OpenACC specialist – both spec wise and
GCC-implementation wise. However, as the test cases are currently
parsing-only test cases, I think they should be fine.


>> gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
>> The ME change is about privatizing common blocks (I haven't studied this part closer.)
> So, please do study that closer.  ;-P
>
> In<http://mid.mail-archive.com/87efx6haep.fsf@...>
> I raised some questions, got a bit of an answer, and in
> <http://mid.mail-archive.com/87bms85kra.fsf@...>
> asked further, didn't get an answer.
>
> All the rationale from Cesar's original submission email should be
> transferred into 'gcc/gimplify.c' as much as feasible, to make that
> "voodoo code" better understandable.

I have now added some comments to the patch. I also changed GOVD_MAP to
GOVD_FIRSTPRIVATE for "acc kernels" to match "acc parallel"; I think
that makes sense in terms of what Cesar has written – but I am not
completely sure about this.

Cross ref: The original email is
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00950.html ; the review
starts here https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00250.html 
(same email as mid.mail-archive.com link above).

BTW: That patch – rediffed for OG9 and augmented by several other
patches (including deviceptr) – was then submitted at
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01911.html and first
reviewed at https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00176.html and
then committed to OG9 at
https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00051.html


>> Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000
> There's no Git magic involved there: somebody just (manually) merged
> several these patches together into one, for no good reason.  ;-\

Well, there is more. If you do not enforce linear history, you cannot
easily say to git: Give me all changes between this commit and that
commit – as they pass by in a sneak path. And by default, GIT merges
such that the private version is the "main" branch – and one merges the
other branch ("upstream") into the own branch. This can quickly become
quite confusing.

In particular, it is not easy to see when/why some code disappeared. You
have some patch – someone else had merge problems, accidentally removed
it and then if you diff or do "log -p", it looks as if the code was
never there, unless you explicitly dig into the branch whose commits
were merged into the "main" branch.

Tobias

PS: I am a great fan of patch submissions by the authors – it avoids
later digging and guess work for reasons why someone else wrote
something in a particular way.


acc-common4.diff (22K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Thomas Schwinge-8
Hi Tobias!

On 2019-10-23T22:34:42+0200, Tobias Burnus <[hidden email]> wrote:
> Updated version attached. Changes:
> * Use "true" instead of "openacc" for the OpenACC-only "copy()" clause
> (as not shared w/ OpenMP)
> * Add some documentation to gimplify.c
> * Use GOVD_FIRSTPRIVATE also for "kernel"

Thanks!

> The patch survived bootstrapping + regtesting on my laptop (no
> offloading) and on a build server (with nvptx offloading).

OK for trunk, with the following few small items considered.  To record
the review effort, please include "Reviewed-by: Thomas Schwinge
<[hidden email]>" in the commit log, see
<https://gcc.gnu.org/wiki/Reviewed-by>.


> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>> I'll be quick to note that I don't have any first-hand experience with
>> Fortran common blocks. :-P
>
> To quote you from below: "So, please do study that closer. ;-P"

Haha!  ;-P (You don't want to know how long is my list of items that I
might/want/could look into...)

> I also do not have first-hand experience (as I started with Fortran 95 +
> some of F2003), but common blocks were a nice idea of the early 1960 to
> provide access to global memory, avoiding to pass all data as arguments
> (which also has stack issues). They have been replaced by derived types
> and variables declared at module level since Fortran 90. See
> https://j3-fortran.org/doc/year/18/18-007r1.pdf or
> https://web.stanford.edu/class/me200c/tutorial_77/13_common.html

..., and didn't "They have been replaced by [...]" go as far as that
they've actually been deprecated in recent Fortran standard revisions?
(I may be misremembering.)


Anyway:

> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>>> For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.

>> I'll defer to your judgement there, but just one comment: I noticed
>> that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses
>> except 'deviceptr' and 'present', the list argument may include a
>> Fortran_common block_ name enclosed within slashes, if that _common
>> block_ name also appears in a 'declare' directive 'link' clause".
>>
>> Are we already properly handling the aspect that requires that the
>> "that _common block_ name also appears in a 'declare' directive 'link'
>> clause"?
>
> I don't know neither the OpenACC spec nor the GCC implementation well
> enough to claim proper (!) handling. However, as stated above:
> device_resident/usedevice/cache/flush/link do support common block
> arguments.
(... in the front end at least.)

> Looking at the testsuite, link and device_resident are tested in
> gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use
> come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been
> added.

(..., again, that'S all front end testing, so not sufficient to claim it
actually works for executing user code.)  ;-\

>> The libgomp execution test cases you're adding all state that "This test does not exercise ACC DECLARE", yet they supposedly already do work fine. Or am I understading the OpenACC specification wrongly here?
>
> You need to ask Cesar, who wrote the test case and that comment, why he
> added it.

Well, Cesar is not working on GCC anymore, thus you've been asked to
adopt his patch, and fix it up, change it as necessary.

> The patch does not touch 'link'/'device_resident' clauses of 'declare',
> hence, I think he didn't see a reason to add a run-time test case for
> it.

(Or such testing didn't work, but there was no time/interest at that
point to make it work.)

> – That's independent from whether it is supported by the OpenACC
> spec and whether it is "properly" implemented in GCC/gfortran.
>
>> I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling so I guess my question here may be whether these test cases are valid after all?
>
> Well, you are the OpenACC specialist – both spec wise and
> GCC-implementation wise.

Sure, I do know some things, but I'm certainly not all-knowing -- that's
why I needed you to look into this in more detail.

> However, as the test cases are currently
> parsing-only test cases, I think they should be fine.

OK, and everything else we're thus delaying for later.  That's OK -- what
we got here now is certainly an improvement on its own.  I just wanted to
make sure that we're not missing something obvious.


>>> gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
>>> The ME change is about privatizing common blocks (I haven't studied this part closer.)
>> So, please do study that closer.  ;-P
>>
>> In<http://mid.mail-archive.com/87efx6haep.fsf@...>
>> I raised some questions, got a bit of an answer, and in
>> <http://mid.mail-archive.com/87bms85kra.fsf@...>
>> asked further, didn't get an answer.

By the way, in the mean time I also found the original GCC trunk
submission email:
<http://mid.mail-archive.com/a028d039-7e9e-792a-7424-ccab1bb425f4@...>.
(Mentioning that just in case that carries any additional information for
you.)


>> All the rationale from Cesar's original submission email should be
>> transferred into 'gcc/gimplify.c' as much as feasible, to make that
>> "voodoo code" better understandable.
>
> I have now added some comments to the patch.

Thanks.


> I also changed GOVD_MAP to
> GOVD_FIRSTPRIVATE for "acc kernels" to match "acc parallel"; I think
> that makes sense in terms of what Cesar has written – but I am not
> completely sure about this.

OK.  Given that this "abstractly" seems to make sense to both of us,
let's do it that way.

Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
faild (at run time, say, with aforementioned duplicate mapping errors, or
would contain "strange"/duplicate/conflicting mapping items in the
'-fdump-tree-gimple' dump)?


>>> Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000
>> There's no Git magic involved there: somebody just (manually) merged
>> several these patches together into one, for no good reason.  ;-\
>
> Well, there is more. If you do not enforce linear history, you cannot
> easily say to git: Give me all changes between this commit and that
> commit – as they pass by in a sneak path. And by default, GIT merges
> such that the private version is the "main" branch – and one merges the
> other branch ("upstream") into the own branch. This can quickly become
> quite confusing.
I'm not sure I'm following your argument there.

> In particular, it is not easy to see when/why some code disappeared. You
> have some patch – someone else had merge problems, accidentally removed
> it and then if you diff or do "log -p", it looks as if the code was
> never there, unless you explicitly dig into the branch whose commits
> were merged into the "main" branch.

There are "Diff Formatting" options like '-c', '--cc', '-m', '-t' which
may help.

Anyway, that's certainly not related to Fortran common block support.


> PS: I am a great fan of patch submissions by the authors – it avoids
> later digging and guess work for reasons why someone else wrote
> something in a particular way.

Absolutely agreed!

On the other hand, what you've now done, re-engineering the original
rationale etc., makes my review much easier, because that then gives
greater confidence in the changes, I can then trust that you didn't just
copy the original patch, but instead spent your own time thinking it
through.


> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -7219,15 +7219,28 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>  {
>    const char *rkind;
>    bool on_device = false;
> +  bool is_private = false;
>    bool declared = is_oacc_declared (decl);
>    tree type = TREE_TYPE (decl);
>  
>    if (lang_hooks.decls.omp_privatize_by_reference (decl))
>      type = TREE_TYPE (type);
>  
> +  /* For Fortran COMMON blocks, only used variables in those blocks are
> +     transfered and remapped.  The block itself will have a private clause to
> +     avoid transfering the data twice.
> +     The hook evaluates to false by default.  For a variable in Fortran's COMMON
> +     or EQUIVALENCE block, returns 'true' (as we have shared=false) - as only
> +     the variables in such a COMMON/EQUIVALENCE block shall be privatized not
> +     the whole block.  For C++ and Fortran, it can also be true under certain
> +     other conditions, if DECL_HAS_VALUE_EXPR.  */
> +  if (RECORD_OR_UNION_TYPE_P (type))
> +    is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> +
>    if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
>        && is_global_var (decl)
> -      && device_resident_p (decl))
> +      && device_resident_p (decl)
> +      && !is_private)
>      {
>        on_device = true;
>        flags |= GOVD_MAP_TO_ONLY;
> @@ -7238,7 +7251,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_KERNELS:
>        rkind = "kernels";
>  
> -      if (AGGREGATE_TYPE_P (type))
> +      if (is_private)
> + flags |= GOVD_FIRSTPRIVATE;
> +      else if (AGGREGATE_TYPE_P (type))
>   {
>    /* Aggregates default to 'present_or_copy', or 'present'.  */
>    if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
> @@ -7255,7 +7270,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_PARALLEL:
>        rkind = "parallel";
>  
> -      if (on_device || declared)
> +      if (is_private)
> + flags |= GOVD_FIRSTPRIVATE;
> +      else if (on_device || declared)
>   flags |= GOVD_MAP;
>        else if (AGGREGATE_TYPE_P (type))
>   {
> @@ -7321,7 +7338,11 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
|        if (DECL_HAS_VALUE_EXPR_P (decl))

>   {
>    tree value = get_base_address (DECL_VALUE_EXPR (decl));
>  
> -  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
> +  /* For OpenACC, defer expansion of value to avoid transfering
> +     privatized common block data instead of im-/explicitly transfered
> +     variables which are in common blocks.  */
> +  if (!(ctx->region_type & ORT_ACC)
> +      && value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>      return omp_notice_threadprivate_variable (ctx, decl, value);
>   }
Wouldn't it be clearer if that latter one were written as follows:

    if (DECL_HAS_VALUE_EXPR_P (decl))
      {
        if (ctx->region_type & ORT_ACC)
          /* For OpenACC, defer expansion of value to avoid transfering
             privatized common block data instead of im-/explicitly transfered
             variables which are in common blocks.  */
          ;
        else
          {
            tree value = get_base_address (DECL_VALUE_EXPR (decl));
   
            if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
              return omp_notice_threadprivate_variable (ctx, decl, value);
          }
      }

> @@ -7353,7 +7374,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>    n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>    if ((ctx->region_type & ORT_TARGET) != 0)
>      {
> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
> +      /* For OpenACC, as remarked above, defer expansion.  */
> +      shared = !(ctx->region_type & ORT_ACC);
> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

Also more explicit, easier to read:

    if (ctx->region_type & ORT_ACC)
      /* For OpenACC, as remarked above, defer expansion.  */
      shared = false;
    else
      shared = true;

> @@ -7521,6 +7544,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>      }
>  
>    shared = ((flags | n->value) & GOVD_SHARED) != 0;
> +  /* For OpenACC, cf. remark above regaring common blocks.  */
> +  if (ctx->region_type & ORT_ACC)
> +    shared = false;
>    ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

And again:

    if (ctx->region_type & ORT_ACC)
      /* For OpenACC, cf. remark above regaring common blocks.  */
      shared = false;
    else
      shared = ((flags | n->value) & GOVD_SHARED) != 0;

(In all three cases, using an easy 'if (ctx->region_type & ORT_ACC)' to
point out the special case.)

It's still some kind of voodoo to me -- but at least, you've now also
reviewed this, and it's now documented what's going on.


> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90

> @@ -0,0 +1,69 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, validates early matching errors.
> +
> +subroutine subtest
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +end subroutine subtest
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data present(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v)
I note there is one single 'exit data' test, but no 'enter data'.

Also, 'update' is missing, to test the 'device' and 'self'/'host' clauses.

> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +end program test

Is there a reason for the duplicated 'deviceptr' testing?

Move 'data deviceptr' up a little bit, next to the other 'data' construct
testing?

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Similarly.


Grüße
 Thomas

signature.asc (671 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Tobias Burnus-3
Hi Thomas,

On 10/25/19 10:43 AM, Thomas Schwinge wrote:
> OK for trunk, with the following few small items considered.

Committed as Rev. 277451 – after a fresh bootstrap and regtesting.

Changes:
* I have now a new test case
libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
omplower.
* In the compile-time *{2,3} test case, there is now also a 'enter data'
and 'update host/self/device' test.
* the libgomp tests have a 'dg-do run'.
* I modified the code in gimplify.c as proposed.


Regarding the new test case: Without the gcc/gimplify.c changes, one has
(see last item before child fn):

     #pragma omp target oacc_parallel map(tofrom:a [len: 400])
map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) map(tofrom:block [len:
812]) [child fn …
     #pragma omp target oacc_kernels map(force_tofrom:i [len: 4])
map(tofrom:y [len: 400]) map(tofrom:x [len: 400])
map(tofrom:kernel_block [len: 804]) map(force_tofrom:c [len: 4])
map(tofrom:block [len: 812])  [child fn …

With the changes of gcc/gimplify.c, one has:

     #pragma omp target oacc_parallel map(tofrom:a [len: 400])
map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) [child fn …
     #pragma omp target oacc_kernels map(force_tofrom:i [len: 4])
map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) map(force_tofrom:c
[len: 4])  [child fn …


And without gimplify.c, the added run-tests indeed fail with:
libgomp: Trying to map into device [0x407100..0x407294) object when
[0x407100..0x407290) is already mapped


Tobias

PS:
> Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
> faild (at run time, say, with aforementioned duplicate mapping errors, or
> would contain "strange"/duplicate/conflicting mapping items in the
> '-fdump-tree-gimple' dump)?

See new test case and result for the current tests.

Additionally, I have applied:

> Wouldn't it be clearer if that latter one were written as follows:
>      if (DECL_HAS_VALUE_EXPR_P (decl))
>        {
>          if (ctx->region_type & ORT_ACC)
>            /* For OpenACC, defer expansion of value to avoid transfering
>               privatized common block data instead of im-/explicitly transfered
>               variables which are in common blocks.  */
>            ;
>          else
>            {
>              tree value = get_base_address (DECL_VALUE_EXPR (decl));
>      
>              if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>                return omp_notice_threadprivate_variable (ctx, decl, value);
>            }
>        }
>
>> @@ -7353,7 +7374,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>>     n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>>     if ((ctx->region_type & ORT_TARGET) != 0)
>>       {
>> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
>> +      /* For OpenACC, as remarked above, defer expansion.  */
>> +      shared = !(ctx->region_type & ORT_ACC);
>> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
> Also more explicit, easier to read:
>
>      if (ctx->region_type & ORT_ACC)
>        /* For OpenACC, as remarked above, defer expansion.  */
>        shared = false;
>      else
>        shared = true;
>
>> @@ -7521,6 +7544,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>>       }
>>  
>>     shared = ((flags | n->value) & GOVD_SHARED) != 0;
>> +  /* For OpenACC, cf. remark above regaring common blocks.  */
>> +  if (ctx->region_type & ORT_ACC)
>> +    shared = false;
>>     ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
> And again:
>
>      if (ctx->region_type & ORT_ACC)
>        /* For OpenACC, cf. remark above regaring common blocks.  */
>        shared = false;
>      else
>        shared = ((flags | n->value) & GOVD_SHARED) != 0;
>
> (In all three cases, using an easy 'if (ctx->region_type & ORT_ACC)' to
> point out the special case.)
>
> It's still some kind of voodoo to me -- but at least, you've now also
> reviewed this, and it's now documented what's going on.

And changed the test case based on:

>> +  !$acc exit data delete(/blockA/, /blockB/, e, v)
> I note there is one single 'exit data' test, but no 'enter data'.
>
> Also, 'update' is missing, to test the 'device' and 'self'/'host' clauses.
>
>> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
>> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
>> +
>> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
>> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
>> +end program test
> Is there a reason for the duplicated 'deviceptr' testing?
>
> Move 'data deviceptr' up a little bit, next to the other 'data' construct
> testing?
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> Similarly.

acc-common-committed.diff (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Thomas Schwinge-8
Hi Tobias!

By the way, do you know what's the status is for Fortran common blocks in
OpenMP: supported vs. expected per the specification?


On 2019-10-25T16:36:10+0200, Tobias Burnus <[hidden email]> wrote:
> On 10/25/19 10:43 AM, Thomas Schwinge wrote:
>> Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
>> faild (at run time, say, with aforementioned duplicate mapping errors, or
>> would contain "strange"/duplicate/conflicting mapping items in the
>> '-fdump-tree-gimple' dump)?

> * I have now a new test case
> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
> omplower.

Thanks.

Curious: why 'omplower' instead of 'gimple' dump?


> Regarding the new test case: Without the gcc/gimplify.c changes, one has
> (see last item before child fn):
>
>      #pragma omp target oacc_parallel map(tofrom:a [len: 400])
> map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) map(tofrom:block [len:
> 812]) [child fn …
>      #pragma omp target oacc_kernels map(force_tofrom:i [len: 4])
> map(tofrom:y [len: 400]) map(tofrom:x [len: 400])
> map(tofrom:kernel_block [len: 804]) map(force_tofrom:c [len: 4])
> map(tofrom:block [len: 812])  [child fn …
>
> With the changes of gcc/gimplify.c, one has:
>
>      #pragma omp target oacc_parallel map(tofrom:a [len: 400])
> map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) [child fn …
>      #pragma omp target oacc_kernels map(force_tofrom:i [len: 4])
> map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) map(force_tofrom:c
> [len: 4])  [child fn …
>
>
> And without gimplify.c, the added run-tests indeed fail with:
> libgomp: Trying to map into device [0x407100..0x407294) object when
> [0x407100..0x407290) is already mapped
OK, good, my suspicion was thus right that there's something "strange"
there.  ;-)

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> @@ -0,0 +1,39 @@
> +! { dg-options "-fopenacc -fdump-tree-omplower" }

(For later: we usually just use 'dg-additional-options
"-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)

> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +program main
> +  use consts
> +  implicit none
> +
> +  integer :: i, j
> +  real ::  a(n) = 0, b(n) = 0, c, d
> +  real ::  x(n) = 0, y(n), z
> +  common /BLOCK/ a, b, c, j, d
> +  common /KERNELS_BLOCK/ x, y, z
> +
> +  c = 1.0
> +  !$acc parallel loop copy(/BLOCK/)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc kernels
> +  do i = 1, n
> +     x(i) = y(i) + c
> +  end do
> +  !$acc end kernels
> +end program main
> +
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel .*map\\(tofrom:a \\\[len: 400\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel .*map\\(tofrom:b \\\[len: 400\\\]\\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel .*map\\(tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
> +
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(force_tofrom:i \\\[len: 4\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(tofrom:x \\\[len: 400\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(tofrom:y \\\[len: 400\\\]\\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(force_tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
> +
> +! { dg-final { scan-tree-dump-not "map\\(.*:block\\)" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block\\)" "omplower" } }
For my understanding: the several unused variables in the common blocks
are to make sure that they don't cause any issues, don't get mapped at
all?

I we were to add to 'gfortran.dg/goacc/common-block-3.f90' a test case
for the upcoming OpenACC 'serial' construct (which basically equals the
OpenACC 'parallel' construct), would we copy/adapt the 'parallel' 'BLOCK'
test case, or add a new, separate common block?

Or, asking the other way round: why aren't in the current test case,
'parallel' and 'kernels' using the same common block, and both explicitly
'copy' the common block vs. not do that?


> * In the compile-time *{2,3} test case, there is now also a 'enter data'
> and 'update host/self/device' test.

;-) Heh, 'update' got inside the 'parallel' region:

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> @@ -0,0 +1,74 @@
> +[...]
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc update device(/blockA/)
> +  !$acc update self(/blockB/, v)
> +  !$acc update host(/blockA/, e, /blockB/)
> +  !$acc end parallel
> +[...]

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Likewise.

As obvoius; see attached, committed "Fix OpenACC directives nesting in
'gfortran.dg/goacc/common-block-1.f90',
'gfortran.dg/goacc/common-block-2.f90'" to trunk in r278047.


Grüße
 Thomas



From 068b41bc6db50d6f600ef95049f41dbbd12f5216 Mon Sep 17 00:00:00 2001
From: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Mon, 11 Nov 2019 09:26:40 +0000
Subject: [PATCH] Fix OpenACC directives nesting in
 'gfortran.dg/goacc/common-block-1.f90',
 'gfortran.dg/goacc/common-block-2.f90'

        gcc/testsuite/
        * gfortran.dg/goacc/common-block-1.f90: Fix OpenACC directives
        nesting.
        * gfortran.dg/goacc/common-block-2.f90: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@278047 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog                            | 6 ++++++
 gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 | 3 ++-
 gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 | 3 ++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cc60856a6a63..f8e626b2fd43 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2019-11-11  Thomas Schwinge  <[hidden email]>
+
+ * gfortran.dg/goacc/common-block-1.f90: Fix OpenACC directives
+ nesting.
+ * gfortran.dg/goacc/common-block-2.f90: Likewise.
+
 2019-11-11  Jiufu Guo  <[hidden email]>
 
  PR tree-optimization/88760
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
index ea437526b464..228637f5883c 100644
--- a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
@@ -55,10 +55,11 @@ program test
   !$acc end parallel
 
   !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
+  !$acc end parallel
+
   !$acc update device(/blockA/)
   !$acc update self(/blockB/, v)
   !$acc update host(/blockA/, e, /blockB/)
-  !$acc end parallel
 
   !$acc enter data pcopyin(/blockA/, /blockB/, e, v)
   !$acc exit data delete(/blockA/, /blockB/, e, v)
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
index 1ba945019f9e..5d49f6195b84 100644
--- a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
@@ -43,10 +43,11 @@ program test
   !$acc end parallel
 
   !$acc parallel firstprivate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end parallel
+
   !$acc update device(b, /blockA/, x) ! { dg-error "Symbol .x. present on multiple clauses" }
   !$acc update self(z, /blockB/, v) ! { dg-error "Symbol .z. present on multiple clauses" }
   !$acc update host(/blockA/, c) ! { dg-error "Symbol .c. present on multiple clauses" }
-  !$acc end parallel
 
   !$acc enter data copyin(/blockB/, e, v, a, c, y) ! { dg-error "Symbol .y. present on multiple clauses" }
   !$acc exit data delete(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
--
2.17.1


signature.asc (671 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Thomas Schwinge-8
Hi Tobias!

On 2019-11-25T15:02:16+0100, Tobias Burnus <[hidden email]> wrote:
> sorry for the belated reply.

Eh, no worries -- I'm way more behind on things...


> On 11/11/19 10:39 AM, Thomas Schwinge wrote:
>> By the way, do you know what's the status is for Fortran common blocks in
>> OpenMP: supported vs. expected per the specification?
>
> No; however, I had a quick glance at the spec and at the test cases;
> both compile-time and run-time test have some coverage, although I
> didn't spot a run-time test for one 'omp target'.

Thanks.

> Definition (3.32.1 in F2018): "blank common" = "unnamed common block".
> 'common /name/ x" (continues) define the common block named "name" by
> adding 'x' to it. While "common // y" or "common y" appends 'y' to the
> blank common.

Thanks for the concise summary.

> In OpenMP 5, common blocks appear twice – once [2.1, p.39, ll.11ff.] as
> general rule in the definition of "list item" (which are inherited by
> "extended list item" and "locator-list item"). [There are also some
> constraints and notes regarding common blocks)]. It does not really tell
> whether blank commons are permitted or not; some description is
> explicitly for named-common variables, leaving blank-common ones out
> (and undefined). But later sections explicitly make reference to blank
> commons, hence, one can assume they are permitted unless explicitly
> stated that they are not.

Yes, I go by the assumption that everything contained in the base
languages of OpenACC/OpenMP (so, the respective C, C++, Fortran
standards), should also work in an OpenACC/OpenMP context in a sensible
manner (detailed/clarified in the respective specification as necessary),
and if not supported then that ought to be spelled out explicitly.  (For
example, see either the "catch-all" notes in OpenACC 3.0,
1.7. "References", or the more in-detail notes in specific sections.)
Anything else I'd consider a bug in the respective specification, which
should be reported/fixed.

That said, if you think OpenMP needs to clarify whether Fortran blank
common blocks are supported or not, then file an issue or directly submit
a pull request against the specification on <https://github.com/OpenMP>
(once we've got access).

> And then very selectively for some items:
> * allocate – only with default allocator.
> * declare target – some restrictions and no blank commons
> * depend clause – no common permitted
> * threadprivate – some notes and explanation of the syntax (why?)
>    also only here requirement regarding common blocks with bind(c)
>    (why not also for declare target?)
> * linear clause – no common permitted
> * copyin – some notes
> * copyprivate – some notes
>
> As target test cases were suspiciously left out, I tries '!$omp target
> map(/name/)' which was rejected. I think one should add test cases for
> newer features – which mostly means 'omp target' and add the missing
> common-block checks. – And one has to find out why blank commons are not
> permitted and for the places where they are permitted, support has to be
> added.
ACK.  Instead of "burying" such things in long emails, I like to see GCC
PRs filed, which can then be actioned on individually.

> Talking about blank common blocks, the current OpenACC implementation
> does not seem to like them (see below); the spec (2.7) does not mention
> blank common blocks at all. – It talks about name between two slashes,
> but leaves it open whether the name can also be an empty string.

My assumption would thus be: yes, ought to be supported -- but I haven't
thought through whether that makes sense, so...

> common // x,y  !blank common
> !$acc parallel copyin(//)
> !$acc end parallel
> end
>
> fails with:
>
>      2 | !$acc parallel copyin(//)
>        |                       1
> Error: Syntax error in OpenMP variable list at (1)
..., please test with the PGI compiler (just to get more data), and
determine whether that makes sense to support in an OpenACC context
(likewise for OpenMP, of course), and then (once you've got access)
either file an issue, or (better) directly submit a pull request for
<https://github.com/OpenACC/openacc-spec/> to clarify that.  Sometimes
it's as easy as replacing non-standard text ("name between two slashes")
with the corresponding standard text (whatever the Fortran specification
calls this).


> On 2019-10-25T16:36:10+0200, Tobias Burnus<[hidden email]>  wrote:
>
>>> * I have now a new test case
>>> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
>>> omplower.
>> Thanks. Curious: why 'omplower' instead of 'gimple' dump?
>
> So far I found -fdump-tree-original, -fdump-omplower and
> -fdump-optimized quite useful – I have so far not used
> -fdump-tree-gimple, hence, I cannot state what's the advantage of the
> latter.
My rationale is that your code changes are in 'gcc/gimplify.c', so you'd
test for that stuff in the 'gimple' dump (which is between 'original' and
'omplower').

> The original dump I like because it shows what the FE generates, the
> omplower dump has the result after lowering including the assignments to
> the omp_arr variables but it keeps a readable pragma line (avoids
> guessing what the kind value was again etc.) while the optimized dump
> really shows what ends up in the call (with the pro and con that it
> depends on the optimization option).
>
> If you think it makes sense, one can switch.

I think it does (but please argue if it doesn't to you), but that's not
high priority, of course.


>>> --- /dev/null
>>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
>>> @@ -0,0 +1,39 @@
>>> +! { dg-options "-fopenacc -fdump-tree-omplower" }
>> (For later: we usually just use 'dg-additional-options
>> "-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)
>
> My impression was that it is not effective unless repeated – and I think
> I even tries it. In gcc/testsuite/gfortran.dg/gomp/ all 64 test cases
> with dg-options specify re-add the "-fopenmp".
>
> And in gcc/testsuite/gfortran.dg/goacc, 4 test cases use dg-options, 3
> specify -fopenacc and one doesn't. I wouldn't call this 'usually'
Note that I said 'dg-additional-options', not 'dg-options', so please
re-consider.


> and I
> wonder whether -fopenacc is really active for goacc/pr84963.f90. (The
> file uses no directive at all!)
>
> Hence, I wonder whether one should add it to goacc/pr84963.f90 – see
> attached patch.

Good find.  Please confirm that indeed this is meant to enable OpenACC
processing by reading discussion in <https://gcc.gnu.org/PR84963> and
<http://mid.mail-archive.com/ead44f52-ee30-6b5e-e18a-5dd49d9a2614@...>.
(Likely yes, of course.)

> Without the patch, I get:
>
> Executing on host: …/gfortran/../../gfortran -B…/gfortran/../../
> -B…/./libgfortran/ …/goacc/pr84963.f90 -fno-diagnostics-show-caret
> -fno-diagnostics-show-line-numbers -fdiagnostics-color=never 
> -fdiagnostics-urls=never    -O  -O2 -S -o pr84963.s    (timeout = 300)

(Confirmed.)

> And with the patch, I get
> … -fdiagnostics-urls=never -O -fopenacc -O2 -S -o pr84963.s …

> --- a/gcc/testsuite/gfortran.dg/goacc/pr84963.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/pr84963.f90
> @@ -1,5 +1,5 @@
>  ! PR ipa/84963
> -! { dg-options "-O2" }
> +! { dg-options "-fopenacc -O2" }

I suggest to change 'dg-options "-O2"' to 'dg-additional-options "-O2"'.
Please verify, and then commit that to trunk, gcc-8-branch, gcc-7-branch,
referencing PR84963 in the ChangeLog update.

That's a separate fix/commit from everything else discussed here.


>>> +  integer :: i, j
>>> +  real ::  a(n) = 0, b(n) = 0, c, d
>>> +  real ::  x(n) = 0, y(n), z
>>> +  common /BLOCK/ a, b, c, j, d
>>> +  common /KERNELS_BLOCK/ x, y, z
>> For my understanding: the several unused variables in the common blocks
>> are to make sure that they don't cause any issues, don't get mapped at
>> all?
>
> I think that's the idea
Then let's please document that in the test case sources, for that's not
quite obvious.

> – common-block variables which are not used
> should also not get mapped (= optimization). But, obviously, they should
> also not cause any issues.
>
> Hence, one could/should also check that they are not mapped – done in
> the attached patch.

Good, thanks.

> --- a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> @@ -9,7 +9,7 @@ program main
>    implicit none
>  
>    integer :: i, j
> -  real ::  a(n) = 0, b(n) = 0, c, d
> +  real ::  a(n) = 0, b(n) = 0, c, d, e(n)
>    real ::  x(n) = 0, y(n), z
>    common /BLOCK/ a, b, c, j, d
>    common /KERNELS_BLOCK/ x, y, z
> @@ -35,5 +35,8 @@ end program main
>  ! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(tofrom:y \\\[len: 400\\\]\\\)" 1 "omplower" } }
>  ! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(force_tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
>  
> -! { dg-final { scan-tree-dump-not "map\\(.*:block\\)" "omplower" } }
> -! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block\\)" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:block" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:d " "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:e " "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:z " "omplower" } }
OK for trunk, with some suitable commentary added ("expecting no mapping
of un-referenced blocks/variables", or something like that, before the
'scan-tree-dump-not' ones).  To record the review effort, please include
"Reviewed-by: Thomas Schwinge <[hidden email]>" in the commit
log, see <https://gcc.gnu.org/wiki/Reviewed-by>.


>> I we were to add to 'gfortran.dg/goacc/common-block-3.f90' a test case
>> for the upcoming OpenACC 'serial' construct (which basically equals the
>> OpenACC 'parallel' construct), would we copy/adapt the 'parallel' 'BLOCK'
>> test case, or add a new, separate common block?
>>
>> Or, asking the other way round: why aren't in the current test case,
>> 'parallel' and 'kernels' using the same common block, and both explicitly
>> 'copy' the common block vs. not do that?
>
> I think one could do either way – by itself, the blocks should be
> independent and, hence, could re-use the same common block. Re-using the
> same common block tests other things, hence, maybe you should do so –
> and use different variables from both blocks.
ACK, thanks.


Grüße
 Thomas

signature.asc (671 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

Tobias Burnus-3
In reply to this post by Thomas Schwinge-8
Hi Thomas,

Quick version: The attached patch seems to work, kind of,  but fails at
run time with:
libgomp: Trying to map into device [0x407218..0x40721c) object when
[0x407210..0x40721c) is already mapped
This marks the common-block decl but not the common-block vars as
'device resident' (alias "omp declare target").

The attached with '#if 0'  set to '1' does not work as it gives an ICE
in lto1. – If one only marks the common-block variables, it fails as the
ME check complains "variable 'block' has been referenced in offloaded
code but hasn't been marked to be included in the offloaded code" –

I think the first version is fine, but it seems as if the ME needs to
use pcreate and not create for those. I think that's also the reason for
the odd is-program check mentioned at the very bottom.

Tobias

PS: Hmm, I really wonder why it seemed to have passed before. Looking at
the code, it cannot have passed — more below. That goes wrong since
r272453 for PR85221 (well, it can't before). I don't quickly see whether
it also affects OpenMP or other clauses.

I think for a proper fix it would be very useful to know some more
details about the intention of 'declare device_resident' (existing only
on the device, existing on both host and device etc.). Cf. previous email.

In terms of this issue, if one does:  "integer :: a, b, c; common /name/
a,b,c; !$acc declare device_resident(a)", should this make all of the
common-block variables as device resident or not? Internally, one gets
for declare-5.f90 the following, i.e. /another/ is the common name and
g, h and i are common-name variables:

   static integer(kind=4) g [value-expr: another.g];
   static integer(kind=4) h[3] [value-expr: another.h];
   static integer(kind=4) i[3] [value-expr: another.i];

For the test case, the issue is that 'gfc_get_symbol_decl' only called
after it's tree representation (sym->backend_decl) has already been
created; this happens for common blocks. – The attached patch fixes
this, marking the common block decl and all its variables as declare
device_resident.

One could think of handling other attributes (which ones?). For
EQUIVALENCE in commons, the attributes are collected using
accumulate_equivalence_attributes – and for normal variables, it is
handled in trans-decl.c's add_attributes_to_decl

  * * *

Additionally, and unrelated to the test case, the following code looks
very suspicious (from finish_oacc_declare in fortran/trans-decl.c):

   module_oacc_clauses = NULL;
   gfc_traverse_ns (ns, find_module_oacc_declare_clauses);
   if (module_oacc_clauses && sym->attr.flavor == FL_PROGRAM)

First, it very much looks like memory leak – one creates a linked list,
but always dumps it w/o using or freeing it if one is currently not
processing the main program. Additionally, it assumes that the main program
has a full view of module-declared 'declare device_resident' variables, but
it is trivial to construct programs where the main program does not see this
property. Most trivial example is:
   subroutine foo()
     use module_w_device_decl
   end subroutine
independent whether that function exists as such or as module procedure or
(at some place) is a procedure contained in another procedure. A general
assumption is also that the whole program is compiled with -fopenacc
and that the main program is written in Fortran and not, e.g., in C or C++.


common-decl-v4.diff (1015 bytes) Download Attachment