[PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses: 'declare'

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses: 'declare'

Thomas Schwinge-8

Hi!

On Wed, 05 Jun 2019 11:25:07 +0200, I wrote:
> I [...] had a look at what OpenMP 5.0
> is saying about Fortran 'allocatable' in 'map' clauses [...]

Will you (Jakub, and/or Fortran guys), please have a look at the attached
test case.

If I disable the '!$omp declare target (ar)', then this works with
offloading enabled, otherwise it fails, and here is my understanding what
happens.

With '!$omp declare target (ar)' active, the array descriptor globally
gets allocated on the device, via the 'offload_vars' handling in
'gcc/omp-offload.c' etc.

Then, in 'gcc/omp-low.c:scan_sharing_clauses', the 'GOMP_MAP_TO_PSET(ar)'
(which would update the array descriptor on the device after the
host-side 'allocate') gets removed, because 'Global variables with "omp
declare target" attribute don't need to be copied, the receiver side will
use them directly'.  So, on the device, we'll still have the dummy
(unallocated) array descriptor, and will thus fail.

But even with that behavior disabled, we'll still fail, because in
'libgomp/target.c:gomp_map_vars_internal', when processing the
'GOMP_MAP_TO_PSET(ar)', we find that 'ar' has already been mapped
(globally, as mentioned above), so for 'n && n->refcount !=
REFCOUNT_LINK', we'll call 'gomp_map_vars_existing', and that one again
won't update device-side the array descriptor, because it's not
'GOMP_MAP_ALWAYS_TO_P'.

Indeed, if I use '!$omp declare target link(ar)' ('link' added), then
things seem to work as expected.

Unless I got something wrong, at which level do you suggest this should
be fixed?


Grüße
 Thomas



! { dg-do run }

module mod
  implicit none
  integer, parameter :: n = 40
  integer, allocatable :: ar(:,:,:)
  !$omp declare target (ar)
end module mod

program main
  use mod
  implicit none

  integer :: i
  integer, parameter :: ar_rank = rank(ar)
  integer :: ar_rank_v
  integer :: ar_size_v
  integer, dimension(ar_rank) :: ar_extent_v
  integer, dimension(ar_rank) :: ar_shape_v

  allocate (ar(n,0:n+1,-1:n+2))
  !$omp target enter data map(alloc:ar)

  !$omp target map(from:ar_rank_v, ar_size_v, ar_extent_v, ar_shape_v)
  ar_rank_v = rank(ar)
  ar_size_v = size(ar)
  do i = 1, ar_rank
     ar_extent_v(i) = size(ar, i)
  end do
  ar_shape_v = shape(ar)
  !$omp end target

  print *, ar_rank_v
  if (ar_rank_v /= rank(ar)) error stop
  print *, ar_size_v
  if (ar_size_v /= size(ar)) error stop
  print *, ar_extent_v
  if (any (ar_extent_v /= shape(ar))) error stop
  print *, ar_shape_v
  if (any (ar_shape_v /= shape(ar))) error stop
end program main

signature.asc (671 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PR90743] Fortran 'allocatable' with OpenACC data/OpenMP 'target' 'map' clauses: 'declare'

Thomas Schwinge-8
Hi!

On Fri, 21 Jun 2019 13:46:09 +0200, I wrote:

> On Wed, 05 Jun 2019 11:25:07 +0200, I wrote:
> > I [...] had a look at what OpenMP 5.0
> > is saying about Fortran 'allocatable' in 'map' clauses [...]
>
> Will you (Jakub, and/or Fortran guys), please have a look at the attached
> test case.
>
> If I disable the '!$omp declare target (ar)', then this works with
> offloading enabled, otherwise it fails, and here is my understanding what
> happens.
>
> With '!$omp declare target (ar)' active, the array descriptor globally
> gets allocated on the device, via the 'offload_vars' handling in
> 'gcc/omp-offload.c' etc.
>
> Then, in 'gcc/omp-low.c:scan_sharing_clauses', the 'GOMP_MAP_TO_PSET(ar)'
> (which would update the array descriptor on the device after the
> host-side 'allocate') gets removed, because 'Global variables with "omp
> declare target" attribute don't need to be copied, the receiver side will
> use them directly'.  So, on the device, we'll still have the dummy
> (unallocated) array descriptor, and will thus fail.
>
> But even with that behavior disabled, we'll still fail, because in
> 'libgomp/target.c:gomp_map_vars_internal', when processing the
> 'GOMP_MAP_TO_PSET(ar)', we find that 'ar' has already been mapped
> (globally, as mentioned above), so for 'n && n->refcount !=
> REFCOUNT_LINK', we'll call 'gomp_map_vars_existing', and that one again
> won't update device-side the array descriptor, because it's not
> 'GOMP_MAP_ALWAYS_TO_P'.
>
> Indeed, if I use '!$omp declare target link(ar)' ('link' added), then
> things seem to work as expected.
>
> Unless I got something wrong, at which level do you suggest this should
> be fixed?
So, handling 'declare'd 'allocatable's via the 'link' case does seem to
make some sense to me -- but not yet fully thought through, for example,
how that'd relate to the (default) 'to' property of '!$omp declare target
[to](ar)'?  Is there even something in OpenMP that we'd indeed copy data
to the device (as opposed to just allocate)?  Because, in OpenACC, we
actually do have '!$acc declare copyin', etc.  (But what would that mean
with an 'allocatable'?)

Anyway, I tried a WIP patch, and I'm also attaching the previous test
case rewritten to add a '!$omp declare target' routine 'ar_properties' to
read the device-side array properties.  However, compiling that for
offloading results in:

    [...]/target-array-properties-allocatable-declare-1-2.f90:6: error: variable 'ar' has been referenced in offloaded code but hasn't been marked to be included in the offloaded code
        6 |   integer, allocatable :: ar(:,:,:)
          |
    lto1: fatal error: errors during merging of translation units

Same, if I use an explicit '!$omp declare target link(ar)'.  (I thought
that was the very point of the 'link' clause, that you could then
reference such a variable in offloaded code; I shall re-read the
standards again.  Or, is that just another bug?)

(There doesn't seem to be any Fortran libgomp test case exercising the
'!$omp declare target link'?)

For the record, for the corresponding OpenACC code, we get:

    [...]/array-properties-allocatable-declare-1-2.f90:6:0:
   
        6 |   integer, allocatable :: ar(:,:,:)
          |
    Error: 'ar' with 'link' clause used in 'routine' function
    [...]/array-properties-allocatable-declare-1-2.f90:6:0: Error: 'ar' with 'link' clause used in 'routine' function
    [...]/array-properties-allocatable-declare-1-2.f90:6:0: Error: 'ar' with 'link' clause used in 'routine' function

..., which is different -- but not better.

WIP patch:

diff --git gcc/fortran/trans-common.c gcc/fortran/trans-common.c
index debdbd98ac0..5c2cd9b539e 100644
--- gcc/fortran/trans-common.c
+++ gcc/fortran/trans-common.c
@@ -452,16 +452,18 @@ build_common_decl (gfc_common_head *com, tree union_type, bool is_init)
       DECL_USER_ALIGN (decl) = 0;
       GFC_DECL_COMMON_OR_EQUIV (decl) = 1;
 
       gfc_set_decl_location (decl, &com->where);
 
       if (com->threadprivate)
  set_decl_tls_model (decl, decl_default_tls_model (decl));
 
+      /* Not possible to get an 'allocatable' here, no special handling
+ required.  */
       if (com->omp_declare_target_link)
  DECL_ATTRIBUTES (decl)
   = tree_cons (get_identifier ("omp declare target link"),
        NULL_TREE, DECL_ATTRIBUTES (decl));
       else if (com->omp_declare_target)
  DECL_ATTRIBUTES (decl)
   = tree_cons (get_identifier ("omp declare target"),
        NULL_TREE, DECL_ATTRIBUTES (decl));
diff --git gcc/fortran/trans-decl.c gcc/fortran/trans-decl.c
index 8803525f6ae..029962f65c6 100644
--- gcc/fortran/trans-decl.c
+++ gcc/fortran/trans-decl.c
@@ -1429,18 +1429,26 @@ add_attributes_to_decl (symbol_attribute sym_attr, tree list)
       tree c = build_omp_clause (UNKNOWN_LOCATION, code);
       OMP_CLAUSE_CHAIN (c) = clauses;
       clauses = c;
 
       tree dims = oacc_build_routine_dims (clauses);
       list = oacc_replace_fn_attrib_attr (list, dims);
     }
 
+  /* For an 'allocatable', it's the user's responsibility to 'allocate' it, and
+     create a device copy, so here, we handle it like the 'link' case.  */
   if (sym_attr.omp_declare_target_link
-      || sym_attr.oacc_declare_link)
+      || sym_attr.oacc_declare_link
+      || ((sym_attr.omp_declare_target
+  || sym_attr.oacc_declare_create
+  || sym_attr.oacc_declare_copyin
+  || sym_attr.oacc_declare_deviceptr
+  || sym_attr.oacc_declare_device_resident)
+  && sym_attr.allocatable))
     list = tree_cons (get_identifier ("omp declare target link"),
       NULL_TREE, list);
   else if (sym_attr.omp_declare_target
    || sym_attr.oacc_declare_create
    || sym_attr.oacc_declare_copyin
    || sym_attr.oacc_declare_deviceptr
    || sym_attr.oacc_declare_device_resident)
     list = tree_cons (get_identifier ("omp declare target"),
@@ -6473,17 +6481,22 @@ find_module_oacc_declare_clauses (gfc_symbol *sym)
  map_op = OMP_MAP_DEVICE_RESIDENT;
 
       if (sym->attr.oacc_declare_create
   || sym->attr.oacc_declare_copyin
   || sym->attr.oacc_declare_deviceptr
   || sym->attr.oacc_declare_device_resident)
  {
   sym->attr.referenced = 1;
-  add_clause (sym, map_op);
+  /* For an 'allocatable', it's the user's responsibility to 'allocate'
+     it, and create a device copy, so here, we handle it like the
+     'link' case, and don't add it to the outer OpenACC 'data'
+     construct.  */
+  if (!sym->attr.allocatable)
+    add_clause (sym, map_op);
  }
     }
 }
 
 
 void
 finish_oacc_declare (gfc_namespace *ns, gfc_symbol *sym, bool block)
 {


Grüße
 Thomas



! { dg-do run }

module mod
  implicit none
  integer, parameter :: n = 40
  integer, allocatable :: ar(:,:,:)
  integer, parameter :: ar_rank = rank(ar)
  integer :: ar_rank_v
  integer :: ar_size_v
  integer, dimension(ar_rank) :: ar_extent_v
  integer, dimension(ar_rank) :: ar_shape_v
  !$omp declare target (ar, ar_rank_v, ar_size_v, ar_extent_v, ar_shape_v)
end module mod

subroutine ar_properties
  use mod
  implicit none
  !$omp declare target

  integer :: i

  ar_rank_v = rank(ar)
  ar_size_v = size(ar)
  do i = 1, ar_rank
     ar_extent_v(i) = size(ar, i)
  end do
  ar_shape_v = shape(ar)
end subroutine ar_properties

program main
  use mod
  implicit none

  allocate (ar(n,0:n+1,-1:n+2))
  !$omp target enter data map(alloc:ar)

  !$omp target map(from:ar_rank_v, ar_size_v, ar_extent_v, ar_shape_v)
  call ar_properties
  !$omp end target

  !$omp target update from(ar_rank_v, ar_size_v, ar_extent_v, ar_shape_v)

  print *, ar_rank_v
  if (ar_rank_v /= rank(ar)) error stop
  print *, ar_size_v
  if (ar_size_v /= size(ar)) error stop
  print *, ar_extent_v
  if (any (ar_extent_v /= shape(ar))) error stop
  print *, ar_shape_v
  if (any (ar_shape_v /= shape(ar))) error stop
end program main

signature.asc (671 bytes) Download Attachment