[patch][OpenMP,Fortran] Fix trans-openmp.c, add use_device_addr run-time test case (has known issues with actual offloading)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[patch][OpenMP,Fortran] Fix trans-openmp.c, add use_device_addr run-time test case (has known issues with actual offloading)

Tobias Burnus-3
This use_device_addr patch does:
* Add trivial but crucial missing change to "fortran/trans-openmp.c (Ups!)
* Add a comprehensive set of test cases (only scalars and
non-array-descriptor arrays)

Remarks:

* The test cases are known to mishandle "cc/dd/ee/ff" (= scalars with
allocatable + pointer attribute). That's only visible with actual
offloading as with shared memory the errors cancel and it works.

* OpenMP spec: As the test case shows, "is_device_ptr" with
"type(c_ptr), VALUE" would be nice; but OpenMP's spec for is_device_ptr
only permits dummy arguments without VALUE attribute.

* Known shortcomings (omp-low.c implementation; not tested for): arrays
with descriptor (alloctable/pointer arrays), absent optional variables,
polymorphic variables. [For the first two, draft patches exist.]

Comments, esp. to the test case?

Tobias

PS: My next planned task is to fix the "scalars with allocatable +
pointer attribute" issue revealed in this case.


use_device_addr_fix-testcase.diff (47K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [patch][OpenMP,Fortran] Fix several OpenMP use_device_addr/map/update errors found by a length test case

Jakub Jelinek
On Tue, Oct 08, 2019 at 01:11:53AM +0200, Tobias Burnus wrote:

> gcc/fortran/
> * f95-lang.c (LANG_HOOKS_OMP_IS_ALLOCATABLE_OR_PTR): Re-define to
> gfc_omp_is_allocatable_or_ptr.
> * trans-decl.c (create_function_arglist): Set GFC_DECL_OPTIONAL_ARGUMENT
> only if not passed by value.
> * trans-openmp.c (gfc_omp_is_allocatable_or_ptr): New.
> (gfc_trans_omp_clauses): Actually pass use_device_addr on to the middle
> end; for MAP, handle (present) optional arguments; for target update,
> handle allocatable/pointer scalars.
> * trans.h (gfc_omp_is_allocatable_or_ptr): Declare.
>
> gcc/
> * langhooks-def.h (LANG_HOOKS_OMP_IS_ALLOCATABLE_OR_PTR): Define.
> (LANG_HOOKS_DECLS): Add it.
> * langhooks.h (lang_hooks_for_decls): Add omp_is_allocatable_or_ptr;
> update comment for omp_is_optional_argument.
> * omp-general.c (omp_is_allocatable_or_ptr): New.
> * omp-general.h (omp_is_allocatable_or_ptr): Declare.
> * omp-low.c (scan_sharing_clauses, lower_omp_target): Handle
> Fortran's optional arguments and allocatable/pointer scalars
> with use_device_addr.

This looks reasonable, with a small nit.

> libgomp/
> * testsuite/libgomp.fortran/use_device_addr-1.f90: New.
> * testsuite/libgomp.fortran/use_device_addr-2.f90: New.

I'm worried about the tests though.

> @@ -11678,7 +11680,18 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>    }
>   else
>    {
> -    var = build_fold_addr_expr (var);
> +    // While MAP is handled explicitly by the FE,
> +    // for 'target update', only the identified is passed.

omp-low.c (like most of gcc/*.c) uses /* ... */ comments almost everywhere,
can you please just use the same here?
Otherwise the non-test part LGTM.

What worries me about the test is that the officially only portable way
to use it in a target region is is_device_ptr.  I believe the intention was
to allow the implementation to transform the pointers from e.g.
use_device_ptr to whatever the implementation wants, where it could be e.g.
a structure containing pointer and something or whatever and is_device_ptr
actually finishing it up, even when in our implementation is_device_ptr is
basically just copying the pointer bits from host to device (==
firstprivate).
Yes, I know there is the restriction that the is_device_ptr list item must
be a dummy variable without VALUE/ALLOCATABLE/POINTER.  VALUE itself
wouldn't be a big deal, we could call just by reference instead of value,
but allocatable/pointer I bet is a problem.  So, if there is no portable
way in Fortran to pass c_loc result as a dummy argument to some subprogram
where the dummy argument is not allocatable/pointer, and the caller doesn't
access the actual data in any way, I'm afraid we need to do what you are
doing.  But then the test should start with a comment that it is not
portable and assumes that is_device_ptr doesn't need to transform the
use_device_ptr addresses in any way.  Or another option would be to use C
code for the actual target region, c_loc is for C pointers and if you pass
to C code the c_loc as a pointer and pass in the array size/whatever else it
needs to know, it can then implement it portably with is_device_ptr clause
on the C pointer.

        Jakub