gcc.gnu.org Git - gcc.git/log

OpenMP: need_device_ptr and need_device_addr support for adjust_args

This patch adds support for the "need_device_addr" modifier to the
"adjust args" clause for the "declare variant" directive, and
extends/re-works the support for "need_device_ptr" as well.

This patch builds on waffl3x's recently posted patch, "OpenMP: C/C++
adjust-args numeric ranges", here.

https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681806.html

In C++, "need_device_addr" supports mapping reference arguments to
device pointers. In Fortran, it similarly supports arguments passed
by reference, the default for the language, in contrast to
"need_device_ptr" which is used to map arguments of c_ptr type. The
C++ support is straightforward, but Fortran has some additional
wrinkles involving arrays passed by descriptor (a new descriptor must
be constructed with a pointer to the array data which is the only part
mapped to the device), plus special cases for passing optional
arguments and a whole array instead of a reference to its first element.

gcc/cp/ChangeLog
* parser.cc (cp_finish_omp_declare_variant): Adjust error messages.

gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_variant): Disallow
polymorphic and optional arguments with need_device_addr for now, but
don't reject need_device_addr entirely.

gcc/ChangeLog
* gimplify.cc (modify_call_for_omp_dispatch): Rework logic for
need_device_ptr and need_device_addr adjustments.

gcc/testsuite/Changelog
* c-c++-common/gomp/adjust-args-10.c: Ignore the new sorry since the
lack of proper diagnostic is already xfail'ed.
* g++.dg/gomp/adjust-args-1.C: Adjust output patterns.
* g++.dg/gomp/adjust-args-17.C: New.
* gcc.dg/gomp/adjust-args-3.c: New.
* gfortran.dg/gomp/adjust-args-14.f90: Don't expect this to fail now.

libgomp/ChangeLog
* libgomp.texi: Mark need_device_addr as supported.
* testsuite/libgomp.c-c++-common/dispatch-3.c: New.
* testsuite/libgomp.c++/need-device-ptr.C: New.
* testsuite/libgomp.fortran/adjust-args-array-descriptor.f90: New.
* testsuite/libgomp.fortran/need-device-ptr.f90: New.

Co-Authored-By: Tobias Burnus <tburnus@baylibre.com>

OpenMP: testsuite fixup for g++.dg/gomp/adjust-args-7.C

I screwed up this testcase when backporting "OpenMP: C/C++ adjust-args
numeric ranges" to OG14.

gcc/testsuite/ChangeLog
* g++.dg/gomp/adjust-args-7.C: Put dg-require-effective-target
in a comment.

OpenMP: Silence uninitialized variable warning in C++ front end.

There's no actual problem with the code here, just a false-positive
warning emitted by some older GCC versions.

gcc/cp/ChangeLog
* parser.cc (cp_finish_omp_declare_variant): Initialize
append_args_last.

(cherry picked from commit c978965b445079abbb88c22ba74de1e26e9f5b81)

libgomp.texi: For HIP interop, mention cpp defines to set

The HIP header files recognize the used compiler, defaulting to either AMD
or Nvidia/CUDA; thus, the alternative way of explicitly defining a macro is
less prominently documented. With GCC, the user has to define the
preprocessor macro manually. Hence, as a service to the user, mention
__HIP_PLATFORM_AMD__ and __HIP_PLATFORM_NVIDIA__ in the interop documentation,
even though it has only indirectly to do with GCC and its interop support.

Note to commit-log readers, only: For Fortran, the hipfort modules can be
used; when compiling the hipfort package (defaults to use gfortran), it
generates the module (*.mod) files in include/hipfort/{amdgcn,nvidia}/ such
that the choice is made by setting the respective include path.

libgomp/ChangeLog:

* libgomp.texi (gcn interop, nvptx interop): For HIP with C/C++, add
a note about setting a preprocessor define.

(cherry picked from commit 4bff3f0b89af9a9aad69b8f85859c0a3667533ae)

libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

Note that this commit also updates the API interface to OpenMP 6.0;
while 5.1 and 5.2 use 'int *' for the the ret_code argument,
OpenMP 6.0 changed this to omp_interop_rc_t *; this enum also exists in
OpenMP 5.1. However, C++ does not like this change such that unless NULL
is passed (i.e. the argument is ignored), OpenMP 5.x and 6.x are not
compatible.

Note that GCC's omp.h already follows OpenMP 6.0 and is now in sync with
the documentation.

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.1): Add @ref to offload-target specifics
for 'interop'.
(OpenMP 6.0): Mark dispatch's interop clause as implemented.
(omp_get_interop_int, omp_get_interop_str,
omp_get_interop_ptr, omp_get_interop_type_desc): Add @ref to
Offload-Target Specifics; change ret_code argument type to
'omp_interop_rc_t *'.
(Offload-Target Specifics): Document the supported OpenMP
interop foreign runtimes on AMD and Nvidia GPUs.

(cherry picked from commit 2e7c1b589bc58be0e155098cf87d8535d41adeab)

OpenMP, GCN: Add interop-hsa testcase

This testcase ensures that the interop HSA support is sufficient to run
a kernel manually on the same device.

libgomp/ChangeLog:

* testsuite/libgomp.c/interop-hsa.c: New test.

(cherry picked from commit 8d84ea28510054fbbb8a2b7441916bd75e29163f)

OpenMP: C/C++ adjust-args numeric ranges

This is a backport of the proposed mainline patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681806.html

Add support for OpenMP parameter-lists in an adjust_args clause, more
specifically, numeric ranges and parameter indices.  Many bugs are also
fixed along the way.  Most of the fixes rely on the changes to handling of
clause arguments and can't reasonably be split out, while most of the
changes that fix PR119602 came as a side effect of properly handling numeric
ranges with dependent bounds so it makes sense to include it here.

Variadic arguments with an incorrect type are not currently diagnosed, but
handling for them is otherwise functional.  It is unclear how references to
pointers are supposed to be handled, so for now we sorry for that case.

PR c++/119659
PR c++/118859
PR c++/119601
PR c++/119602
PR c++/119775

gcc/c/ChangeLog:

* c-parser.cc (c_omp_numeric_ranges_always_overlap): New function.
(c_parser_omp_parm_list): New function.
(c_finish_omp_declare_variant): Use c_parser_omp_parm_list instead
of c_parser_omp_variable_list.  Refactor, change format of
"omp declare variant variant args" attribute.

gcc/cp/ChangeLog:

PR c++/119659
PR c++/118859
PR c++/119601
PR c++/119602
PR c++/119775
* cp-tree.h (finish_omp_parm_list): New declaration.
(finish_omp_adjust_args): New declaration.
* decl.cc (omp_declare_variant_finalize_one): Refactor and change
attribute unpacking, use finish_omp_parm_list and
finish_omp_adjust_args, refactor append_args diagnostics, add
nbase_parms to append_args attribute, remove special handling for
member functions.
* parser.cc (cp_parser_direct_declarator): Don't pass parms.
(cp_parser_late_return_type_opt): Remove parms parameter.
(cp_parser_omp_parm_list): New function.
(cp_finish_omp_declare_variant): Remove parms parameter.
Add NULL_TREE instead of nbase_args to append_args_tree.  Refactor,
use cp_parser_omp_parm_list not cp_parser_omp_var_list_no_open,
handle "need_device_addr" and remove handling and diagnostics of
parm list arguments that are done too early.  Change format of
unnamed variant attribute.
(cp_parser_late_parsing_omp_declare_simd): Remove parms parameter.
* pt.cc (tsubst_attribute): Copy "omp declare variant base" nodes,
substitute parm list numeric range bounds.
* semantics.cc (finish_omp_parm_list): New function.
(finish_omp_adjust_args): New function.

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_declare_variant): Change format of
"omp declare variant variant args" attribute.

gcc/ChangeLog:

* gimplify.cc (modify_call_for_omp_dispatch): Refactor and change
attribute unpacking.  For adjust_args variadic functions, expand
numeric ranges with relative bounds.  Refactor argument adjustment.

libgomp/ChangeLog:

* libgomp.texi: Set 'adjust args' variadic arguments support to Y.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/pr118579.c: Change error text.
* g++.dg/gomp/adjust-args-1.C: Fix error text, add dg-* directives.
* g++.dg/gomp/adjust-args-2.C: Add dg-* directives.
* g++.dg/gomp/append-args-1.C: Add dg-* directives.
* gcc.dg/gomp/adjust-args-1.c: Fix error text, add dg-* directives.
* gcc.dg/gomp/append-args-1.c: Fix error text, add dg-* directives.
* c-c++-common/gomp/adjust-args-7.c: New test.
* c-c++-common/gomp/adjust-args-8.c: New test.
* c-c++-common/gomp/adjust-args-9.c: New test.
* c-c++-common/gomp/adjust-args-10.c: New test.
* c-c++-common/gomp/adjust-args-11.c: New test.
* c-c++-common/gomp/adjust-args-12.c: New test.
* c-c++-common/gomp/adjust-args-13.c: New test.
* c-c++-common/gomp/adjust-args-14.c: New test.
* c-c++-common/gomp/adjust-args-15.c: New test.
* g++.dg/gomp/adjust-args-5.C: New test.
* g++.dg/gomp/adjust-args-6.C: New test.
* g++.dg/gomp/adjust-args-7.C: New test.
* g++.dg/gomp/adjust-args-8.C: New test.
* g++.dg/gomp/adjust-args-9.C: New test.
* g++.dg/gomp/adjust-args-10.C: New test.
* g++.dg/gomp/adjust-args-11.C: New test.
* g++.dg/gomp/adjust-args-12.C: New test.
* g++.dg/gomp/adjust-args-13.C: New test.
* g++.dg/gomp/adjust-args-14.C: New test.
* g++.dg/gomp/adjust-args-15.C: New test.
* g++.dg/gomp/adjust-args-16.C: New test.
* g++.dg/gomp/append-args-9.C: New test.
* g++.dg/gomp/append-args-10.C: New test.
* g++.dg/gomp/append-args-11.C: New test.
* g++.dg/gomp/append-args-omp-interop-t.h: New header.

Signed-off-by: Waffl3x <waffl3x@baylibre.com>

GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]

'__dso_handle' for '__cxa_atexit', '__cxa_finalize'. See
<https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor>.

PR target/119853
PR target/119854
libgcc/
* config/gcn/crt0.c (_fini_array): Call
'__GCC_offload___cxa_finalize'.
* config/nvptx/gbl-ctors.c (__static_do_global_dtors): Likewise.
libgomp/
* target-cxa-dso-dtor.c: New.
* config/accel/target-cxa-dso-dtor.c: Likewise.
* Makefile.am (libgomp_la_SOURCES): Add it.
* Makefile.in: Regenerate.
* testsuite/libgomp.c++/target-cdtor-1.C: New.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.

(cherry picked from commit aafe942227baf8c2bcd4cac2cb150e49a4b895a9)

Add 'libgomp.c-c++-common/target-cdtor-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-cdtor-1.c: New.

(cherry picked from commit 40ce48e87c1e7344c622c8eb6bed53f1311f5a0a)

GCN: Properly switch sections in 'gcn_hsa_declare_function_name' [PR119737]

There are GCN/C++ target as well as offloading codes, where the hard-coded
section names in 'gcn_hsa_declare_function_name' do not fit, and assembly thus
fails:

    LLVM ERROR: Size expression must be absolute.

This commit progresses GCN target:

    [-FAIL: g++.dg/init/call1.C  -std=gnu++17 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL: g++.dg/init/call1.C  -std=gnu++26 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
    UNSUPPORTED: g++.dg/init/call1.C  -std=gnu++98: exception handling not supported

..., and GCN offloading:

    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-1.C output pattern test+}

    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-2.C output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 9 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

PR target/119737
gcc/
* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Properly
switch sections.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-1.C: Remove
PR119737 XFAILing.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit dfc43afe719898c3eafbed37fac7e6809d8b97ab)

Adjust 'libgomp.c++/target-exceptions-pr118794-1.C' for 'targetm.arm_eabi_unwinder' [PR118794]

Fix-up for commit aa3e72f943032e5f074b2bd2fd06d130dda8760b
"Add test cases for exception handling constructs in dead code for GCN, nvptx target and OpenMP 'target' offloading [PR118794]":
we need to adjust for configurations with 'targetm.arm_eabi_unwinder', as per:

    gcc/config/arm/arm.cc:#define TARGET_ARM_EABI_UNWINDER true
    gcc/config/c6x/c6x.cc:#define TARGET_ARM_EABI_UNWINDER true

..., which for ARM is conditional to '#if ARM_UNWIND_INFO' (defined in
'gcc/config/arm/bpabi.h', used for various GCC configurations), and for
C6x unconditional.

This gets us:

    --- target-exceptions-pr118794-1.C.269t.optimized
    +++ target-exceptions-pr118794-1.C.270t.optimized
    [...]
     __attribute__((omp declare target))
     void f ()
    [...]
       gimple_call <__dt_comp , NULL, &c>
    -  gimple_call <__builtin_eh_pointer, _7, 2>
    -  gimple_call <__builtin_unwind_resume, NULL, _7>
    +  gimple_call <__builtin_cxa_end_cleanup, NULL>

     }
    [...]

PR target/118794
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Adjust for
'targetm.arm_eabi_unwinder'.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.

(cherry picked from commit 8a1f5424b04130f88e9dcd5cbecd58300bc5166e)

openmp: Fix up cloning of dynamic C++ initializers for OpenMP target [PR119370]

The following testcase ICEs, because we emit the dynamic initialization twice,
once for host and once for target initialization, and although we use
copy_tree_body_r to unshare it, e.g. for the array initializers it can contain
TARGET_EXPRs with local temporaries (or local temporaries alone).
Now, these temporaries were created when current_function_decl was NULL,
they are outside of any particular function, so they have DECL_CONTEXT NULL.
That is normally fine, gimple_add_tmp_var will set DECL_CONTEXT for them
later on to the host __static_init* function into which they are gimplified.
The problem is that the copy_tree_body_r cloning happens before that (and has
to, gimplification is destructive and so we couldn't gimplify the same tree
again in __omp_static_init* function) and uses auto_var_in_fn_p to see what
needs to be remapped. But that means it doesn't remap temporaries with
NULL DECL_CONTEXT and having the same temporaries in two different functions
results in ICEs (sure, one can e.g. use parent function's temporaries in a
nested function).

The following patch just arranges to set DECL_CONTEXT on the temporaries
to the host dynamic initialization function, so that they get remapped.
If gimplification cared whether DECL_CONTEXT is NULL or not, we could
remember those that we've changed in a vector and undo it afterwards,
but seems it doesn't really care.

2025-03-20 Jakub Jelinek <jakub@redhat.com>

PR c++/119370
* decl2.cc (set_context_for_auto_vars_r): New function.
(emit_partial_init_fini_fn): Call walk_tree with that function
on &init before walk_tree with copy_tree_body_r.

* g++.dg/gomp/pr119370.C: New test.

(cherry picked from commit e0b3eeb67f6d3bfe95591d8fb0c7dfd3f1b3b4ef)

openmp: Fix handling of declare target statics with array type which need destruction [PR118876]

The following testcase ICEs because it attempts to emit the __tcfa function twice,
once when handling the host destruction and once when handling nohost destruction.

This patch fixes it by using __omp_tcfa function for the nohost case and marks it
with the needed "omp declare target" and "omp declare target nohost" attributes.

2025-02-25 Jakub Jelinek <jakub@redhat.com>

PR c++/118876
* cp-tree.h (register_dtor_fn): Add a bool argument defaulted to false.
* decl.cc (start_cleanup_fn): Add OMP_TARGET argument, use
"__omp_tcf" prefix rather than "__tcf" in that case. Add
"omp declare target" and "omp declare target nohost" attributes
to the fndecl.
(register_dtor_fn): Add OMP_TARGET argument, pass it down to
start_cleanup_fn.
* decl2.cc (one_static_initialization_or_destruction): Add OMP_TARGET
argument, pass it down to register_dtor_fn.
(emit_partial_init_fini_fn): Pass omp_target to
one_static_initialization_or_destruction.
(handle_tls_init): Pass false to
one_static_initialization_or_destruction.

* g++.dg/gomp/pr118876.C: New test.

(cherry picked from commit 86a4af2793393e47af6b78cb7094c97914890091)

OpenMP: Constructors and destructors for "declare target" static aggregates: Fix effective-target keyword in test cases

(Most of) the tests added in commit f1bfba3a9b3f31e3e06bfd1911c9f223869ea03f
"OpenMP: Constructors and destructors for "declare target" static aggregates"
had a mismatch between dump file production and its scanning; the former needs
to use 'offload_target_nvptx' (like 'offload_target_amdgcn'), not
'offload_device_nvptx'.

libgomp/
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C:
Fix effective-target keyword.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C:
Likewise.
* testsuite/libgomp.c-c++-common/target-is-initial-host-2.c:
Likewise.
* testsuite/libgomp.c-c++-common/target-is-initial-host.c:
Likewise.
* testsuite/libgomp.fortran/target-is-initial-host-2.f90:
Likewise.
* testsuite/libgomp.fortran/target-is-initial-host.f: Likewise.
* testsuite/libgomp.fortran/target-is-initial-host.f90: Likewise.

(cherry picked from commit 9f5d22e3e2b8e4532896a4f3837cb86006d5930c)

libgomp/libgomp.texi: Mention -fno-builtin-omp_is_initial_device

libgomp/ChangeLog:

* libgomp.texi (omp_is_initial_device): Mention
-fno-builtin-omp_is_initial_device and folding by default.

(cherry picked from commit 8b5a8b1f60e7d1a51429f118e0fb3d851abe6cd4)

libgomp.c++/static-aggr-constructor-destructor-{1,2}.C: Fix scan-tree-dump

In principle, the optimized dump should be the same on the host, but as
'nohost' is not handled, is is present. However when ENABLE_OFFLOADING is
false, it is handled early enough to remove the function.

libgomp/ChangeLog:

* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: Split
scan-tree-dump into with and without target offload_target_any.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C:
Likewise.

(cherry picked from commit e3a6dec326a127ad549246435b9d3835e9a32407)

OpenMP: Constructors and destructors for "declare target" static aggregates

This commit also compile-time expands (__builtin_)omp_is_initial_device for
both Fortran and C/C++ (unless, -fno-builtin-omp_is_initial_device is used).
But the main change is:

This commit adds support for running constructors and destructors for
static (file-scope) aggregates for C++ objects which are marked with
"declare target" directives on OpenMP offload targets.

Before this commit, space is allocated on the target for such aggregates,
but nothing ever constructs them properly, so they end up zero-initialised.

(See the new test static-aggr-constructor-destructor-3.C for a reason
why running constructors on the target is preferable to e.g. constructing
on the host and then copying the resulting object to the target.)

2024-08-07 Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@baylibre.com>

gcc/ChangeLog:

* builtins.def (DEF_GOMP_BUILTIN_COMPILER): Define
DEF_GOMP_BUILTIN_COMPILER to handle the non-prefix version.
* gimple-fold.cc (gimple_fold_builtin_omp_is_initial_device): New.
(gimple_fold_builtin): Call it.
* omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): Define.
* tree.cc (get_file_function_name): Support names for on-target
constructor/destructor functions.

gcc/cp/
* decl2.cc (tree-inline.h): Include.
(static_init_fini_fns): Bump to four entries. Update comment.
(start_objects, start_partial_init_fini_fn): Add 'omp_target'
parameter. Support "declare target" decls. Update forward declaration.
(emit_partial_init_fini_fn): Add 'host_fn' parameter. Return tree for
the created function. Support "declare target".
(OMP_SSDF_IDENTIFIER): New macro.
(partition_vars_for_init_fini): Support partitioning "declare target"
variables also.
(generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support
"declare target" decls.
(c_parse_final_cleanups): Support constructors/destructors on OpenMP
offload targets.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_option_t): Add disable_omp_is_initial_device.
* lang.opt (fbuiltin-): Add.
* options.cc (gfc_handle_option): Handle
-fno-builtin-omp_is_initial_device.
* f95-lang.cc (gfc_init_builtin_functions): Handle
DEF_GOMP_BUILTIN_COMPILER.
* trans-decl.cc (gfc_get_extern_function_decl): Add code to use
DEF_GOMP_BUILTIN_COMPILER for 'omp_is_initial_device'.

libgomp/ChangeLog:

* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-3.C: New test.
* testsuite/libgomp.c-c++-common/target-is-initial-host.c: New test.
* testsuite/libgomp.c-c++-common/target-is-initial-host-2.c: New test.
* testsuite/libgomp.fortran/target-is-initial-host.f: New test.
* testsuite/libgomp.fortran/target-is-initial-host.f90: New test.
* testsuite/libgomp.fortran/target-is-initial-host-2.f90: New test.

Co-authored-by: Tobias Burnus <tobias@baylibre.com>
(cherry picked from commit f1bfba3a9b3f31e3e06bfd1911c9f223869ea03f)

Revert 'OpenMP: Constructors and destructors for "declare target" static aggregates'

... in preparation of cherry-picking the upstream trunk changes.

This reverts commit 9127b4db9c58f48e0b47816d64e22c21404136b3.

gcc/
Revert:
2023-05-12  Julian Brown  <julian@codesourcery.com>

* omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): New builtin.
* tree.cc (get_file_function_name): Support names for on-target
constructor/destructor functions.
gcc/cp/
Revert:
2023-05-12  Julian Brown  <julian@codesourcery.com>

* decl2.cc (tree-inline.h): Include.
(static_init_fini_fns): Bump to four entries. Update comment.
(start_objects, start_partial_init_fini_fn): Add 'omp_target'
parameter. Support "declare target" decls. Update forward declaration.
(emit_partial_init_fini_fn): Add 'host_fn' parameter. Return tree for
the created function. Support "declare target".
(OMP_SSDF_IDENTIFIER): New macro.
(partition_vars_for_init_fini): Support partitioning "declare target"
variables also.
(generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support
"declare target" decls.
(c_parse_final_cleanups): Support constructors/destructors on OpenMP
offload targets.
libgomp/
Revert:
2023-05-12  Julian Brown  <julian@codesourcery.com>

* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New
test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New
test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-3.C: New
test.

libgomp/testsuite: Fix hip_header_nvidia check, add workaround to test

This is all about using the AMD's HIP header files with
__HIP_PLATFORM_NVIDIA__ defined, i.e. HIP with Nvidia/CUDA; in that case,
HIP is a thin layer on top of CUDA.

First, the check_effective_target_gomp_hip_header_nvidia check failed;
to fix it, -Wno-deprecated-declarations was added - and likewise to the
two affected testcases that actually used the HIP headers on Nvidia.

Doing so, the HIP tested was successful but the HIP-BLAS one showed two
issues:

* One seems to be related to include search paths as the HIP header uses
  #include "library_types.h" to include that CUDA header. Seemingly, it
  tried to included (again) the HIP header hip/library_types.h, not the
  CUDA one. I guess, some tweaking of -isystem vs. -I could have
  prevented this, but the simpler workaround was to just explicitly
  include the CUDA one before the HIP header files.

* Once done, everything compiled but linking failed as the association
  between three HIP-BLAS functions and their CUDA-BLAS ones did not
  work. Solution: Just add three #define for mapping them.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp
(check_effective_target_gomp_hip_header_nvidia): Compile with
"-Wno-deprecated-declarations".
* testsuite/libgomp.c/interop-hip-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hipblas-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hipblas.h: Add workarounds
when using the HIP headers with __HIP_PLATFORM_NVIDIA__.

(cherry picked from commit 8ef0518bce489c4c0c252a0e0c44193c5f7cf777)

libgomp: Add additional OpenMP interop runtime tests

Add checks for nowait/depend and for checks that the returned
CUDA, CUDA_DRIVER and HIP interop objects actually work.

While the CUDA/CUDA_DRIVER ones are only for Nvidia GPUs, HIP
works on both AMD and Nvidia GPUs; on Nvidia GPUs, it is a
very thin wrapper around CUDA.

For Fortran, only a HIP test has been added - using hipfort.

While libgomp.c-c++-common/interop-2.c always works - even without
GPU - and checks for depend / nowait, all others require that
runtime libraries are found at link (and execution) time:
For Nvidia GPUs, libcuda + libcudart or libcublas,
For AMD GPUs, libamdhip64 or libhipblas.

The header files and hipfort modules do not need to be present as a
fallback has been implemented, but if they are, they get used.

Due to the combinations, the basic 1x C/C++, 4x C and 1x Fortran tests
yield 1x C/C++, 14x C and 4 Fortran run-test files.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp (check_effective_target_openacc_cublas,
check_effective_target_openacc_cudart): Update description as
the check requires more.
(check_effective_target_openacc_libcuda,
check_effective_target_openacc_libcublas,
check_effective_target_openacc_libcudart,
check_effective_target_gomp_hip_header_amd,
check_effective_target_gomp_hip_header_nvidia,
check_effective_target_gomp_hipfort_module,
check_effective_target_gomp_libamdhip64,
check_effective_target_gomp_libhipblas): New.
* testsuite/libgomp.c-c++-common/interop-2.c: New test.
* testsuite/libgomp.c/interop-cublas-full.c: New test.
* testsuite/libgomp.c/interop-cublas-libonly.c: New test.
* testsuite/libgomp.c/interop-cuda-full.c: New test.
* testsuite/libgomp.c/interop-cuda-libonly.c: New test.
* testsuite/libgomp.c/interop-hip-amd-full.c: New test.
* testsuite/libgomp.c/interop-hip-amd-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-full.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-no-headers.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hip.h: New test.
* testsuite/libgomp.c/interop-hipblas-amd-full.c: New test.
* testsuite/libgomp.c/interop-hipblas-amd-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-full.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-no-headers.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hipblas.h: New test.
* testsuite/libgomp.fortran/interop-hip-amd-full.F90: New test.
* testsuite/libgomp.fortran/interop-hip-amd-no-module.F90: New test.
* testsuite/libgomp.fortran/interop-hip-nvidia-full.F90: New test.
* testsuite/libgomp.fortran/interop-hip-nvidia-no-module.F90: New test.
* testsuite/libgomp.fortran/interop-hip.h: New test.

(cherry picked from commit 515d9be7944e89f5ec4363f9816ad4031ab6394b)

openmp, fortran: Add support for non-constant iterator bounds in Fortran deep-mapping iterator support

gcc/fortran/

* trans-openmp.cc (gfc_omp_deep_mapping_map): Add new argument for
vector of newly created iterators.  Push new iterators onto the
vector.
(gfc_omp_deep_mapping_comps): Add new argument for vector of new
iterators.  Pass argument in calls to gfc_omp_deep_mapping_item and
gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_item): Add new argument for vector of new
iterators.  Pass argument in calls to gfc_omp_deep_mapping_map and
gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_do): Add new argument for vector of new
iterators.  Pass argument in calls to gfc_omp_deep_mapping_item.
(gfc_omp_deep_mapping_cnt): Pass NULL to new argument for
gfc_omp_deep_mapping_do.
(gfc_omp_deep_mapping): Add new argument for vector of new
iterators.  Pass argument in calls to gfc_omp_deep_mapping_do.
* trans.h (gfc_omp_deep_mapping): Add new argument.

gcc/

* langhooks-def.h (lhd_omp_deep_mapping): Add new argument.
* langhooks.cc (lhd_omp_deep_mapping): Likewise.
* langhooks.h (omp_deep_mapping): Likewise.
* omp-low.cc (allocate_omp_iterator_elems): Work on the supplied
iterator set instead of the iterators in a supplied set of clauses.
(free_omp_iterator_elems): Likewise.
(lower_omp_target): Maintain vector of new iterators generated by
deep-mapping.  Allocate and free iterator element arrays using
iterators found in clauses and in the new iterator vector.

libgomp/

* testsuite/libgomp.fortran/allocatable-comp-iterators.f90: Add test
for non-const iterator boundaries.

openmp, fortran: Add iterator support for Fortran deep-mapping of allocatables

gcc/fortran/

* trans-openmp.cc (gfc_omp_deep_mapping_map): Remove const from ctx
argument.  Add arguments for iterators and the statement sequence to
go into the iterator loop.  Add statement sequence to iterator loop
body.  Generate iterator loop entries for generated maps, insert
the map decls and sizes into the iterator element arrays, replace
original decls with the address of the element arrays, and
sizes/biases with SIZE_INT.
(gfc_omp_deep_mapping_comps): Remove const from ctx. Add argument for
iterators.  Pass iterators to calls to gfc_omp_deep_mapping_item and
gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_item): Remove const from ctx. Add argument for
iterators.  Collect generated side-effect statements and pass to
gfc_omp_deep_mapping_map along with the iterators.  Pass iterators
to gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_do): Remove const from ctx.  Pass iterators to
gfc_omp_deep_mapping_item.
(gfc_omp_deep_mapping_cnt): Remove const from ctx.
(gfc_omp_deep_mapping): Likewise.
* trans.h (gfc_omp_deep_mapping_cnt): Likewise.
(gfc_omp_deep_mapping): Likewise.

gcc/

* gimplify.cc (enter_omp_iterator_loop_context): New function variant.
(enter_omp_iterator_loop_context): Delegate to new variant.
(exit_omp_iterator_loop_context): New function variant.
(exit_omp_iterator_loop_context): Delegate to new variant.
(assign_to_iterator_elems_array): New.
(add_new_omp_iterators_entry): New.
(add_new_omp_iterators_clause): Delegate to
add_new_omp_iterators_entry.
* gimplify.h (enter_omp_iterator_loop_context): New prototype.
(enter_omp_iterator_loop_context): Remove default argument.
(exit_omp_iterator_loop_context): Remove argument.
(assign_to_iterator_elems_array): New prototype.
(add_new_omp_iterators_entry): New prototype.
(add_new_omp_iterators_clause): New prototype.
* langhooks-def.h (lhd_omp_deep_mapping_cnt): Remove const from
argument.
(lhd_omp_deep_mapping): Likewise.
* langhooks.h (omp_deep_mapping_cnt): Likewise.
(omp_deep_mapping): Likewise.
* omp-low.cc (lower_omp_map_iterator_expr): Delegate to
assign_to_iterator_elems_array.
(lower_omp_map_iterator_size): Likewise.
(lower_omp_target): Remove sorry for deep mapping.

libgomp/

* testsuite/libgomp.fortran/allocatable-comp-iterators.f90: New.

openmp, Fortran: Add support using iterators with custom mappers in Fortran

gcc/fortran/

* openmp.cc (gfc_omp_instantiate_mapper): Add argument for namespace.
Apply namespace to new clauses. Propagate namespace to nested
mappers.
(gfc_omp_instantiate_mappers): Pass namespace of clause to clauses
generated by mappers.

libgomp/

* testsuite/libgomp.fortran/mapper-iterators-1.f90: New test.
* testsuite/libgomp.fortran/mapper-iterators-2.f90: New test.
* testsuite/libgomp.fortran/mapper-iterators-3.f90: New test.
* testsuite/libgomp.fortran/mapper-iterators-4.f90: New test.

Co-authored-by: Andrew Stubbs <ams@baylibre.com>

openmp: Add support for using custom mappers with iterators (C, C++)

gcc/c-family/

* c-omp.cc (omp_instantiate_mapper): Apply iterator to new clauses
generated from mapper.

gcc/c/

* c-parser.cc (c_parser_omp_clause_map): Apply iterator to push and
pop mapper clauses.

gcc/cp/

* parser.cc (cp_parser_omp_clause_map): Apply iterator to push and
pop mapper clauses.
* semantics.cc (cxx_omp_map_array_section): Allow array types for
base type of array sections.

libgomp/

* testsuite/libgomp.c-c++-common/mapper-iterators-1.c: New test.
* testsuite/libgomp.c-c++-common/mapper-iterators-2.c: New test.
* testsuite/libgomp.c-c++-common/mapper-iterators-3.c: New test.

Co-authored-by: Andrew Stubbs <ams@baylibre.com>

openmp: Fix struct handling for OpenMP iterators

New clauses can be created for structs, and these will also need to have
iterators applied to them if the base clause is using iterators. As this
occurs after the initial iterator expansion, a new mechanism for allocating
new entries in the iterator loop is required.

gcc/

* gimplify.cc (add_new_omp_iterators_clause): New.
(build_omp_struct_comp_nodes): Add extra argument for loops sequence.
Call add_new_omp_iterators_clause on newly generated clauses.
(omp_accumulate_sibling_list): Add extra argument for loops sequence.
Pass to calls to build_omp_struct_comp_nodes. Add iterators to newly
generator clauses for struct accesses.
(omp_build_struct_sibling_lists): Add extra argument for loops
sequence. Pass to call to omp_accumulate_sibling_list.
(gimplify_adjust_omp_clauses): Pass loops sequence to
omp_build_struct_sibling_lists.

openmp: Add macros for iterator element access

gcc/c/

* c-parser.cc (c_parser_omp_iterators): Use macros for accessing
iterator elements.
(c_parser_omp_clause_affinity): Likewise.
(c_parser_omp_clause_depend): Likewise.
(c_parser_omp_clause_map): Likewise.
(c_parser_omp_clause_from_to): Likewise.
* c-typeck.cc (c_omp_finish_iterators): Likewise.

gcc/cp/

* parser.cc (cp_parser_omp_iterators): Use macros for accessing
iterator elements.
(cp_parser_omp_clause_affinity): Likewise.
(cp_parser_omp_clause_depend): Likewise.
(cp_parser_omp_clause_from_to): Likewise.
(cp_parser_omp_clause_map): Likewise.
* semantics.cc (cp_omp_finish_iterators): Likewise.

gcc/fortran/

* trans-openmp.cc (gfc_trans_omp_array_section): Use macros for
accessing iterator elements.
(handle_iterator): Likewise.
(gfc_trans_omp_clauses): Likewise.

gcc/

* gimplify.cc (gimplify_omp_affinity): Use macros for accessing
iterator elements.
(compute_omp_iterator_count): Likewise.
(build_omp_iterator_loop): Likewise.
(remove_unused_omp_iterator_vars): Likewise.
(build_omp_iterators_loops): Likewise.
(enter_omp_iterator_loop_context_1): Likewise.
(extract_base_bit_offset): Likewise.
* omp-low.cc (lower_omp_map_iterator_expr): Likewise.
(lower_omp_map_iterator_size): Likewise.
(allocate_omp_iterator_elems): Likewise.
(free_omp_iterator_elems): Likewise.
* tree-inline.cc (copy_tree_body_r): Likewise.
* tree-pretty-print.cc (dump_omp_iterators): Likewise.
* tree.h (OMP_ITERATORS_VAR, OMP_ITERATORS_BEGIN, OMP_ITERATORS_END,
OMP_ITERATORS_STEP, OMP_ITERATORS_ORIG_STEP, OMP_ITERATORS_BLOCK,
OMP_ITERATORS_LABEL, OMP_ITERATORS_INDEX, OMP_ITERATORS_ELEMS,
OMP_ITERATORS_COUNT, OMP_ITERATORS_EXPANDED_P): New macros.

openmp: Add support for non-constant iterator parameters in map, to and from clauses

This patch enables support for using non-constant expressions when specifying
iterators in the map clause of target constructs and to/from clauses of
target update constructs.

gcc/

* gimplify.cc (omp_iterator_elems_length): New.
(build_omp_iterators_loops): Change type of elements
array to pointer of pointers if array length is non-constant, and
assign size with indirect reference.  Reorder elements added to
iterator vector and add element containing the iteration count.  Use
omp_iterator_elems_length to compute element array size required.
* gimplify.h (omp_iterator_elems_length): New prototype.
* omp-low.cc (lower_omp_map_iterator_expr): Reorder elements read
from iterator vector.  If elements field is a pointer type, assign
using pointer arithmetic followed by indirect reference, and return
the field directly.
(lower_omp_map_iterator_size): Reorder elements read from iterator
vector.  If elements field is a pointer type, assign using pointer
arithmetic followed by indirect reference.
(allocate_omp_iterator_elems): New.
(free_omp_iterator_elems): New.
(lower_omp_target): Call allocate_omp_iterator_elems before inserting
loops sequence, and call free_omp_iterator_elems afterwards.
* tree-pretty-print.cc (dump_omp_iterators): Print extra elements in
iterator vector.

gcc/testsuite/

* c-c++-common/gomp/target-map-iterators-3.c: Update expected Gimple
output.
* c-c++-common/gomp/target-map-iterators-5.c: New.
* c-c++-common/gomp/target-update-iterators-3.c: Update expected
Gimple output.
* gfortran.dg/gomp/target-map-iterators-3.f90: Likewise.
* gfortran.dg/gomp/target-map-iterators-5.f90: New.
* gfortran.dg/gomp/target-update-iterators-3.f90: Update expected
Gimple output.

libgomp/

* testsuite/libgomp.c-c++-common/target-map-iterators-4.c: New.
* testsuite/libgomp.c-c++-common/target-map-iterators-5.c: New.
* testsuite/libgomp.c-c++-common/target-update-iterators-4.c: New.
* testsuite/libgomp.fortran/target-map-iterators-4.f90: New.
* testsuite/libgomp.fortran/target-map-iterators-5.f90: New.
* testsuite/libgomp.fortran/target-update-iterators-4.f90: New.

openmp: Disable strided target updates when iterators are used

Non-contiguous target updates result in the new strided target updates code
being used, resulting in new clauses such as GOMP_MAP_GRID_DIM,
GOMP_MAP_GRID_STRIDE etc. These are not currently supported in conjunction
with iterators, so this code-path is disabled when used together with
iterators.

The older target updates supports non-contiguous updates as long as a stride
is not applied.

gcc/c/

* c-typeck.cc (handle_omp_array_sections): Add extra argument.  Set
argument to true if array section has a stride that is not one.
(c_finish_omp_clauses): Disable strided updates when iterators are
used in the clause.  Emit sorry if strided.

gcc/cp/

* semantics.cc (handle_omp_array_sections): Add extra argument.  Set
argument to true if array section has a stride that is not one.
(finish_omp_clauses): Disable strided updates when iterators are
used in the clause.  Emit sorry if strided.

gcc/fortran/

* trans-openmp.cc (gfc_trans_omp_clauses): Disable strided updates
when iterators are used in the clause.

openmp, fortran: Add support for iterators in OpenMP 'target update' constructs (Fortran)

This adds Fortran support for iterators in 'to' and 'from' clauses in the
'target update' OpenMP directive.

gcc/fortran/

* dump-parse-tree.cc (show_omp_namelist): Add iterator support for
OMP_LIST_TO and OMP_LIST_FROM.
* match.cc (gfc_free_namelist): Free namespace for OMP_LIST_TO and
OMP_LIST_FROM.
* openmp.cc (gfc_free_omp_clauses): Free namespace for OMP_LIST_TO
and OMP_LIST_FROM.
(gfc_match_motion_var_list): Parse 'iterator' modifier.
(resolve_omp_clauses): Resolve iterators for OMP_LIST_TO and
OMP_LIST_FROM.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle iterators in
OMP_LIST_TO and OMP_LIST_FROM clauses. Add expressions to
iter_block rather than block.

gcc/testsuite/

* gfortran.dg/gomp/target-update-iterators-1.f90: New.
* gfortran.dg/gomp/target-update-iterators-2.f90: New.
* gfortran.dg/gomp/target-update-iterators-3.f90: New.

libgomp/

* testsuite/libgomp.fortran/target-update-iterators-1.f90: New.
* testsuite/libgomp.fortran/target-update-iterators-2.f90: New.
* testsuite/libgomp.fortran/target-update-iterators-3.f90: New.

Co-authored-by: Andrew Stubbs <ams@baylibre.com>

openmp, fortran: Add support for map iterators in OpenMP target construct (Fortran)

This adds support for iterators in map clauses within OpenMP
'target' constructs in Fortran.

Some special handling for struct field maps has been added to libgomp in
order to handle arrays of derived types.

gcc/fortran/

* dump-parse-tree.cc (show_omp_namelist): Add iterator support for
OMP_LIST_MAP.
* match.cc (gfc_free_namelist): Free namespace for OMP_LIST_MAP.
* openmp.cc (gfc_free_omp_clauses): Free namespace in namelist for
OMP_LIST_MAP.
(gfc_match_omp_clauses): Parse 'iterator' modifier for 'map' clause.
(resolve_omp_clauses): Resolve iterators for OMP_LIST_MAP.
* trans-openmp.cc: Include tree-ssa-loop-niter.h.
(gfc_trans_omp_array_section): Add iterator argument.  Replace
instances of iterator variables with the initial value when
computing biases.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle iterators in
OMP_LIST_MAP clauses.  Add expressions to iter_block rather than
block.  Do not apply iterators to firstprivate maps.  Pass iterator
to gfc_trans_omp_array_section.

gcc/

* gimplify.cc (compute_omp_iterator_count): Account for difference
in loop boundaries in Fortran.
(build_omp_iterator_loop): Change upper boundary condition for
Fortran.  Insert block statements into innermost loop.
(remove_unused_omp_iterator_vars): Copy block subblocks of old
iterator to new iterator and remove original.
(contains_vars_1): New.
(contains_vars): New.
(extract_base_bit_offset): Add iterator argument.  Remove iterator
variables from base.  Do not set variable_offset if the offset
does not contain any remaining variables.
(omp_accumulate_sibling_list): Add iterator argument to
extract_base_bit_offset.
* tree-pretty-print.cc (dump_block_node): Ignore BLOCK_SUBBLOCKS
containing iterator block statements.

gcc/testsuite/

* gfortran.dg/gomp/target-map-iterators-1.f90: New.
* gfortran.dg/gomp/target-map-iterators-2.f90: New.
* gfortran.dg/gomp/target-map-iterators-3.f90: New.
* gfortran.dg/gomp/target-map-iterators-4.f90: New.

libgomp/

* target.c (kind_to_name): Handle GOMP_MAP_STRUCT and
GOMP_MAP_STRUCT_UNORD.
(gomp_add_map): New.
(gomp_merge_iterator_maps): Expand fields of a struct mapping
breadth-first.
* testsuite/libgomp.fortran/target-map-iterators-1.f90: New.
* testsuite/libgomp.fortran/target-map-iterators-2.f90: New.
* testsuite/libgomp.fortran/target-map-iterators-3.f90: New.

Co-authored-by: Andrew Stubbs <ams@baylibre.com>

openmp, fortran: Revert to using tree expressions when translating Fortran OpenMP array sections

In the patch 'OpenACC 2.7: Implement reductions for arrays and records',
temporaries are used to hold the decl and bias of clauses resulting from array
sections, which breaks some assumptions made for map iterator support.

This patch reverts the change for OpenMP only.

gcc/fortran/

* trans-openmp.cc (gfc_trans_omp_array_section): Use temporaries only
when translating OpenACC.

gcc/testsuite/

* gfortran.dg/gomp/target-enter-exit-data.f90: Revert expected tree
dumps.

openmp, fortran: Move udm field of gfc_omp_namelist into a new union

This patch moves u2.udm into u3.udm.

This is necessary to avoid clashes when mappers are used together with
iterators, which uses u2.ns.

gcc/fortran/

* gfortran.h (struct gfc_omp_namelist): Move udm field into a new
union u3.
* match.cc (gfc_free_omp_namelist): Change references to u2.udm to
u3.udm.
* module.cc (load_omp_udms): Likewise.
(write_omp_udm): Likewise.
* openmp.cc (gfc_match_motion_var_list): Likewise.
(gfc_match_omp_clauses): Likewise.
(resolve_omp_clauses): Likewise.
(gfc_omp_instantiate_mapper): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
(gfc_find_nested_mappers): Likewise.

openmp: Add support for iterators in 'target update' clauses (C/C++)

This adds support for iterators in 'to' and 'from' clauses in the
'target update' OpenMP directive.

gcc/c/

* c-parser.cc (c_parser_omp_clause_from_to): Parse 'iterator' modifier.
* c-typeck.cc (c_finish_omp_clauses): Finish iterators for to/from
clauses.

gcc/cp/

* parser.cc (cp_parser_omp_clause_from_to): Parse 'iterator' modifier.
* semantics.cc (finish_omp_clauses): Finish iterators for to/from
clauses.

gcc/

* gimplify.cc (gimplify_scan_omp_clauses): Add argument for iterator
loop sequence.   Gimplify the clause decl and size into the iterator
loop if iterators are used.
(gimplify_omp_workshare): Add argument for iterator loops sequence
in call to gimplify_scan_omp_clauses.
(gimplify_omp_target_update): Call remove_unused_omp_iterator_vars and
build_omp_iterators_loops.  Add loop sequence as argument when calling
gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses and building
the Gimple statement.
* tree-pretty-print.cc (dump_omp_clause): Call dump_omp_iterators
for to/from clauses with iterators.
* tree.cc (omp_clause_num_ops): Add extra operand for OMP_CLAUSE_FROM
and OMP_CLAUSE_TO.
* tree.h (OMP_CLAUSE_HAS_ITERATORS): Add check for OMP_CLAUSE_TO and
OMP_CLAUSE_FROM.
(OMP_CLAUSE_ITERATORS): Likewise.

gcc/testsuite/

* c-c++-common/gomp/target-update-iterators-1.c: New.
* c-c++-common/gomp/target-update-iterators-2.c: New.
* c-c++-common/gomp/target-update-iterators-3.c: New.

libgomp/

* target.c (gomp_update): Call gomp_merge_iterator_maps.  Free
allocated variables.
* testsuite/libgomp.c-c++-common/target-update-iterators-1.c: New.
* testsuite/libgomp.c-c++-common/target-update-iterators-2.c: New.
* testsuite/libgomp.c-c++-common/target-update-iterators-3.c: New.

openmp: Add support for iterators in map clauses (C/C++)

This adds preliminary support for iterators in map clauses within OpenMP
'target' constructs (which includes constructs such as 'target enter data').

Iterators with non-constant loop bounds are not currently supported.

gcc/c/

* c-parser.cc (c_parser_omp_clause_map): Parse 'iterator' modifier.
* c-typeck.cc (c_finish_omp_clauses): Finish iterators.  Apply
iterators to generated clauses.

gcc/cp/

* parser.cc (cp_parser_omp_clause_map): Parse 'iterator' modifier.
* semantics.cc (finish_omp_clauses): Finish iterators.  Apply
iterators to generated clauses.

gcc/

* gimple-pretty-print.cc (dump_gimple_omp_target): Print expanded
iterator loops.
* gimple.cc (gimple_build_omp_target): Add argument for iterator
loops sequence.  Initialize iterator loops field.
* gimple.def (GIMPLE_OMP_TARGET): Set GSS symbol to GSS_OMP_TARGET.
* gimple.h (gomp_target): Set GSS symbol to GSS_OMP_TARGET.  Add extra
field for iterator loops.
(gimple_build_omp_target): Add argument for iterator loops sequence.
(gimple_omp_target_iterator_loops): New.
(gimple_omp_target_iterator_loops_ptr): New.
(gimple_omp_target_set_iterator_loops): New.
* gimplify.cc (find_var_decl): New.
(copy_omp_iterator): New.
(remap_omp_iterator_var_1): New.
(remap_omp_iterator_var): New.
(remove_unused_omp_iterator_vars): New.
(struct iterator_loop_info_t): New type.
(iterator_loop_info_map_t): New type.
(build_omp_iterators_loops): New.
(enter_omp_iterator_loop_context_1): New.
(enter_omp_iterator_loop_context): New.
(enter_omp_iterator_loop_context): New.
(exit_omp_iterator_loop_context): New.
(gimplify_adjust_omp_clauses): Add argument for iterator loop
sequence.  Gimplify the clause decl and size into the iterator
loop if iterators are used.
(gimplify_omp_workshare): Call remove_unused_omp_iterator_vars and
build_omp_iterators_loops for OpenMP target expressions.  Add
loop sequence as argument when calling gimplify_adjust_omp_clauses
and building the Gimple statement.
* gimplify.h (enter_omp_iterator_loop_context): New prototype.
(exit_omp_iterator_loop_context): New prototype.
* gsstruct.def (GSS_OMP_TARGET): New.
* omp-low.cc (lower_omp_map_iterator_expr): New.
(lower_omp_map_iterator_size): New.
(finish_omp_map_iterators): New.
(lower_omp_target): Add sorry if iterators used with deep mapping.
Call lower_omp_map_iterator_expr before assigning to sender ref.
Call lower_omp_map_iterator_size before setting the size.  Insert
iterator loop sequence before the statements for the target clause.
* tree-nested.cc (convert_nonlocal_reference_stmt): Walk the iterator
loop sequence of OpenMP target statements.
(convert_local_reference_stmt): Likewise.
(convert_tramp_reference_stmt): Likewise.
* tree-pretty-print.cc (dump_omp_iterators): Dump extra iterator
information if present.
(dump_omp_clause): Call dump_omp_iterators for iterators in map
clauses.
* tree.cc (omp_clause_num_ops): Add operand for OMP_CLAUSE_MAP.
(walk_tree_1): Do not walk last operand of OMP_CLAUSE_MAP.
* tree.h (OMP_CLAUSE_HAS_ITERATORS): New.
(OMP_CLAUSE_ITERATORS): New.

gcc/testsuite/

* c-c++-common/gomp/map-6.c (foo): Amend expected error message.
* c-c++-common/gomp/target-map-iterators-1.c: New.
* c-c++-common/gomp/target-map-iterators-2.c: New.
* c-c++-common/gomp/target-map-iterators-3.c: New.
* c-c++-common/gomp/target-map-iterators-4.c: New.

libgomp/

* target.c (kind_to_name): New.
(gomp_merge_iterator_maps): New.
(gomp_map_vars_internal): Call gomp_merge_iterator_maps.  Copy
address of only the first iteration to target vars.  Free allocated
variables.
* testsuite/libgomp.c-c++-common/target-map-iterators-1.c: New.
* testsuite/libgomp.c-c++-common/target-map-iterators-2.c: New.
* testsuite/libgomp.c-c++-common/target-map-iterators-3.c: New.

Co-authored-by: Andrew Stubbs <ams@baylibre.com>

openmp: Refactor handling of iterators

Move code to calculate the iteration size and to generate the iterator
expansion loop into separate functions.

Use OMP_ITERATOR_DECL_P to check for iterators in clause declarations.

gcc/c-family/

* c-omp.cc (c_finish_omp_depobj): Use OMP_ITERATOR_DECL_P.

gcc/c/

* c-typeck.cc (handle_omp_array_sections): Use OMP_ITERATOR_DECL_P.
(c_finish_omp_clauses): Likewise.

gcc/cp/

* pt.cc (tsubst_omp_clause_decl): Use OMP_ITERATOR_DECL_P.
* semantics.cc (handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise.

gcc/

* gimplify.cc (gimplify_omp_affinity): Use OMP_ITERATOR_DECL_P.
(compute_omp_iterator_count): New.
(build_omp_iterator_loop): New.
(gimplify_omp_depend): Use OMP_ITERATOR_DECL_P,
compute_omp_iterator_count and build_omp_iterator_loop.
* tree-inline.cc (copy_tree_body_r): Use OMP_ITERATOR_DECL_P.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.h (OMP_ITERATOR_DECL_P): New macro.

OpenACC: Improve implicit mapping for non-lexically nested offload regions: Adjust cherry-picked test cases

Adjust cherry-picked test cases per
OG14 commit b918a7e4b4bdf070bfa9ede48ef9d22f89ff7795
"OpenACC: Improve implicit mapping for non-lexically nested offload regions"
(in combination with
OG14 commit 5fb2987d33c7296543fa7b8dbeab597fc552b110
"Clarify 'OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P' in 'gcc/tree-pretty-print.cc:dump_omp_clause'"
(cherry picked from trunk commit d6e66e7b3a40315ad303344e19bccb4006c51cac)).

libgomp/
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-3.C: Adjust.
* testsuite/libgomp.oacc-c++/exceptions-throw-3.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-1.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-2.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-3.C: Likewise.

Remove 'ALWAYS_INLINE' workaround in 'libgomp.c++/target-exceptions-pr118794-1.C'

With commit ca9cffe737d20953082333dacebb65d4261e0d0c
"For nvptx offloading, make sure to emit C++ constructor, destructor aliases [PR97106]",
we're able to remove the 'ALWAYS_INLINE' workaround added in
commit fe283dba774be57b705a7a871b000d2894d2e553
"GCN, nvptx: Support '-mfake-exceptions', and use it for offloading compilation [PR118794]".

libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Remove
'ALWAYS_INLINE' workaround.

(cherry picked from commit 518efed8cb7d003cd85477060b1fe926a2d7a53b)

Add 'libgomp.c++/pr106445-1{,-O0}.C' [PR106445]

PR target/106445
libgomp/
* testsuite/libgomp.c++/pr106445-1.C: New.
* testsuite/libgomp.c++/pr106445-1-O0.C: Likewise.

(cherry picked from commit 0b2a2490bebd29acc4da18562eec7464601cbb05)

For nvptx offloading, make sure to emit C++ constructor, destructor aliases [PR97106]

PR target/97106
gcc/
* config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls)
[ACCEL_COMPILER]: Make sure to emit C++ constructor, destructor
aliases.
libgomp/
* testsuite/libgomp.c++/pr96390.C: Un-XFAIL nvptx offloading.
* testsuite/libgomp.c-c++-common/pr96390.c: Adjust.

(cherry picked from commit ca9cffe737d20953082333dacebb65d4261e0d0c)

GCN, nvptx: Support '-mfake-exceptions', and use it for offloading compilation [PR118794]

With '-mfake-exceptions' enabled, the user-visible behavior in presence of
exception handling constructs changes such that the compile-time
'sorry, unimplemented: exception handling not supported' is skipped, code
generation proceeds, and instead, exception handling constructs 'abort' at
run time. (..., or don't, if they're in dead code.)

PR target/118794
gcc/
* config/gcn/gcn.opt (-mfake-exceptions): Support.
* config/nvptx/nvptx.opt (-mfake-exceptions): Likewise.
* config/gcn/gcn.md (define_expand "exception_receiver"): Use it.
* config/nvptx/nvptx.md (define_expand "exception_receiver"):
Likewise.
* config/gcn/mkoffload.cc (main): Set it.
* config/nvptx/mkoffload.cc (main): Likewise.
* config/nvptx/nvptx.cc (nvptx_assemble_integer)
<in_section == exception_section>: Special handling for
'SYMBOL_REF's.
* except.cc (expand_dw2_landing_pad_for_region): Don't generate
bogus code for (default)
'#define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM'.
libgcc/
* config/gcn/unwind-gcn.c (_Unwind_Resume): New.
* config/nvptx/unwind-nvptx.c (_Unwind_Resume): Likewise.
gcc/testsuite/
* g++.target/gcn/exceptions-bad_cast-2.C: Set
'-mno-fake-exceptions'.
* g++.target/gcn/exceptions-pr118794-1.C: Likewise.
* g++.target/gcn/exceptions-throw-2.C: Likewise.
* g++.target/nvptx/exceptions-bad_cast-2.C: Likewise.
* g++.target/nvptx/exceptions-pr118794-1.C: Likewise.
* g++.target/nvptx/exceptions-throw-2.C: Likewise.
* g++.target/gcn/exceptions-bad_cast-2_-mfake-exceptions.C: New.
* g++.target/gcn/exceptions-pr118794-1_-mfake-exceptions.C:
Likewise.
* g++.target/gcn/exceptions-throw-2_-mfake-exceptions.C: Likewise.
* g++.target/nvptx/exceptions-bad_cast-2_-mfake-exceptions.C:
Likewise.
* g++.target/nvptx/exceptions-pr118794-1_-mfake-exceptions.C:
Likewise.
* g++.target/nvptx/exceptions-throw-2_-mfake-exceptions.C:
Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-bad_cast-2-offload-sorry-GCN.C:
Set '-foffload-options=-mno-fake-exceptions'.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2-offload-sorry-nvptx.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2-offload-sorry-nvptx.C:
Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2-offload-sorry-nvptx.C:
Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2-offload-sorry-nvptx.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Adjust.
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2-O0.C: New.

(cherry picked from commit fe283dba774be57b705a7a871b000d2894d2e553)

Add 'throw', dead code test cases for GCN, nvptx target and OpenACC, OpenMP 'target' offloading

gcc/testsuite/
* g++.target/gcn/exceptions-throw-3.C: New.
* g++.target/nvptx/exceptions-throw-3.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-3.C: New.
* testsuite/libgomp.oacc-c++/exceptions-throw-3.C: Likewise.

(cherry picked from commit 6c0ea840265454b01d9ef4eb8850a4f015788788)

Add 'throw', caught test cases for GCN, nvptx target and OpenACC, OpenMP 'target' offloading

gcc/testsuite/
* g++.target/gcn/exceptions-throw-2.C: New.
* g++.target/nvptx/exceptions-throw-2.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-2.C: New.
* testsuite/libgomp.c++/target-exceptions-throw-2-offload-sorry-GCN.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2-offload-sorry-nvptx.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2-offload-sorry-GCN.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2-offload-sorry-nvptx.C: Likewise.

(cherry picked from commit 1daf57049878a79b8bb40fb57ee2370f3cd4d4e7)

Add 'throw' test cases for GCN, nvptx target and OpenACC, OpenMP 'target' offloading

gcc/testsuite/
* g++.target/gcn/exceptions-throw-1.C: New.
* g++.target/nvptx/exceptions-throw-1.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-1.C: New.
* testsuite/libgomp.c++/target-exceptions-throw-1-O0.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-1.C: Likewise.

(cherry picked from commit 1362d9d49449bec12741ef158f4bf81a8b3ff2cd)

Add 'std::bad_cast' exception, dead code test cases for GCN, nvptx target and OpenACC, OpenMP 'target' offloading

gcc/testsuite/
* g++.target/gcn/exceptions-bad_cast-3.C: New.
* g++.target/nvptx/exceptions-bad_cast-3.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-bad_cast-3.C: New.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-3.C: Likewise.

(cherry picked from commit 27f88cc79934b689a92e08342e25b2ee947a1bce)

Add 'std::bad_cast' exception, caught test cases for GCN, nvptx target and OpenACC, OpenMP 'target' offloading

gcc/testsuite/
* g++.target/gcn/exceptions-bad_cast-2.C: New.
* g++.target/nvptx/exceptions-bad_cast-2.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: New.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2-offload-sorry-GCN.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2-offload-sorry-nvptx.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2-offload-sorry-GCN.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2-offload-sorry-nvptx.C: Likewise.

(cherry picked from commit 00cde164eeb834958ae8f9eec0c7c9cc3ac697a1)

Add 'std::bad_cast' exception test cases for GCN, nvptx target and OpenACC, OpenMP 'target' offloading

gcc/testsuite/
* g++.target/gcn/exceptions-bad_cast-1.C: New.
* g++.target/nvptx/exceptions-bad_cast-1.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-bad_cast-1.C: New.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-1.C: Likewise.

(cherry picked from commit 0e68f49db986d8d6c1eaaa96dd14f7b2a927a0d0)

Add test cases for exception handling constructs in dead code for GCN, nvptx target and OpenMP 'target' offloading [PR118794]

PR target/118794
gcc/testsuite/
* g++.target/gcn/exceptions-pr118794-1.C: New.
* g++.target/nvptx/exceptions-pr118794-1.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: New.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.

(cherry picked from commit aa3e72f943032e5f074b2bd2fd06d130dda8760b)

Add PR119692 "C++ 'typeinfo', 'vtable' vs. OpenACC, OpenMP 'target' offloading" test cases [PR119692]

... documenting the status quo.

PR c++/119692
gcc/testsuite/
* g++.target/gcn/pr119692-1-1.C: New.
* g++.target/nvptx/pr119692-1-1.C: Likewise.
libgomp/
* testsuite/libgomp.c++/pr119692-1-1.C: New.
* testsuite/libgomp.c++/pr119692-1-2.C: Likewise.
* testsuite/libgomp.c++/pr119692-1-3.C: Likewise.
* testsuite/libgomp.c++/pr119692-1-4.C: Likewise.
* testsuite/libgomp.c++/pr119692-1-5.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-1.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-2.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-3.C: Likewise.

(cherry picked from commit a304c88b6feb9ee580d3e389f08a6ed40dfe2292)

Add 'g++.target/gcn/gcn.exp' for GCN-specific C++ test cases

Like 'gcc.target/gcn/gcn.exp' is modeled after 'gcc.dg/dg.exp', this new
'g++.target/gcn/gcn.exp' is modeled after 'g++.dg/dg.exp'.

gcc/testsuite/
* g++.target/gcn/gcn.exp: New.

(cherry picked from commit 785448f0465cf9802f1ccfea798f5b732f73f62c)

GCN, nvptx: Define '_Unwind_RaiseException', '_Unwind_Resume_or_Rethrow'

This resolves GCN:

    ld: error: undefined symbol: _Unwind_RaiseException
    >>> referenced by eh_throw.cc:93 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:93)
    >>>               eh_throw.o:(__cxa_throw) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
    [...]
    collect2: error: ld returned 1 exit status

..., and/or:

    ld: error: undefined symbol: _Unwind_Resume_or_Rethrow
    >>> referenced by eh_throw.cc:129 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_throw.cc:129)
    >>>               eh_throw.o:(__cxa_rethrow) in archive /srv/data/tschwinge/amd-instinct2/gcc/build/submit-light-target_gcn/build-gcc/amdgcn-amdhsa/gfx908/libstdc++-v3/src/.libs/libstdc++.a
    [...]
    collect2: error: ld returned 1 exit status

..., and nvptx:

    unresolved symbol _Unwind_RaiseException
    collect2: error: ld returned 1 exit status

..., or:

    unresolved symbol _Unwind_Resume_or_Rethrow
    collect2: error: ld returned 1 exit status

For both GCN, nvptx, this each progresses ~25 'check-gcc-c++',
and ~10 'check-target-libstdc++-v3' test cases:

    [-FAIL:-]{+PASS:+} [...] (test for excess errors)

..., with (if applicable, for most of them):

    [-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}

..., or some 'FAIL: [...] execution test' where these test cases now FAIL when
attempting to use these interfaces, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.

libgcc/
* config/gcn/unwind-gcn.c (_Unwind_RaiseException)
(_Unwind_Resume_or_Rethrow): New.
* config/nvptx/unwind-nvptx.c (_Unwind_RaiseException)
(_Unwind_Resume_or_Rethrow): Likewise.

(cherry picked from commit 54ab0f6785c006ddff056dab9c0240cddae82020)

GCN, nvptx: Define '_Unwind_DeleteException'

This resolves GCN:

    ld: error: undefined symbol: _Unwind_DeleteException
    >>> referenced by eh_catch.cc:109 ([...]/source-gcc/libstdc++-v3/libsupc++/eh_catch.cc:109)
    >>>               eh_catch.o:(__cxa_end_catch) in archive [...]/build-gcc/amdgcn-amdhsa/libstdc++-v3/src/.libs/libstdc++.a
    [...]
    collect2: error: ld returned 1 exit status

..., and nvptx:

    unresolved symbol _Unwind_DeleteException
    collect2: error: ld returned 1 exit status

For both GCN, nvptx, this each progresses ~100 'check-gcc-c++',
and ~500 'check-target-libstdc++-v3' test cases:

    [-FAIL:-]{+PASS:+} [...] (test for excess errors)

..., with (if applicable, for most of them):

    [-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}

..., or just a few 'FAIL: [...] execution test' where these test cases now
FAIL for unrelated reasons, or, if applicable, FAIL due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.

libgcc/
* config/gcn/unwind-gcn.c (_Unwind_DeleteException): New.
* config/nvptx/unwind-nvptx.c (_Unwind_DeleteException): Likewise.

(cherry picked from commit 815abd68353751d53ed9299e218f8ca1cc108d5f)

GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]

For both GCN, nvptx, this gets rid of 'configure'-time:

    configure: WARNING: No native atomic operations are provided for this platform.
    configure: WARNING: They will be faked using a mutex.
    configure: WARNING: Performance of certain classes will degrade as a result.

..., and changes:

    -checking for lock policy for shared_ptr reference counts... mutex
    +checking for lock policy for shared_ptr reference counts... atomic

That means, '[...]/[target]/libstdc++-v3/', 'Makefile's change:

    -ATOMICITY_SRCDIR = config/cpu/generic/atomicity_mutex
    +ATOMICITY_SRCDIR = config/cpu/generic/atomicity_builtins

..., and '[...]/[target]/libstdc++-v3/config.h' changes:

    /* Defined if shared_ptr reference counting should use atomic operations. */
    -/* #undef HAVE_ATOMIC_LOCK_POLICY */
    +#define HAVE_ATOMIC_LOCK_POLICY 1

    /* Define if the compiler supports C++11 atomics. */
    -/* #undef _GLIBCXX_ATOMIC_BUILTINS */
    +#define _GLIBCXX_ATOMIC_BUILTINS 1

..., and '[...]/[target]/libstdc++-v3/include/[target]/bits/c++config.h'
changes:

    /* Defined if shared_ptr reference counting should use atomic operations. */
    -/* #undef _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY */
    +#define _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY 1

    /* Define if the compiler supports C++11 atomics. */
    -/* #undef _GLIBCXX_ATOMIC_BUILTINS */
    +#define _GLIBCXX_ATOMIC_BUILTINS 1

This means that '[...]/[target]/libstdc++-v3/libsupc++/atomicity.cc',
'[...]/[target]/libstdc++-v3/libsupc++/atomicity.o' then uses atomic
instructions for synchronization instead of C++ static local variables, which
in turn for their guard variables, via 'libstdc++-v3/libsupc++/guard.cc', used
'libgcc/gthr.h' recursive mutexes, which currently are unsupported for GCN.

For GCN, this turns ~500 libstdc++ execution test FAILs into PASSes, and also
progresses:

    PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++17 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 execution test
    PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++26 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 execution test
    UNSUPPORTED: g++.dg/tree-ssa/pr20458.C  -std=gnu++98: exception handling not supported

(For nvptx, there is no effective change, due to other misconfiguration.)

PR target/119645
libstdc++-v3/
* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY) [GCN, nvptx]:
Hard-code results.
* configure: Regenerate.
* configure.host [GCN, nvptx] (atomicity_dir): Set to
'cpu/generic/atomicity_builtins'.

(cherry picked from commit 059b5509c14904b55c37f659170240ae0d2c1c8e)

nvptx: Support '-mfake-ptx-alloca': defer failure to run-time 'alloca' usage

Follow-up to commit 1146410c0feb0e82c689b1333fdf530a2b34dc2b
"nvptx: Support '-mfake-ptx-alloca'".  '-mfake-ptx-alloca' is applicable only
for configurations where PTX 'alloca' is not supported, where target libraries
are built with it enabled (that is, libstdc++, libgfortran).

This change progresses:

    [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
    UNSUPPORTED: g++.dg/tree-ssa/pr20458.C  -std=gnu++98: exception handling not supported

..., and "enables" a few test cases:

    FAIL: g++.old-deja/g++.other/sibcall1.C  -std=gnu++17 (test for excess errors)
    [Etc.]

    FAIL: g++.old-deja/g++.other/unchanging1.C  -std=gnu++17 (test for excess errors)
    [Etc.]

..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAIL due to:

    unresolved symbol _Unwind_DeleteException
    collect2: error: ld returned 1 exit status

Most importantly, it progresses ~830 libstdc++ test cases:

    [-FAIL:-]{+PASS:+} [...] (test for excess errors)

..., with (if applicable, for most of them):

    [-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}

..., or just a few 'FAIL: [...] execution test' where these test cases also
FAIL in configurations where PTX 'alloca' is supported, or ~120 instances of
'FAIL: [...]  execution test' due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.

This change also resolves the cases noted in
commit bac2d8a246892334e24dfa7d62be0cd0648c5606
"nvptx: Build libgfortran with '-mfake-ptx-alloca' [PR107635]":

| With '-mfake-ptx-alloca', libgfortran again succeeds to build, and compared
| to before, we've got only a small number of regressions due to nvptx 'ld'
| complaining about 'unresolved symbol __GCC_nvptx__PTX_alloca_not_supported':
|
|     [-PASS:-]{+FAIL:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)

    [-FAIL:-]{+PASS:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)

|     [-PASS:-]{+FAIL:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
|     [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

    [-FAIL:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single [-compilation failed to produce executable-]{+execution test+}

|     [-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
|     [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

    [-FAIL:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single [-compilation failed to produce executable-]{+execution test+}

|     [-PASS:-]{+FAIL:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
|     [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

    [-FAIL:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib  -O2  -lcaf_single [-compilation failed to produce executable-]{+execution test+}

|     [-PASS:-]{+FAIL:+} gfortran.dg/coarray_43.f90   -O  (test for excess errors)

    [-FAIL:-]{+PASS:+} gfortran.dg/coarray_43.f90   -O  (test for excess errors)

..., and further progresses:

    [-FAIL:-]{+PASS:+} gfortran.dg/coarray_lib_comm_1.f90   -O0  (test for excess errors)
    [-UNRESOLVED:-]{+FAIL:+} gfortran.dg/coarray_lib_comm_1.f90   -O0  [-compilation failed to produce executable-]{+execution test+}
    [Etc.]

..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAILs due to:

    error   : Prototype doesn't match for '_gfortran_caf_transfer_between_remotes' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
    error   : Prototype doesn't match for '_gfortran_caf_stop_numeric' in 'input file 9 at offset 159897', first defined in 'input file 9 at offset 159897'
    nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)

gcc/
* config/nvptx/nvptx.opt (-mfake-ptx-alloca): Update.
gcc/testsuite/
* gcc.target/nvptx/alloca-2-O0_-mfake-ptx-alloca.c: Adjust.
libgcc/
* config/nvptx/alloca.c: New.
* config/nvptx/t-nvptx (LIB2ADD): Add it.

(cherry picked from commit 199f1abeef579912b4c40c42519825cedca6530f)

libstdc++, nvptx: Remove machinery to inject per-file flags

Not used anymore.

libstdc++-v3/
* config/cpu/nvptx/t-nvptx: Remove.
* configure.host [nvptx]: Adjust.

(cherry picked from commit 287f360b3e75a19c48ee14c71f51b6e7968474ef)

nvptx: Don't use PTX '.const', constant state space [PR119573]

This avoids cases where a "File uses too much global constant data" (final
executable, or single object file), and avoids cases of wrong code generation:
"error : State space incorrect for instruction 'st'" ('st.const'), or another
case where an "illegal instruction was encountered", or a lot of cases where
for two compilation units (such as a library linked with user code) we ran into
"error : Memory space doesn't match" due to differences in '.const' usage
between definition and use of a variable.

We progress:

    ptxas error   : File uses too much global constant data (0x1f01a bytes, 0x10000 max)
    nvptx-run: cuLinkAddData failed: a PTX JIT compilation failed (CUDA_ERROR_INVALID_PTX, 218)

... into:

    PASS: 20_util/to_chars/103955.cc  -std=gnu++17 (test for excess errors)
    [-FAIL:-]{+PASS:+} 20_util/to_chars/103955.cc  -std=gnu++17 execution test

We progress:

    ptxas error   : File uses too much global constant data (0x36c65 bytes, 0x10000 max)
    nvptx-as: ptxas returned 255 exit status

... into:

    [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O0  {+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O1  {+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O2  {+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O3 -g  {+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -Os  {+(test for excess errors)+}

    [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O0  (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O1  (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O2  (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O3 -g  (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -Os  (test for excess errors)

    [-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90   -O0  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90   -O0  [-compilation failed to produce executable-]{+execution test+}

    [-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90   -O0  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90   -O0  [-compilation failed to produce executable-]{+execution test+}

    [-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90   -O0  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90   -O0  [-compilation failed to produce executable-]{+execution test+}

    [-FAIL:-]{+PASS:+} 20_util/to_chars/double.cc  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/double.cc  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

    [-FAIL:-]{+PASS:+} 20_util/to_chars/float.cc  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/float.cc  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

    [-FAIL:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

    [-FAIL:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

..., and progress likewise, but fail later with an unrelated error:

    [-FAIL:-]{+PASS:+} ext/special_functions/hyperg/check_value.cc  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+FAIL:+} ext/special_functions/hyperg/check_value.cc  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

    [...]/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc:12317: void test(const testcase_hyperg<Ret> (&)[Num], Ret) [with Ret = double; unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed.

..., and:

    [-FAIL:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+FAIL:+} tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

    [...]/libstdc++-v3/testsuite/tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc:12316: void test(const testcase_hyperg<Ret> (&)[Num], Ret) [with Ret = double; unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed.

We progress:

    nvptx-run: error getting kernel result: an illegal instruction was encountered (CUDA_ERROR_ILLEGAL_INSTRUCTION, 715)

... into:

    PASS: g++.dg/cpp1z/inline-var1.C  -std=gnu++17 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C  -std=gnu++17 execution test
    PASS: g++.dg/cpp1z/inline-var1.C  -std=gnu++20 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C  -std=gnu++20 execution test
    PASS: g++.dg/cpp1z/inline-var1.C  -std=gnu++26 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C  -std=gnu++26 execution test

(A lot of '.const' -> '.global' etc.  Haven't researched what the actual
problem was.)

We progress:

    ptxas /tmp/cc5TSZZp.o, line 142; error   : State space incorrect for instruction 'st'
    ptxas /tmp/cc5TSZZp.o, line 174; error   : State space incorrect for instruction 'st'
    ptxas fatal   : Ptx assembly aborted due to errors
    nvptx-as: ptxas returned 255 exit status

... into:

    [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -O0  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -O0  [-compilation failed to produce executable-]{+execution test+}
    PASS: g++.dg/torture/builtin-clear-padding-1.C   -O1  (test for excess errors)
    PASS: g++.dg/torture/builtin-clear-padding-1.C   -O1  execution test
    [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -O2  [-compilation failed to produce executable-]{+execution test+}
    [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -O3 -g  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -O3 -g  [-compilation failed to produce executable-]{+execution test+}
    [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -Os  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C   -Os  [-compilation failed to produce executable-]{+execution test+}

This indeed tried to write ('st.const') into 's2', which was '.const'
(also: 's1' was '.const') -- even though, no explicit 'const' in
'g++.dg/torture/builtin-clear-padding-1.C'; "interesting".

We progress:

    error   : Memory space doesn't match for '_ZNSt3tr18__detail12__prime_listE' in 'input file 3 at offset 53085', first specified in 'input file 1 at offset 1924'
    nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)

... into execution test PASS for a few dozens of libstdc++ test cases.

We progress:

    error   : Memory space doesn't match for '_ZNSt6locale17_S_twinned_facetsE' in 'input file 11 at offset 479903', first specified in 'input file 9 at offset 59300'
    nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)

... into:

    PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++17 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 execution test
    PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++26 (test for excess errors)
    [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 execution test

..., and likewise for a few hundreds of libstdc++ test cases.

We progress:

    error   : Memory space doesn't match for '_ZNSt6locale5_Impl19_S_facet_categoriesE' in 'input file 11 at offset 821962', first specified in 'input file 10 at offset 676317'
    nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)

... into execution test PASS for a hundred of libstdc++ test cases.

We progress:

    error   : Memory space doesn't match for '_ctype_' in 'input file 22 at offset 1698331', first specified in 'input file 9 at offset 57095'
    nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)

... into execution test PASS for another few libstdc++ test cases.

PR target/119573
gcc/
* config/nvptx/nvptx.cc (nvptx_encode_section_info): Don't set
'DATA_AREA_CONST' for 'TREE_CONSTANT', or 'TREE_READONLY'.
(nvptx_asm_declare_constant_name): Use '.global' instead of
'.const'.
gcc/testsuite/
* gcc.c-torture/compile/pr46534.c: Don't 'dg-skip-if' nvptx.
* gcc.target/nvptx/decl.c: Adjust.
libstdc++-v3/
* config/cpu/nvptx/t-nvptx (AM_MAKEFLAGS): Don't amend.

(cherry picked from commit 5deeae29dab2af64e3342daf7a30000e424c64ea)

nvptx: In offloading compilation, special-case certain host-setup symbol aliases: avoid unused label 'emit_ptx_alias' diagnostic

Minor fix-up for commit 65b31b3fff2fced015ded1026733605f34053796
"nvptx: In offloading compilation, special-case certain host-setup symbol aliases [PR101544]",
as of which we see for non-offloading configurations:

    +[...]/source-gcc/gcc/config/nvptx/nvptx.cc: In function 'void nvptx_asm_output_def_from_decls(FILE*, tree, tree)':
    +[...]/source-gcc/gcc/config/nvptx/nvptx.cc:7769:2: warning: label 'emit_ptx_alias' defined but not used [-Wunused-label]
    + 7769 |  emit_ptx_alias:
    +      |  ^~~~~~~~~~~~~~

gcc/
* config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls)
[!ACCEL_COMPILER]: Don't define label 'emit_ptx_alias'.

(cherry picked from commit 175016de6f9d800343ce31cf1837a3265569b657)

OpenACC 2.7: adjust 2.6 references to 2.7

More adjustments to indicate OpenACC 2.7 support.

2025-04-11 Chung-Lin Tang <cltang@baylibre.com>

gcc/fortran/ChangeLog:

* intrinsic.texi (OpenACC Module OPENACC): Adjust version
references to 2.7 from 2.6.

libgomp/ChangeLog:

* libgomp.texi (Enabling OpenACC): Adjust version
references to 2.7 from 2.6.
* openacc.f90 (module openacc): Adjust openacc_version to 201811.
* openacc_lib.h (openacc_version): Adjust openacc_version to 201811.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Adjust
test value to 201811.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.

OpenACC 2.7: update _OPENACC value test in testcases

Adjust value test of _OPENACC to 201811 for OpenACC 2.7 update.

2025-04-11 Chung-Lin Tang <cltang@baylibre.com>

gcc/testsuite/ChangeLog:

* c-c++-common/cpp/openacc-define-3.c: Adjust test.
* gfortran.dg/openacc-define-3.f90: Adjust test.

OpenACC 2.7: update _OPENACC symbol to 201811

This patch updates the _OPENACC preprocessor symbol to "201811",
to indicate OpenACC 2.7 support.

2025-04-11 Chung-Lin Tang <cltang@baylibre.com>

gcc/c-family/ChangeLog:

* c-cppbuiltin.cc (c_cpp_builtins): Updated _OPENACC to "201811"
for OpenACC 2.7.

gcc/fortran/ChangeLog:

* cpp.cc (cpp_define_builtins): Updated _OPENACC to "201811"
for OpenACC 2.7.

Merge remote-tracking branch 'origin/releases/gcc-14' into devel/omp/gcc-14

Merge up to r14-11542-g059107eb22c480 (8th Apr 2025)

OpenMP: Fix append_args handling in modify_call_for_omp_dispatch

At tree level, the addr ref is also required for array dummy arguments,
contrary to C; the GOMP_interop calls in modify_call_for_omp_dispatch
were updated accordingly (using build_fold_addr_expr).

As the GOMP_interop calls had no location data associated with them,
the init call happened as soon as executing the previous line of code,
which was confusing; solution: use the location data of the function
call itself.

PR middle-end/119662

gcc/ChangeLog:

* gimplify.cc (modify_call_for_omp_dispatch): Fix GOMP_interop
arg passing; add location info to function calls.

libgomp/ChangeLog:

* testsuite/libgomp.c/append-args-fr-1.c: New test.
* testsuite/libgomp.c/append-args-fr.h: New test.

gcc/testsuite/ChangeLog:
* c-c++-common/gomp/append-args-interop.c: Update for fixed
GOMP_interop call.
* g++.dg/gomp/append-args-8.C: Likewise.
* gfortran.dg/gomp/append-args-interop.f90: Likewise.

(cherry picked from commit 0f77d88fdf797842ac0134a4013b4227dd5a658f)

libstdc++: Fix use-after-free in std::format [PR119671]

When formatting floating-point values to wide strings there's a case
where we invalidate a std::wstring buffer while a std::wstring_view is
still referring to it.

libstdc++-v3/ChangeLog:

PR libstdc++/119671
* include/std/format (__formatter_fp::format): Do not invalidate
__wstr unless _M_localized returns a valid string.
* testsuite/std/format/functions/format.cc: Check wide string
formatting of floating-point types with classic locale.

Reviewed-by: Tomasz Kaminski <tkaminsk@redhat.com>
(cherry picked from commit e33b62eed7fd0a82d758b23252d288585b6790d2)

libstdc++: Add new header to Doxygen config file

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (INPUT): Add text_encoding.

(cherry picked from commit 5430fcd1a3222d62c1b9560de251268c8bc50303)

libstdc++: Replace use of __mindist in ranges::uninitialized_xxx algos [PR101587]

In r15-8980-gf4b6acfc36fb1f I introduced a new function object for
finding the smaller of two distances. In bugzilla Hewill Kang pointed
out that we still need to explicitly convert the result back to the
right difference type, because the result might be an integer-like class
type that doesn't convert to an integral type explicitly.

Rather than doing that conversion in the __mindist function object, I
think it's simpler to remove it again and just do a comparison and
assignment. We always want the result to have a specific type, so we can
just check if the value of the other type is smaller, and then convert
that to the other type if so.

libstdc++-v3/ChangeLog:

PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
Remove.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
comparison and assignment instead of __mindist.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check with ranges that use integer-like class type for
difference type.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.

Reviewed-by: Tomasz Kaminski <tkaminsk@redhat.com>
Reviewed-by: Hewill Kang <hewillk@gmail.com>
(cherry picked from commit 03ac8886e5c1fa16da90276fd721a57fa9435f4f)

libstdc++: Replace use of std::min in ranges::uninitialized_xxx algos [PR101587]

Because ranges can have any signed integer-like type as difference_type,
it's not valid to use std::min(diff1, diff2). Instead of calling
std::min with an explicit template argument, this adds a new __mindist
helper that determines the common type and uses that with std::min.

libstdc++-v3/ChangeLog:

PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
New function object.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
__mindist instead of std::min.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check ranges with difference difference types.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.

(cherry picked from commit f4b6acfc36fb1f72fdd5bf4da208515e6495a062)

LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers.  This will cause the compiler to automatically link libquadmath
when compiling Fortran programs.  But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.

PR target/119408

libgfortran/ChangeLog:

* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

libquadmath/ChangeLog:

* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

Sigend-off-by: Xi Ruoyao <xry111@xry111.site>
Sigend-off-by: Jakub Jelinek <jakub@redhat.com>
(cherry picked from commit 1534f0099c98ea14c08a401302b05edf2231f411)

Daily bump.

libstdc++: Work around C++20 tuple<tuple<any>> constraint recursion [PR116440]

The type tuple<tuple<any>> is clearly copy/move constructible, but for
reasons that are not yet completely understood checking this triggers
constraint recursion with our C++20 tuple implementation (but not the
C++17 implementation).

It turns out this recursion stems from considering the non-template
tuple(const _Elements&) constructor during the copy/move constructibility
check.  Considering this constructor is ultimately redundant, since the
defaulted copy/move constructors are better matches.

GCC has a non-standard "perfect candidate" optimization[1] that causes
overload resolution to shortcut considering template candidates if we
find a (non-template) perfect candidate.  So to work around this issue
(and as a general compile-time optimization) this patch turns the
problematic constructor into a template so that GCC doesn't consider it
when checking for copy/move constructibility of this tuple type.

Changing the template-ness of a constructor can affect overload
resolution (since template-ness is a tiebreaker) so there's a risk this
change could e.g. introduce overload resolution ambiguities.  But the
original C++17 implementation has long defined this constructor as a
template (in order to constrain it etc), so doing the same thing in the
C++20 mode should naturally be quite safe.

The testcase still fails with Clang (in C++20 mode) since it doesn't
implement said optimization.

[1]: See r11-7287-g187d0d5871b1fa and
https://isocpp.org/files/papers/P3606R0.html

PR libstdc++/116440

libstdc++-v3/ChangeLog:

* include/std/tuple (tuple::tuple(const _Elements&...))
[C++20]: Turn into a template.
* testsuite/20_util/tuple/116440.C: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 6570fa6f2612a4e4ddd2fcfc119369a1a48656e4)

c++: constinit and value-initialization [PR119652]

Value-initialization built an AGGR_INIT_EXPR to set AGGR_INIT_ZERO_FIRST on.
Passing that AGGR_INIT_EXPR to maybe_constant_value returned a TARGET_EXPR,
which potential_constant_expression_1 mistook for a temporary.

We shouldn't add a TARGET_EXPR to the AGGR_INIT_EXPR in this case, just like
we already avoid adding it to CONSTRUCTOR or CALL_EXPR.

PR c++/119652

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_outermost_constant_expr): Also don't add a
TARGET_EXPR around AGGR_INIT_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constinit20.C: New test.

(cherry picked from commit c7dc9b6f889fa8f9e4ef060c3af107eaf54265c5)

c++: __FUNCTION__ in lambda return type [PR118629]

In this testcase, the use of __FUNCTION__ is within a function parameter
scope, the lambda's. And P1787 changed __func__ to live in the parameter
scope. But [basic.scope.pdecl] says that the point of declaration of
__func__ is immediately before {, so in the trailing return type it isn't in
scope yet, so this __FUNCTION__ should refer to foo().

Looking first for a block scope, then a function parameter scope, gives us
the right result.

PR c++/118629

gcc/cp/ChangeLog:

* name-lookup.cc (pushdecl_outermost_localscope): Look for an
sk_block.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-__func__3.C: New test.

(cherry picked from commit 7d561820525fd3b9d8f3876333c0584d75e7c053)

Daily bump.

c++: lambda in requires outside template [PR99546]

Since r10-7441 we set processing_template_decl in a requires-expression so
that we can use tsubst_expr to evaluate the requirements, but that confuses
lambdas terribly; begin_lambda_type silently returns error_mark_node and we
continue into other failures. This patch clears processing_template_decl
again while we're defining the closure and op() function, so it only remains
set while parsing the introducer (i.e. any init-captures) and building the
resulting object. This properly avoids trying to create another lambda in
tsubst_lambda_expr.

PR c++/99546
PR c++/113925
PR c++/106976
PR c++/109961
PR c++/117336

gcc/cp/ChangeLog:

* lambda.cc (build_lambda_object): Handle fake
requires-expr processing_template_decl.
* parser.cc (cp_parser_lambda_expression): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-requires2.C: New test.
* g++.dg/cpp2a/lambda-requires3.C: New test.
* g++.dg/cpp2a/lambda-requires4.C: New test.
* g++.dg/cpp2a/lambda-requires5.C: New test.

(cherry picked from commit 25992d8daff60726a247ec7850d540aed5335639)

c++: constraint variable used in evaluated context [PR117849]

Here we wrongly reject the type-requirement at parse time due to its use
of the constraint variable 't' within a template argument (an evaluated
context). Fix this simply by refining the "use of parameter outside
function body" error path to exclude constraint variables.

PR c++/104255 tracks the same issue for function parameters, but fixing
that would be more involved, requiring changes to the PARM_DECL case of
tsubst_expr.

PR c++/117849

gcc/cp/ChangeLog:

* semantics.cc (finish_id_expression_1): Allow use of constraint
variable outside an unevaluated context.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires41.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 6e973e87e3fec6f33e97edf8fce2fcd121e53961)

c++: P2280R4 and speculative constexpr folding [PR119387]

Compiling the testcase in this PR uses 2.5x more memory and 6x more
time ever since r14-5979 which implements P2280R4.  This is because
our speculative constexpr folding now does a lot more work trying to
fold ultimately non-constant calls to constexpr functions, and in turn
produces a lot of garbage.  We do sometimes successfully fold more
thanks to P2280R4, but it seems to be trivial stuff like calls to
std::array::size or std::addressof.  The benefit of P2280 therefore
doesn't seem worth the cost during speculative constexpr folding, so
this patch restricts the paper to only manifestly-constant evaluation.

PR c++/119387

gcc/cp/ChangeLog:

* constexpr.cc (p2280_active_p): New.
(cxx_eval_constant_expression) <case VAR_DECL>: Use it to
restrict P2280 relaxations.
<case PARM_DECL>: Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit a926345f22b500a2620adb83e6821e01fb8cc8fd)

Ada: Fix thinko in Eigensystem for complex Hermitian matrices

The implementation solves the eigensystem for a NxN complex Hermitian matrix
by first solving it for a 2Nx2N real symmetric matrix and then interpreting
the 2Nx1 real vectors as Nx1 complex ones, but the last step does not work.

The patch fixes the last step and also performs a small cleanup throughout
the implementation, mostly in the commentary and without functional changes.

gcc/ada/
* libgnat/a-ngcoar.adb (Eigensystem): Adjust notation and fix the
layout of the real symmetric matrix in the main comment. Adjust
the layout of the associated code accordingly and correctly turn
the 2Nx1 real vectors into Nx1 complex ones.
(Eigenvalues): Minor similar tweaks.
* libgnat/a-ngrear.adb (Jacobi): Minor tweaks in the main comment.
Adjust notation and corresponding parameter names of functions.
Fix call to Unit_Matrix routine. Adjust the comment describing
the various kinds of iterations to match the implementation.

vect: Relax scan-tree-dump strict pattern matching [PR118597]

Using specific SSA names in pattern matching in `dg-final' makes tests
"unstable", in that changes in passes prior to the pass whose dump is
analyzed in the particular test may change the numbering of the SSA
variables, causing the test to start failing spuriously.

We thus switch from specific SSA names to the use of a multi-line
regular expression making use of capture groups for matching particular
variables across different statements, ensuring the test will pass
more consistently across different versions of GCC.

PR testsuite/118597

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-fncall-mask.c: Update test directives.

Daily bump.

OpenMP: Require target and/or targetsync init modifier [PR118965]

As noted in PR 118965, the initial interop implementation overlooked
the requirement in the OpenMP spec that at least one of the "target"
and "targetsync" modifiers is required in both the interop construct
init clause and the declare variant append_args clause.

Adding the check was fairly straightforward, but it broke about a
gazillion existing test cases.  In particular, things like "init (x, y)"
which were previously accepted (and tested for being accepted) aren't
supposed to be allowed by the spec, much less things like "init (target)"
where target was previously interpreted as a variable name instead of a
modifier.  Since one of the effects of the change is that at least one
modifier is always required, I found that deleting all the code that was
trying to detect and handle the no-modifier case allowed for better
diagnostics.

gcc/c/ChangeLog
PR middle-end/118965
* c-parser.cc (c_parser_omp_clause_init_modifiers): Adjust
error message.
(c_parser_omp_clause_init): Remove code for recognizing clauses
without modifiers.  Diagnose missing target/targetsync modifier.
(c_finish_omp_declare_variant): Diagnose missing target/targetsync
modifier.

gcc/cp/ChangeLog
PR middle-end/118965
* parser.cc (c_parser_omp_clause_init_modifiers): Adjust
error message.
(cp_parser_omp_clause_init): Remove code for recognizing clauses
without modifiers.  Diagnose missing target/targetsync modifier.
(cp_finish_omp_declare_variant): Diagnose missing target/targetsync
modifier.

gcc/fortran/ChangeLog
PR middle-end/118965
* openmp.cc (gfc_parser_omp_clause_init_modifiers): Fix some
inconsistent code indentation.  Remove code for recognizing
clauses without modifiers.  Diagnose prefer_type without a
following paren.  Adjust error message for an unrecognized modifier.
Diagnose missing target/targetsync modifier.
(gfc_match_omp_init): Fix more inconsistent code indentation.

gcc/testsuite/ChangeLog
PR middle-end/118965
* c-c++-common/gomp/append-args-1.c: Add target/targetsync
modifiers so tests do what they were previously supposed to do.
Adjust expected output.
* c-c++-common/gomp/append-args-7.c: Likewise.
* c-c++-common/gomp/append-args-8.c: Likewise.
* c-c++-common/gomp/append-args-9.c: Likewise.
* c-c++-common/gomp/interop-1.c: Likewise.
* c-c++-common/gomp/interop-2.c: Likewise.
* c-c++-common/gomp/interop-3.c: Likewise.
* c-c++-common/gomp/interop-4.c: Likewise.
* c-c++-common/gomp/pr118965-1.c: New.
* c-c++-common/gomp/pr118965-2.c: New.
* g++.dg/gomp/append-args-1.C: Add target/targetsync modifiers
and adjust expected output.
* g++.dg/gomp/append-args-2.C: Likewise.
* g++.dg/gomp/append-args-6.C: Likewise.
* g++.dg/gomp/append-args-7.C: Likewise.
* g++.dg/gomp/append-args-8.C: Likewise.
* g++.dg/gomp/interop-5.C: Likewise.
* gfortran.dg/gomp/append_args-1.f90: Add target/targetsync
modifiers and adjust expected output.
* gfortran.dg/gomp/append_args-2.f90: Likewise.
* gfortran.dg/gomp/append_args-3.f90: Likewise.
* gfortran.dg/gomp/append_args-4.f90: Likewise.
* gfortran.dg/gomp/interop-1.f90: Likewise.
* gfortran.dg/gomp/interop-2.f90: Likewise.
* gfortran.dg/gomp/interop-3.f90: Likewise.
* gfortran.dg/gomp/interop-4.f90: Likewise.
* gfortran.dg/gomp/pr118965-1.f90: New.
* gfortran.dg/gomp/pr118965-2.f90: New.

(cherry picked from commit aca8155c09001f269a20d6df438fa0e749dd5388)

libstdc++: Restored accidentally removed test case.

It was removed by accident r14-11523-gad1b71fc2882c1.

libstdc++-v3/ChangeLog:

* testsuite/std/format/functions/format.cc: Restored line.

(cherry picked from commit 81c990aa84b22562157ce2926577b392b4a129d3)

libstdc++: Fix handling of field width for wide strings and characters [PR119593]

This patch corrects handling of UTF-32LE and UTF32-BE in
__unicode::__literal_encoding_is_unicode<_CharT>, so they are
recognized as unicode and functions produces correct result for wchar_t.

Use `__unicode::__field_width` to compute the estimated witdh
of the charcter for unicode wide encoding.

PR libstdc++/119593

libstdc++-v3/ChangeLog:

* include/bits/unicode.h
(__unicode::__literal_encoding_is_unicode<_CharT>):
Corrected handing for UTF-16 and UTF-32 with "LE" or "BE" suffix.
* include/std/format (__formatter_str::_S_character_width):
Define.
(__formatter_str::_S_character_width): Updated passed char
length.
* testsuite/std/format/functions/format.cc: Test for wchar_t.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

Fortran: Fix freeing procedure pointer components [PR119380]

Backported from gcc-15.

PR fortran/119380

gcc/fortran/ChangeLog:

* trans-array.cc (structure_alloc_comps): Prevent freeing of
procedure pointer components.

gcc/testsuite/ChangeLog:

* gfortran.dg/proc_ptr_comp_54.f90: New test.

Daily bump.

tree-optimization/119145 - avoid stray .MASK_CALL after vectorization

When we BB vectorize an if-converted loop body we make sure to not
leave around .MASK_LOAD or .MASK_STORE created by if-conversion but
we failed to check for .MASK_CALL.

PR tree-optimization/119145
* tree-vectorizer.cc (try_vectorize_loop_1): Avoid BB
vectorizing an if-converted loop body when there's a .MASK_CALL
in the loop body.

* gcc.dg/vect/pr119145.c: New testcase.

(cherry picked from commit 7950d4cceb9fc7559b1343c95fc651cefbe287a0)

middle-end/119119 - re-gimplification of empty CTOR assignments

The following testcase runs into a re-gimplification issue during
inlining when processing

  MEM[(struct e *)this_2(D)].a = {};

where re-gimplification does not handle assignments in the same
way than the gimplifier but instead relies on rhs_predicate_for
and gimplifying the RHS standalone.  This fails to handle
special-casing of CTORs.  The is_gimple_mem_rhs_or_call predicate
already handles clobbers but not empty CTORs so we end up in
the fallback code trying to force the CTOR into a separate stmt
using a temporary - but as we have a non-copyable type here that ICEs.

The following generalizes empty CTORs in is_gimple_mem_rhs_or_call
since those need no additional re-gimplification.

PR middle-end/119119
* gimplify.cc (is_gimple_mem_rhs_or_call): All empty CTORs
are OK when not a register type.

* g++.dg/torture/pr11911.C: New testcase.

(cherry picked from commit 3bd61c1dfaa2d7153eb4be82f423533ea937d0f9)

tree-optimization/119096 - bogus conditional reduction vectorization

When we vectorize a .COND_ADD reduction and apply the single-use-def
cycle optimization we can end up chosing the wrong else value for
subsequent .COND_ADD. The following rectifies this.

PR tree-optimization/119096
* tree-vect-loop.cc (vect_transform_reduction): Use the
correct else value for .COND_fn.

* gcc.dg/vect/pr119096.c: New testcase.

(cherry picked from commit 10e4107dfcf9fe324d0902f16411a75c596dab91)

ipa/119067 - bogus TYPE_PRECISION check on VECTOR_TYPE

odr_types_equivalent_p can end up using TYPE_PRECISION on vector
types which is a no-go. The following instead uses TYPE_VECTOR_SUBPARTS
for vector types so we also end up comparing the number of vector elements.

PR ipa/119067
* ipa-devirt.cc (odr_types_equivalent_p): Check
TYPE_VECTOR_SUBPARTS for vectors.

* g++.dg/lto/pr119067_0.C: New testcase.
* g++.dg/lto/pr119067_1.C: Likewise.

(cherry picked from commit f22e89167b3abfbf6d67f42fc4d689d8ffdc1810)

tree-optimization/119057 - bogus double reduction detection

We are detecting a cycle as double reduction where the inner loop
cycle has extra out-of-loop uses.  This clashes at least with
assumptions from the SLP discovery code which says the cycle
isn't reachable from another SLP instance.  It also was not intended
to support this case, in fact with GCC 14 we seem to generate wrong
code here.

PR tree-optimization/119057
* tree-vect-loop.cc (check_reduction_path): Add argument
specifying whether we're analyzing the inner loop of a
double reduction.  Do not allow extra uses outside of the
double reduction cycle in this case.
(vect_is_simple_reduction): Adjust.

* gcc.dg/vect/pr119057.c: New testcase.

(cherry picked from commit 758de6263dfc7ba8701965fa468691ac23cb7eb5)

lto/114501 - missed free-lang-data for CONSTRUCTOR index

The following makes sure to also walk CONSTRUCTOR element indexes
which can be FIELD_DECLs, referencing otherwise unused types we
need to clean. walk_tree only walks CONSTRUCTOR element data.

PR lto/114501
* ipa-free-lang-data.cc (find_decls_types_r): Explicitly
handle CONSTRUCTORs as walk_tree handling of those is
incomplete.

* g++.dg/pr114501_0.C: New testcase.

(cherry picked from commit fdd95e1cf29137a19baed25f8c817d320dfe63e3)

ipa/111245 - bogus modref analysis for store in call that might throw

We currently record a kill for

*x_4(D) = always_throws ();

because we consider the store always executing since the appropriate
check for whether the stmt could throw is guarded by
!cfun->can_throw_non_call_exceptions.

PR ipa/111245
* ipa-modref.cc (modref_access_analysis::analyze_store): Do
not guard the check of whether the stmt could throw by
cfun->can_throw_non_call_exceptions.

* g++.dg/torture/pr111245.C: New testcase.

(cherry picked from commit e6037af6d5e5a43c437257580d75bc8b35a6dcfd)

aarch64: Use PAUTH instead of V8_3A in some places

PR target/119383

gcc/ChangeLog:

* config/aarch64/aarch64.cc
(aarch64_expand_epilogue): Use TARGET_PAUTH.
* config/aarch64/aarch64.md: Update comment.

(cherry picked from commit 20385cb92cbd4a1934661ab97a162c1e25935836)

libstdc++: Fix std::ranges::iter_move for function references [PR119469]

The result of std::move (or a cast to an rvalue reference) on a function
reference is always an lvalue. Because std::ranges::iter_move was using
the type std::remove_reference_t<X>&& as the result of std::move, it was
giving the wrong type for function references. Use a decltype-specifier
with declval<remove_reference_t<X>>() instead of just using the
remove_reference_t<X>&& type directly. This gives the right result,
while still avoiding the cost of doing overload resolution for
std::move.

libstdc++-v3/ChangeLog:

PR libstdc++/119469
* include/bits/iterator_concepts.h (_IterMove::__result): Use
decltype-specifier instead of an explicit type.
* testsuite/24_iterators/customization_points/iter_move.cc:
Check results for function references.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit 3e52eb28c537aaa03afb78ef9dff8325c5f41f78)

libstdc++: Fix ranges::iter_move handling of rvalues [PR106612]

The specification for std::ranges::iter_move apparently requires us to
handle types which do not satisfy std::indirectly_readable, for example
with overloaded operator* which behaves differently for different value
categories.

libstdc++-v3/ChangeLog:

PR libstdc++/106612
* include/bits/iterator_concepts.h (_IterMove::__iter_ref_t):
New alias template.
(_IterMove::__result): Use __iter_ref_t instead of
std::iter_reference_t.
(_IterMove::__type): Remove incorrect __dereferenceable
constraint.
(_IterMove::operator()): Likewise. Add correct constraints. Use
__iter_ref_t instead of std::iter_reference_t. Forward parameter
as correct value category.
(iter_swap): Add comments.
* testsuite/24_iterators/customization_points/iter_move.cc: Test
that iter_move is found by ADL and that rvalue arguments are
handled correctly.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
(cherry picked from commit a8ee522c5923ba17851e4b71316a2dff19d6368f)

libstdc++: Fix -Warray-bounds warning in std::vector<bool> [PR110498]

In this case, we need to tell the compiler that the current size is not
larger than the new size so that all the existing elements can be copied
to the new storage. This avoids bogus warnings about overflowing the new
storage when the compiler can't tell that that cannot happen.

We might as well also hoist the loads of begin() and end() before the
allocation too. All callers will have loaded at least begin() before
calling _M_reallocate.

libstdc++-v3/ChangeLog:

PR libstdc++/110498
* include/bits/vector.tcc (vector<bool, A>::_M_reallocate):
Hoist loads of begin() and end() before allocation and use them
to state an unreachable condition.
* testsuite/23_containers/vector/bool/capacity/110498.cc: New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit aa3aaf2bfb8fcc17076993df4297597b68bc5f60)

libstdc++: Fix -Wstringop-overread warning in std::vector<bool> [PR114758]

As in r13-4393-gcca06f0d6d76b0 and a few other commits, we can avoid
bogus warnings in std::vector<bool> by hoisting some loads to before the
allocation that calls operator new. This means that the compiler has
enough info to remove the dead branches that trigger bogus warnings.

On trunk this is only needed with -fno-assume-sane-operators-new-delete
but it will help on the branches where that option doesn't exist.

libstdc++-v3/ChangeLog:

PR libstdc++/114758
* include/bits/vector.tcc (vector<bool, A>::_M_fill_insert):
Hoist loads of begin() and end() before allocation.
* testsuite/23_containers/vector/bool/capacity/114758.cc: New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit 1f6c19f307c8de9830130a0ba071c24e3835beb3)

libstdc++: Fix bogus -Wstringop-overflow in std::vector::insert [PR117983]

This was fixed on trunk by r15-4473-g3abe751ea86e34, but that isn't
suitable for backporting. Instead, just add another unreachable
condition in std::vector::_M_range_insert so the compiler knows this
memcpy doesn't use a length originating from a negative ptrdiff_t
converted to a very positive size_t.

libstdc++-v3/ChangeLog:

PR libstdc++/117983
* include/bits/vector.tcc (vector::_M_range_insert): Add
unreachable condition to tell the compiler begin() <= end().
* testsuite/23_containers/vector/modifiers/insert/117983.cc: New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit 878812b6f6905774ab37cb78903e3e11bf1c508c)

libstdc++: Fix -Warray-bounds warning in std::vector::resize [PR114945]

This is yet another false positive warning fix. This time the compiler
can't prove that when the vector has sufficient excess capacity to
append new elements, the pointer to the existing storage is not null.

libstdc++-v3/ChangeLog:

PR libstdc++/114945
* include/bits/vector.tcc (vector::_M_default_append): Add
unreachable condition so the compiler knows that _M_finish is
not null.
* testsuite/23_containers/vector/capacity/114945.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit 844eed3364309bd20cbb7d6793a16b7c6b889ba4)

GCN: Don't emit weak undefined symbols [PR119369]

This resolves all instances of PR119369
"GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'";
for all affected test cases, the execution test status progresses FAIL -> PASS.

This however also causes a small number of (expected) regressions, very similar
to GCC/nvptx:

    [-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C  -std=c++17 (test for excess errors)
    [-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C  -std=c++26 (test for excess errors)
    [-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C  -std=c++98 (test for excess errors)

    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++11  scan-assembler .weak[ \t]*_?_ZTH11derived_obj
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++11  scan-assembler .weak[ \t]*_?_ZTH13container_obj
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++11  scan-assembler .weak[ \t]*_?_ZTH8base_obj
    PASS: g++.dg/cpp0x/pr84497.C  -std=c++11 (test for excess errors)
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++17  scan-assembler .weak[ \t]*_?_ZTH11derived_obj
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++17  scan-assembler .weak[ \t]*_?_ZTH13container_obj
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++17  scan-assembler .weak[ \t]*_?_ZTH8base_obj
    PASS: g++.dg/cpp0x/pr84497.C  -std=c++17 (test for excess errors)
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++26  scan-assembler .weak[ \t]*_?_ZTH11derived_obj
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++26  scan-assembler .weak[ \t]*_?_ZTH13container_obj
    [-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C  -std=c++26  scan-assembler .weak[ \t]*_?_ZTH8base_obj
    PASS: g++.dg/cpp0x/pr84497.C  -std=c++26 (test for excess errors)

    [-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C  -std=gnu++17  scan-assembler weak[^ \t]*[ \t]_?_Z3foov
    PASS: g++.dg/ext/weak2.C  -std=gnu++17 (test for excess errors)
    [-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C  -std=gnu++26  scan-assembler weak[^ \t]*[ \t]_?_Z3foov
    PASS: g++.dg/ext/weak2.C  -std=gnu++26 (test for excess errors)
    [-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C  -std=gnu++98  scan-assembler weak[^ \t]*[ \t]_?_Z3foov
    PASS: g++.dg/ext/weak2.C  -std=gnu++98 (test for excess errors)

    [-PASS:-]{+FAIL:+} gcc.dg/attr-weakref-1.c (test for excess errors)
    [-FAIL:-]{+UNRESOLVED:+} gcc.dg/attr-weakref-1.c [-execution test-]{+compilation failed to produce executable+}

    @@ -131211,25 +131211,25 @@ PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?c
    PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?d
    PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?e
    PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?g
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?j
    PASS: gcc.dg/weak/weak-1.c scan-assembler-not weak[^ \t]*[ \t]_?i

    PASS: gcc.dg/weak/weak-12.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-12.c scan-assembler weak[^ \t]*[ \t]_?foo

    PASS: gcc.dg/weak/weak-15.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?a
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?c
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?d
    PASS: gcc.dg/weak/weak-15.c scan-assembler-not weak[^ \t]*[ \t]_?b

    PASS: gcc.dg/weak/weak-16.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-16.c scan-assembler weak[^ \t]*[ \t]_?kallsyms_token_index
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-16.c scan-assembler weak[^ \t]*[ \t]_?kallsyms_token_table
    PASS: gcc.dg/weak/weak-2.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1a
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1b
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1c
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1e
    PASS: gcc.dg/weak/weak-2.c scan-assembler-not weak[^ \t]*[ \t]_?ffoo1d

    PASS: gcc.dg/weak/weak-3.c  (test for warnings, line 58)
    PASS: gcc.dg/weak/weak-3.c  (test for warnings, line 73)
    PASS: gcc.dg/weak/weak-3.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1a
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1b
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1c
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1e
    PASS: gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1f
    PASS: gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1g
    PASS: gcc.dg/weak/weak-3.c scan-assembler-not weak[^ \t]*[ \t]_?ffoo1d

    PASS: gcc.dg/weak/weak-4.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1a
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1b
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1c
    PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1d
    PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1e
    PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1f
    @@ -131267,16 +131267,16 @@ PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1i
    PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1j
    PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1k

    PASS: gcc.dg/weak/weak-5.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1a
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1b
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1c
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1d
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1e
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1f
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1g
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1h
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1i
    [-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1j
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1k
    PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1l

These get 'dg-xfail-if'ed or 'dg-skip-if'ed, (mostly) similar to GCC/nvptx.

PR target/119369
gcc/
* config/gcn/gcn-protos.h (gcn_asm_weaken_decl): Declare.
* config/gcn/gcn.cc (gcn_asm_weaken_decl): New.
* config/gcn/gcn-hsa.h (ASM_WEAKEN_DECL): '#define' to this.
gcc/testsuite/
* g++.dg/abi/pure-virtual1.C: 'dg-xfail-if' GCN.
* g++.dg/cpp0x/pr84497.C: 'dg-skip-if' GCN.
* g++.dg/ext/weak2.C: Likewise.
* gcc.dg/attr-weakref-1.c: Likewise.
* gcc.dg/weak/weak-1.c: Likewise.
* gcc.dg/weak/weak-12.c: Likewise.
* gcc.dg/weak/weak-15.c: Likewise.
* gcc.dg/weak/weak-16.c: Likewise.
* gcc.dg/weak/weak-2.c: Likewise.
* gcc.dg/weak/weak-3.c: Likewise.
* gcc.dg/weak/weak-4.c: Likewise.
* gcc.dg/weak/weak-5.c: Likewise.

(cherry picked from commit 2f58d8ac03911063d6a8887a2bee7b4e25ac1871)

GCN, libstdc++: '#define _GLIBCXX_USE_WEAK_REF 0' [PR119369]

This fixes a few hundreds of compilation/linking FAILs (similar to PR69506),
where the GCN/LLVM 'ld' reported:

    ld: error: relocation R_AMDGPU_REL32_LO cannot be used against symbol '_ZGTtnam'; recompile with -fPIC
    >>> defined in [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a(cow-stdexcept.o)
    >>> referenced by cow-stdexcept.cc:259 ([...]/libstdc++-v3/src/c++11/cow-stdexcept.cc:259)
    >>>               cow-stdexcept.o:(_txnal_cow_string_C1_for_exceptions(void*, char const*, void*)) in archive [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a

    ld: error: relocation R_AMDGPU_REL32_HI cannot be used against symbol '_ZGTtnam'; recompile with -fPIC
    >>> defined in [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a(cow-stdexcept.o)
    >>> referenced by cow-stdexcept.cc:259 ([...]/source-gcc/libstdc++-v3/src/c++11/cow-stdexcept.cc:259)
    >>>               cow-stdexcept.o:(_txnal_cow_string_C1_for_exceptions(void*, char const*, void*)) in archive [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a

    [...]

..., which is:

    $ c++filt _ZGTtnam
    transaction clone for operator new[](unsigned long)

..., and similarly for other libitm symbols.

However, the affected test cases, if applicable, then run into execution test
FAILs, due to PR119369
"GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'".

PR target/119369
libstdc++-v3/
* config/cpu/gcn/cpu_defines.h: New.
* configure.host [GCN] (cpu_defines_dir): Point to it.

(cherry picked from commit 816335960d020eac92d49bc9cd13729afd313da7)