Ensure that the GOMP_MAX_VF does the right thing for explicit schedules, when
offloading is enabled ("target" directives are present), and is inactive
otherwise.
libgomp/ChangeLog:
* testsuite/libgomp.c/max_vf-1.c: New test.
* testsuite/libgomp.c/max_vf-2.c: New test.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/max_vf-1.c: New test.
Delay omp_max_vf call until after the host and device compilers have diverged
so that the max_vf value can be tuned exactly right on both variants.
This change means that the ompdevlow pass must be enabled for functions that
use OpenMP directives with both "simd" and "schedule" enabled.
gcc/ChangeLog:
* internal-fn.cc (expand_GOMP_MAX_VF): New function.
* internal-fn.def (GOMP_MAX_VF): New internal function.
* omp-expand.cc (omp_adjust_chunk_size): Emit IFN_GOMP_MAX_VF when
called in offload context, otherwise assume host context.
* omp-offload.cc (execute_omp_device_lower): Expand IFN_GOMP_MAX_VF.
The chunk size for SIMD loops should be right for the current device; too big
allocates too much memory, too small is inefficient. Getting it wrong doesn't
actually break anything though.
This patch attempts to choose the optimal setting based on the context. Both
host-fallback and device will get the same chunk size, but device performance
is the most important in this case.
gcc/ChangeLog:
* omp-expand.cc (is_in_offload_region): New function.
(omp_adjust_chunk_size): Add pass-through "offload" parameter.
(get_ws_args_for): Likewise.
(determine_parallel_type): Use is_in_offload_region to adjust call to
get_ws_args_for.
(expand_omp_for_generic): Likewise.
(expand_omp_for_static_chunk): Likewise.
If requested, return the vectorization factor appropriate for the offload
device, if any.
This change gives a significant speedup in the BabelStream "dot" benchmark on
amdgcn.
The omp_adjust_chunk_size usecase is set "false", for now, but I intend to
change that in a follow-up patch.
Note that NVPTX SIMT offload does not use this code-path.
gcc/ChangeLog:
* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Set
omp_max_vf to offload == false.
* omp-expand.cc (omp_adjust_chunk_size): Likewise.
* omp-general.cc (omp_max_vf): Add "offload" parameter, and detect
amdgcn offload devices.
* omp-general.h (omp_max_vf): Likewise.
* omp-low.cc (lower_rec_simd_input_clauses): Pass offload state to
omp_max_vf.
The Assume pass simply produces results, with no indication of how it
arrived as the results it gets. Add some output to the details listing.
The only functional change is when gori is used to calculate a range
more than once (ie, multiple uses), we now load the merged range rather
than just using the last calculated one.
* tree-assume.cc (assume_query::assume_query): Add debug output.
(assume_query::update_parms): Likewise.
(assume_query::calculate_phi): Likewise.
(assume_query::calculate_op): Likewise. Also pick up any
merged path values.
(assume_query::calculate_stmt): Likewise.
These headers make no sense for C++ programs, because they either define
different content to the corresponding <xxx.h> C header, or define
nothing at all in namespace std. They were all deprecated in C++17, so
add deprecation warnings to them, which can be disabled with
-Wno-deprecated. For C++20 and later these headers are no longer in the
standard at all, so compiling with _GLIBCXX_USE_DEPRECATED defined to 0
will give an error when they are included.
Because #warning is non-standard before C++23 we need to use pragmas to
ignore -Wc++23-extensions for the -Wsystem-headers -pedantic case.
One g++ test needs adjustment because it includes <ciso646>, but that
can be made conditional on the __cplusplus value without any reduction
in test coverage.
For the library tests, consolidate the std_c++0x_neg.cc XFAIL tests into
the macros.cc test, using dg-error with a { target c++98_only }
selector. This avoids having two separate test files, one for C++98 and
one for everything later. Also add tests for the <xxx.h> headers to
ensure that they behave as expected and don't give deprecated warnings.
libstdc++-v3/ChangeLog:
* doc/xml/manual/evolution.xml: Document deprecations.
* doc/html/*: Regenerate.
* include/c_compatibility/complex.h (_GLIBCXX_COMPLEX_H): Move
include guard to start of file. Include <complex> directly
instead of <ccomplex>.
* include/c_compatibility/tgmath.h: Include <cmath> and
<complex> directly, instead of <ctgmath>.
* include/c_global/ccomplex: Add deprecated #warning for C++17
and #error for C++20 if _GLIBCXX_USE_DEPRECATED == 0.
* include/c_global/ciso646: Likewise.
* include/c_global/cstdalign: Likewise.
* include/c_global/cstdbool: Likewise.
* include/c_global/ctgmath: Likewise.
* include/c_std/ciso646: Likewise.
* include/precompiled/stdc++.h: Do not include ccomplex,
ciso646, cstdalign, cstdbool, or ctgmath in C++17 and later.
* testsuite/18_support/headers/cstdalign/macros.cc: Check for
warnings and errors for unsupported dialects.
* testsuite/18_support/headers/cstdbool/macros.cc: Likewise.
* testsuite/26_numerics/headers/ctgmath/complex.cc: Likewise.
* testsuite/27_io/objects/char/1.cc: Do not include <ciso646>.
* testsuite/27_io/objects/wchar_t/1.cc: Likewise.
* testsuite/18_support/headers/cstdbool/std_c++0x_neg.cc: Removed.
* testsuite/18_support/headers/cstdalign/std_c++0x_neg.cc: Removed.
* testsuite/26_numerics/headers/ccomplex/std_c++0x_neg.cc: Removed.
* testsuite/26_numerics/headers/ctgmath/std_c++0x_neg.cc: Removed.
* testsuite/18_support/headers/ciso646/macros.cc: New test.
* testsuite/18_support/headers/ciso646/macros.h.cc: New test.
* testsuite/18_support/headers/cstdbool/macros.h.cc: New test.
* testsuite/26_numerics/headers/ccomplex/complex.cc: New test.
* testsuite/26_numerics/headers/ccomplex/complex.h.cc: New test.
* testsuite/26_numerics/headers/ctgmath/complex.h.cc: New test.
gcc/testsuite/ChangeLog:
* g++.old-deja/g++.other/headers1.C: Do not include ciso646 for
C++17 and later.
libstdc++-v3/ChangeLog:
* include/c_compatibility/complex.h (_GLIBCXX_COMPLEX_H): Move
include guard to start of the header.
* include/c_global/ctgmath (_GLIBCXX_CTGMATH): Likewise.
Currently dereferencing an empty shared_ptr prints a complicated
internal type in the assertion message:
include/bits/shared_ptr_base.h:1377: std::__shared_ptr_access<_Tp, _Lp, <anonymous>, <anonymous> >::element_type& std::__shared_ptr_access<_Tp, _Lp, <anonymous>, <anonymous> >::operator*() const [with _Tp = std::filesystem::__cxx11::recursive_directory_iterator::_Dir_stack; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic; bool <anonymous> = false; bool <anonymous> = false; element_type = std::filesystem::__cxx11::recursive_directory_iterator::_Dir_stack]: Assertion '_M_get() != nullptr' failed.
Users don't care about any of the _Lp and <anonymous> template
parameters, so this is unnecessarily verbose.
We can simplify it to something that only mentions "shared_ptr_deref"
and the element type:
include/bits/shared_ptr_base.h:1371: _Tp* std::__shared_ptr_deref(_Tp*) [with _Tp = filesystem::__cxx11::recursive_directory_iterator::_Dir_stack]: Assertion '__p != nullptr' failed.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h (__shared_ptr_deref): New
function template.
(__shared_ptr_access, __shared_ptr_access<>): Use it.
Several member functions of filesystem::directory_iterator and
filesystem::recursive_directory_iterator currently dereference their
shared_ptr data member without checking for non-null. Because they use
operator-> and that function only uses _GLIBCXX_DEBUG_PEDASSERT rather
than __glibcxx_assert there is no assertion even when the library is
built with _GLIBCXX_ASSERTIONS defined. This means that dereferencing
invalid directory iterators gives an unhelpful segfault.
By using (*p). instead of p-> we get an assertion when the library is
built with _GLIBCXX_ASSERTIONS, with a "_M_get() != nullptr" message.
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (fs::directory_iterator::operator*): Use
shared_ptr::operator* instead of shared_ptr::operator->.
(fs::recursive_directory_iterator::options): Likewise.
(fs::recursive_directory_iterator::depth): Likewise.
(fs::recursive_directory_iterator::recursion_pending): Likewise.
(fs::recursive_directory_iterator::operator*): Likewise.
(fs::recursive_directory_iterator::disable_recursion_pending):
Likewise.
This patch disables propagation of ipcp information into partitions
where all instances of the node are marked to be inlined.
Motivation:
Incremental LTO needs stable values between compilations to be
effective. This requirement fails with following example:
void heavily_used_function(int);
...
heavily_used_function(__LINE__);
Ipcp creates long list of all __LINE__ arguments, and then
propagates it with every function clone, even though for inlined
functions this information is not useful.
gcc/ChangeLog:
* ipa-prop.cc (write_ipcp_transformation_info): Disable
uneeded value propagation.
* lto-cgraph.cc (lto_symtab_encoder_encode): Default values.
(lto_symtab_encoder_always_inlined_p): New.
(lto_set_symtab_encoder_not_always_inlined): New.
(add_node_to): Set always inlined.
* lto-streamer.h (struct lto_encoder_entry): New field.
(lto_symtab_encoder_always_inlined_p): New.
Store merging assumes a merged region won't be too large. The assumption is
e.g. in using inappropriate types in various spots (e.g. int for bit sizes
and bit positions in a few spots, or unsigned for the total size in bytes of
the merged region), in doing XNEWVEC for the whole total size of the merged
region and preparing everything in there and even that XALLOCAVEC in two
spots. The last case is what was breaking the test below in the patch,
64MB XALLOCAVEC is just too large, but even with that fixed I think we just
shouldn't be merging gigabyte large merge groups.
We already have --param=store-merging-max-size= parameter, right now with
65536 bytes maximum (if needed, we could raise that limit a little bit).
That parameter is currently used when merging two adjacent stores, if the
size of the already merged bitregion together with the new store's bitregion
is above that limit, we don't merge those.
I guess initially that was sufficient, at that time a store was always
limited to MAX_BITSIZE_MODE_ANY_INT bits.
But later on we've added support for empty ctors ({} and even later
{CLOBBER}) and also added another spot where we merge further stores into
the merge group, if there is some overlap, we can merge various other stores
in one coalesce_immediate_stores iteration.
And, we weren't applying the --param=store-merging-max-size= parameter
in either of those cases. So a single store can be gigabytes long, and
if there is some overlap, we can extend the region again to gigabytes in
size.
The following patch attempts to apply that parameter even in those cases.
So, if testing if it should merge the merged group with info (we've already
punted if those together are above the parameter) and some other stores,
the first two hunks just punt if that would make the merge group too large.
And the third hunk doesn't even add stores which are over the limit.
2024-11-06 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/117439
* gimple-ssa-store-merging.cc
(imm_store_chain_info::coalesce_immediate_stores): Punt if merging of
any of the additional overlapping stores would result in growing the
bitregion size over param_store_merging_max_size.
(pass_store_merging::process_store): Terminate all aliasing chains
for stores with bitregion larger than param_store_merging_max_size.
* g++.dg/opt/pr117439.C: New test.
encode_tree_to_bitpos uses the more expensive sub_byte_op_p mode in which
it has to allocate a buffer and do various extra work like shifting the bits
etc. if bitlen or bitpos aren't multiples of BITS_PER_UNIT, or if bitlen
doesn't have corresponding integer mode.
The last case is explained later in the comments:
/* The native_encode_expr machinery uses TYPE_MODE to determine how many
bytes to write. This means it can write more than
ROUND_UP (bitlen, BITS_PER_UNIT) / BITS_PER_UNIT bytes (for example
write 8 bytes for a bitlen of 40). Skip the bytes that are not within
bitlen and zero out the bits that are not relevant as well (that may
contain a sign bit due to sign-extension). */
Now, we've later added empty_ctor_p support, either {} CONSTRUCTOR
or {CLOBBER}, which doesn't use native_encode_expr at all, just memset,
so that case doesn't need those fancy games unless bitlen or bitpos
aren't multiples of BITS_PER_UNIT (unlikely, but let's pretend it is
possible).
The following patch makes us use the fast path even for empty_ctor_p
which occupy full bytes, we can just memset that in the provided buffer and
don't need to XALLOCAVEC another buffer.
This patch in itself fixes the testcase from the PR (which was about using
huge XALLLOCAVEC), but I want to do some other changes, to be posted in a
next patch.
2024-11-06 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/117439
* gimple-ssa-store-merging.cc (encode_tree_to_bitpos): For
empty_ctor_p use !sub_byte_op_p even if bitlen doesn't have an
integral mode.
2024-11-06 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/117434
* interface.cc (gfc_compare_actual_formal): Skip 'Expected a
procedure pointer error' if the formal argument typespec has an
interface and the type of the actual arg is BT_PROCEDURE.
gcc/testsuite/
PR fortran/117434
* gfortran.dg/proc_ptr_54.f90: New test. This is temporarily
compile-only until one one seven four five five is fixed.
* gfortran.dg/proc_ptr_55.f90: New test.
* gfortran.dg/proc_ptr_56.f90: New test.
Since x32 uses (%reg32), instead of (%r.x), also scan (%e.x).
* gcc.target/i386/avx10_2-512-movrs-1.c: Also scan (%e.x).
* gcc.target/i386/avx10_2-movrs-1.c: Likewise.
* gcc.target/i386/movrs-1.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Since x32 uses (%edi), instead of (%rdi), also scan (%edi).
* gcc.target/i386/apx-ndd.c: Also scan (%edi).
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
PR fortran/117442 reports a crash on exit of f951 when configured
with --enable-gather-detailed-mem-stats.
The crash happens if any diagnostics were ever buffered into
error_buffer. The root cause is that error_buffer is statically
allocated and thus has a non-trivial destructor called at exit.
If error_buffer's diagnostic_buffer ever buffered anything, then
a diagnostic_per_format_buffer will have been created for the
buffer per-output-sink, and the destructors for these call
into the mem-stats subsystem, which has already beeen cleaned up.
The simplest fix is to allocate error_buffer on the heap, rather
that statically, which fixes the crash.
There's a comment about error_buffer:
/* pp_error_buffer is statically allocated. This simplifies memory
management when using gfc_push/pop_error. */
added by Manu in r6-1748-g5862c189c2c3c2 while fixing PR fortran/66528.
The comment appears to be out of date. I've tested maxerrors.f90 under
valgrind, and it's clean with the patch.
gcc/fortran/ChangeLog:
PR fortran/117442
* error.cc (error_buffer): Convert to a pointer so it can be
heap-allocated.
(gfc_error_now): Update for error_buffer being heap-allocated.
(gfc_clear_error): Likewise.
(gfc_error_flag_test): Likewise.
(gfc_error_check): Likewise.
(gfc_push_error): Likewise.
(gfc_pop_error): Likewise.
(gfc_diagnostics_init): Allocate error_buffer on the heap, rather
than statically.
(gfc_diagnostics_finish): Delete error_buffer.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Just a small coment fix, the `(` was in the wrong location,
making it look it was transforming into `(X - X) != 0`
rather than `X - (X != 0)`.
Pushed as obvious after a quick build for x86_64-linux-gnu.
gcc/ChangeLog:
* match.pd (X != 0 ? X + ~0 : 0): Fix comment.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Since r15-1619-g3b9b8d6cfdf, test5 and test8 fails due to that "ip"
might be used and r3 might be moved to another register for later
dereference.
gcc/testsuite/ChangeLog:
PR testsuite/116623
* gcc.target/arm/mve/dlstp-compile-asm-2.c: Align test5 and
test8 with changes in r15-1619-g3b9b8d6cfdf.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
The test case assumes that -mfp16-format=alternative is accepted for the
target, but not all targets support this flag. One such target is
Cortex-M85 that does support FP16, but not the alternative format.
gcc/testsuite/ChangeLog:
* gcc.target/arm/pr98636.c: Use effective-target
arm_fp16_alternative.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Since r13-707-g68e0063397ba82, COND_EXPR/VEC_COND_EXPR has not
allowed a comparison as the first operand but the gimple front-end
was not updated for this change and you would error out later on.
An assert was added with r15-4791-gb60031e8f9f8fe which meant an ICE
would happen from the gimple FE.
This removes support for parsing of the `?:` expressions except for an
identifier.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/c/ChangeLog:
PR c/117445
* gimple-parser.cc (c_parser_gimple_statement): Remove
support for comparisons before the querry (`?`) token.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
The vector rotate splitter has some logic to deal with post-reload splitting
but not all cases in aarch64_emit_opt_vec_rotate are post-reload-safe.
In particular the ROTATE+XOR expansion for TARGET_SHA3 can create RTL that
can later be simplified to a simple ROTATE post-reload, which would then
match the insn again and try to split it.
So do a clean split pre-reload and avoid going down this path post-reload
by restricting the insn_and_split to can_create_pseudo_p ().
Bootstrapped and tested on aarch64-none-linux.
Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/
PR target/117449
* config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm<mode>):
Match only when can_create_pseudo_p ().
* config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Assume
can_create_pseudo_p ().
gcc/testsuite/
PR target/117449
* gcc.c-torture/compile/pr117449.c: New test.
The test safe-indirect-jump-3.c FAILs on powerpc64le-linux with the change
in jump table generation behavior with commit r15-4756-g06bc3a734e8890,
since it is compiled without optimization and expects jump tables to be
generated. Add an explicit -fjump-tables to dg-options to get the old
behavior back.
2024-11-05 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR testsuite/117444
* gcc.target/powerpc/safe-indirect-jump-3.c: Add -fjump-tables to
dg-options.
We've accidentally accepted this forever (at least as far back as 4.7), but
it's always been ill-formed; this was PR59465. And we didn't accept it for
scalar types. But rather than switch to a hard error for this code, let's
give a permerror so affected code can continue to work with -fpermissive.
PR c++/116634
gcc/cp/ChangeLog:
* init.cc (can_init_array_with_p): Allow PR59465 case with
permerror.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/aggr-init1.C: Expect warning with -fpermissive.
* g++.dg/init/array62.C: Adjust diagnostic.
* g++.dg/init/array63.C: Adjust diagnostic.
* g++.dg/init/array64.C: Adjust diagnostic.
The ARM simulator is no longer able to simulator modern ARM cores, so it
is being deprecated. Once this change has been active for a while - and
assuming that no problems have been found - the ARm simulator codebase
will be removed.
2024-11-05 Nick Clifton <nickc@redhat.com>
* configure.ac: Add sim to noconfigdirs for ARM targets.
* configure: Regenerate.
PR117099 and PR117129 are ICEs upon invalid code that happen when NRVO
is activated, and both due to the fact that we don't consistently set
current_function_return_value to error_mark_node upon error in
finish_return_expr.
This patch fixes this inconsistency which fixes both cases since we skip
calling finalize_nrv when current_function_return_value is
error_mark_node.
PR c++/117099
PR c++/117129
gcc/cp/ChangeLog:
* typeck.cc (check_return_expr): Upon error, set
current_function_return_value to error_mark_node.
gcc/testsuite/ChangeLog:
* g++.dg/parse/crash78.C: New test.
* g++.dg/parse/crash78a.C: New test.
* g++.dg/parse/crash79.C: New test.
We currently crash upon the following invalid code (notice the "void
void**" parameter)
=== cut here ===
using size_t = decltype(sizeof(int));
void *operator new(size_t, void void **p) noexcept { return p; }
int x;
void f() {
int y;
new (&y) int(x);
}
=== cut here ===
The problem is that in this case, we end up with a NULL_TREE parameter
list for the new operator because of the error, and (1) coerce_new_type
wrongly complains about the first parameter type not being size_t,
(2) std_placement_new_fn_p blindly accesses the parameter list, hence a
crash.
This patch does NOT address #1 since we can't easily distinguish between
a new operator declaration without parameters from one with erroneous
parameters (and it's not worth the risk to refactor and break things for
an error recovery issue) hence a dg-bogus in new52.C, but it does
address #2 and the ICE by simply checking the first parameter against
NULL_TREE.
It also adds a new testcase checking that we complain about new
operators with no or invalid first parameters, since we did not have
any.
PR c++/117101
gcc/cp/ChangeLog:
* init.cc (std_placement_new_fn_p): Check first_arg against
NULL_TREE.
gcc/testsuite/ChangeLog:
* g++.dg/init/new52.C: New test.
* g++.dg/init/new53.C: New test.
Since r10-3793-g1a37b6d9a7e57c, we ICE upon the following valid code
with -std=c++17 and above
=== cut here ===
struct Base {
unsigned int *intarray;
};
template <typename T> struct Sub : public Base {
bool Get(int i) {
return (Base::intarray[++i] == 0);
}
};
=== cut here ===
The problem is that from c++17 on, we use -fstrong-eval-order and need
to wrap the array access expression into a SAVE_EXPR. We do so at
template declaration time, and end up calling contains_placeholder_p
with a SCOPE_REF, that it does not handle well.
This patch fixes this by deferring the wrapping into SAVE_EXPR to
instantiation time for templates, when the SCOPE_REF will have been
turned into a COMPONENT_REF.
PR c++/117158
gcc/cp/ChangeLog:
* typeck.cc (cp_build_array_ref): Only wrap array expression
into a SAVE_EXPR at template instantiation time.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/eval-order13.C: New test.
* g++.dg/parse/crash77.C: New test.
The test case is for targets that support FMA. Previously
the "target" selector is missed in dg-final command.
gcc/testsuite/ChangeLog:
PR tree-optimization/110279
* gcc.dg/pr110279-1.c: add target selector.
It's supported by vector permutation with zero vector.
gcc/ChangeLog:
* config/i386/i386-expand.cc
(ix86_expand_vector_bf2sf_with_vec_perm): New function.
* config/i386/i386-protos.h
(ix86_expand_vector_bf2sf_with_vec_perm): New Declare.
* config/i386/mmx.md (extendv2bfv2sf2): New expander.
* config/i386/sse.md (extend<sf_cvt_bf16_lower><mode>2):
Ditto.
(VF1_AVX512BW): New mode iterator.
(sf_cvt_bf16): Add V4SF.
(sf_cvt_bf16_lower): New mode attr.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512bw-extendbf2sf.c: New test.
* gcc.target/i386/sse2-extendbf2sf.c: New test.
cxx_init_decl_processing predeclares 12 out of the 20 replaceable global
new/delete operators and sets DECL_IS_REPLACEABLE_OPERATOR on those.
But it doesn't handle the remaining 8, in particular
void* operator new(std::size_t, const std::nothrow_t&) noexcept;
void* operator new[](std::size_t, const std::nothrow_t&) noexcept;
void operator delete(void*, const std::nothrow_t&) noexcept;
void operator delete[](void*, const std::nothrow_t&) noexcept;
void* operator new(std::size_t, std::align_val_t, const std::nothrow_t&) noexcept;
void* operator new[](std::size_t, std::align_val_t, const std::nothrow_t&) noexcept;
void operator delete(void*, std::align_val_t, const std::nothrow_t&) noexcept;
void operator delete[](void*, std::align_val_t, const std::nothrow_t&) noexcept;
The following patch sets that flag during grok_op_properties for those, so
that they don't need to be predeclared.
The patch doesn't fix the whole PR, as some work is needed on the CDDCE side
too, unlike the throwing operator new case the if (ptr) conditional around
operator delete isn't removed by VRP and so we need to handle conditional
delete for unconditional new.
2024-11-05 Jakub Jelinek <jakub@redhat.com>
PR c++/117370
* cp-tree.h (is_std_class): Declare.
* constexpr.cc (is_std_class): New function.
(is_std_allocator): Use it.
* decl.cc (grok_op_properties): Mark global replaceable
operator new/delete operators with const std::nothrow_t & last
argument with DECL_IS_REPLACEABLE_OPERATOR.
op1 should be between 0 and 2. Add an error handler, and op3 should be 0
or 1, raise a warning, when op3 is an invalid value.
gcc/ChangeLog:
PR target/117416
* config/i386/i386-expand.cc (ix86_expand_builtin): Raise warning when
op1 isn't in range of [0, 2] and set op1 as const0_rtx, and raise
warning when op3 isn't in range of [0, 1].
gcc/testsuite/ChangeLog:
PR target/117416
* gcc.target/i386/pr117416-1.c: New test.
* gcc.target/i386/pr117416-2.c: Ditto.
When we end up expanding a SSA name copy with BLKmode regs which can
happen for vectors, possibly wrapped in a NOP-conversion or
a PAREN_EXPR and we are not optimizing we can end up with two
BLKmode MEMs that expand_gimple_stmt_1 doesn't properly handle
when expanding, trying to emit_move_insn them. Looking at store_expr
which what expand_gimple_stmt_1 is really doing reveals a lot of
magic that's missing. It eventually falls back to emit_block_move
(store_expr isn't exported), so this is what I ended up using here
given I think we'll only have BLKmode "registers" for vectors.
PR middle-end/117433
* cfgexpand.cc (expand_gimple_stmt_1): Use emit_block_move
when moving temp to BLKmode target.
* gcc.dg/pr117433.c: New testcase.
This code is not well tested and there is only a single testcase
(gcc.target/aarch64/pr94530.c) which only enables this code but it
is testing to make sure there is no ICE.
The falkor cores have not been supported from Qualcomm from 2019 or so
either. And I don't have a way to test these cores internally either.
Bootstrapped and tested on aarch64-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-passes.def: Don't add pass_tag_collision_avoidance.
* config/aarch64/aarch64-protos.h (make_pass_tag_collision_avoidance): Remove.
* config/aarch64/aarch64-tuning-flags.def (RENAME_LOAD_REGS): Remove.
* config/aarch64/tuning_models/qdf24xx.h (qdf24xx_tunings): Set tuning flags to
AARCH64_EXTRA_TUNE_NONE.
* config/aarch64/falkor-tag-collision-avoidance.cc: Removed.
* config/aarch64/t-aarch64 (falkor-tag-collision-avoidance.o): Remove.
* config.gcc (aarch64*-*-*): Remove falkor-tag-collision-avoidance.o from extra_objs.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
These 2 qualcomm cores have been long gone in that Qualcomm has not
supported since at least 2019. Removing them will make it easier I think
to change the insn type attributes instead of keeping them up todate.
Note this does not remove the cores, just the schedule models.
Bootstrapped and tested on aarch64-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (falkor): Use cortex-a57 scheduler.
(saphira): Likewise.
* config/aarch64/aarch64.md: Don't include falkor.md and saphira.md.
* config/aarch64/falkor.md: Removed.
* config/aarch64/saphira.md: Removed.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This patch enables the use of the VCOMSBF16 instruction from AVX10.2 for
efficient BF16 comparisons.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_branch): Handle BFmode
when TARGET_AVX10_2_256 is enabled.
(ix86_prepare_fp_compare_args): Use SSE_FLOAT_MODE_SSEMATH_OR_HFBF_P.
(ix86_expand_fp_movcc): Ditto.
(ix86_expand_fp_compare): Handle BFmode under IX86_FPCMP_COMI.
* config/i386/i386.cc (ix86_multiplication_cost): Use
SSE_FLOAT_MODE_SSEMATH_OR_HFBF_P.
(ix86_division_cost): Ditto.
(ix86_rtx_costs): Ditto.
(ix86_vector_costs::add_stmt_cost): Ditto.
* config/i386/i386.h (SSE_FLOAT_MODE_SSEMATH_OR_HF_P): Rename to ...
(SSE_FLOAT_MODE_SSEMATH_OR_HFBF_P): ...this, and add BFmode.
* config/i386/i386.md (*cmpibf): New define_insn.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-comibf-1.c: New test.
* gcc.target/i386/avx10_2-comibf-2.c: Ditto.
Follow MSVC in having a special type value, T_HRESULT, for (signed)
longs that have been typedef'd with the name "HRESULT". This is so that
the debugger can display user-friendly constant names when debugging COM
code.
gcc/
* dwarf2codeview.cc (get_type_num_typedef): New function.
(get_type_num): Call get_type_num_typedef.
* dwarf2codeview.h (T_HRESULT): Define.
Translate DW_TAG_ptr_to_member_type DIEs into special extended
LF_POINTER CodeView types.
gcc/
* dwarf2codeview.cc (struct codeview_custom_type): Add new fields to
lf_pointer struct in union.
(write_lf_pointer): Write containing_class and ptr_to_mem_type if
applicable.
(get_type_num_subroutine_type): Write correct containing_class_type if
this is a pointer to a member function.
(get_type_num_ptr_to_member_type): New function.
(get_type_num): Call get_type_num_ptr_to_member_type.
* dwarf2codeview.h (CV_PTR_MODE_MASK, CV_PTR_MODE_PMEM): Define.
(CV_PTR_MODE_PMFUNC, CV_PMTYPE_D_Single, CV_PMTYPE_F_Single): Likewise.
When writing the CodeView type definition for a struct, translate
DW_TAG_inheritance DIEs into LF_BCLASS records, to record which other
structs this one inherits from.
gcc/
* dwarf2codeview.cc (enum cv_leaf_type): Add LF_BCLASS.
(struct codeview_subtype): Add lf_bclass to union.
(write_cv_padding): Add declaration.
(write_lf_fieldlist): Handle LF_BCLASS records.
(add_struct_inheritance): New function.
(get_type_num_struct): Call add_struct_inheritance.
When merging a newly imported declaration with an existing declaration
we don't currently propagate new default arguments, which causes issues
when modularising header units. This patch adds logic to propagate
default arguments to existing declarations on import, and error if the
defaults do not match.
PR c++/99274
gcc/cp/ChangeLog:
* module.cc (trees_in::is_matching_decl): Merge default
arguments.
* tree.cc (cp_tree_equal) <AGGR_INIT_EXPR>: Handle unification
of AGGR_INIT_EXPRs with new VAR_DECL slots.
gcc/testsuite/ChangeLog:
* g++.dg/modules/lambda-7.h: Skip ODR-violating declaration when
testing ODR deduplication.
* g++.dg/modules/lambda-7_b.C: Note we're testing ODR
deduplication.
* g++.dg/modules/default-arg-1_a.H: New test.
* g++.dg/modules/default-arg-1_b.C: New test.
* g++.dg/modules/default-arg-2_a.H: New test.
* g++.dg/modules/default-arg-2_b.C: New test.
* g++.dg/modules/default-arg-3.h: New test.
* g++.dg/modules/default-arg-3_a.H: New test.
* g++.dg/modules/default-arg-3_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
The 'location_t' type currently only stores a limited number of distinct
locations. In some cases, if many modules are imported that sum up to a
large number of locations, we may run out of room to represent new
locations for these imported declarations. In such a case, any new
declarations from the affected modules simply get given a location of
"the module interface as a whole".
'write_location' sometimes gets confused when this happens: it finds that
the location is a location we've noted to get streamed out, but it's
inconsistent whether it's an ordinary location from the current module
or an imported location from a different module. This causes
random-looking locations to be associated with these declarations, and
occasionally (checking-only) ICEs.
This patch fixes the issue by first checking whether an ordinary
location represents a module (rather than a location inside a module);
if so, we instead write the location of the point that we imported this
module. This will continue recursively in case the importing location
also was not able to be stored.
We only need to handle this in the IS_ORDINARY_LOC case: even for
locations originally within macro expansions, the remapping logic for
location exhaustion will make them look like ordinary locs again.
This is a relatively expensive addition, so this new check only occurs
if we've noted resource exhaustion has occurred while preparing imported
line maps, or in checking builds.
PR c++/105443
gcc/cp/ChangeLog:
* module.cc (loc_spans::locs_exhausted_p): New field.
(loc_spans::loc_spans): Initialise it.
(loc_spans::locations_exhausted_p): New function.
(module_state::read_prepare_maps): Move inform into...
(loc_spans::report_location_exhaustion): ...this new function.
(module_state::write_location): Check for writing module
locations stored due to resource exhaustion.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
When gdb defaults to use debuginfod, gdb warns simulate-thread tests:
spawn gdb -nx -nw -batch -x /export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.dg/simulate-thread/simulate-thread.gdb ./atomic-load-int.exe
Breakpoint 1 at 0x4005cc: file /export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.dg/simulate-thread/atomic-load-int.c, line 97.
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
Silence gdb warning by setting DEBUGINFOD_URLS to "" and restore it if
it exists.
PR testsuite/117300
* g++.dg/simulate-thread/simulate-thread.exp: Set DEBUGINFOD_URLS
to "" and restore it if it exists.
* gcc.dg/simulate-thread/simulate-thread.exp: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>