Commit Graph

210388 Commits

Author SHA1 Message Date
Alex Coplan
73c8e24b69 aarch64: Fix typo in aarch64-ldp-fusion.cc:combine_reg_notes [PR114936]
This fixes a typo in combine_reg_notes in the load/store pair fusion
pass.  As it stands, the calls to filter_notes store any
REG_FRAME_RELATED_EXPR to fr_expr with the following association:

 - i2 -> fr_expr[0]
 - i1 -> fr_expr[1]

but then the checks inside the following if statement expect the
opposite (more natural) association, i.e.:

 - i2 -> fr_expr[1]
 - i1 -> fr_expr[0]

this patch fixes the oversight by swapping the fr_expr indices in the
calls to filter_notes.

In hindsight it would probably have been less confusing / error-prone to
have combine_reg_notes take an array of two insns, then we wouldn't have
to mix 1-based and 0-based indexing as well as remembering to call
filter_notes in reverse program order.  This however is a minimal fix
for backporting purposes.

gcc/ChangeLog:

	PR target/114936
	* config/aarch64/aarch64-ldp-fusion.cc (combine_reg_notes):
	Ensure insn iN has its REG_FRAME_RELATED_EXPR (if any) stored in
	FR_EXPR[N-1], thus matching the correspondence expected by the
	copy_rtx calls.
2024-05-08 11:52:57 +01:00
Stefan Schulze Frielinghaus
e755f478c2 tree-ssa-loop-prefetch.cc: Honour -fno-unroll-loops
This fixes a couple of tests (gcc.dg/vect/pr109011-*.c) on s390 where
loops are unrolled although -fno-unroll-loops is specified.

gcc/ChangeLog:

	* tree-ssa-loop-prefetch.cc (determine_unroll_factor): Honour
	-fno-unroll-loops.
2024-05-08 10:48:45 +02:00
Georg-Johann Lay
41bc359c32 AVR: target/114975 - Add combine-pattern for __parityqi2.
PR target/114975
gcc/
	* config/avr/avr.md: Add combine pattern for
	8-bit parity detection.

gcc/testsuite/
	* gcc.target/avr/pr114975-parity.c: New test.
2024-05-08 10:41:12 +02:00
Georg-Johann Lay
c8f4bbb824 AVR: target/114975 - Add combine-pattern for __popcountqi2.
PR target/114975
gcc/
	* config/avr/avr.md: Add combine pattern for
	8-bit popcount detection.

gcc/testsuite/
	* gcc.target/avr/pr114975-popcount.c: New test.
2024-05-08 10:40:41 +02:00
Richard Biener
245a6d478a Fix and speedup IDF pruning by dominator
When insert_updated_phi_nodes_for tries to skip pruning the IDF to
blocks dominated by the nearest common dominator of the set of
definition blocks it compares against ENTRY_BLOCK but that's never
going to be the common dominator.  In fact if it ever were the code
fails to copy IDF to PRUNED_IDF, leading to wrong code.

The following fixes that by avoiding the copy and pruning from the
IDF in-place as well as using the more approprate check against
the single successor of the ENTRY_BLOCK.

	* tree-into-ssa.cc (insert_updated_phi_nodes_for): Skip
	pruning when the nearest common dominator is the successor
	of ENTRY_BLOCK.  Do not copy IDF but prune it directly.
2024-05-08 10:29:33 +02:00
Jakub Jelinek
9adec2d91e reassoc: Fix up optimize_range_tests_to_bit_test [PR114965]
The optimize_range_tests_to_bit_test optimization normally emits a range
test first:
          if (entry_test_needed)
            {
              tem = build_range_check (loc, optype, unshare_expr (exp),
                                       false, lowi, high);
              if (tem == NULL_TREE || is_gimple_val (tem))
                continue;
            }
so during the bit test we already know that exp is in the [lowi, high]
range, but skips it if we have range info which tells us this isn't
necessary.
Also, normally it emits shifts by exp - lowi counter, but has an
optimization to use just exp counter if the mask isn't a more expensive
constant in that case and lowi is > 0 and high is smaller than prec.

The following testcase is miscompiled because the two abnormal cases
are triggered.  The range of exp is [43, 43][48, 48][95, 95], so we on
64-bit arch decide we don't need the entry test, because 95 - 43 < 64.
And we also decide to use just exp as counter, because the range test
tests just for exp == 43 || exp == 48, so high is smaller than 64 too.
Because 95 is in the exp range, we can't do that, we'd either need to
do a range test first, i.e.
if (exp - 43U <= 48U - 43U) if ((1UL << exp) & mask1))
or need to subtract lowi from the shift counter, i.e.
if ((1UL << (exp - 43)) & mask2)
but can't do both unless r.upper_bound () is < prec.

The following patch ensures that.

2024-05-08  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/114965
	* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Don't try to
	optimize away exp - lowi subtraction from shift count unless entry
	test is emitted or unless r.upper_bound () is smaller than prec.

	* gcc.c-torture/execute/pr114965.c: New test.
2024-05-08 10:17:32 +02:00
Eric Botcazou
10e34aa5b1 Minor tweaks to code computing modular multiplicative inverse
This removes the last parameter of choose_multiplier, which is unused, adds
another assertion and more details to the description and various comments.
Likewise to the closely related invert_mod2n, except for the last parameter.

[changelog]
	* expmed.h (choose_multiplier): Tweak description and remove last
	parameter.
	* expmed.cc (choose_multiplier): Likewise.  Add assertion for the
	third parameter and adds details to various comments.
	(invert_mod2n): Tweak description and add assertion for the first
	parameter.
	(expand_divmod): Adjust calls to choose_multiplier.
	* tree-vect-generic.cc (expand_vector_divmod): Likewise.
	* tree-vect-patterns.cc (vect_recog_divmod_pattern): Likewise.
2024-05-08 10:06:17 +02:00
konglin1
d826f79456 x86: Fix cmov cost model issue [PR109549]
(if_then_else:SI (eq (reg:CCZ 17 flags)
        (const_int 0 [0]))
    (reg/v:SI 101 [ e ])
    (reg:SI 102))
The cost is 8 for the rtx, the cost for
(eq (reg:CCZ 17 flags) (const_int 0 [0])) is 4,
but this is just an operator do not need to compute it's cost in cmov.

gcc/ChangeLog:

	PR target/109549
	* config/i386/i386.cc (ix86_rtx_costs): The XEXP (x, 0) for cmov
	is an operator do not need to compute cost.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/cmov6.c: Fixed.
2024-05-08 15:46:22 +08:00
Aldy Hernandez
36e8779969 Enable prange support.
This throws the switch on prange.  After this patch, it is no longer
valid to store a pointer in an irange (or vice versa).  Instead, they
must go in prange, which is faster and more memory efficient.

I will push this now, so I have time to do any follow-up bugfixing
before going on paternity leave.

There are various cleanups we plan on doing after this patch (faster
intersect/union, remove range-op-mixed.h, remove value_range in favor
of int_range_max, reclaim the name for the Value_Range temporary,
clean up range-ops, etc etc).  But we will hold off on those for now
to make it easier to revert this patch, if for some reason we need to
do so while I'm away.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-cache.cc (sbr_sparse_bitmap::sbr_sparse_bitmap):
	Change irange to prange.
	* gimple-range-fold.cc (fold_using_range::fold_stmt): Same.
	(fold_using_range::range_of_address): Same.
	* gimple-range-fold.h (range_of_address): Same.
	* gimple-range-infer.cc (gimple_infer_range::add_nonzero): Same.
	* gimple-range-op.cc (class cfn_strlen): Same.
	* gimple-range-path.cc
	(path_range_query::adjust_for_non_null_uses): Same.
	* gimple-ssa-warn-access.cc (pass_waccess::check_pointer_uses): Same.
	* tree-ssa-structalias.cc (find_what_p_points_to): Same.
	* range-op-ptr.cc (range_op_table::initialize_pointer_ops): Remove
	hybrid entries in table.
	* range-op.cc (range_op_table::range_op_table): Add pointer
	entries for bitwise and/or and min/max.
	* value-range.cc (irange::verify_range): Add assert.
	* value-range.h (irange::varying_compatible_p): Remove check for
	error_mark_node.
	(irange::supports_p): Remove pointer support.
	* ipa-cp.h (ipa_supports_p): Add prange support.
2024-05-08 08:12:48 +02:00
Hans-Peter Nilsson
f6ce85502e Revert "Revert "testsuite/gcc.target/cris/pr93372-2.c: Handle xpass from combine improvement""
This reverts commit 39f81924d8.
2024-05-08 04:11:20 +02:00
Nathaniel Shead
e60032b382 c++/modules: Stream unmergeable temporaries by value again [PR114856]
In r14-9266-g2823b4d96d9ec4 I gave all temporary vars a DECL_CONTEXT,
including those at namespace or global scope, so that they could be
properly merged across importers.  However, not all of these temporary
vars are actually supposed to be mergeable.

For instance, in the attached testcase we have an unnamed temporary var
used in the NSDMI of a class member, which cannot properly merged -- but
it also doesn't need to be, as it'll be thrown away when the class type
itself is merged anyway.

This patch reverts the change made above and instead makes a weaker
adjustment that only causes temporary vars with linkage have a
DECL_CONTEXT to merge from.  This way these unnamed, "unmergeable"
temporaries are properly streamed by value again.

	PR c++/114856

gcc/cp/ChangeLog:

	* call.cc (make_temporary_var_for_ref_to_temp): Set context for
	temporaries with linkage.
	* init.cc (create_temporary_var): Revert to only set context
	when in a function decl.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/pr114856.h: New test.
	* g++.dg/modules/pr114856_a.H: New test.
	* g++.dg/modules/pr114856_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-05-08 09:16:40 +10:00
Andrew Pinski
4421d35167 c++/c-common: Fix convert_vector_to_array_for_subscript for qualified vector types [PR89224]
After r7-987-gf17a223de829cb, the access for the elements of a vector type would lose the qualifiers.
So if we had `constvector[0]`, the type of the element of the array would not have const on it.
This was due to a missing build_qualified_type for the inner type of the vector when building the array type.
We need to add back the call to build_qualified_type and now the access has the correct qualifiers. So the
overloads and even if it is a lvalue or rvalue is correctly done.

Note we correctly now reject the testcase gcc.dg/pr83415.c which was incorrectly accepted after r7-987-gf17a223de829cb.

Built and tested for aarch64-linux-gnu.

	PR c++/89224

gcc/c-family/ChangeLog:

	* c-common.cc (convert_vector_to_array_for_subscript): Call build_qualified_type
	for the inner type.

gcc/cp/ChangeLog:

	* constexpr.cc (cxx_eval_array_reference): Compare main variants
	for the vector/array types instead of the types directly.

gcc/testsuite/ChangeLog:

	* g++.dg/torture/vector-subaccess-1.C: New test.
	* gcc.dg/pr83415.c: Change warning to error.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-07 15:42:36 -07:00
Andrew Pinski
c9dd853680 DCE __cxa_atexit calls where the function is pure/const [PR19661]
In C++ sometimes you have a deconstructor function which is "empty", like for an
example with unions or with arrays.  The front-end might not know it is empty either
so this should be done on during optimization.o
To implement it I added it to DCE where we mark if a statement is necessary or not.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Changes since v1:
  * v2: Add support for __aeabi_atexit for arm-*eabi. Add extra comments.
        Add cxa_atexit-5.C testcase for -fPIC case.
  * v3: Fix testcases for the __aeabi_atexit (forgot to do in the v2).

	PR tree-optimization/19661

gcc/ChangeLog:

	* tree-ssa-dce.cc (is_cxa_atexit): New function.
	(is_removable_cxa_atexit_call): New function.
	(mark_stmt_if_obviously_necessary): Don't mark removable
	cxa_at_exit calls.
	(mark_all_reaching_defs_necessary_1): Likewise.
	(propagate_necessity): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/tree-ssa/cxa_atexit-1.C: New test.
	* g++.dg/tree-ssa/cxa_atexit-2.C: New test.
	* g++.dg/tree-ssa/cxa_atexit-3.C: New test.
	* g++.dg/tree-ssa/cxa_atexit-4.C: New test.
	* g++.dg/tree-ssa/cxa_atexit-5.C: New test.
	* g++.dg/tree-ssa/cxa_atexit-6.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-07 14:46:37 -07:00
Andrew Pinski
e472527c7b MATCH: Add some more value_replacement simplifications (a != 0 ? expr : 0) to match
This adds a few more of what is currently done in phiopt's value_replacement
to match. I noticed this when I was hooking up phiopt's value_replacement
code to use match and disabling the old code. But this can be done
independently from the hooking up phiopt's value_replacement as phiopt
is already hooked up for simplified versions already.

/* a != 0 ? a / b : 0  -> a / b iff b is nonzero. */
/* a != 0 ? a * b : 0 -> a * b */
/* a != 0 ? a & b : 0 -> a & b */

We prefer the `cond ? a : 0` forms to allow optimization of `a * cond` which
uses that form.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/114894

gcc/ChangeLog:

	* match.pd (`a != 0 ? a / b : 0`): New pattern.
	(`a != 0 ? a * b : 0`): New pattern.
	(`a != 0 ? a & b : 0`): New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi-opt-value-5.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-07 14:41:15 -07:00
Jeff Law
9f14f19782 [committed][RISC-V] Turn on overlap_op_by_pieces for generic-ooo tuning
Per quick email exchange with Palmer.  Given the triviality, I'm just pushing
it.

gcc/
	* config/riscv/riscv.cc (generic_ooo_tune_info): Turn on
	overlap_op_by_pieces.
2024-05-07 15:35:21 -06:00
Christoph Müllner
300393484d [committed] [RISC-V] Allow uarchs to set TARGET_OVERLAP_OP_BY_PIECES_P
This is almost exclusively work from the VRULL team.

As we've discussed in the Tuesday meeting in the past, we'd like to have a knob
in the tuning structure to indicate that overlapped stores during
move_by_pieces expansion of memcpy & friends are acceptable.

This patch adds the that capability in our tuning structure.  It's off for all
the uarchs upstream, but we have been using it inside Ventana for our uarch
with success.  So technically it's NFC upstream, but puts in the infrastructure
multiple organizations likely need.

gcc/

	* config/riscv/riscv.cc (struct riscv_tune_param): Add new
	"overlap_op_by_pieces" field.
	(rocket_tune_info, sifive_7_tune_info): Set it.
	(sifive_p400_tune_info, sifive_p600_tune_info): Likewise.
	(thead_c906_tune_info, xiangshan_nanhu_tune_info): Likewise.
	(generic_ooo_tune_info, optimize_size_tune_info): Likewise.
	(riscv_overlap_op_by_pieces): New function.
	(TARGET_OVERLAP_OP_BY_PIECES_P): define.

gcc/testsuite/

	* gcc.target/riscv/memcpy-nonoverlapping.c: New test.
	* gcc.target/riscv/memset-nonoverlapping.c: New test.
2024-05-07 15:17:16 -06:00
Jakub Jelinek
17458d2bc7 c++: Implement C++26 P2893R3 - Variadic friends [PR114459]
The following patch imeplements the C++26 P2893R3 - Variadic friends
paper.  The paper allows for the friend type declarations to specify
more than one friend type specifier and allows to specify ... at
the end of each.  The patch doesn't introduce tentative parsing of
friend-type-declaration non-terminal, but rather just extends existing
parsing where it is a friend declaration which ends with ; after the
declaration specifiers to the cases where it ends with ...; or , or ...,
In that case it pedwarns for cxx_dialect < cxx26, handles the ... and
if there is , continues in a loop to parse the further friend type
specifiers.

2024-05-07  Jakub Jelinek  <jakub@redhat.com>

	PR c++/114459
gcc/c-family/
	* c-cppbuiltin.cc (c_cpp_builtins): Predefine
	__cpp_variadic_friend=202403L for C++26.
gcc/cp/
	* parser.cc (cp_parser_member_declaration): Implement C++26
	P2893R3 - Variadic friends.  Parse friend type declarations
	with ... or with more than one friend type specifier.
	* friend.cc (make_friend_class): Allow TYPE_PACK_EXPANSION.
	* pt.cc (instantiate_class_template): Handle PACK_EXPANSION_P
	in friend classes.
gcc/testsuite/
	* g++.dg/cpp26/feat-cxx26.C (__cpp_variadic_friend): Add test.
	* g++.dg/cpp26/variadic-friend1.C: New test.
2024-05-07 22:38:01 +02:00
Jakub Jelinek
28ee13db2e expansion: Use __trunchfbf2 calls rather than __extendhfbf2 [PR114907]
The HF and BF modes have the same size/precision and neither is
a subset nor superset of the other.
So, using either __extendhfbf2 or __trunchfbf2 is weird.
The expansion apparently emits __extendhfbf2, but on the libgcc side
we apparently have __trunchfbf2 implemented.

I think it is easier to switch to using what is available rather than
adding new entrypoints to libgcc, even alias, because this is backportable.

2024-05-07  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/114907
	* expr.cc (convert_mode_scalar): Use trunc_optab rather than
	sext_optab for HF->BF conversions.
	* optabs-libfuncs.cc (gen_trunc_conv_libfunc): Likewise.

	* gcc.dg/pr114907.c: New test.
2024-05-07 21:30:21 +02:00
Jakub Jelinek
d4e25cf4f7 tree-inline: Remove .ASAN_MARK calls when inlining functions into no_sanitize callers [PR114956]
In r9-5742 we've started allowing to inline always_inline functions into
functions which have disabled e.g. address sanitization even when the
always_inline function is implicitly from command line options sanitized.

This mostly works fine because most of the asan instrumentation is done only
late after ipa, but as the following testcase the .ASAN_MARK ifn calls
gimplifier adds can result in ICEs.

Fixed by dropping those during inlining, similarly to how we drop
.TSAN_FUNC_EXIT calls.

2024-05-07  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/114956
	* tree-inline.cc: Include asan.h.
	(copy_bb): Remove also .ASAN_MARK calls if id->dst_fn has asan/hwasan
	sanitization disabled.

	* gcc.dg/asan/pr114956.c: New test.
2024-05-07 21:29:14 +02:00
Marek Polacek
7887d80887 c++: DECL_DECOMPOSITION_P cleanup
DECL_DECOMPOSITION_P already checks VAR_P but we repeat the check
in a lot of places.

gcc/cp/ChangeLog:

	* decl.cc (duplicate_decls): Don't check VAR_P before
	DECL_DECOMPOSITION_P.
	* init.cc (build_aggr_init): Likewise.
	* parser.cc (cp_parser_range_for): Likewise.
	(do_range_for_auto_deduction): Likewise.
	(cp_convert_range_for): Likewise.
	(cp_convert_omp_range_for): Likewise.
	(cp_finish_omp_range_for): Likewise.
	* pt.cc (extract_locals_r): Likewise.
	(tsubst_omp_for_iterator): Likewise.
	(tsubst_decomp_names): Likewise.
	(tsubst_stmt): Likewise.
	* typeck.cc (maybe_warn_about_returning_address_of_local): Likewise.
2024-05-07 15:09:20 -04:00
Gaius Mulley
76e591200f PR modula2/114133 bugfix constants must be cast prior to vararg call
This bug fix corrects the test codes below by converting the constant
literals to the type required by C.  In the testcases below the values, 1
etc were converted into the INTEGER type before being passed to a C
vararg function.  By default in modula2 constant literal ordinals are
represented as the ZTYPE (the largest GCC integer type node).

gcc/testsuite/ChangeLog:

	PR modula2/114133
	* gm2/extensions/run/pass/callingc10.mod: Convert constant
	literal numbers into INTEGER.
	* gm2/extensions/run/pass/callingc11.mod: Ditto.
	* gm2/extensions/run/pass/vararg2.mod: Ditto.
	* gm2/iso/run/pass/packed.mod: Emit a printf as a runtime
	diagnostic.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-05-07 19:24:08 +01:00
Jeff Law
1139f38e79 [RISC-V] [PATCH v2] Enable inlining str* by default
So with Chrstoph's patches from late 2022 we've had the ability to inline
strlen, and str[n]cmp (scalar).  However, we never actually turned this
capability on by default!

This patch flips the those default to allow inlinining by default.  It also
fixes one bug exposed by our internal testing when NBYTES is zero for strncmp.
I don't think that case happens enough to try and optimize it, we just disable
inline expansion for that instance.

This has been bootstrapped and regression tested on rv64gc at various times as
well as cross tested on rv64gc more times than I can probably count (we've have
this patch internally for a while).  More importantly, I just successfully
tested it on rv64gc and rv32gcv elf configurations with the trunk

gcc/

	* config/riscv/riscv-string.cc (riscv_expand_strcmp): Do not inline
	strncmp with zero size.
	(emit_strcmp_scalar_compare_subword): Adjust rotation for rv32 vs rv64.
	* config/riscv/riscv.opt (var_inline_strcmp): Enable by default.
	(vriscv_inline_strncmp, riscv_inline_strlen): Likewise.

gcc/testsuite

	* gcc.target/riscv/zbb-strlen-disabled-2.c: Turn off inlining.
2024-05-07 11:43:09 -06:00
Zac Walker
d6d7afcdbc Add aarch64-w64-mingw32 target to libgcc
Reuse MinGW definitions from i386 for libgcc. Move reused files to
libgcc/config/mingw folder.

libgcc/ChangeLog:

	* config.host: Add aarch64-w64-mingw32 target. Adjust targets
	after moving MinGW files.
	* config/i386/t-gthr-win32: Move to...
	* config/mingw/t-gthr-win32: ...here.
	* config/i386/t-mingw-pthread: Move to...
	* config/mingw/t-mingw-pthread: ...here.
	* config/aarch64/t-no-eh: New file. EH is not yet implemented for
	the target, and the default definition should be disabled.
2024-05-07 16:02:35 +00:00
Zac Walker
0c23efc04b aarch64: Add aarch64-w64-mingw32 target to libatomic
libatomic/ChangeLog:

	* configure.tgt: Add aarch64-w64-mingw32 target.
2024-05-07 16:02:35 +00:00
Zac Walker
10a2f11b41 aarch64: Build and add objects for Cygwin and MinGW for AArch64
gcc/ChangeLog:

	* config.gcc: Build and add objects for Cygwin and MinGW. Add Cygwin
	and MinGW options to the target.
2024-05-07 16:02:34 +00:00
Zac Walker
e8d003736e Rename "x86 Windows Options" to "Cygwin and MinGW Options"
Rename "x86 Windows Options" to "Cygwin and MinGW Options".
It will be used also for AArch64.

gcc/ChangeLog:

	* config/i386/mingw-w64.opt.urls: Rename options' name and
	regenerate option URLs.
	* config/lynx.opt.urls: Likewise.
	* config/mingw/cygming.opt.urls: Likewise.
	* config/mingw/mingw.opt.urls: Likewise.
	* doc/invoke.texi: Likewise.
2024-05-07 16:02:34 +00:00
Zac Walker
38e422e2ef aarch64: Add SEH to machine_function
SEH is not enabled in aarch64-w64-mingw32 target yet. However, it is
needed to be declared in machine_function for reusing winnt.cc.

gcc/ChangeLog:

	* config/aarch64/aarch64.h (struct seh_frame_state): Declare SEH
	structure in machine_function.
	(GTY): Add SEH field.
2024-05-07 16:02:34 +00:00
Zac Walker
565b782bfa aarch64: Add Cygwin and MinGW environments for AArch64
Define Cygwin and MinGW environment such as types, SEH definitions,
shared libraries, etc.

gcc/ChangeLog:

	* config.gcc: Add Cygwin and MinGW difinitions.
	* config/aarch64/aarch64-protos.h
	(mingw_pe_maybe_record_exported_symbol): Declare functions
	which are used in Cygwin and MinGW environment.
	(mingw_pe_section_type_flags): Likewise.
	(mingw_pe_unique_section): Likewise.
	(mingw_pe_encode_section_info): Likewise.
	* config/aarch64/cygming.h: New file.
2024-05-07 16:02:34 +00:00
Zac Walker
de2bcdaf39 Exclude i386 functionality from aarch64 build
This patch defines TARGET_AARCH64_MS_ABI in config.gcc and uses it to
exclude i386 functionality from aarch64 build and adjust MinGW headers
for AArch64 MS ABI.

gcc/ChangeLog:

	* config.gcc: Define TARGET_AARCH64_MS_ABI.
	* config/mingw/mingw-stdint.h (INTPTR_TYPE): Use
	TARGET_AARCH64_MS_ABI to adjust MinGW headers for
	AArch64 MS ABI.
	(UINTPTR_TYPE): Likewise.
	(defined): Likewise.
	* config/mingw/mingw32.h (DEFAULT_ABI): Likewise.
	(defined): Likewise.
	* config/mingw/winnt.cc (defined): Use TARGET_ARM64_MS_ABI to
	exclude ix86_get_callcvt.
	(i386_pe_maybe_mangle_decl_assembler_name): Likewise.
	(i386_pe_mangle_decl_assembler_name): Likewise.
2024-05-07 16:02:34 +00:00
Zac Walker
99d7d5ec8d Rename section and encoding functions from i386 which will be used in aarch64
gcc/ChangeLog:

	* config/i386/cygming.h (SUBTARGET_ENCODE_SECTION_INFO):
	Rename functions in mingw folder which will be reused for
	aarch64.
	(TARGET_ASM_UNIQUE_SECTION): Likewise.
	(TARGET_ASM_NAMED_SECTION): Likewise.
	(TARGET_SECTION_TYPE_FLAGS): Likewise.
	(ASM_DECLARE_COLD_FUNCTION_NAME): Likewise.
	(ASM_OUTPUT_EXTERNAL_LIBCALL): Likewise.
	* config/i386/i386-protos.h (i386_pe_unique_section):
	Rename into ...
	(mingw_pe_unique_section): ... this.
	(i386_pe_declare_function_type): Rename into ...
	(mingw_pe_declare_function_type): ... this.
	(i386_pe_encode_section_info): Rename into ...
	(mingw_pe_encode_section_info): ... this.
	(i386_pe_maybe_record_exported_symbol): Rename into ...
	(mingw_pe_maybe_record_exported_symbol): ... this.
	(i386_pe_section_type_flags): Rename into ...
	(mingw_pe_section_type_flags): ... this.
	(i386_pe_asm_named_section): Rename into ...
	(mingw_pe_asm_named_section): ... this.
	* config/mingw/winnt.cc (i386_pe_encode_section_info):
	Rename into ...
	(mingw_pe_encode_section_info): ... this.
	(i386_pe_unique_section): Rename into ...
	(mingw_pe_unique_section): ... this.
	(i386_pe_section_type_flags): Rename into ...
	(mingw_pe_section_type_flags): ... this.
	(i386_pe_asm_named_section): Rename into ...
	(mingw_pe_asm_named_section): ... this.
	(i386_pe_asm_output_aligned_decl_common): Likewise.
	(i386_pe_declare_function_type): Rename into ...
	(mingw_pe_declare_function_type): ... this.
	(i386_pe_maybe_record_exported_symbol): Rename into ...
	(mingw_pe_maybe_record_exported_symbol): ... this.
	(i386_pe_start_function): Likewise.
	* varasm.cc (switch_to_comdat_section): Likewise.
2024-05-07 16:02:33 +00:00
Zac Walker
1f05dfc131 Reuse MinGW from i386 for AArch64
This patch creates a new config/mingw directory to share MinGW
related definitions, and moves there the corresponding existing files
from config/i386.

gcc/ChangeLog:

	* config.gcc: Adjust targets after moving MinGW related files
	from i386 to mingw folder.
	* config/i386/cygming.opt: Move to...
	* config/mingw/cygming.opt: ...here.
	* config/i386/cygming.opt.urls: Move to...
	* config/mingw/cygming.opt.urls: ...here.
	* config/i386/cygwin-d.cc: Move to...
	* config/mingw/cygwin-d.cc: ...here.
	* config/i386/mingw-stdint.h: Move to...
	* config/mingw/mingw-stdint.h: ...here.
	* config/i386/mingw.opt: Move to...
	* config/mingw/mingw.opt: ...here.
	* config/i386/mingw.opt.urls: Move to...
	* config/mingw/mingw.opt.urls: ...here.
	* config/i386/mingw32.h: Move to...
	* config/mingw/mingw32.h: ...here.
	* config/i386/msformat-c.cc: Move to...
	* config/mingw/msformat-c.cc: ...here.
	* config/i386/t-cygming: Move to...
	* config/mingw/t-cygming: ...here and updated.
	* config/i386/winnt-cxx.cc: Move to...
	* config/mingw/winnt-cxx.cc: ...here.
	* config/i386/winnt-d.cc: Move to...
	* config/mingw/winnt-d.cc: ...here.
	* config/i386/winnt-stubs.cc: Move to...
	* config/mingw/winnt-stubs.cc: ...here.
	* config/i386/winnt.cc: Move to...
	* config/mingw/winnt.cc: ...here.
2024-05-07 16:02:33 +00:00
Zac Walker
21fbaa1a2d aarch64: Add aarch64-w64-mingw32 COFF
Define ASM specific for COFF format on AArch64.

gcc/ChangeLog:

	* config.gcc: Add COFF format support definitions.
	* config/aarch64/aarch64-coff.h: New file.
2024-05-07 16:02:33 +00:00
Zac Walker
b9415046fa aarch64: Mark x18 register as a fixed register for MS ABI
Define the MS ABI for aarch64-w64-mingw32.
Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
The X18 register is reserved on Windows for the TEB.

gcc/ChangeLog:

	* config.gcc: Define TARGET_AARCH64_MS_ABI when
	AArch64 MS ABI is used.
	* config/aarch64/aarch64.h (FIXED_X18): Adjust
	FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
	STATIC_CHAIN_REGNUM for AArch64 MS ABI.
	(CALL_USED_X18): Likewise.
	(FIXED_REGISTERS): Likewise.
	* config/aarch64/aarch64-abi-ms.h: New file.
2024-05-07 16:02:33 +00:00
Zac Walker
13bad1ac7a Introduce aarch64-w64-mingw32 target
Add the initial aarch64-w64-mingw32 target for gcc.

This is the first commit in a sequence of patch series to add
new aarch64-w64-mingw32 target.

Coauthors: Zac Walker <zacwalker@microsoft.com>,
Mark Harmstone <mark@harmstone.com>  and
Ron Riddle <ron.riddle@microsoft.com>

Refactored, prepared, and validated by
Radek Barton <radek.barton@microsoft.com> and
Evgeny Karpov <evgeny.karpov@microsoft.com>

fixincludes/ChangeLog:

	* mkfixinc.sh: Extend for *-mingw32* targets.

gcc/ChangeLog:

	* config.gcc: Add aarch64-w64-mingw32 target.
2024-05-07 16:02:33 +00:00
Wolfgang Hospital
8d2c93fcfe AVR: target/114835 - Tweak popcountqi2
libgcc/
	PR target/114835
	* config/avr/lib1funcs.S (__popcountqi2): Use code that
	is one instruction shorter / faster.
2024-05-07 16:32:07 +02:00
Jonathan Wakely
3f04f3939e
libstdc++: Fix handling of incomplete UTF-8 sequences in _Unicode_view
Eddie Nolan reported to me that _Unicode_view was not correctly
implementing the substitution of ill-formed subsequences with U+FFFD,
due to failing to increment the counter when the iterator reaches the
end of the sequence before a multibyte sequence is complete.  As a
result, the incomplete sequence was not completely consumed, and then
the remaining character was treated as another ill-formed sequence,
giving two U+FFFD characters instead of one.

To avoid similar mistakes in future, this change introduces a lambda
that increments the iterator and the counter together. This ensures the
counter is always incremented when the iterator is incremented, so that
we always know how many characters have been consumed.

libstdc++-v3/ChangeLog:

	* include/bits/unicode.h (_Unicode_view::_M_read_utf8): Ensure
	count of characters consumed is correct when the end of the
	input is reached unexpectedly.
	* testsuite/ext/unicode/view.cc: Test incomplete UTF-8
	sequences.
2024-05-07 14:47:50 +01:00
Jonathan Wakely
9927059bb8
libstdc++: Fix <memory> for -std=c++23 -ffreestanding [PR114866]
std::shared_ptr isn't declared for freestanding, so guard uses of it
with #if _GLIBCXX_HOSTED in <bits/out_ptr.h>.

libstdc++-v3/ChangeLog:

	PR libstdc++/114866
	* include/bits/out_ptr.h [!_GLIBCXX_HOSTED]: Don't refer to
	shared_ptr, __shared_ptr or __is_shred_ptr.
	* testsuite/20_util/headers/memory/114866.cc: New test.
2024-05-07 14:47:49 +01:00
Jonathan Wakely
6709e35457
libstdc++: Simplify std::variant comparison operators
libstdc++-v3/ChangeLog:

	* include/std/variant (_VARIANT_RELATION_FUNCTION_TEMPLATE):
	Simplify.
2024-05-07 14:44:36 +01:00
Alex Coplan
74690ff96b aarch64: Preserve mem info on change of base for ldp/stp [PR114674]
The ldp/stp fusion pass can change the base of an access so that the two
accesses end up using a common base register.  So far we have been using
adjust_address_nv to do this, but this means that we don't preserve
other properties of the mem we're replacing.  It seems better to use
replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
mem whose address we're changing.

The PR shows that by adjusting the other mem we lose alignment
information about the original access and therefore end up rejecting an
otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
This patch fixes that by using replace_equiv_address_nv instead.

Notably this is the same approach as taken by
aarch64_check_consecutive_mems when a change of base is required, so
this at least makes things more consistent between the ldp fusion pass
and the peepholes.

gcc/ChangeLog:

	PR target/114674
	* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
	Use replace_equiv_address_nv on a change of base instead of
	adjust_address_nv on the other access.

gcc/testsuite/ChangeLog:

	PR target/114674
	* gcc.target/aarch64/pr114674.c: New test.
2024-05-07 14:42:08 +01:00
Richard Biener
bf10f0db20 Fix block index check in insert_updated_phi_nodes_for
This replaces a >= 0 block index check with the appropriate NUM_FIXED_BLOCKs,
the check is from times ENTRY_BLOCK was negative.

	* tree-into-ssa.cc (insert_updated_phi_nodes_for): Fix block
	index check.
2024-05-07 15:28:55 +02:00
Richard Biener
cb4e2685a3 Avoid re-allocating vector
The following avoids re-allocating the var map BB vector by
pre-allocating it to the exact size needed when operating on the
whole function.

	* tree-ssa-live.cc (init_var_map): Pre-allocate vec_bbs vector
	to the correct size and use quick_push.
2024-05-07 15:28:55 +02:00
Jonathan Wakely
b72e7addf8
libstdc++: Constrain equality ops for std::pair, std::tuple, std::variant
Implement the changes from P2944R3 which add constraints to the
comparison operators of std::pair, std::tuple, and std::variant.

The paper also changes std::optional, but we already constrain its
comparisons using SFINAE on the return type. However, we need some
additional constraints on the [optional.comp.with.t] operators that
compare an optional with a value. The paper doesn't say to do that, but
I think it's needed because otherwise when the comparison for two
optional objects fails its constraints, the two overloads that are
supposed to be for comparing to a non-optional become the best overload
candidates, but are ambiguous (and we don't even get as far as checking
the constraints for satisfaction). I reported LWG 4072 for this.

The paper does not change std::expected, but probably should have done.
I'll submit an LWG issue about that and implement it separately.

Also add [[nodiscard]] to all these comparison operators.

libstdc++-v3/ChangeLog:

	* include/bits/stl_pair.h (operator==): Add constraint.
	* include/bits/version.def (constrained_equality): Define.
	* include/bits/version.h: Regenerate.
	* include/std/optional: Define feature test macro.
	(__optional_rep_op_t): Use is_convertible_v instead of
	is_convertible.
	* include/std/tuple: Define feature test macro.
	(operator==, __tuple_cmp, operator<=>): Reimplement C++20
	comparisons using lambdas. Add constraints.
	* include/std/utility: Define feature test macro.
	* include/std/variant: Define feature test macro.
	(_VARIANT_RELATION_FUNCTION_TEMPLATE): Add constraints.
	(variant): Remove unnecessary friend declarations for comparison
	operators.
	* testsuite/20_util/optional/relops/constrained.cc: New test.
	* testsuite/20_util/pair/comparison_operators/constrained.cc:
	New test.
	* testsuite/20_util/tuple/comparison_operators/constrained.cc:
	New test.
	* testsuite/20_util/variant/relops/constrained.cc: New test.
	* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
	Disable for C++20 and later.
	* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
	Remove dg-error line for target c++20.
2024-05-07 13:46:11 +01:00
Jonathan Wakely
9ebd123432
libstdc++: Use https instead of http in some comments
libstdc++-v3/ChangeLog:

	* include/backward/auto_ptr.h: Use https for URL in comment.
	* include/bits/basic_ios.h: Likewise.
	* include/std/iostream: Likewise.
2024-05-07 13:46:11 +01:00
Jonathan Wakely
6e25ca387f
libstdc++: Update ABI test to disallow adding to released symbol versions
If we update the list of "active" symbols versions now, rather than when
adding a new symbol version, we will notice if new symbols get added to
the wrong version (as in PR 114692).

libstdc++-v3/ChangeLog:

	* testsuite/util/testsuite_abi.cc: Update latest versions to
	new versions that should be used in future.
2024-05-07 13:46:11 +01:00
Richard Biener
bed6ec161b middle-end/27800 - avoid unnecessary temporary during gimplification
This avoids a tempoary when gimplifying reg = a ? b : c, re-using
the LHS of an assignment if that's a register.

	PR middle-end/27800
	* gimplify.cc (gimplify_modify_expr_rhs): For a COND_EXPR
	avoid a temporary from gimplify_cond_expr when the LHS is
	a register by pushing the assignment into the COND_EXPR arms.

	* gcc.dg/pr27800.c: New testcase.
2024-05-07 14:19:35 +02:00
Richard Biener
cc2f3e408e Remove redundant check
operand_equal_p already has checking code to verify the hash
is equal, avoid doing that again in gimplify_hasher::equal.

	* gimplify.cc (gimplify_hasher::equal): Remove redundant
	checking.
2024-05-07 14:19:34 +02:00
Stefan Schulze Frielinghaus
e1f56c67a8 tree-optimization/110490 - bitcount for narrow modes
Bitcount operations popcount, clz, and ctz are emulated for narrow modes
in case an operation is only supported for wider modes.  Beside that ctz
may be emulated via clz in expand_ctz.  Reflect this in
expression_expensive_p.

I considered the emulation of ctz via clz as not expensive since this
basically reduces to ctz (x) = c - (clz (x & ~x)) where c is the mode
precision minus 1 which should be faster than a loop.

gcc/ChangeLog:

	PR tree-optimization/110490
	* tree-scalar-evolution.cc (expression_expensive_p): Also
	consider mode widening for popcount, clz, and ctz.
2024-05-07 14:12:55 +02:00
Richard Biener
c69eda94f2 Use unsigned for stack var indexes during RTL expansion
We're currently using size_t but at the same time storing them into
bitmaps which only support unsigned int index.  The following makes
it unsigned int throughout, saving memory as well.

	* cfgexpand.cc (stack_var::representative): Use 'unsigned'
	for stack var indexes instead of 'size_t'.
	(stack_var::next): Likewise.
	(EOC): Likewise.
	(stack_vars_alloc): Likewise.
	(stack_vars_num): Likewise.
	(decl_to_stack_part): Likewise.
	(stack_vars_sorted): Likewise.
	(add_stack_var): Likewise.
	(add_stack_var_conflict): Likewise.
	(stack_var_conflict_p): Likewise.
	(visit_op): Likewise.
	(visit_conflict): Likewise.
	(add_scope_conflicts_1): Likewise.
	(stack_var_cmp): Likewise.
	(part_hashmap): Likewise.
	(update_alias_info_with_stack_vars): Likewise.
	(union_stack_vars): Likewise.
	(partition_stack_vars): Likewise.
	(dump_stack_var_partition): Likewise.
	(expand_stack_vars): Likewise.
	(account_stack_vars): Likewise.
	(stack_protect_decl_phase_1): Likewise.
	(stack_protect_decl_phase_2): Likewise.
	(asan_decl_phase_3): Likewise.
	(init_vars_expansion): Likewise.
	(estimated_stack_frame_size): Likewise.
2024-05-07 13:14:56 +02:00
Rainer Orth
35b05a02de build: Derive object names in make_sunver.pl
The recent move of libgfortran object files to subdirs and the resulting
breakage of libgfortran.so symbol exports demonstrated how fragile
deriving object and archive names from their libtool counterparts in the
Makefiles is.  Therefore, this patch moves that step into
make_sunver.pl, considerably simplifying the Makefile rules to create
the version scripts.

Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11, verifying that the version scripts are identical
except for the input filenames.

2024-05-06  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	contrib:
	* make_sunver.pl: Use File::Basename;
	Skip -lLIB args.
	Convert libtool object/archive names to underlying
	objects/archives.

	libatomic:
	* Makefile.am [LIBAT_BUILD_VERSIONED_SHLIB_SUN]
	(libatomic.map-sun): Pass $(libatomic_la_OBJECTS),
	$(libatomic_la_LIBADD) to make_sunver.pl unmodified.
	* Makefile.in: Regenerate.

	libffi:
	* Makefile.am [LIBFFI_BUILD_VERSIONED_SHLIB_SUN] (libffi.map-sun):
	Pass $(libffi_la_OBJECTS), $(libffi_la_LIBADD) to make_sunver.pl
	unmodified.
	* Makefile.in: Regenerate.

	libgfortran:
	* Makefile.am [LIBGFOR_USE_SYMVER_SUN} (gfortran.ver-sun): Pass
	$(libgfortran_la_OBJECTS), $(libgfortran_la_LIBADD) to
	make_sunver.pl unmodified.
	* Makefile.in: Regenerate.

	libgomp:
	* Makefile.am [LIBGOMP_BUILD_VERSIONED_SHLIB_SUN]
	(libgomp.ver-sun): Pass $(libgomp_la_OBJECTS),
	$(libgomp_la_LIBADD) to make_sunver.pl unmodified.
	* Makefile.in: Regenerate.

	libitm:
	* Makefile.am [LIBITM_BUILD_VERSIONED_SHLIB_SUN] (libitm.map-sun):
	Pass $(libitm_la_OBJECTS), $(libitm_la_LIBADD) to make_sunver.pl
	unmodified.
	* Makefile.in: Regenerate.

	libquadmath:
	* Makefile.am [LIBQUAD_USE_SYMVER_SUN] (quadmath.map-sun): Pass
	$(libquadmath_la_OBJECTS), $(libquadmath_la_LIBADD) to
	make_sunver.pl unmodified.
	* Makefile.in: Regenerate.

	libssp:
	* Makefile.am [LIBSSP_USE_SYMVER_SUN] (ssp.map-sun): Pass
	$(libssp_la_OBJECTS), $(libssp_la_LIBADD) to make_sunver.pl
	unmodified.
	* Makefile.in: Regenerate.

	libstdc++-v3:
	* src/Makefile.am [ENABLE_SYMVERS_SUN]
	(libstdc++-symbols.ver-sun): Pass $(libstdc___la_OBJECTS),
	$(libstdc___la_LIBADD) to make_sunver.pl unmodified.
	* src/Makefile.in: Regenerate.
2024-05-07 13:14:05 +02:00
Richard Biener
b09c2e9560 middle-end/114931 - type_hash_canon and structual equality types
TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
to make sure to include that into the type unification done via
type_hash_canon.  This requires the flag to be set before querying
the hash which is the biggest part of the patch.

	PR middle-end/114931
gcc/
	* tree.cc (type_hash_canon_hash): Hash TYPE_STRUCTURAL_EQUALITY_P.
	(type_cache_hasher::equal): Compare TYPE_STRUCTURAL_EQUALITY_P.
	(build_array_type_1): Set TYPE_STRUCTURAL_EQUALITY_P before
	probing with type_hash_canon.
	(build_function_type): Likewise.
	(build_method_type_directly): Likewise.
	(build_offset_type): Likewise.
	(build_complex_type): Likewise.
	* attribs.cc (build_type_attribute_qual_variant): Likewise.

gcc/c-family/
	* c-common.cc (complete_array_type): Set TYPE_STRUCTURAL_EQUALITY_P
	before probing with type_hash_canon.

gcc/testsuite/
	* gcc.dg/pr114931.c: New testcase.
2024-05-07 13:05:12 +02:00