mirror/gcc - gcc - Alpaca Bi's Private Git Pepository

mirror/gcc

mirror of https://github.com/gcc-mirror/gcc.git synced 2024-11-21 13:40:47 +00:00

Author	SHA1	Message	Date
Ian Lance Taylor	93c54caa64	libbacktrace: add cast to avoid warning * print.c (print_syminfo_callback): Add cast to avoid warning.	2024-07-17 17:59:41 -07:00
Patrick Palka	144b6099cd	c++: missing -Wunused-value for !<expr> [PR114104] Here we're neglecting to issue a -Wunused-value warning for suitable ! operator expressions, and in turn for != operator expressions that are rewritten as !(x == y), only because we don't call warn_if_unused_value on TRUTH_NOT_EXPR since its class is tcc_expression. This patch makes us also consider warning for TRUTH_NOT_EXPR and also for ADDR_EXPR. PR c++/114104 gcc/cp/ChangeLog: * cvt.cc (convert_to_void): Call warn_if_unused_value for TRUTH_NOT_EXPR and ADDR_EXPR as well. gcc/testsuite/ChangeLog: * g++.dg/warn/Wunused-20.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>	2024-07-17 20:57:54 -04:00
Patrick Palka	313afcfdab	c++: diagnose failed qualified lookup into current inst When the scope of a qualified name is the current instantiation, and qualified lookup finds nothing at template definition time, then we know it'll find nothing at instantiation time (unless the current instantiation has dependent bases). So such qualified name lookup failure can be diagnosed ahead of time as per [temp.res.general]/6. This patch implements that, for qualified names of the form (where the current instantiation is A<T>): this->non_existent a.non_existent A::non_existent typename A::non_existent It turns out we already optimistically attempt qualified lookup of seemingly every qualified name, even when it's dependently scoped, and then suppress issuing a lookup failure diagnostic after the fact. So implementing this is mostly a matter of restricting the diagnostic suppression to "dependentish" scopes (i.e. dependent scopes or the current instantiation with dependent bases), rather than suppressing for any dependently-typed scope as we currently do. The cp_parser_conversion_function_id change is needed to avoid regressing lookup/using8.C: using A<T>::operator typename A<T>::Nested; When looking up A<T>::Nested we consider it not dependently scoped since we entered A<T> from cp_parser_conversion_function_id earlier. But this A<T> is the implicit instantiation A<T> not the primary template type A<T>, and so the lookup fails which we now diagnose. This patch works around this by not entering the template scope of a qualified conversion function-id in this case, i.e. if we're in an expression vs declaration context, by seeing if the type already went through finish_template_type with entering_scope=true. gcc/cp/ChangeLog: decl.cc (make_typename_type): Restrict name lookup failure punting to dependentish_scope_p instead of dependent_type_p. * error.cc (qualified_name_lookup_error): Improve diagnostic when the scope is the current instantiation. * parser.cc (cp_parser_diagnose_invalid_type_name): Likewise. (cp_parser_conversion_function_id): Don't call push_scope on a template scope unless we're in a declaration context. (cp_parser_lookup_name): Restrict name lookup failure punting to dependentish_scope_p instead of depedent_type_p. * semantics.cc (finish_id_expression_1): Likewise. * typeck.cc (finish_class_member_access_expr): Likewise. libstdc++-v3/ChangeLog: * include/experimental/socket (basic_socket_iostream::basic_socket_iostream): Fix typo. * include/tr2/dynamic_bitset (__dynamic_bitset_base::_M_is_proper_subset_of): Likewise. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/alignas18.C: Expect name lookup error for U::X. * g++.dg/cpp0x/forw_enum13.C: Expect name lookup error for D3::A and D4<T>::A. * g++.dg/parse/access13.C: Declare A::E::V to avoid name lookup failure and preserve intent of the test. * g++.dg/parse/enum11.C: Expect extra errors, matching the non-template case. * g++.dg/template/crash123.C: Avoid name lookup failure to preserve intent of the test. * g++.dg/template/crash124.C: Likewise. * g++.dg/template/crash7.C: Adjust expected diagnostics. * g++.dg/template/dtor6.C: Declare A::~A() to avoid name lookup failure and preserve intent of the test. * g++.dg/template/error22.C: Adjust expected diagnostics. * g++.dg/template/static30.C: Avoid name lookup failure to preserve intent of the test. * g++.old-deja/g++.other/decl5.C: Adjust expected diagnostics. * g++.dg/template/non-dependent34.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>	2024-07-17 20:54:14 -04:00
Ian Lance Taylor	30875fa698	libbacktrace: better backtrace_print when no debug info Fixes https://github.com/ianlancetaylor/libbacktrace/issues/59 * print.c (print_syminfo_callback): New static function. (print_callback): Call backtrace_syminfo if there is no function or file name.	2024-07-17 17:39:27 -07:00
GCC Administrator	a922de0a7a	Daily bump.	2024-07-18 00:18:58 +00:00
Ian Lance Taylor	a8b5ce1580	libbacktrace: add notes about dl_iterate_phdr to README * README: Add notes about dl_iterate_phdr.	2024-07-17 17:03:30 -07:00
Jakub Jelinek	3bbc8ea2e3	testsuite: Fix up pr111150* tests on i686-linux [PR111150] The tests FAIL on i686-linux due to unexpected -Wpsabi diagnostics. Fixed as usually by adding -Wno-psabi to dg-options. 2024-07-17 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/111150 * gcc.dg/tree-ssa/pr111150.c: Add -Wno-psabi to dg-options. * g++.dg/tree-ssa/pr111150.C: Likewise.	2024-07-17 23:47:17 +02:00
Jørgen Kvalsvik	ec64666f97	Use foreach, not lmap, for tcl <= 8.5 compat lmap was introduced in tcl 8.6, and while it was released in 2012, lmap does not really make too much of a difference to warrant the friction on consverative (and relevant) systems. gcc/testsuite/ChangeLog: * lib/gcov.exp: Use foreach, not lmap, for tcl <= 8.5 compat.	2024-07-17 23:31:33 +02:00
Richard Sandiford	43a7ece873	rtl-ssa: Fix move range canonicalisation [PR115929] In this PR, canonicalize_move_range walked off the end of a list and triggered a null dereference. There are multiple ways of fixing that, but I think the approach taken in the patch should be relatively efficient. gcc/ PR rtl-optimization/115929 * rtl-ssa/movement.h (canonicalize_move_range): Check for null prev and next insns and create an invalid move range for them. gcc/testsuite/ PR rtl-optimization/115929 * gcc.dg/torture/pr115929-2.c: New test.	2024-07-17 19:38:12 +01:00
Richard Sandiford	71b31690a7	rtl-ssa: Fix split_clobber_group [PR115928] One of the goals of the rtl-ssa representation was to allow a group of consecutive clobbers to be skipped in constant time, with amortised sublinear insertion and deletion. This involves putting consecutive clobbers in groups. Splitting or joining groups would be linear if we had to update every clobber on each update, so the operation to query a clobber's group is lazy and (again) amortised sublinear. This means that, when splitting a group into two, we cannot reuse the old group for one side. We have to invalidate it, so that the lazy clobber_info::group query can tell that something has changed. The ICE in the PR came from failing to do that. gcc/ PR rtl-optimization/115928 * rtl-ssa/accesses.h (clobber_group): Add a new constructor that takes the first, last and root clobbers. * rtl-ssa/internals.inl (clobber_group::clobber_group): Define it. * rtl-ssa/accesses.cc (function_info::split_clobber_group): Use it. Allocate a new group for both sides and invalidate the previous group. (function_info::add_def): After calling split_clobber_group, remove the old group from the splay tree. gcc/testsuite/ PR rtl-optimization/115928 * gcc.dg/torture/pr115928.c: New test.	2024-07-17 19:38:11 +01:00
Richard Sandiford	b19906a029	genattrtab: Drop enum tags, consolidate type names genattrtab printed an "enum" tag before references to attribute enums, but that's redundant in C++. Removing it means that each attribute type becomes a single token and can be easily stored in the attr_desc structure. gcc/ * genattrtab.cc (attr_desc::cxx_type): New field. (write_attr_get, write_attr_value): Use it. (gen_attr, find_attr, make_internal_attr): Initialize it, dropping enum tags.	2024-07-17 19:34:46 +01:00
Marek Polacek	d890b04197	c++: wrong error initializing empty class [PR115900] In r14-409, we started handling empty bases first in cxx_fold_indirect_ref_1 so that we don't need to recurse and waste time. This caused a bogus "modifying a const object" error. I'm appending my analysis from the PR, but basically, cxx_fold_indirect_ref now returns a different object than before, and we mark the wrong thing as const, but since we're initializing an empty object, we should avoid setting the object constness. ~~ Pre-r14-409: we're evaluating the call to C::C(), which is in the body of B::B(), which is the body of D::D(&d): C::C ((struct C ) this, NON_LVALUE_EXPR <0>) It's a ctor so we get here: 3118 / Remember the object we are constructing or destructing. / 3119 tree new_obj = NULL_TREE; 3120 if (DECL_CONSTRUCTOR_P (fun) \|\| DECL_DESTRUCTOR_P (fun)) 3121 { 3122 / In a cdtor, it should be the first `this' argument. 3123 At this point it has already been evaluated in the call 3124 to cxx_bind_parameters_in_call. / 3125 new_obj = TREE_VEC_ELT (new_call.bindings, 0); new_obj=(struct C ) &d.D.2656 3126 new_obj = cxx_fold_indirect_ref (ctx, loc, DECL_CONTEXT (fun), new_obj); new_obj=d.D.2656.D.2597 We proceed to evaluate the call, then we get here: 3317 /* At this point, the object's constructor will have run, so 3318 the object is no longer under construction, and its possible 3319 'const' semantics now apply. Make a note of this fact by 3320 marking the CONSTRUCTOR TREE_READONLY. / 3321 if (new_obj && DECL_CONSTRUCTOR_P (fun)) 3322 cxx_set_object_constness (ctx, new_obj, /readonly_p=/true, 3323 non_constant_p, overflow_p); new_obj is still d.D.2656.D.2597, its type is "C", cxx_set_object_constness doesn't set anything as const. This is fine. After r14-409: on line 3125, new_obj is (struct C ) &d.D.2656 as before, but we go to cxx_fold_indirect_ref_1: 5739 if (is_empty_class (type) 5740 && CLASS_TYPE_P (optype) 5741 && lookup_base (optype, type, ba_any, NULL, tf_none, off)) 5742 { 5743 if (empty_base) 5744 empty_base = true; 5745 return op; type is C, which is an empty class; optype is "const D", and C is a base of D. So we return the VAR_DECL 'd'. Then we get to cxx_set_object_constness with object=d, which is const, so we mark the constructor READONLY. Then we're evaluating A::A() which has ((A)this)->data = 0; we evaluate the LHS to d.D.2656.a, for which the initializer is {.D.2656={.a={.data=}}} which is TREE_READONLY and 'd' is const, so we think we're modifying a const object and fail the constexpr evaluation. PR c++/115900 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_call_expression): Set new_obj to NULL_TREE if cxx_fold_indirect_ref set empty_base to true. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-init23.C: New test.	2024-07-17 13:55:17 -04:00
Edwin Lu	5bb01e91d4	RISC-V: Fix testcase missing arch attribute The C + F extention implies the zcf extension on rv32. Add missing zcf extension for the rv32 target. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-16.c: Update expected assembly Signed-off-by: Edwin Lu <ewlu@rivosinc.com>	2024-07-17 10:19:23 -07:00
Eikansh Gupta	44fcc1ca11	MATCH: Simplify (a ? x : y) eq/ne (b ? x : y) [PR111150] This patch adds match pattern for `(a ? x : y) eq/ne (b ? x : y)`. In forwprop1 pass, depending on the type of `a` and `b`, GCC produces `vec_cond` or `cond_expr`. Based on the observation that `(x != y)` is TRUE, the pattern can be optimized to produce `(a^b ? TRUE : FALSE)`. The patch adds match pattern for a, b: (a ? x : y) != (b ? x : y) --> (a^b) ? TRUE : FALSE (a ? x : y) == (b ? x : y) --> (a^b) ? FALSE : TRUE (a ? x : y) != (b ? y : x) --> (a^b) ? TRUE : FALSE (a ? x : y) == (b ? y : x) --> (a^b) ? FALSE : TRUE PR tree-optimization/111150 gcc/ChangeLog: * match.pd (`(a ? x : y) eq/ne (b ? x : y)`): New pattern. (`(a ? x : y) eq/ne (b ? y : x)`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr111150.c: New test. * gcc.dg/tree-ssa/pr111150-1.c: New test. * g++.dg/tree-ssa/pr111150.C: New test. Signed-off-by: Eikansh Gupta <quic_eikagupt@quicinc.com>	2024-07-17 09:58:11 -07:00
Andrew Pinski	7c3287f361	Add debug counter for ext_dce Like r15-1610-gb6215065a5b143 (which adds one for late_combine), adding one for ext_dce is useful to debug some issues with this pass. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * dbgcnt.def (ext_dce): New debug counter. * ext-dce.cc (ext_dce_try_optimize_insn): Reject the insn if the debug counter says so. (ext_dce): Rename to ... (ext_dce_execute): This. (pass_ext_dce::execute): Update for the name of ext_dce. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-07-17 09:40:10 -07:00
Uros Bizjak	0841fd4c42	alpha: Fix duplicate !tlsgd!62 assemble error [PR115526] Add missing "cannot_copy" attribute to instructions that have to stay in 1-1 correspondence with another insn. PR target/115526 gcc/ChangeLog: * config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy attribute. (movdi_er_tlsgd): Ditto. (movdi_er_tlsldm): Ditto. (call_value_osf_<tls>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr115526.c: New test.	2024-07-17 18:12:38 +02:00
Mark Wielaard	3412b6e994	Regenerate c.opt.urls The addition of -Wunterminated-string-initialization should have regenerated the c.opt.urls file. Fixes: `44c9403ed1` ("c, objc: Add -Wunterminated-string-initialization") gcc/c-family/ChangeLog: * c.opt.urls: Regenerate.	2024-07-17 18:06:31 +02:00
Georg-Johann Lay	e21fef7da9	AVR: target/90616 - Improve adding constants that are 0 mod 256. This patch introduces a new insn that works as an insn combine pattern for (plus:HI (zero_extend:HI (reg:QI)) (const_0mod256_operannd:HI)) which requires at most 2 instructions. When the input register operand is already in HImode, the addhi3 printer only adds the hi8 part when it sees a SYMBOL_REF or CONST aligned to at least 256 bytes. (The CONST_INT case was already handled). gcc/ PR target/90616 * config/avr/predicates.md (const_0mod256_operand): New predicate. * config/avr/constraints.md (Cp8): New constraint. * config/avr/avr.md (aligned_add_symbol): New insn. config/avr/avr.cc (avr_out_plus_symbol) [HImode]: When op2 is a multiple of 256, there is no need to add / subtract the lo8 part. (avr_rtx_costs_1) [PLUS && HImode]: Return expected costs for new insn *aligned_add_symbol as it applies.	2024-07-17 17:37:10 +02:00
Jakub Jelinek	5104fe4c78	bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate [PR115887] The following testcase ICEs on x86_64-linux, because we try to gsi_insert_on_edge_immediate a statement on an edge which already has statements queued with gsi_insert_on_edge, and the deferral has been intentional so that we don't need to deal with cfg changes in between. The following patch uses the delayed insertion as well. 2024-07-17 Jakub Jelinek <jakub@redhat.com> PR middle-end/115887 * gimple-lower-bitint.cc (gimple_lower_bitint): Use gsi_insert_on_edge instead of gsi_insert_on_edge_immediate and set edge_insertions to true. * gcc.dg/bitint-108.c: New test.	2024-07-17 17:32:21 +02:00
Jakub Jelinek	d8a75353dd	varasm: Shorten assembly of strings with larger zero regions When not using .base64 directive, we emit for long sequences of zeros .string "foobarbaz" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "" The following patch changes that to .string "foobarbaz" .zero 12 It keeps emitting .string "" if there is just one zero or two zeros where the first one is preceded by non-zeros, so we can have .string "foobarbaz" .string "" or .base64 "VG8gYmUgb3Igbm90IHRvIGJlLCB0aGF0IGlzIHRoZSBxdWVzdGlvbg==" .string "" but not 2 .string "" in a row. On a testcase I have with around 310440 0-255 unsigned char character constants mostly derived from cc1plus start but with too long sequences of 0s which broke transformation to STRING_CST adjusted to have at most 126 consecutive 0s, I see: 1504498 bytes long assembly without this patch on i686-linux (without .base64 support in binutils) 1155071 bytes long assembly with this patch on i686-linux (without .base64 support in binutils) 431390 bytes long assembly without this patch on x86_64-linux (with .base64 support in binutils) 427593 bytes long assembly with this patch on x86_64-linux (with .base64 support in binutils) All 4 assemble to identical .o file when using x86_64-linux .base64 supporting gas, and the former 2 when using older x86_64-linux gas assemble to identical content as well. 2024-07-17 Jakub Jelinek <jakub@redhat.com> varasm.cc (default_elf_asm_output_ascii): Use ASM_OUTPUT_SKIP instead of 2 or more default_elf_asm_output_limited_string (f, "") calls and adjust base64 heuristics correspondingly.	2024-07-17 17:30:24 +02:00
Tamar Christina	0135a90de5	middle-end: fix 0 offset creation and folding [PR115936] As shown in PR115936 SCEV and IVOPTS create an invalidate IV when the IV is a pointer type: ivtmp.39_65 = ivtmp.39_59 + 0B; where the IVs are DI mode and the offset is a pointer. This comes from this weird candidate: Candidate 8: Var befor: ivtmp.39_59 Var after: ivtmp.39_65 Incr POS: before exit test IV struct: Type: sizetype Base: 0 Step: 0B Biv: N Overflowness wrto loop niter: No-overflow This IV was always created just ended up not being used. This is created by SCEV. simple_iv_with_niters in the case where no CHREC is found creates an IV with base == ev, offset == 0; however in this case EV is a POINTER_PLUS_EXPR and so the type is a pointer. it ends up creating an unusable expression. gcc/ChangeLog: PR tree-optimization/115936 * tree-scalar-evolution.cc (simple_iv_with_niters): Use sizetype for pointers.	2024-07-17 16:22:14 +01:00
Patrick Palka	247335823f	c++: constrained partial spec type context [PR111890] maybe_new_partial_specialization wasn't propagating TYPE_CONTEXT when creating a new class type corresponding to a constrained partial spec, which do_friend relies on via template_class_depth to distinguish a template friend from a non-template friend, and so in the below testcase we were incorrectly instantiating the non-template operator+ as if it were a template leading to an ICE. PR c++/111890 gcc/cp/ChangeLog: * pt.cc (maybe_new_partial_specialization): Propagate TYPE_CONTEXT to the newly created partial specialization. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-partial-spec15.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>	2024-07-17 11:08:35 -04:00
Feng Xue	db3c8c9726	vect: Optimize order of lane-reducing operations in loop def-use cycles When transforming multiple lane-reducing operations in a loop reduction chain, originally, corresponding vectorized statements are generated into def-use cycles starting from 0. The def-use cycle with smaller index, would contain more statements, which means more instruction dependency. For example: int sum = 1; for (i) { sum += d0[i] * d1[i]; // dot-prod <vector(16) char> sum += w[i]; // widen-sum <vector(16) char> sum += abs(s0[i] - s1[i]); // sad <vector(8) short> sum += n[i]; // normal <vector(4) int> } Original transformation result: for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = WIDEN_SUM (w_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = SAD (s0_v0[i: 0 ~ 7 ], s1_v0[i: 0 ~ 7 ], sum_v0); sum_v1 = SAD (s0_v1[i: 8 ~ 15], s1_v1[i: 8 ~ 15], sum_v1); sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy ... } For a higher instruction parallelism in final vectorized loop, an optimal means is to make those effective vector lane-reducing ops be distributed evenly among all def-use cycles. Transformed as the below, DOT_PROD, WIDEN_SUM and SADs are generated into disparate cycles, instruction dependency among them could be eliminated. for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = sum_v0; // copy sum_v1 = WIDEN_SUM (w_v1[i: 0 ~ 15], sum_v1); sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = sum_v0; // copy sum_v1 = sum_v1; // copy sum_v2 = SAD (s0_v2[i: 0 ~ 7 ], s1_v2[i: 0 ~ 7 ], sum_v2); sum_v3 = SAD (s0_v3[i: 8 ~ 15], s1_v3[i: 8 ~ 15], sum_v3); ... } 2024-03-22 Feng Xue <fxue@os.amperecomputing.com> gcc/ PR tree-optimization/114440 * tree-vectorizer.h (struct _stmt_vec_info): Add a new field reduc_result_pos. * tree-vect-loop.cc (vect_transform_reduction): Generate lane-reducing statements in an optimized order.	2024-07-17 21:54:06 +08:00
Feng Xue	178cc41951	vect: Support multiple lane-reducing operations for loop reduction [PR114440] For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current vectorizer could only handle the pattern if the reduction chain does not contain other operation, no matter the other is normal or lane-reducing. This patches removes some constraints in reduction analysis to allow multiple arbitrary lane-reducing operations with mixed input vectypes in a loop reduction chain. For example: int sum = 1; for (i) { sum += d0[i] * d1[i]; // dot-prod <vector(16) char> sum += w[i]; // widen-sum <vector(16) char> sum += abs(s0[i] - s1[i]); // sad <vector(8) short> } The vector size is 128-bit vectorization factor is 16. Reduction statements would be transformed as: vector<4> int sum_v0 = { 0, 0, 0, 1 }; vector<4> int sum_v1 = { 0, 0, 0, 0 }; vector<4> int sum_v2 = { 0, 0, 0, 0 }; vector<4> int sum_v3 = { 0, 0, 0, 0 }; for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = WIDEN_SUM (w_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = SAD (s0_v0[i: 0 ~ 7 ], s1_v0[i: 0 ~ 7 ], sum_v0); sum_v1 = SAD (s0_v1[i: 8 ~ 15], s1_v1[i: 8 ~ 15], sum_v1); sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy } sum_v = sum_v0 + sum_v1 + sum_v2 + sum_v3; // = sum_v0 + sum_v1 2024-03-22 Feng Xue <fxue@os.amperecomputing.com> gcc/ PR tree-optimization/114440 * tree-vectorizer.h (vectorizable_lane_reducing): New function declaration. * tree-vect-stmts.cc (vect_analyze_stmt): Call new function vectorizable_lane_reducing to analyze lane-reducing operation. * tree-vect-loop.cc (vect_model_reduction_cost): Remove cost computation code related to emulated_mixed_dot_prod. (vectorizable_lane_reducing): New function. (vectorizable_reduction): Allow multiple lane-reducing operations in loop reduction. Move some original lane-reducing related code to vectorizable_lane_reducing. (vect_transform_reduction): Adjust comments with updated example. gcc/testsuite/ PR tree-optimization/114440 * gcc.dg/vect/vect-reduc-chain-1.c * gcc.dg/vect/vect-reduc-chain-2.c * gcc.dg/vect/vect-reduc-chain-3.c * gcc.dg/vect/vect-reduc-chain-dot-slp-1.c * gcc.dg/vect/vect-reduc-chain-dot-slp-2.c * gcc.dg/vect/vect-reduc-chain-dot-slp-3.c * gcc.dg/vect/vect-reduc-chain-dot-slp-4.c * gcc.dg/vect/vect-reduc-dot-slp-1.c	2024-07-17 21:54:05 +08:00
Feng Xue	8b59fa9d8c	vect: Refit lane-reducing to be normal operation Vector stmts number of an operation is calculated based on output vectype. This is over-estimated for lane-reducing operation, which would cause vector def/use mismatched when we want to support loop reduction mixed with lane- reducing and normal operations. One solution is to refit lane-reducing to make it behave like a normal one, by adding new pass-through copies to fix possible def/use gap. And resultant superfluous statements could be optimized away after vectorization. For example: int sum = 1; for (i) { sum += d0[i] * d1[i]; // dot-prod <vector(16) char> } The vector size is 128-bit，vectorization factor is 16. Reduction statements would be transformed as: vector<4> int sum_v0 = { 0, 0, 0, 1 }; vector<4> int sum_v1 = { 0, 0, 0, 0 }; vector<4> int sum_v2 = { 0, 0, 0, 0 }; vector<4> int sum_v3 = { 0, 0, 0, 0 }; for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy } sum_v = sum_v0 + sum_v1 + sum_v2 + sum_v3; // = sum_v0 2024-07-02 Feng Xue <fxue@os.amperecomputing.com> gcc/ * tree-vect-loop.cc (vect_reduction_update_partial_vector_usage): Calculate effective vector stmts number with generic vect_get_num_copies. (vect_transform_reduction): Insert copies for lane-reducing so as to fix over-estimated vector stmts number. (vect_transform_cycle_phi): Calculate vector PHI number only based on output vectype. * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Remove adjustment on vector stmts number specific to slp reduction.	2024-07-17 21:54:05 +08:00
Feng Xue	e7fbae834f	vect: Add a unified vect_get_num_copies for slp and non-slp Extend original vect_get_num_copies (pure loop-based) to calculate number of vector stmts for slp node regarding a generic vect region. 2024-07-12 Feng Xue <fxue@os.amperecomputing.com> gcc/ * tree-vectorizer.h (vect_get_num_copies): New overload function. * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Calculate number of vector stmts for slp node with vect_get_num_copies. (vect_slp_analyze_node_operations): Calculate number of vector elements for constant/external slp node with vect_get_num_copies.	2024-07-17 21:54:05 +08:00
Richard Biener	24689b84b8	tree-optimization/115959 - ICE with SLP condition reduction The following fixes how during reduction epilogue generation we gather conditional compares for condition reductions, thereby following the reduction chain via STMT_VINFO_REDUC_IDX. The issue is that SLP nodes for COND_EXPRs can have either three or four children dependent on whether we have legacy GENERIC expressions in the transitional pattern GIMPLE for the COND_EXPR condition. PR tree-optimization/115959 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Get at the REDUC_IDX child in a safer way for COND_EXPR nodes. * gcc.dg/vect/pr115959.c: New testcase.	2024-07-17 13:19:41 +02:00
Jakub Jelinek	2790800c61	testsuite: Add dg-do run to another test This is another test which clearly has been written with the assumption that it will be executed, but it isn't. It works fine when it is executed on both x86_64-linux and i686-linux. 2024-07-17 Jakub Jelinek <jakub@redhat.com> * c-c++-common/torture/builtin-convertvector-1.c: Add dg-do run directive.	2024-07-17 11:40:58 +02:00
Jakub Jelinek	74bcef4cf1	varasm: Fix bootstrap after the .base64 changes [PR115958] Apparently there is a -Wsign-compare warning if ptrdiff_t has precision of int, then (t - s + 1 + 2) / 3 * 4 has int type while cnt unsigned int. This doesn't warn if ptrdiff_t has larger precision, say on x86_64 it is 64-bit and so (t - s + 1 + 2) / 3 * 4 has long type and cnt unsigned int. And it doesn't warn when using older binutils (in my tests I've used new binutils on x86_64 and old binutils on i686). Anyway, earlier condition guarantees that t - s is at most 256-ish and t >= s by construction, so we can just cast it to (unsigned) to avoid the warning. 2024-07-17 Jakub Jelinek <jakub@redhat.com> PR other/115958 * varasm.cc (default_elf_asm_output_ascii): Cast t - s to unsigned to avoid -Wsign-compare warnings.	2024-07-17 11:40:03 +02:00
Jakub Jelinek	8b5919bae1	gimple-fold: Fix up __builtin_clear_padding lowering [PR115527] The builtin-clear-padding-6.c testcase fails as clear_padding_type doesn't correctly recompute the buf->size and buf->off members after expanding clearing of an array using a runtime loop. buf->size should be in that case the offset after which it should continue with next members or padding before them modulo UNITS_PER_WORD and buf->off that offset minus buf->size. That is what the code was doing, but with off being the start of the loop cleared array, not its end. So, the last hunk in gimple-fold.cc fixes that. When adding the testcase, I've noticed that the c-c++-common/torture/builtin-clear-padding-* tests, although clearly written as runtime tests to test the builtins at runtime, didn't have { dg-do run } directive and were just compile tests because of that. When adding that to the tests, builtin-clear-padding-1.c was already failing without that clear_padding_type hunk too, but builtin-clear-padding-5.c was still failing even after the change. That is due to a bug in clear_padding_flush which the patch fixes as well - when clear_padding_flush is called with full=true (that happens at the end of the whole __builtin_clear_padding or on those array padding clears done by a runtime loop), it wants to flush all the pending padding clearings rather than just some. If it is at the end of the whole object, it decreases wordsize when needed to make sure the code never writes including RMW cycles to something outside of the object: if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize) > (unsigned HOST_WIDE_INT) buf->sz) { gcc_assert (wordsize > 1); wordsize /= 2; i -= wordsize; continue; } but if it is full==true flush in the middle, this doesn't happen, but we still process just the buffer bytes before the current end. If that end is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18, nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones might be true, so in some spots we just didn't emit any clearing in that last chunk. 2024-07-17 Jakub Jelinek <jakub@redhat.com> PR middle-end/115527 * gimple-fold.cc (clear_padding_flush): Introduce endsize variable and use it instead of wordsize when comparing it against nonzero_last. (clear_padding_type): Increment off by sz. * c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run directive. * c-c++-common/torture/builtin-clear-padding-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-3.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/builtin-clear-padding-5.c: Likewise. * c-c++-common/torture/builtin-clear-padding-6.c: New test.	2024-07-17 11:38:33 +02:00
Haochen Gui	ecc2c3cb72	rs6000: Remove redundant guard for float128 mode pattern gcc/ * config/rs6000/rs6000.md (mov<mode>cc, mov<mode>cc_p10, mov<mode>cc_invert_p10, fpmask<mode>, xxsel<mode>, @ieee_128bit_vsx_abs<mode>2, ieee_128bit_vsx_nabs<mode>2, add<mode>3, sub<mode>3, mul<mode>3, div<mode>3, sqrt<mode>2, copysign<mode>3, copysign<mode>3_hard, copysign<mode>3_soft, @neg<mode>2_hw, @abs<mode>2_hw, nabs<mode>2_hw, fma<mode>4_hw, fms<mode>4_hw, nfma<mode>4_hw, nfms<mode>4_hw, extend<SFDF:mode><IEEE128:mode>2_hw, trunc<mode>df2_hw, trunc<mode>sf2_hw, fix<uns>_<IEEE128:mode><SDI:mode>2_hw, fix<uns>_trunc<IEEE128:mode><QHI:mode>2, fix<uns>_trunc<IEEE128:mode><QHSI:mode>2_mem, float_<mode>di2_hw, float_<mode>si2_hw, float<QHI:mode><IEEE128:mode>2, floatuns_<mode>di2_hw, floatuns_<mode>si2_hw, floatuns<QHI:mode><IEEE128:mode>2, floor<mode>2, ceil<mode>2, btrunc<mode>2, round<mode>2, add<mode>3_odd, sub<mode>3_odd, mul<mode>3_odd, div<mode>3_odd, sqrt<mode>2_odd, fma<mode>4_odd, fms<mode>4_odd, nfma<mode>4_odd, nfms<mode>4_odd, trunc<mode>df2_odd, cmp<mode>_hw for IEEE128): Remove guard FLOAT128_IEEE_P. (@extenddf<mode>2_fprs, @extenddf<mode>2_vsx, trunc<mode>df2_internal1, trunc<mode>df2_internal2, fix_trunc_helper<mode>, neg<mode>2, cmp<mode>_internal1, cmp<IBM128:mode>_internal2 for IBM128): Remove guard FLOAT128_IBM_P.	2024-07-17 14:49:00 +08:00
Kewen Lin	dd4d71ca4d	rs6000: Change optab for ibm128 and ieee128 conversion Currently for 128 bit floating-point ibm128 and ieee128 formats conversion, the corresponding libcalls are: ibm128 -> ieee128 "__trunctfkf2" ieee128 -> ibm128 "__extendkftf2" , and generic code handling (like convert_mode_scalar) also adopts sext_optab for ieee128 -> ibm128 while trunc_optab for ibm128 -> ieee128. But in rs6000 port as function rs6000_expand_float128_convert and init_float128_ieee show, we adopt sext_optab for ibm128 -> ieee128 with "__trunctfkf2" while trunc_optab for ieee128 -> ibm128 with "__extendkftf2". To make them consistent and avoid some surprises, this patch is to adjust rs6000 internal handlings by adopting trunc_optab for ibm128 -> ieee128 with "__trunctfkf2" while sext_optab for ieee128 -> ibm128 with "__extendkftf2". gcc/ChangeLog: * config/rs6000/rs6000.cc (init_float128_ieee): Use trunc_optab rather than sext_optab for converting FLOAT128_IBM_P mode to FLOAT128_IEEE_P mode, and use sext_optab rather than trunc_optab for converting FLOAT128_IEEE_P mode to FLOAT128_IBM_P mode. (rs6000_expand_float128_convert): Likewise.	2024-07-17 00:19:30 -05:00
Kewen Lin	b5c813ed60	tree: Remove KFmode workaround [PR112993] The fix for PR112993 makes KFmode have 128 bit mode precision, we don't need this workaround to fix up the type precision any more, and just go with mode precision. So this patch is to remove KFmode workaround. PR target/112993 gcc/ChangeLog: * tree.cc (build_common_tree_nodes): Drop the workaround for rs6000 KFmode precision adjustment.	2024-07-17 00:19:00 -05:00
Kewen Lin	fa86f510f5	ranger: Revert the workaround introduced in PR112788 [PR112993] This reverts commit r14-6478-gfda8e2f8292a90 "range: Workaround different type precision between _Float128 and long double [PR112788]" as the fixes for PR112993 make all 128 bits scalar floating point have the same 128 bit precision, this workaround isn't needed any more. PR target/112993 gcc/ChangeLog: * value-range.h (range_compatible_p): Remove the workaround on different type precision between _Float128 and long double.	2024-07-17 00:17:42 -05:00
Kewen Lin	de6969fd31	fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993] Previously effective target fortran_real_c_float128 never passes on Power regardless of the default 128 long double is ibmlongdouble or ieeelongdouble. It's due to that TF mode is always used for kind 16 real, which has precision 127, while the node float128_type_node for c_float128 has 128 type precision, get_real_kind_from_node can't find a matching as it only checks gfc_real_kinds[i].mode_precision and type precision. With changing TFmode/IFmode/KFmode to have the same mode precision 128, now fortran_real_c_float12 can pass with ieeelongdouble enabled by default and test cases guarded with it get tested accordingly. But with ibmlongdouble enabled by default, since TFmode has precision 128 which is the same as type precision 128 of float128_type_node, get_real_kind_from_node considers kind for TFmode matches float128_type_node, but it's wrong as at this time point TFmode is with ibm extended format. So this patch is to teach get_real_kind_from_node to check one more field which can be differentiable from the underlying real format, it can avoid the unexpected matching when there more than one modes have the same precisoin. PR target/112993 gcc/fortran/ChangeLog: * trans-types.cc (get_real_kind_from_node): Consider the case where more than one modes have the same precision.	2024-07-17 00:16:59 -05:00
Kewen Lin	33dca0a4c1	rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993] On rs6000, there are three 128 bit scalar floating point modes TFmode, IFmode and KFmode. With some historical reasons, we defines them with different mode precisions, that is KFmode 126, TFmode 127 and IFmode 128. But in fact all of them should have the same mode precision 128, this special setting has caused some issues like some unexpected failures mentioned in [1] and also made us have to introduce some workarounds, such as: the workaround in build_common_tree_nodes for KFmode 126, the workaround in range_compatible_p for same mode but different precision issue. This patch is to make these three 128 bit scalar floating point modes TFmode, IFmode and KFmode have 128 bit mode precision, and keep the order same as previous in order to make machine independent parts of the compiler not try to widen IFmode to TFmode. Besides, build_common_tree_nodes adopts the newly added hook mode_for_floating_type so we don't need to worry about unexpected mode for long double type node. In function convert_mode_scalar, with the proposed change, it adopts sext_optab for converting ieee128 format mode to ibm128 format mode while trunc_optab for converting ibm128 format mode to ieee128 format mode. Thus this patch removes useless extend and trunc optab supports, supplements new define_expands expandkftf2 and trunctfkf2 to align with convert_mode_scalar implementation. It also unnames two define_insn_and_split to avoid conflicts and make them more clear. Considering the current implementation that there is no chance to have KF <-> IF conversion (since either of them would be TF already), it adds two dummy define_expands to assert this. [1] https://inbox.sourceware.org/gcc-patches/ 718677e7-614d-7977-312d-05a75e1fd5b4@linux.ibm.com/ PR target/112993 gcc/ChangeLog: * config/rs6000/rs6000-modes.def (IFmode, KFmode, TFmode): Define with FLOAT_MODE instead of FRACTIONAL_FLOAT_MODE, don't use special precisions any more. (rs6000-modes.h): Remove include. * config/rs6000/rs6000-modes.h: Remove. * config/rs6000/rs6000.h (rs6000-modes.h): Remove include. * config/rs6000/t-rs6000: Remove rs6000-modes.h include. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace all uses of FLOAT_PRECISION_TFmode with 128. (rs6000_c_mode_for_floating_type): Likewise. * config/rs6000/rs6000.md (define_expand extendiftf2): Remove. (define_expand extendifkf2): Remove. (define_expand extendtfkf2): Remove. (define_expand trunckftf2): Remove. (define_expand trunctfif2): Remove. (define_expand extendtfif2): Add new assertion. (define_expand expandkftf2): New. (define_expand trunciftf2): Add new assertion. (define_expand trunctfkf2): New. (define_expand truncifkf2): Change with gcc_unreachable. (define_expand expandkfif2): New. (define_insn_and_split extendkftf2): Rename to ... (define_insn_and_split extendkftf2): ... this. (define_insn_and_split trunctfkf2): Rename to ... (define_insn_and_split extendtfkf2): ... this.	2024-07-17 00:14:43 -05:00
Kewen Lin	3f6e6d4b40	expr: Allow same precision modes conversion between {ibm_extended, ieee_quad}_format With some historical reasons, rs6000 defines KFmode, TFmode and IFmode to have different mode precisions, but it causes some issues and needs some workarounds such as PR112993. So we are going to make all rs6000 128 bit scalar FP modes have 128 bit precision. Be prepared for that, this patch is to make function convert_mode_scalar allow same precision FP modes conversion if their underlying formats are ibm_extended_format and ieee_quad_format respectively, just like the existing special treatment on arm_bfloat_half_format <-> ieee_half_format. It also factors out all the relevant checks into a lambda function. Besides, similar to ieee fp16 -> bfloat conversion, it adopts trunc_optab rather than sext_optab for ibm128 to ieee128 conversion. PR target/112993 gcc/ChangeLog: * expr.cc (convert_mode_scalar): Allow same precision conversion between scalar floating point modes if whose underlying format is ibm_extended_format or ieee_quad_format, and refactor assertion with new lambda function acceptable_same_precision_modes. Use trunc_optab rather than sext_optab for ibm128 to ieee128 conversion. * optabs-libfuncs.cc (gen_trunc_conv_libfunc): Use trunc_optab rather than sext_optab for ibm128 to ieee128 conversion.	2024-07-17 00:14:18 -05:00
Ian Lance Taylor	f438299ef6	libbacktrace: update xcoff.c for base_address changes * xcoff.c (struct xcoff_fileline_data): Change base_address field to struct libbacktrace_base_address. (xcoff_initialize_syminfo): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (xcoff_initialize_fileline): Likewise. (xcoff_lookup_pc): Use libbacktrace_add_base. (xcoff_add): Change base_address to struct libbacktrace_base_address. (xcoff_armem_add, xcoff_add_shared_libs): Likewise. (backtrace_initialize): Likewise. * Makefile.am (xcoff.lo): Remove unused target. (xcoff_32.lo, xcoff_64.lo): New targets. * Makefile.in: Regenerate.	2024-07-16 21:27:05 -07:00
Peter Bergner	6f2bab9b5d	rs6000: Error on CPUs and ABIs that don't support the ROP protection insns [PR114759] We currently silently ignore the -mrop-protect option for old CPUs we don't support with the ROP hash insns, but we throw an error for unsupported ABIs. This patch treats unsupported CPUs and ABIs similarly by throwing an error both both. This matches clang behavior and allows us to simplify our tests in the code that generates our prologue and epilogue code. 2024-06-26 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/114759 * config/rs6000/rs6000.cc (rs6000_option_override_internal): Disallow CPUs and ABIs that do no support the ROP protection insns. * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now unneeded tests. (rs6000_emit_prologue): Likewise. Remove unneeded gcc_assert. (rs6000_emit_epilogue): Likewise. * config/rs6000/rs6000.md: Likewise. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-3.c: New test.	2024-07-16 20:35:25 -05:00
Peter Bergner	a05c3d23d1	rs6000: ROP - Emit hashst and hashchk insns on Power8 and later [PR114759] We currently only emit the ROP-protect hash* insns for Power10, where the insns were added to the architecture. We want to emit them for earlier cpus (where they operate as NOPs), so that if those older binaries are ever executed on a Power10, then they'll be protected from ROP attacks. Binutils accepts hashst and hashchk back to Power8, so change GCC to emit them for Power8 and later. This matches clang's behavior. 2024-06-19 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/114759 * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Use TARGET_POWER8. (rs6000_emit_prologue): Likewise. * config/rs6000/rs6000.md (hashchk): Likewise. (hashst): Likewise. Fix whitespace. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-2.c: New test. * lib/target-supports.exp (rop_ok): Use check_effective_target_has_arch_pwr8.	2024-07-16 20:35:25 -05:00
Nathaniel Shead	1aa0f16278	c++/modules: Propagate BINDING_VECTOR__DUPS_P on realloc [PR99242] When importing modules, when a binding vector for a name runs out of slots it gets reallocated with a larger size, and existing bindings are copied across. However, the flags to indicate whether deduping needs to occur did not: this causes ICEs, as it allows a duplicate binding to be added which then violates assumptions later on. PR c++/99242 gcc/cp/ChangeLog: name-lookup.cc (append_imported_binding_slot): Propagate dups flags. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99242_a.H: New test. * g++.dg/modules/pr99242_b.H: New test. * g++.dg/modules/pr99242_c.H: New test. * g++.dg/modules/pr99242_d.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>	2024-07-17 11:21:58 +10:00
GCC Administrator	72bce1fbef	Daily bump.	2024-07-17 00:18:32 +00:00
Andrew MacLeod	73a8286d3a	range-ops should return the requested boolean type. The pointer based relation operator's fold_range () routines should return a boolean range with the requested type, not the default type. PR tree-optimization/115951 * range-op-ptr.cc (operator_equal::fold_range): Return a boolean range with the requested type. (operator_not_equal::fold_range): Likewise. (operator_lt::fold_range): Likewise. (operator_le::fold_range): Likewise. (operator_gt::fold_range): Likewise. (operator_ge::fold_range): Likewise.	2024-07-16 17:22:10 -04:00
Nina Ranns	40a990c8b5	c++/contracts: ICE in C++ Contracts with '-fno-exceptions' [PR 110159] We currently only initialise terminate_fn if exceptions are enabled. However, contract handling requires terminate_fn when building the contract because a contract failure may result in std::terminate call regardless of whether the exceptions are enabled. Refactored init_exception_processing to extract the initialisation of terminate_fn. New function init_terminate_fn added that initialises terminate_fn if it hasn't already been initialised. Call to terminate_fn added in cxx_init_decl_processing if contracts are enabled. PR c++/110159 gcc/cp/ChangeLog: * cp-tree.h (init_terminate_fn): Declaration of a new function. * decl.cc (cxx_init_decl_processing): If contracts are enabled, call init_terminate_fn. * except.cc (init_exception_processing): Function refactored to call init_terminate_fn. (init_terminate_fn): Added new function that initializes terminate_fn if it hasn't already been initialised. gcc/testsuite/ChangeLog: * g++.dg/contracts/pr110159.C: New test. Signed-off-by: Nina Ranns <dinka.ranns@gmail.com>	2024-07-16 14:51:52 -04:00
Georg-Johann Lay	a3d1469c7c	AVR: testsuite - Attribute ipa implies noinline and noclone. gcc/testsuite/ * gcc.target/avr/isr-test.h: Attribute ipa implies noinline and noclone. * gcc.target/avr/pr114981-powif.c: Same. * gcc.target/avr/pr114981-powil.c: Same. * gcc.target/avr/pr71676-1.c: Same. * gcc.target/avr/pr71676-2.c: Same. * gcc.target/avr/pr71676-3.c: Same. * gcc.target/avr/pr71676.c: Same. * gcc.target/avr/torture/add-extend.c: Same. * gcc.target/avr/torture/fix-types.h: Same. * gcc.target/avr/torture/fuse-add.c: Same. * gcc.target/avr/torture/get-mem.c: Same. * gcc.target/avr/torture/insv-anyshift-hi.c: Same. * gcc.target/avr/torture/insv-anyshift-si.c: Same. * gcc.target/avr/torture/isr-02-call.c: Same. * gcc.target/avr/torture/isr-03-fixed.c: Same. * gcc.target/avr/torture/pr109650-1.c: Same. * gcc.target/avr/torture/pr109650-2.c: Same. * gcc.target/avr/torture/pr109907-1.c: Same. * gcc.target/avr/torture/pr109907-2.c: Same. * gcc.target/avr/torture/pr114132-2.c: Same. * gcc.target/avr/torture/pr39633.c: Same. * gcc.target/avr/torture/pr51782-1.c: Same. * gcc.target/avr/torture/pr61055.c: Same. * gcc.target/avr/torture/pr61443.c: Same. * gcc.target/avr/torture/pr64331.c: Same. * gcc.target/avr/torture/pr77326.c: Same. * gcc.target/avr/torture/pr83729.c: Same. * gcc.target/avr/torture/pr83801.c: Same. * gcc.target/avr/torture/pr87376.c: Same. * gcc.target/avr/torture/pr88236-pr115726.c: Same. * gcc.target/avr/torture/pr92606.c: Same. * gcc.target/avr/torture/pr98762.c: Same. * gcc.target/avr/torture/sat-hr-plus-minus.c: Same. * gcc.target/avr/torture/sat-k-plus-minus.c: Same. * gcc.target/avr/torture/sat-llk-plus-minus.c: Same. * gcc.target/avr/torture/sat-r-plus-minus.c: Same. * gcc.target/avr/torture/sat-uhr-plus-minus.c: Same. * gcc.target/avr/torture/sat-uk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ullk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ur-plus-minus.c: Same. * gcc.target/avr/torture/set-mem.c: Same. * gcc.target/avr/torture/sub-extend.c: Same. * gcc.target/avr/torture/tiny-progmem.c: Same.	2024-07-16 19:55:52 +02:00
Iain Sandoe	d1706235ed	c++, coroutines, contracts: Handle coroutine and void functions [PR110871,PR110872,PR115434]. The current implementation of contracts emits the checks into function bodies in three places; for pre-conditions at the start of the body, for asserts in-line in the function body and for post-conditions as an addition to return statements. In general (at least with existing "2a" contract semantics) the in-line contract asserts behave as expected. However, the mechanism is not applicable to: * Handling pre conditions in coroutines since, for those, the standard specifies a wrapping of the original function body by functionality implementing initial and final suspends (along with some housekeeping to route exceptions). Thus for such transformed function bodies, the preconditions then get actioned after the initial suspend, which does not behave as intended. * Handling post conditions in functions that do not have return statements (which applies to coroutines and void functions). In the following, we identify a potentially transformed function body (in the case of coroutines, this is usually called the "ramp()" function). The patch here re-implements the code insertion in one of the two following ways (code for exposition only): * For functions with no post-conditions we wrap the potentially transformed function as follows: { handle_pre_condition_checking (); potentially_transformed_function_body (); } This implements the intent that the preconditions are processed after the function parameters are initialised but before any other actions. * For functions with post-conditions: if (preconditions_exist) handle_pre_condition_checking (); try { potentially_transformed_function_body (); } finally { handle_post_condition_checking (); } else [only if the function is not marked noexcept(true) ] { ; } In this, post-conditions [that might apply to the return value etc.] are evaluated on every non-exceptional edge out of the function. At present, the model here is that exceptions thrown by the function propagate upwards as if there were no contracts present. If the desired semantic becomes that an exception is counted as equivalent to a contract violation - then we can add a second handler in place of the empty statement. This patch specifically does not address changes to code-gen and constexpr handling that are contained in P2900. PR c++/115434 PR c++/110871 PR c++/110872 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_constant_expression): Handle EH_ELSE_EXPR. * contracts.cc (finish_contract_attribute): Remove excess line. (build_contract_condition_function): Post condition handlers are void now. (emit_postconditions_cleanup): Remove. (emit_postconditions): New. (add_pre_condition_fn_call): New. (add_post_condition_fn_call): New. (apply_preconditions): New. (apply_postconditions): New. (maybe_apply_function_contracts): New. (apply_postcondition_to_return): Remove. * contracts.h (apply_postcondition_to_return): Remove. (maybe_apply_function_contracts): Add. * coroutines.cc (coro_build_actor_or_destroy_function): Do not copy contracts to coroutine helpers. * decl.cc (finish_function): Handle wrapping a possibly transformed function body in contract checks. * typeck.cc (check_return_expr): Remove handling of post conditions on return expressions. gcc/ChangeLog: * gimplify.cc (struct gimplify_ctx): Add a flag to show we are expending a handler. (gimplify_expr): When we are expanding a handler, and the body transforms might have re-written DECL_RESULT into a gimple var, ensure that hander references to DECL_RESULT are also re-written to refer to the gimple var. When we are processing an EH_ELSE expression, then add it if either of the cleanup slots is in use. gcc/testsuite/ChangeLog: * g++.dg/contracts/pr115434.C: New test. * g++.dg/coroutines/pr110871.C: New test. * g++.dg/coroutines/pr110872.C: New test. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2024-07-16 16:58:52 +01:00
Georg-Johann Lay	f8b302c983	AVR: testsuite - Add noipa function attribute to noclone functions. Many functions under test have the noinline and noclone function attributes attached so that no (constant) values are propagated into the functions, so that we actually are testing what's supposed to be tested. In order to enforce that, noipa may also be required when inter-procedural analysis / optimizations are on. gcc/testsuite/ * gcc.target/avr/isr-test.h: Add noipa function attribute to noclone functions. * gcc.target/avr/pr114981-powif.c: Same. * gcc.target/avr/pr114981-powil.c: Same. * gcc.target/avr/pr71676-1.c: Same. * gcc.target/avr/pr71676-2.c: Same. * gcc.target/avr/pr71676-3.c: Same. * gcc.target/avr/pr71676.c: Same. * gcc.target/avr/torture/fix-types.h: Same. * gcc.target/avr/torture/fuse-add.c: Same. * gcc.target/avr/torture/get-mem.c: Same. * gcc.target/avr/torture/insv-anyshift-hi.c: Same. * gcc.target/avr/torture/insv-anyshift-si.c: Same. * gcc.target/avr/torture/isr-02-call.c: Same. * gcc.target/avr/torture/isr-03-fixed.c: Same. * gcc.target/avr/torture/pr109650-1.c: Same. * gcc.target/avr/torture/pr109650-2.c: Same. * gcc.target/avr/torture/pr109907-1.c: Same. * gcc.target/avr/torture/pr109907-2.c: Same. * gcc.target/avr/torture/pr114132-2.c: Same. * gcc.target/avr/torture/pr39633.c: Same. * gcc.target/avr/torture/pr51782-1.c: Same. * gcc.target/avr/torture/pr61055.c: Same. * gcc.target/avr/torture/pr61443.c: Same. * gcc.target/avr/torture/pr64331.c: Same. * gcc.target/avr/torture/pr77326.c: Same. * gcc.target/avr/torture/pr83729.c: Same. * gcc.target/avr/torture/pr83801.c: Same. * gcc.target/avr/torture/pr87376.c: Same. * gcc.target/avr/torture/pr88236-pr115726.c: Same. * gcc.target/avr/torture/pr92606.c: Same. * gcc.target/avr/torture/pr98762.c: Same. * gcc.target/avr/torture/sat-hr-plus-minus.c: Same. * gcc.target/avr/torture/sat-k-plus-minus.c: Same. * gcc.target/avr/torture/sat-llk-plus-minus.c: Same. * gcc.target/avr/torture/sat-r-plus-minus.c: Same. * gcc.target/avr/torture/sat-uhr-plus-minus.c: Same. * gcc.target/avr/torture/sat-uk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ullk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ur-plus-minus.c: Same. * gcc.target/avr/torture/set-mem.c: Same. * gcc.target/avr/torture/tiny-progmem.c: Same.	2024-07-16 17:33:50 +02:00
Paul Thomas	9f966b6a8f	Fortran: Simplify len_trim with array ref and fix mapping bug[PR84868]. 2024-07-16 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/84868 * simplify.cc (gfc_simplify_len_trim): If the argument is an element of a parameter array, simplify all the elements and build a new parameter array to hold the result, after checking that it doesn't already exist. * trans-expr.cc (gfc_get_interface_mapping_array) if a string length is available, use it for the typespec. (gfc_add_interface_mapping): Supply the se string length. gcc/testsuite/ PR fortran/84868 * gfortran.dg/pr84868.f90: New test.	2024-07-16 15:56:44 +01:00
Richard Sandiford	fec38d7987	rtl-ssa: Fix removal of order_nodes [PR115929] order_nodes are used to implement ordered comparisons between two insns with the same program point number. remove_insn would remove an order_node from its splay tree, but didn't remove it from the insn. This caused confusion if the insn was later reinserted somewhere else that also needed an order_node. gcc/ PR rtl-optimization/115929 * rtl-ssa/insns.cc (function_info::remove_insn): Remove an order_node from the instruction as well as from the splay tree. gcc/testsuite/ PR rtl-optimization/115929 * gcc.dg/torture/pr115929-1.c: New test.	2024-07-16 15:33:23 +01:00
Richard Sandiford	851ec9960b	recog: restrict paradoxical mode punning in insn_propagation [PR115901] In g:44fc801e97a8dc626a4806ff4124439003420b20 I'd extended insn_propagation to handle simple cases of hard-reg mode punning. One of the checks was that the new use mode occupied the same number of registers as the original definition mode. However, as PR115901 shows, we need to avoid increasing the size of any registers in the punned "to" expression as well. Specifically, the test includes a DImode move from GPR x0 to a vector register, followed by a V2DI use of the vector register. The simplification would then create a V2DI spanning x0 and x1, manufacturing a new, unwanted use of x1. Checking for that kind of thing directly seems too cumbersome, and is not related to the original motivation (which was to improve handling of shared vector zeros on aarch64). This patch therefore restricts the paradoxical case to constants. gcc/ PR rtl-optimization/115901 * recog.cc (insn_propagation::apply_to_rvalue_1): Restrict paradoxical mode punning to cases where "to" is constant. gcc/testsuite/ PR rtl-optimization/115901 * gcc.dg/torture/pr115901.c: New test.	2024-07-16 15:31:17 +01:00

1 2 3 4 5 ...

212187 Commits