Fix the missing dependency between the gcc and libgrust.
ChangeLog:
* Makefile.def: Add a dependency to libgrust for all-gcc.
* Makefile.in: Regenerate the file.
Signed-off-by: Pierre-Emmanuel Patry <pierre-emmanuel.patry@embecosm.com>
pr88077 fails on SPARC since char HeaderStr[1] in pr88077_1.c and
long HeaderStr in pr88077_0.c differs in alignment.
Warning printed by Binutils ld:
warning: alignment 4 of normal symbol `HeaderStr' in c_lto_pr88077_0.o is
smaller than 8 used by the common definition in c_lto_pr88077_1.o
gcc/testsuite/ChangeLog:
* gcc.dg/lto/pr88077_0.c: Change type to match alignment for SPARC
This is to handle the membar_empty instruction that can be generated
when compiling for UT699.
gcc/ChangeLog:
* config/sparc/sparc.cc (next_active_non_empty_insn): Length 0 treated as empty
LEON now uses the standard V8 membar patterns that contains an ldstub
instruction. This instruction needs to be aligned properly when the
GR712RC errata workaround is enabled.
gcc/ChangeLog:
* config/sparc/sparc.cc (atomic_insn_for_leon3_p): Treat membar_storeload as atomic
* config/sparc/sync.md (membar_storeload): Turn into named insn
and add GR712RC errata workaround.
(membar_v8): Add GR712RC errata workaround.
LEON5 has a deeper write-buffer and hence stb is not enough to flush a
write out. For compatibility, use the default V8 approach for both
LEON3 and LEON5.
This reverts commit 49cc765db3,
"sync.md (*membar_storeload_leon3): New insn."
gcc/ChangeLog:
* config/sparc/sync.md (*membar_storeload_leon3): Remove
(*membar_storeload): Enable for LEON
The following patch adds a quick workaround to bugs in VAR_DECL
partitioning.
The problem is that there is no dependency between ADDR_EXPRs of local
decls and CLOBBERs of those vars, so VN can CSE uses of ADDR_EXPRs
(including ivopts integral variants thereof), which can break
add_scope_conflicts discovery of what variables are actually used
in certain region.
E.g. we can have
ivtmp.40_3 = (unsigned long) &MEM <unsigned long[100]> [(void *)&bitint.6 + 8B];
...
uses of ivtmp.40_3
...
bitint.6 ={v} {CLOBBER(eos)};
...
ivtmp.28_43 = (unsigned long) &MEM <unsigned long[100]> [(void *)&bitint.6 + 8B];
...
uses of ivtmp.28_43
before VN (such as dom3), which the add_scope_conflicts code identifies as 2
independent uses of bitint.6 variable (which is correct), but then VN
determines ivtmp.28_43 is the same as ivtmp.40_3 and just uses ivtmp.40_3
even in the second region; at that point add_scope_conflict thinks the
bitint.6 variable is not used in that region anymore.
The following patch does a simple single def-stmt check for such ADDR_EXPRs
(rather than say trying to do a full propagation of what SSA_NAMEs can
contain ADDR_EXPRs of local variables), which seems to workaround all 4 PRs.
In addition to this patch I've used the attached one to gather statistics
on the total size of all variable partitions in a function and seems besides
the new testcases nothing is really affected compared to no patch (I've
actually just modified the patch to == OMP_SCAN instead of == ADDR_EXPR, so
it looks the same except that it never triggers). The comparison wasn't
perfect because I've only gathered BITS_PER_WORD, main_input_filename (did
some replacement of build directories and /tmp/ccXXXXXX names of LTO to make
it more similar between the two bootstraps/regtests), current_function_name
and the total size of all variable partitions if any, because I didn't
record e.g. the optimization options and so e.g. torture tests which iterate
over options could have different partition sizes even in one compiler when
BITS_PER_WORD, main_input_filename and current_function_name are all equal.
So had to write an awk script to check if the first triple in the second
build appeared in the first one and the quadruple in the second build
appeared in the first one too, otherwise print result and that only
triggered in the new tests.
Also, the cc1plus binary according to objdump -dr is identical between the
two builds except for the ADDR_EXPR vs. OMP_SCAN constant in the two spots.
2024-01-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113372
PR middle-end/90348
PR middle-end/110115
PR middle-end/111422
* cfgexpand.cc (add_scope_conflicts_2): New function.
(add_scope_conflicts_1): Use it.
* gcc.dg/torture/bitint-49.c: New test.
* gcc.c-torture/execute/pr90348.c: New test.
* gcc.c-torture/execute/pr110115.c: New test.
* gcc.c-torture/execute/pr111422.c: New test.
When pattern recognition is involved, a statement whose definition is
consumed in some pattern, may not be included in the final replacement
pattern statements, and would be skipped when building SLP graph.
* Original
char a_c = *(char *) a;
char b_c = *(char *) b;
unsigned short a_s = (unsigned short) a_c;
int a_i = (int) a_s;
int b_i = (int) b_c;
int r_i = a_i - b_i;
* After pattern replacement
a_s = (unsigned short) a_c;
a_i = (int) a_s;
patt_b_s = (unsigned short) b_c; // b_i = (int) b_c
patt_b_i = (int) patt_b_s; // b_i = (int) b_c
patt_r_s = widen_minus(a_c, b_c); // r_i = a_i - b_i
patt_r_i = (int) patt_r_s; // r_i = a_i - b_i
The definitions of a_i(original statement) and b_i(pattern statement)
are related to, but actually not part of widen_minus pattern.
Vectorizing the pattern does not cause these definition statements to
be marked as PURE_SLP. For this case, we need to recursively check
whether their uses are all absorbed into vectorized code. But there
is an exception that some use may participate in an vectorized
operation via an external SLP node containing that use as an element.
gcc/ChangeLog:
PR tree-optimization/113091
* tree-vect-slp.cc (vect_slp_has_scalar_use): New function.
(vect_bb_slp_mark_live_stmts): New parameter scalar_use_map, check
scalar use with new function.
(vect_bb_slp_mark_live_stmts): New function as entry to existing
overriden functions with same name.
(vect_slp_analyze_operations): Call new entry function to mark
live statements.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/bb-slp-pr113091.c: New test.
As PR113404 mentioned: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113404
We have ICE when we enable RVV in big-endian mode:
during RTL pass: expand
a-float-point-dynamic-frm-66.i:2:14: internal compiler error: in to_constant, at poly-int.h:588
0xab4c2c poly_int<2u, unsigned short>::to_constant() const
/repo/gcc-trunk/gcc/poly-int.h:588
0xab4de1 poly_int<2u, unsigned short>::to_constant() const
/repo/gcc-trunk/gcc/tree.h:4055
0xab4de1 default_function_arg_padding(machine_mode, tree_node const*)
/repo/gcc-trunk/gcc/targhooks.cc:844
0x12e2327 locate_and_pad_parm(machine_mode, tree_node*, int, int, int, tree_node*, args_size*, locate_and_pad_arg_data*)
/repo/gcc-trunk/gcc/function.cc:4061
0x12e2aca assign_parm_find_entry_rtl
/repo/gcc-trunk/gcc/function.cc:2614
0x12e2c89 assign_parms
/repo/gcc-trunk/gcc/function.cc:3693
0x12e59df expand_function_start(tree_node*)
/repo/gcc-trunk/gcc/function.cc:5152
0x112fafb execute
/repo/gcc-trunk/gcc/cfgexpand.cc:6739
Report users that we don't support RVV in big-endian mode for the following reasons:
1. big-endian in RISC-V is pretty rare case.
2. We didn't test RVV in big-endian and we don't have enough time to test it since it's stage 4 now.
Naive disallow RVV in big-endian.
Tested no regression, ok for trunk ?
gcc/ChangeLog:
PR target/113404
* config/riscv/riscv.cc (riscv_override_options_internal): Report sorry
for RVV in big-endian mode.
gcc/testsuite/ChangeLog:
PR target/113404
* gcc.target/riscv/rvv/base/big_endian-1.c: New test.
* gcc.target/riscv/rvv/base/big_endian-2.c: New test.
As pointed out by the discussion in PR109705, the current
vect_long_mult effective target check on Power is broken.
This patch is to fix it accordingly.
With additional change by adding a guard vect_long_mult
in gcc.dg/vect/pr25413a.c, it's tested well on Power{8,9}
LE & BE (also on Power10 LE as before).
PR testsuite/109705
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_vect_long_mult):
Fix powerpc*-*-* checks.
gcc/analyzer/ChangeLog:
PR analyzer/106229
* analyzer.h (compare_constants): New decl.
* constraint-manager.cc (compare_constants): Make non-static.
* sm-taint.cc: Add include "fold-const.h".
(class concrete_range): New.
(get_possible_range): New.
(index_can_be_out_of_bounds_p): New.
(region_model::check_region_for_taint): Reject
-Wanalyzer-tainted-array-index if the type of the value makes it
impossible for it to be out-of-bounds of the array.
gcc/testsuite/ChangeLog:
PR analyzer/106229
* c-c++-common/analyzer/taint-index-pr106229.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
In particular, accessing the result of *calloc (1, SZ) (if non-NULL)
should be known to be all zeroes.
gcc/analyzer/ChangeLog:
PR analyzer/113333
* region-model-manager.cc
(region_model_manager::maybe_fold_unaryop): Casting all zeroes
should give all zeroes.
gcc/testsuite/ChangeLog:
PR analyzer/113333
* c-c++-common/analyzer/calloc-1.c: Add tests.
* c-c++-common/analyzer/pr96639.c: Update expected results.
* gcc.dg/analyzer/data-model-9.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Here we started crashing with r14-1659 because that removed the
auto checking in cp_parser_template_type_arg which seemed like
dead code. But the attached test shows that the code can still
be reached because cp_parser_type_id_1 checks auto only when
auto_is_implicit_function_template_parm_p is on.
Then I noticed that we're still crashing in C++20, and that ICE
started with r12-4772. So I changed the reemerged check to use
flag_concepts_ts rather than flag_concepts on the basis that
check_auto_in_tmpl_args also checks flag_concepts_ts.
PR c++/110065
gcc/cp/ChangeLog:
* parser.cc (cp_parser_template_type_arg): Add auto checking.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/auto8.C: New test.
* g++.dg/concepts/auto8a.C: New test.
It is time to add myself to DCO section for my quicinc email account.
ChangeLog:
* MAINTAINERS (DCO): Add myself.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Since partial specializations can't be named directly, their access
when declared at class scope is irrelevant, so we shouldn't have to set
their TREE_PRIVATE / TREE_PROTECTED in maybe_new_partial_specialization
(which is used only for constrained partial specializations anyway).
This code was added by r10-4833-gcce3c9db9e6ffa for PR92078, but it
seems better to just disable the access consistency check for partial
specializations, which lets us accept the below testcase.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_check_access_in_redeclaration): Don't
check access for a partial or explicit specialization.
* pt.cc (maybe_new_partial_specialization): Don't set TREE_PRIVATE
or TREE_PROTECTED on the newly created partial specialization.
gcc/testsuite/ChangeLog:
* g++.dg/template/partial-specialization14.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
Here we neglect to emit the definitions of A<double>::f2 and A<double*>::f4
despite the explicit instantiations ultimately because TREE_PUBLIC isn't
set on the corresponding partial specializations, whose declarations are
created from maybe_new_partial_specialization which is responsible for
disambiguating them from the first and third partial specializations (which
have the same class-head but different constraints). This makes grokfndecl
in turn clear TREE_PUBLIC for f2 and f4 as if they have internal linkage.
This patch fixes this by setting TREE_PUBLIC appropriately for such partial
specializations.
PR c++/104634
gcc/cp/ChangeLog:
* pt.cc (maybe_new_partial_specialization): Propagate TREE_PUBLIC
to the newly created partial specialization.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-explicit-inst6.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
The get_target_expr call added in r12-7069-g119cea98f66476 causes us
for the below testcase to call build_vec_delete in a template context,
which builds a templated destructor call and checks expr_noexcept_p for
it, which ICEs because the call has templated form.
Much of the work of build_vec_delete however is code generation and thus
will just get discarded in a template context, and that includes the
code guarded by expr_noexcept_p. So this patch narrowly fixes this ICE
by eliding the expr_noexcept_p call when in a template context.
PR c++/109899
gcc/cp/ChangeLog:
* init.cc (build_vec_delete_1): Assume expr_noexcept_p returns
false in a template context.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-array21.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
The recursively defined constraints on _Variadic_union's user-defined
destructor (used for maintaining trivial destructibility of the variant
iff all of its alternatives are) turn out to require a template
instantiation depth of 3x the number of variants in C++20 mode, with the
instantiation stack looking like
...
_Variadic_union<B, C, ...>
std::is_trivially_destructible_v<_Variadic_union<B, C, ...>>
_Variadic_union<A, B, C, ...>::~_Variadic_union()
_Variadic_union<A, B, C, ...>
...
Ideally the template depth should be ~equal to the number of variants
(plus a constant). Luckily it seems we don't need to compute trivial
destructibility of the alternatives at all from _Variadic_union, since
its only user _Variant_storage already has that information. To that
end this patch removes these recursive constraints and instead passes
this information down from _Variant_storage. After this patch, the
template instantiation depth for 87619.cc in C++20 mode is ~270 instead
of ~780.
libstdc++-v3/ChangeLog:
* include/std/variant (__detail::__variant::_Variadic_union):
Add bool __trivially_destructible template parameter.
(__detail::__variant::_Variadic_union::~_Variadic_union):
Use __trivially_destructible in constraints instead.
(__detail::__variant::_Variant_storage): Pass
__trivially_destructible value to _Variadic_union.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
--save-temps is needed to scan assembly outputs for assemble, link and
run tests. Not all compile tests need --save-temps unless they used to
trigger GCC bugs. Run --save-temps from compile tests if not needed.
PR testsuite/113369
* g++.dg/abi/ref-temp1.C: Remove --save-temps.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
* gcc.dg/debug/dwarf2/pr111080.c: Likewise.
* gcc.dg/debug/dwarf2/pr47939-1.c: Likewise.
* gcc.dg/debug/dwarf2/pr47939-2.c: Likewise.
* gcc.dg/debug/dwarf2/pr47939-3.c: Likewise.
* gcc.dg/debug/dwarf2/pr47939-4.c: Likewise.
When using a compiler that doesn't define __cpp_conditional_explicit
there's a redefinition error for tuple::__nothrow_assignable. This is
because it's defined in different places for the pre-C++20 and C++20
implementations, which are controled by different preprocessor
conditions. For certain combinations of C++20 feature test macros it's
possible for both __nothrow_assignable definitions to be in scope.
Move the pre-C++20 __assignable and __nothrow_assignable definitions adjacent to
their use, so that only one set of definitions is visible for any given
set of feature test macros.
libstdc++-v3/ChangeLog:
PR libstdc++/108822
* include/std/tuple (__assignable, __is_nothrow_assignable):
Move pre-C++20 definitions adjacent to their use.
There's an error for -fconcepts-ts due to using a concept where a bool
NTTP is required, which is fixed by using the vraiable template that
already exists in the class scope.
This doesn't fix the problem with -fconcepts-ts as changes to the
placement of attributes is also needed.
libstdc++-v3/ChangeLog:
PR testsuite/113366
* include/std/format (basic_format_arg): Use __formattable
variable template instead of __format::__formattable_with
concept.
Import the new 2023d tzdata.zi file. The leapseconds file was also
updated to have a new expiry (no new leap seconds were added).
libstdc++-v3/ChangeLog:
* src/c++20/tzdata.zi: Import new file from 2023d release.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds)
Update expiry date for leap seconds list.
The ICE on this testcase was fixed by r14-7141.
2024-01-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/113048
* gcc.target/i386/pr113048.c: New test.
This patch adds C intrinsics for Bitmanip Extension.
RISCV_BUILTIN_NO_PREFIX is a new riscv_builtin_description like RISCV_BUILTIN.
But it uses CODE_FOR_##INSN rather than CODE_FOR_riscv_##INSN.
Changed orcb, clmul, brev8 pattern's mode form X to GPR because orcbsi, clmul_si,
brev8_si are both included in rv32 and rv64. Test them in scalar_bitmanip_intrinsic-64-emulated.c.
gcc/ChangeLog:
* config.gcc: Include riscv_bitmanip.h.
* config/riscv/bitmanip.md: Changed mode form X to GPR in orcb and clmul pattern.
* config/riscv/crypto.md: Changed mode form X to GPR in brev8 pattern.
* config/riscv/riscv-builtins.cc (AVAIL): Adding new bitmanip builtins.
(RISCV_BUILTIN_NO_PREFIX): New helper macro.
* config/riscv/riscv-cmo.def (RISCV_BUILTIN): Add '_32'/'_64' postfix to builtins.
* config/riscv/riscv-ftypes.def (2): New ftypes.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): New builtins.
(RISCV_BUILTIN_NO_PREFIX): Likewise.
* config/riscv/riscv_bitmanip.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/scalar_bitmanip_intrinsic-32.c: New test.
* gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c: New test.
* gcc.target/riscv/scalar_bitmanip_intrinsic-64.c: New test.
This patch adds C intrinsics for Scalar Crypto Extension.
gcc/ChangeLog:
* config.gcc: Include riscv_crypto.h.
* config/riscv/riscv_crypto.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/scalar_crypto_intrinsic-32.c: New test.
* gcc.target/riscv/scalar_crypto_intrinsic-64.c: New test.
My recent patch for PR112918 triggered a hidden bug in LRA on MIPS. A
pseudo is matched to a register constraint and assigned to a hard
registers at the first constraint sub-pass but later it is matched to
X constraint. Keeping this pseudo in the register (MD0) prevents to
use the same register for another pseudo in the insn and this results
in LRA failure. The patch fixes this by spilling the pseudo at the
constraint subpass when the chosen alternative constraint not require
hard register anymore.
gcc/ChangeLog:
PR middle-end/113354
* lra-constraints.cc (curr_insn_transform): Spill pseudo only used
in the insn if the corresponding operand does not require hard
register anymore.
driver-avr.cc contains a spec that discriminates bwtween cores
and devices by means of a mmcu=avr* spec pattern. This does not
work for new devices like AVR128* which also start with mmcu=avr
like all cores do. The patch uses a new spec function in order to
tell apart cores from devices.
gcc/
PR target/107201
* config/avr/avr.h (EXTRA_SPEC_FUNCTIONS): Add no-devlib, avr_no_devlib.
* config/avr/driver-avr.cc (avr_no_devlib): New function.
(avr_devicespecs_file): Use it to remove -nodevicelib from the
options for cores only.
* config/avr/avr-arch.h (avr_get_parch): New prototype.
* config/avr/avr-devices.cc (avr_get_parch): New function.
This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is
not defined in dec_waiting_unlocked function. As io.h does
not include async.h, the WRLOCK and RWUNLOCK macros are
undefined.
libgfortran/ChangeLog:
* io/io.h (dec_waiting_unlocked): Use
__gthread_rwlock_wrlock/__gthread_rwlock_unlock or
__gthread_mutex_lock/__gthread_mutex_unlock functions
to replace WRLOCK and RWUNLOCK macros.
Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>
This patch fixes -70% performance drop from GCC-13.2 to GCC-14 with -march=rv64gcv in real hardware.
The root cause is incorrect cost model cause inefficient vectorization which makes us performance drop significantly.
So this patch does:
1. Adjust vector to scalar cost by introducing v to scalar reg move.
2. Adjust vec_construct cost since we does spend NUNITS instructions to construct the vector.
Tested on both RV32/RV64 no regression, Rebase to the trunk and commit it as it is approved by Robin.
PR target/113247
gcc/ChangeLog:
* config/riscv/riscv-protos.h (struct regmove_vector_cost): Add vector to scalar regmove.
* config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Ditto.
* config/riscv/riscv.cc (riscv_builtin_vectorization_cost): Adjust vec_construct cost.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/reduc-19.c: Adapt test.
* gcc.target/riscv/rvv/autovec/vls/reduc-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/reduc-21.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-4.c: New test.
Rebase in v3: Rebase to the trunk and commit it as it's approved by Robin.
Update in v2: Add dynmaic lmul test.
This patch fixes the regression between GCC 13.2.0 and trunk GCC (GCC-14)
GCC 13.2.0:
lui a5,%hi(a)
li a4,19
sb a4,%lo(a)(a5)
li a0,0
ret
Trunk GCC:
vsetvli a5,zero,e8,mf2,ta,ma
li a4,-32768
vid.v v1
vsetvli zero,zero,e16,m1,ta,ma
addiw a4,a4,104
vmv.v.i v3,15
lui a1,%hi(a)
li a0,19
vsetvli zero,zero,e8,mf2,ta,ma
vadd.vi v1,v1,1
sb a0,%lo(a)(a1)
vsetvli zero,zero,e16,m1,ta,ma
vzext.vf2 v2,v1
vmv.v.x v1,a4
vminu.vv v2,v2,v3
vsrl.vv v1,v1,v2
vslidedown.vi v1,v1,17
vmv.x.s a0,v1
snez a0,a0
ret
The root cause we are vectorizing the codes inefficiently since we doesn't cost len when NITERS < VF.
Leverage loop control of mask targets or rs6000 fixes the regression.
Tested no regression. Ok for trunk ?
PR target/113281
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (costs::adjust_vect_cost_per_loop): New function.
(costs::finish_cost): Adjust cost for LOOP LEN with NITERS < VF.
* config/riscv/riscv-vector-costs.h: New function.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c: New test.
The following avoids splitting an edge before redirecting it. This
allows the loop father of the new block to be correct in the first
place.
PR tree-optimization/113385
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
First redirect, then split the exit edge.
Notice the m_num_vector_iterations is not used, remove the redundant codes.
Committed.
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (costs::analyze_loop_vinfo):
Remove m_num_vector_iterations.
* config/riscv/riscv-vector-costs.h: Ditto.
Multilib options -mdouble= and -mlong-double= are not orthogonal:
TARGET_HANDLE_OPTION = avr-common.cc::avr_handle_option() sets them
such that sizeof(double) <= sizeof(long double) is always true.
gcc/
PR target/113156
* config/avr/avr.opt (-mdouble, -mlong-double): Add "Save" flag.
(-mbranch-cost): Set "Optimization" flag.
The INTEGER_CST code uses the remainder bits in computations whether to use
whole constant or just part of it and extend it at runtime, and furthermore
uses it to avoid using all bits even when using the (almost) whole constant.
The problem is that the prec % (2 * limb_prec) computation it uses is
appropriate only for the normal lowering of mergeable operations (where
we process 2 limbs at a time in a loop starting with least significant
limbs and process the remaining 0-2 limbs after the loop (there with
constant indexes). For that case it is ok not to emit the upper
prec % (2 * limb_prec) bits into the constant, because those bits will be
extracted using INTEGER_CST idx and so will be used directly in the
statements as INTEGER_CSTs.
For other cases, where we either process just a single limb in a loop,
process it downwards (e.g. non-equality comparisons) or with some runtime
addends (some shifts), there is either just at most one limb lowered with
INTEGER_CST idx after the loop (e.g. for right shift) or before the loop
(e.g. non-equality comparisons), or all limbs are processed with
non-INTEGER_CST indexes (e.g. for left shift, when m_var_msb is set).
Now, the m_var_msb case is already handled through
if (m_var_msb)
type = TREE_TYPE (op);
else
/* If we have a guarantee the most significant partial limb
(if any) will be only accessed through handle_operand
with INTEGER_CST idx, we don't need to include the partial
limb in .rodata. */
type = build_bitint_type (prec - rem, 1);
but for the right shifts or comparisons the prec - rem when rem was
prec % (2 * limb_prec) was incorrect, so the following patch fixes it
to use remainder for 2 limbs only if m_upwards_2limb and remainder for
1 limb otherwise.
2024-01-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113370
* gimple-lower-bitint.cc (bitint_large_huge::handle_operand): Only
set rem to prec % (2 * limb_prec) if m_upwards_2limb, otherwise
set it to just prec % limb_prec.
* gcc.dg/torture/bitint-48.c: New test.
This patch fixes the following FAILs:
Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.c-torture/execute/pr68532.c -O0 execution test
FAIL: gcc.c-torture/execute/pr68532.c -O1 execution test
FAIL: gcc.c-torture/execute/pr68532.c -O2 execution test
FAIL: gcc.c-torture/execute/pr68532.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
FAIL: gcc.c-torture/execute/pr68532.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr68532.c -Os execution test
FAIL: gcc.c-torture/execute/pr68532.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test
Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/vect/pr60196-1.c execution test
FAIL: gcc.dg/vect/pr60196-1.c -flto -ffat-lto-objects execution test
Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/vect/pr60196-1.c execution test
FAIL: gcc.dg/vect/pr60196-1.c -flto -ffat-lto-objects execution test
Running target riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/vect/pr60196-1.c execution test
FAIL: gcc.dg/vect/pr60196-1.c -flto -ffat-lto-objects execution test
The root cause is attributes of ternary intructions are incorrect which cause AVL prop PASS and VSETVL PASS behave
incorrectly.
Tested no regression and committed.
PR target/113393
gcc/ChangeLog:
* config/riscv/vector.md: Fix ternary attributes.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr113393-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr113393-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr113393-3.c: New test.
PIC/abicalls option will generate some GOT operation, and some
`ld/sd` instructions are used.
Let's skip them.
gcc/testsuite
* gcc.target/mips/unaligned-2.c: Add -mno-abicalls option.
hppa*-*-hpux* doesn't have strdup or strndup.
2024-01-14 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* gcc.dg/builtin-object-size-1.c: Disable tests for strdup/strndup
on __hpux__.
* gcc.dg/builtin-object-size-2.c: Likewise.
* gcc.dg/builtin-object-size-3.c: Likewise.
* gcc.dg/builtin-object-size-4.c: Likewise.