Commit Graph

214837 Commits

Author SHA1 Message Date
Jakub Jelinek
cd5535494c genmatch: Fix build on hppa64-hpux [PR117348]
Apparently autoconf defines the HAVE_DECL_* macros to 0
rather than not defining them at all, so defined(HAVE_DECL_FMEMOPEN)
test doesn't do much.

The following patch fixes it by testing HAVE_DECL_FMEMOPEN
for being non-zero instead.

2024-10-30  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/117348
	* genmatch.cc: Replace defined(HAVE_DECL_FMEMOPEN)
	test with HAVE_DECL_FMEMOPEN.
2024-10-30 09:58:26 +01:00
Paul Thomas
6f0f202b9f Fortran: Move pr115070.f90 to ieee directory [PR117335].
2024-10-30  Paul Thomas  <pault@gcc.gnu.org>

gcc/testsuite/
	PR fortran/117335
	* gfortran.dg/pr115070.f90: Delete.
	* gfortran.dg/ieee/pr115070.f90: Moved to ieee directory to
	prevent failures on incompatible architectures.
2024-10-30 07:50:37 +00:00
Uros Bizjak
ee09fcc4e3 i386: Use assign_stack_temp instead of assign_386_stack_local with SLOT_TEMP
It is better to use assign_stack_temp instead of assign_386_stack_local
with SLOT_TEMP because assign_stack_temp also shares sub-space of stack
slots (e.g. HImode temp shares stack slot with SImode stack slot).

Use assign_386_stack_local only for special stack slots (SLOT_STV_TEMP that
can be nested inside other stack temp access, SLOT_FLOATxFDI_387 that has
relaxed alignment constraint) or slots that can't be shared (SLOT_CW_*).

The patch removes SLOT_TEMP. assign_stack_temp should be used instead.

	gcc/ChangeLog:

	* config/i386/i386.h (enum ix86_stack_slot): Remove SLOT_TEMP.
	* config/i386/i386-expand.cc (ix86_expand_builtin)
	<case IX86_BUILTIN_LDMXCSR>: Use assign_stack_temp instead of
	assign_386_stack_local with SLOT_TEMP.
	<case IX86_BUILTIN_LDMXCSR>: Ditto.
	(ix86_expand_divmod_libfunc): Ditto.
	* config/i386/i386.md (floatunssi<mode>2): Ditto.
	* config/i386/sync.md (atomic_load<mode>): Ditto.
	(atomic_store<mode>): Ditto.
2024-10-30 08:17:50 +01:00
Jakub Jelinek
abcfe1e51c c: Add C2Y N3370 - Case range expressions support [PR117021]
The following patch adds the C2Y N3370 paper support.
We had the case ranges as a GNU extension for decades, so this patch
simply:
1) adds different diagnostics when it is used in C (depending on flag_isoc2y
   and pedantic and warn_c23_c2y_compat)
2) emits a pedwarn in C if in a range conversion changes the value of
   the low or high bounds and in that case doesn't emit -Woverflow and
   similar warnings anymore if the pedwarn has been diagnosed
3) changes the handling of empty ranges both in C and C++; previously
   we just warned but let the values be still looked up in the splay
   tree/entered into it (and let only gimplification throw away those
   empty cases), so e.g. case -6 ... -8: break; case -6: break;
   complained about duplicate case label.  But that actually isn't
   duplicate case label, case -6 ... -8: stands for nothing at all
   and that is how it is treated later on (thrown away)

2024-10-30  Jakub Jelinek  <jakub@redhat.com>

	PR c/117021
gcc/c-family/
	* c-common.cc (c_add_case_label): Emit different diagnostics for C
	on case ranges.  Diagnose for C using pedwarn conversions of range
	expressions changing value and don't emit further conversion
	diagnostics if the pedwarn has been diagnosed.  For empty ranges
	bail out after emitting warning, don't add anything into splay
	trees nor add a CASE_LABEL_EXPR.
gcc/testsuite/
	* gcc.dg/switch-6.c: Expect different diagnostics.  Add -std=gnu23
	to dg-options.
	* gcc.dg/switch-7.c: Expect different diagnostics.  Add -std=c23
	to dg-options.
	* gcc.dg/gnu23-switch-1.c: New test.
	* gcc.dg/gnu23-switch-2.c: New test.
	* gcc.dg/c23-switch-1.c: New test.
	* gcc.dg/c2y-switch-1.c: New test.
	* gcc.dg/c2y-switch-2.c: New test.
	* gcc.dg/c2y-switch-3.c: New test.
2024-10-30 07:59:52 +01:00
Haochen Jiang
1208686523 testsuite: Adjust AVX10.2 check_effective_target
Since Binutils haven't fully merged all AVX10.2 insts, only testing
one inst/intrin in AVX10.2 is never sufficient for check_effective_target.
Like APX_F, use inline asm to do the target check.

gcc/testsuite/ChangeLog:

	PR target/117301
	* lib/target-supports.exp (check_effective_target_avx10_2):
	Use inline asm instead of intrin for check_effective_target.
	(check_effective_target_avx10_2_512): Ditto.
2024-10-30 10:38:59 +08:00
xuli
179a682d04 RISC-V: Add testcases for unsigned .SAT_SUB form 2 with IMM = 1.
form2:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
{                                       \
  return x >= (T)IMM ? x - (T)IMM : 0;  \
}

Passed the rv64gcv regression test.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat_u_sub_imm-run-5.c: add run case for imm=1.
	* gcc.target/riscv/sat_u_sub_imm-run-6.c: Ditto.
	* gcc.target/riscv/sat_u_sub_imm-run-7.c: Ditto.
	* gcc.target/riscv/sat_u_sub_imm-run-8.c: Ditto.
	* gcc.target/riscv/sat_u_sub_imm-5_3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-6_3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-7_3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-8_1.c: New test.
2024-10-30 00:57:26 +00:00
xuli
4af8db3eca Match: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).
When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating
a branch instruction.This simplification also applies to signed integer.

Form2:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
{                                       \
  return x >= (T)IMM ? x - (T)IMM : 0;  \
}

Take below form 2 as example:
DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (x_2(D) != 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 3> [local count: 536870912]:
  _3 = x_2(D) + 255;

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <x_2(D)(2), _3(3)>
  return _1;

}

Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
	beq	a0,zero,.L2
	addiw	a0,a0,-1
	andi	a0,a0,0xff
.L2:
	ret

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
  _Bool _1;
  unsigned char _2;
  uint8_t _4;

  <bb 2> [local count: 1073741824]:
  _1 = x_3(D) != 0;
  _2 = (unsigned char) _1;
  _4 = x_3(D) - _2;
  return _4;

}

Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
	snez	a5,a0
	subw	a0,a0,a5
	andi	a0,a0,0xff
	ret

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>

gcc/ChangeLog:

	* match.pd: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi-opt-44.c: New test.
	* gcc.dg/tree-ssa/phi-opt-45.c: New test.
2024-10-30 00:57:22 +00:00
GCC Administrator
17643e5a68 Daily bump. 2024-10-30 00:19:35 +00:00
Andi Kleen
220e0570f0 Revert "Simplify switch bit test clustering algorithm"
This reverts commit 3d06e9c3e0.
2024-10-29 16:41:57 -07:00
David Malcolm
0b73e9382a diagnostics: support multiple output formats simultaneously [PR116613]
This patch generalizes diagnostic_context so that rather than having
a single output format, it has a vector of zero or more.

It adds new two options:
 -fdiagnostics-add-output=DIAGNOSTICS-OUTPUT-SPEC
 -fdiagnostics-set-output=DIAGNOSTICS-OUTPUT-SPEC
which both take a new configuration syntax of the form SCHEME ("text" or
"sarif"), optionally followed by ":" and one or more KEY=VALUE pairs,
in this form:

  <SCHEME>
  <SCHEME>:<KEY>=<VALUE>
  <SCHEME>:<KEY>=<VALUE>,<KEY2>=<VALUE2>
  ...etc

where each SCHEME supports some set of keys.  For example, it's now
possible to use:

  -fdiagnostics-add-output=sarif:version=2.1,file=foo.2.1.sarif \
  -fdiagnostics-add-output=sarif:version=2.2-prerelease,file=foo.2.2.sarif

to add a pair of outputs, each writing to a different file, using
versions 2.1 and 2.2 of the SARIF standard respectively, whilst also
emitting the classic text form of the diagnostics to stderr.

I hope the new syntax gives us room to potentially add new kinds of
output sink in the future (e.g. RPC notifications), and to add new
key/value pairs as needed by the different sinks.

Implementation-wise, the diagnostic_context's m_printer which previously
was used directly by the single output format now becomes a "reference
printer", created by the client (such as the frontend), with defaults
modified by command-line options.  Each of the multiple output sinks has
its own pretty_printer instance, created by cloning the context's
reference printer.

gcc/ChangeLog:
	PR other/116613
	* Makefile.in (OBJS-libcommon-target): Add opts-diagnostic.o.
	* common.opt (fdiagnostics-add-output=): New.
	(fdiagnostics-set-output=): New.
	(diagnostics_output_format): Drop sarif-file-2.2-prerelease from
	enum.
	* common.opt.urls: Regenerate.
	* diagnostic-buffer.h (diagnostic_buffer::~diagnostic_buffer): New.
	(diagnostic_buffer::ensure_per_format_buffer): Rename to...
	(diagnostic_buffer::ensure_per_format_buffers): ...this.
	(diagnostic_buffer::m_per_format_buffer): Replace with...
	(diagnostic_buffer::m_per_format_buffers): ...this, updating type.
	* diagnostic-format-json.cc (json_output_format::update_printer):
	New.
	(json_output_format::follows_reference_printer_p): New.
	(diagnostic_output_format_init_json): Drop redundant call to
	set_path_format, as this is not a text output format.
	* diagnostic-format-sarif.cc: Include "diagnostic-format-text.h".
	(sarif_builder::set_printer): New.
	(sarif_builder::sarif_builder): Add "printer" param and use it for
	m_printer.
	(sarif_builder::make_location_object::escape_nonascii_renderer::render):
	Rather than using dc.m_printer, create a
	diagnostic_text_output_format instance and use its printer.
	(sarif_output_format::follows_reference_printer_p): New.
	(sarif_output_format::update_printer): New.
	(sarif_output_format::sarif_output_format): Pass in correct
	printer to m_builder's ctor.
	(diagnostic_output_format_init_sarif): Drop redundant call to
	set_path_format, as this is not a text output format.  Replace
	calls to pp_show_color and set_token_printer with call to
	update_printer.  Drop redundant call to set_show_highlight_colors,
	as this printer does not show colors.
	(diagnostic_output_format_init_sarif_file): Split out file opening
	into...
	(diagnostic_output_format_open_sarif_file): ...this new function.
	(make_sarif_sink): New.
	(selftest::test_make_location_object): Provide a pp for the
	builder.
	* diagnostic-format-sarif.h
	(diagnostic_output_format_open_sarif_file): New decl.
	(make_sarif_sink): New decl.
	* diagnostic-format-text.cc (diagnostic_text_output_format::dump):
	Dump sm_follows_reference_printer.
	(diagnostic_text_output_format::on_report_verbatim): New.
	(diagnostic_text_output_format::follows_reference_printer_p): New.
	(diagnostic_text_output_format::update_printer): New.
	* diagnostic-format-text.h
	(diagnostic_text_output_format::diagnostic_text_output_format):
	Add optional "follows_reference_printer" param.
	(diagnostic_text_output_format::on_report_verbatim): New decl.
	(diagnostic_text_output_format::after_diagnostic): Drop "final".
	(diagnostic_text_output_format::follows_reference_printer_p): New
	decl.
	(class diagnostic_text_output_format): Convert private members to
	protected.
	(diagnostic_text_output_format::m_follows_reference_printer): New
	field.
	* diagnostic-format.h
	(diagnostic_output_format::on_report_verbatim): New vfunc.
	(diagnostic_output_format::follows_reference_printer_p): New vfunc.
	(diagnostic_output_format::update_printer): New vfunc.
	(diagnostic_output_format::get_printer): Use m_printer rather than
	a printer from m_context.
	(diagnostic_output_format::diagnostic_output_format): Initialize
	m_printer by cloning the context's printer.
	(diagnostic_output_format::m_printer): New field.
	* diagnostic-global-context.cc (verbatim): Reimplement in terms of
	global_dc->report_verbatim, moving existing implementation to
	diagnostic_text_output_format::on_report_verbatim.
	(fnotice): Support multiple output sinks by using a new
	global_dc->supports_fnotice_on_stderr_p.
	* diagnostic-output-file.h
	(diagnostic_output_file::diagnostic_output_file): New default ctor.
	(diagnostic_output_file::operator=): Implement move assignment.
	* diagnostic-path.cc (selftest::test_interprocedural_path_1): Pass
	false for new param of text_output's ctor.
	* diagnostic-show-locus.cc
	(selftest::test_layout_x_offset_display_utf8): Use reference
	printer.
	(selftest::test_layout_x_offset_display_tab): Likewise.
	(selftest::test_one_liner_fixit_remove): Likewise.
	* diagnostic.cc: Include "pretty-print-urlifier.h".
	(diagnostic_set_caret_max_width): Update for global_dc's m_printer
	becoming reference printer.
	(diagnostic_context::initialize): Update for m_printer becoming
	m_reference_printer.  Use ::make_unique to create it.  Update for
	m_output_format becoming m_output_sinks.
	(diagnostic_context::color_init): Update the reference printer,
	then update the printers for any output sinks that follow it.
	(diagnostic_context::urls_init): Likewise.
	(diagnostic_context::finish): Update comment.  Update for
	m_output_format becoming m_output_sinks.  Update for m_printer
	becoming m_reference_printer and use "delete" on it rather than
	XDELETE.
	(diagnostic_context::dump): Update for m_printer becoming
	reference printer, and for multiple output sinks.
	(diagnostic_context::set_output_format): Reimplement for
	supporting multiple output sinks.
	(diagnostic_context::get_output_format): Likewise.
	(diagnostic_context::add_sink): New.
	(diagnostic_context::supports_fnotice_on_stderr_p): New.
	(diagnostic_context::set_pretty_printer): New.
	(diagnostic_context::refresh_output_sinks): New.
	(diagnostic_context::set_format_decoder): New.
	(diagnostic_context::set_show_highlight_colors): New.
	(diagnostic_context::set_prefixing_rule): New.
	(diagnostic_context::report_diagnostic): Update to support
	multiple output sinks.
	(diagnostic_context::report_verbatim): New.
	(diagnostic_context::emit_diagram): Update to support multiple
	output sinks.
	(diagnostic_context::error_recursion): Update to use
	m_reference_printer.
	(fancy_abort): Likewise.
	(diagnostic_context::end_group): Update to support multiple
	output sinks.
	(diagnostic_output_format::dump): Implement.
	(diagnostic_output_format::on_report_verbatim): Likewise.
	(diagnostic_output_format_init): Drop
	DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE.
	(diagnostic_context::set_diagnostic_buffer): Reimplement to
	support multiple output sinks.
	(diagnostic_context::clear_diagnostic_buffer): Likewise.
	(diagnostic_context::flush_diagnostic_buffer): Likewise.
	(diagnostic_buffer::diagnostic_buffer): Initialize
	m_per_format_buffers.
	(diagnostic_buffer::~diagnostic_buffer): New dtor.
	(diagnostic_buffer::dump): Reimplement to support multiple output
	sinks.
	(diagnostic_buffer::empty_p): Likewise.
	(diagnostic_buffer::move_to): Likewise.
	(diagnostic_buffer::ensure_per_format_buffer): Likewise, renaming
	to...
	(diagnostic_buffer::ensure_per_format_buffers): ...this.
	* diagnostic.h
	(DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE): Delete.
	(class diagnostic_context): Add friend class diagnostic_buffer.
	(diagnostic_context::set_pretty_printer): New decl.
	(diagnostic_context::refresh_output_sinks): New decl.
	(diagnostic_context::report_verbatim): New decl.
	(diagnostic_context::get_output_format): Drop.
	(diagnostic_context::set_show_highlight_colors): Drop body.
	(diagnostic_context::set_format_decoder): New decl.
	(diagnostic_context::set_prefixing_rule): New decl.
	(diagnostic_context::clone_printer): Reimplement.
	(diagnostic_context::get_reference_printer): New accessor.
	(diagnostic_context::add_sink): New decl.
	(diagnostic_context::supports_fnotice_on_stderr_p): New decl.
	(diagnostic_context::m_printer): Replace with...
	(diagnostic_context::m_reference_printer): ...this, and make
	private.
	(diagnostic_context::m_output_format): Replace with...
	(diagnostic_context::m_output_sinks): ...this.
	(diagnostic_format_decoder): Delete.
	(diagnostic_prefixing_rule): Delete.
	(diagnostic_ready_p): Delete.
	* doc/invoke.texi: Document -fdiagnostics-add-output= and
	-fdiagnostics-set-output=.
	* gcc.cc: Include "opts-diagnostic.h".
	(driver_handle_option): Handle cases OPT_fdiagnostics_add_output_
	and OPT_fdiagnostics_set_output_.
	* opts-diagnostic.cc: New file.
	* opts-diagnostic.h (handle_OPT_fdiagnostics_add_output_): New decl.
	(handle_OPT_fdiagnostics_set_output_): New decl.
	* opts-global.cc (init_options_once): Update for global_dc's
	m_printer becoming reference printer.  Call
	global_dc->refresh_output_sinks.
	* opts.cc (common_handle_option): Replace use of
	diagnostic_prefixing_rule with dc->set_prefixing_rule.  Handle
	cases OPT_fdiagnostics_add_output_ and
	OPT_fdiagnostics_set_output_.  Update for m_printer becoming
	reference printer.
	* selftest-diagnostic.cc
	(selftest::test_diagnostic_context::test_diagnostic_context):
	Update for m_printer becoming reference printer.
	(test_diagnostic_context::test_show_locus): Likewise.
	* selftest-run-tests.cc (selftest::run_tests): Call
	selftest::opts_diagnostic_cc_tests.
	* selftest.h (selftest::opts_diagnostic_cc_tests): New decl.
	* simple-diagnostic-path.cc
	(selftest::simple_diagnostic_path_cc_tests): Use reference
	printer.
	* toplev.cc (announce_function): Update for global_dc's m_printer
	becoming reference printer.
	(toplev::main): Likewise.
	* tree-diagnostic.cc (tree_diagnostics_defaults): Replace use of
	diagnostic_format_decoder with context->set_format_decoder.
	* tree-diagnostic.h
	(tree_dump_pretty_printer::tree_dump_pretty_printer): Update for
	global_dc's m_printer becoming reference printer.
	* tree.cc (escaped_string::escape): Likewise.
	(selftest::test_escaped_strings): Likewise.

gcc/ada/ChangeLog:
	PR other/116613
	* gcc-interface/misc.cc (internal_error_function): Update for
	m_printer becoming reference printer.

gcc/analyzer/ChangeLog:
	PR other/116613
	* analyzer-language.cc (on_finish_translation_unit): Update for
	m_printer becoming reference printer.
	* engine.cc (run_checkers): Likewise.
	* program-point.cc (function_point::print_source_line): Likewise.

gcc/c-family/ChangeLog:
	PR other/116613
	* c-format.cc (selftest::test_type_mismatch_range_labels): Update
	for m_printer becoming reference printer.
	(selftest::test_type_mismatch_range_labels): Likewise.

gcc/c/ChangeLog:
	PR other/116613
	* c-objc-common.cc: Include "make-unique.h".
	(c_initialize_diagnostics): Use unique_ptr for pretty_printer.
	Use context->set_format_decoder.

gcc/cp/ChangeLog:
	PR other/116613
	* error.cc (cxx_initialize_diagnostics): Use unique_ptr for
	pretty_printer.  Use context->set_format_decoder.
	* module.cc (noisy_p): Update for global_dc's m_printer becoming
	reference printer.

gcc/d/ChangeLog:
	PR other/116613
	* d-diagnostic.cc (d_diagnostic_report_diagnostic): Update for
	m_printer becoming reference printer.

gcc/fortran/ChangeLog:
	PR other/116613
	* error.cc (gfc_diagnostic_build_kind_prefix): Update for
	global_dc's m_printer becoming reference printer.
	(gfc_diagnostics_init): Replace usage of diagnostic_format_decoder
	with global_dc->set_format_decoder.

gcc/jit/ChangeLog:
	PR other/116613
	* dummy-frontend.cc: Include "make-unique.h".
	(class jit_diagnostic_listener): New.
	(jit_begin_diagnostic): Update comment.
	(jit_end_diagnostic): Drop call to add_diagnostic.
	(jit_langhook_init): Set the output format to a new
	jit_diagnostic_listener.
	* jit-playback.cc (playback::context::add_diagnostic): Add "text"
	param and use that rather than trying to get the text from a
	pretty_printer.
	* jit-playback.h (playback::context::add_diagnostic): Add "text"
	param.

gcc/testsuite/ChangeLog:
	PR other/116613
	* gcc.dg/plugin/analyzer_cpython_plugin.c (dump_refcnt_info):
	Update for global_dc's m_printer becoming reference printer.
	* gcc.dg/plugin/crash-test-ice-in-header-sarif-2.2.c: Replace usage
	of -fdiagnostics-format=sarif-file-2.2-prerelease with
	-fdiagnostics-set-output=sarif:version=2.2-prerelease.
	* gcc.dg/plugin/diagnostic_plugin_test_paths.c: Update for
	global_dc's m_printer becoming reference printer.
	* gcc.dg/plugin/diagnostic_plugin_xhtml_format.c: Update for
	changes to output formats.
	* gcc.dg/plugin/expensive_selftests_plugin.c: Update for
	global_dc's m_printer becoming reference printer.
	* gcc.dg/sarif-output/add-output-sarif-defaults.c: New test.
	* gcc.dg/sarif-output/bad-binary-op.c: New test.
	* gcc.dg/sarif-output/bad-binary-op.py: New support script.
	* gcc.dg/sarif-output/multiple-outputs.c: New test.
	* gcc.dg/sarif-output/multiple-outputs.py: New support script.
	* lib/scansarif.exp (verify-sarif-file): Add an optional second
	argument specifying the expected filename of the .sarif file.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-10-29 19:12:02 -04:00
Andrew Pinski
3d8cd34a45 aarch64: Use canonicalize_comparison in ccmp expansion [PR117346]
While testing the patch for PR 85605 on aarch64, it was noticed that
imm_choice_comparison.c test failed. This was because canonicalize_comparison
was not being called in the ccmp case. This can be noticed without the patch
for PR 85605 as evidence of the new testcase.

Bootstrapped and tested on aarch64-linux-gnu.

	PR target/117346

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_gen_ccmp_first): Call
	canonicalize_comparison before figuring out the cmp_mode/cc_mode.
	(aarch64_gen_ccmp_next): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/imm_choice_comparison-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-29 15:34:45 -07:00
Andi Kleen
3d06e9c3e0 Simplify switch bit test clustering algorithm
The current switch bit test clustering enumerates all possible case
clusters combinations to find ones that fit the bit test constrains
best.  This causes performance problems with very large switches.

For bit test clustering which happens naturally in word sized chunks
I don't think such an expensive algorithm is really needed.

This patch implements a simple greedy algorithm that walks
the sorted list and examines word sized windows and tries
to cluster them.

Surprisingly the new algorithm gives consistly better clusters
for the examples I tried.

For example from the gcc bootstrap:

old: 0-15 16-31 96-175
new: 0-31 96-175

I'm not fully sure why that is, probably some bug in the old
algorithm? This shows even up in the test suite where if-to-switch-6
now can generate a switch, as well as a case in switch-1.c

I don't have a proof that the new algorithm is always as good or better,
but so far at least I don't see any counter examples.

It also fixes the excessive compile time in PR117091,
however this was already fixed by an earlier patch
that doesn't run clustering when no targets have multiple
values.

gcc/ChangeLog:

	PR middle-end/117091
	* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests):
	Change clustering algorithm to simple greedy.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/if-to-switch-6.c: Allow condition chain.
	* gcc.dg/tree-ssa/switch-1.c: Allow more bit tests.
	* gcc.dg/pr21643.c: Use -fno-bit-tests
	* gcc.target/aarch64/pr99988.c: Use -fno-bit-tests
2024-10-29 15:10:23 -07:00
Andi Kleen
a4e2b13888 Only do switch bit test clustering when multiple labels point to same bb
The bit cluster code generation strategy is only beneficial when
multiple case labels point to the same code. Do a quick check if
that is the case before trying to cluster.

This fixes the switch part of PR117091 where all case labels are unique
however it doesn't address the performance problems for non unique
cases.

gcc/ChangeLog:

	PR middle-end/117091
	* gimple-if-to-switch.cc (if_chain::is_beneficial): Update
	find_bit_test call.
	* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests):
	Get max_c argument and bail out early if all case labels are
	unique.
	(switch_decision_tree::compute_cases_per_edge): Record number of
	targets per label and return.
	(switch_decision_tree::analyze_switch_statement): ... pass to
	find_bit_tests.
	* tree-switch-conversion.h: Update prototypes.
2024-10-29 15:08:11 -07:00
Andi Kleen
06bc3a734e Disable -fbit-tests and -fjump-tables at -O0
gcc/ChangeLog:

	* common.opt: Enable -fbit-tests and -fjump-tables only at -O1.
	* opts.cc (default_options_table): Dito.
2024-10-29 15:08:00 -07:00
Eric Botcazou
7211155732 Fix miscompilation of function containing __builtin_unreachable
This is a wrong-code generation on the SPARC for a function containing
a call to __builtin_unreachable caused by the delay slot scheduling pass,
and more specifically the find_end_label function which has these lines:

  /* Otherwise, see if there is a label at the end of the function. If there
     is, it must be that RETURN insns aren't needed, so that is our return
     label and we don't have to do anything else.  */

The comment was correct 20 years ago but no longer is nowadays in the
presence of RTL epilogues and calls to __builtin_unreachable, so the
patch just removes the associated two lines of code:

  else if (LABEL_P (insn))
    *plabel = as_a <rtx_code_label *> (insn);

and otherwise contains just adjustments to the commentary.

gcc/
	PR rtl-optimization/117327
	* reorg.cc (find_end_label): Do not return a dangling label at the
	end of the function and adjust commentary.

gcc/testsuite/
	* gcc.c-torture/execute/20241029-1.c: New test.
2024-10-29 21:43:59 +01:00
Andrew Pinski
9dd9a88b75 aarch64: Remove unnecessary casts to rtx_code [PR117349]
In aarch64_gen_ccmp_first/aarch64_gen_ccmp_next, the casts
were no longer needed after r14-3412-gbf64392d66f291 which
changed the type of the arguments to rtx_code.

In aarch64_rtx_costs, they were no longer needed since
r12-4828-g1d5c43db79b7ea which changed the type of code
to rtx_code.

Pushed as obvious after a build/test for aarch64-linux-gnu.

gcc/ChangeLog:

	PR target/117349
	* config/aarch64/aarch64.cc (aarch64_rtx_costs): Remove
	unnecessary casts to rtx_code.
	(aarch64_gen_ccmp_first): Likewise.
	(aarch64_gen_ccmp_next): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-29 13:07:16 -07:00
Jakub Jelinek
28b7aed124 c-family: Handle RAW_DATA_CST in complete_array_type [PR117313]
The following testcase ICEs, because
add_flexible_array_elts_to_size -> complete_array_type
is done only after braced_lists_to_strings which optimizes
RAW_DATA_CST surrounded by INTEGER_CST into a larger RAW_DATA_CST
covering even the boundaries, while I thought it is done before
that.
So, RAW_DATA_CST now can be the last constructor_elt in a CONSTRUCTOR
and so we need the function to take it into account (handle it as
RAW_DATA_CST standing for RAW_DATA_LENGTH consecutive elements).

The function wants to support both CONSTRUCTORs without indexes and with
them (for non-RAW_DATA_CST elts it was just adding 1 for the current
index).  So, if the RAW_DATA_CST elt has ce->index, we need to add
RAW_DATA_LENGTH (ce->value) - 1, while if it doesn't (and it isn't cnt == 0
case where curindex is 0), add that plus 1, i.e. RAW_DATA_LENGTH (ce->value).

2024-10-29  Jakub Jelinek  <jakub@redhat.com>

	PR c/117313
gcc/c-family/
	* c-common.cc (complete_array_type): For RAW_DATA_CST elements
	advance curindex by RAW_DATA_LENGTH or one less than that if
	ce->index is non-NULL.  Handle even the first element if
	it is RAW_DATA_CST.  Formatting fix.
gcc/testsuite/
	* c-c++-common/init-6.c: New test.
2024-10-29 20:14:09 +01:00
Jason Merrill
e6d21cbf5c c++: printing AGGR_INIT_EXPR args
PR30854 was about wrongly dumping the dummy object argument to a
constructor; r126582 in 4.3 fixed that by skipping the first argument.  But
not all functions called by AGGR_INIT_EXPR are constructors, as observed in
PR116634; we shouldn't skip for non-member functions.  And let's combine the
printing code for CALL_EXPR and AGGR_INIT_EXPR.

This doesn't make us accept the ill-formed 116634 testcase again with a
pedwarn, just fixes the diagnostic issue.

	PR c++/30854
	PR c++/116634

gcc/cp/ChangeLog:

	* error.cc (dump_aggr_init_expr_args): Remove.
	(dump_call_expr_args): Handle AGGR_INIT_EXPR.
	(dump_expr): Combine AGGR_INIT_EXPR and CALL_EXPR cases.

gcc/testsuite/ChangeLog:

	* g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C: Adjust
	diagnostic.
	* g++.dg/diagnostic/aggr-init1.C: New test.
2024-10-29 12:25:18 -04:00
Tsung Chun Lin
f003834bad [RISC-V] RISC-V: Add implication for M extension.
That M implies Zmmul.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: M implies Zmmul.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/attribute-15.c: Add _zmmul1p0 to arch string.
	* gcc.target/riscv/attribute-16.c: Ditto.
	* gcc.target/riscv/attribute-17.c: Ditto.
	* gcc.target/riscv/attribute-18.c: Ditto.
	* gcc.target/riscv/attribute-19.c: Ditto.
	* gcc.target/riscv/pr110696.c: Ditto.
	* gcc.target/riscv/target-attr-01.c: Ditto.
	* gcc.target/riscv/target-attr-02.c: Ditto.
	* gcc.target/riscv/target-attr-03.c: Ditto.
	* gcc.target/riscv/target-attr-04.c: Ditto.
	* gcc.target/riscv/target-attr-08.c: Ditto.
	* gcc.target/riscv/target-attr-11.c: Ditto.
	* gcc.target/riscv/target-attr-14.c: Ditto.
	* gcc.target/riscv/target-attr-15.c: Ditto.
	* gcc.target/riscv/target-attr-16.c: Ditto.
	* gcc.target/riscv/rvv/base/pr114352-1.c: Likewise.
	* gcc.target/riscv/rvv/base/pr114352-3.c: Likewise.
	* gcc.dg/pr90838.c: Fix search string for rv64.

	    Co-Authored-By: Jeff Law  <jlaw@ventanamicro.com>
2024-10-29 09:51:28 -06:00
Andrew Pinski
17f6add3ab testcase: Add testcase for tree-optimization/117341
Even though PR 117341 was a duplicate of PR 116768, another
testcase this time C++ does not hurt to have.
The testcase is a self-contained and does not use directly libstdc++
except for operator new (it does not even call delete).

Tested on x86_64-linux-gnu with it working.

	PR tree-optimization/117341

gcc/testsuite/ChangeLog:

	* g++.dg/torture/pr117341-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-29 08:49:41 -07:00
yulong
b22d9c8f82 [PATCH 2/2] RISC-V:Add intrinsic cases for the CMOs extensions
gcc/testsuite/ChangeLog:

	* gcc.target/riscv/cmo-32.c: New test.
	* gcc.target/riscv/cmo-64.c: New test.
2024-10-29 09:46:17 -06:00
yulong
d2c8548e0c [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions
gcc/ChangeLog:

	* config.gcc: Add riscv_cmo.h.
	* config/riscv/riscv_cmo.h: New file.
2024-10-29 09:46:17 -06:00
Pan Li
072d6bb67a RISC-V: Add testcases for form 1 of MASK_LEN_STRIDED_LOAD{STORE}
Form 1:
  void __attribute__((noinline))                                        \
  vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \
				       long stride, size_t size)        \
  {                                                                     \
    for (size_t i = 0; i < size; i++)                                   \
      out[i * stride] = in[i * stride];                                 \
  }

The below test suites are passed for this patch:
* The riscv fully regression test.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/rvv.exp: Add strided folder.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h: New test.
	* gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2024-10-29 22:19:46 +08:00
Pan Li
30435cc261 RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE}
This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in
the RISC-V backend by leveraging the vector strided load/store insn.

For example:
void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
    for (int i = 0; i < n; i++)
      a[i*stride] = b[i*stride] + 100;
}

Before this patch:
  38   │     vsetvli a5,a3,e32,m1,ta,ma
  39   │     vluxei64.v  v1,(a1),v4
  40   │     mul a4,a2,a5
  41   │     sub a3,a3,a5
  42   │     vadd.vv v1,v1,v2
  43   │     vsuxei64.v  v1,(a0),v4
  44   │     add a1,a1,a4
  45   │     add a0,a0,a4

After this patch:
  33   │     vsetvli a5,a3,e32,m1,ta,ma
  34   │     vlse32.v    v1,0(a1),a2
  35   │     mul a4,a2,a5
  36   │     sub a3,a3,a5
  37   │     vadd.vv v1,v1,v2
  38   │     vsse32.v    v1,0(a0),a2
  39   │     add a1,a1,a4
  40   │     add a0,a0,a4

The below test suites are passed for this patch:
* The riscv fully regression test.

gcc/ChangeLog:

	* config/riscv/autovec.md (mask_len_strided_load_<mode>): Add
	new pattern for MASK_LEN_STRIDED_LOAD.
	(mask_len_strided_store_<mode>): Ditto but for store.
	* config/riscv/riscv-protos.h (expand_strided_load): Add new
	func decl to expand strided load.
	(expand_strided_store): Ditto but for store.
	* config/riscv/riscv-v.cc (expand_strided_load): Add new
	func impl to expand strided load.
	(expand_strided_store): Ditto but for store.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2024-10-29 22:19:41 +08:00
Pan Li
372060d787 RISC-V: Adjust the gather-scatter testcases due to middle-end change
After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the
strided case need to be adjust for IR check.

The below test suites are passed for this patch:
* The riscv fully regression test.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c:
	Adjust IR for MASK_LEN_LOAD check.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c:
	Ditto but for store.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c:
	Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2024-10-29 22:19:34 +08:00
Pan Li
a0292ddb21 Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer
This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR
for invariant stride memory access.  For example as below

void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
    for (int i = 0; i < n; i++)
      a[i*stride] = b[i*stride] + 100;
}

Before this patch:
  66   │   _73 = .SELECT_VL (ivtmp_71, POLY_INT_CST [4, 4]);
  67   │   _52 = _54 * _73;
  68   │   vect__5.16_61 = .MASK_LEN_GATHER_LOAD (vectp_b.14_59, _58, 4, { 0, ... }, { -1, ... }, _73, 0);
  69   │   vect__7.17_63 = vect__5.16_61 + { 100, ... };
  70   │   .MASK_LEN_SCATTER_STORE (vectp_a.18_67, _58, 4, vect__7.17_63, { -1, ... }, _73, 0);
  71   │   vectp_b.14_60 = vectp_b.14_59 + _52;
  72   │   vectp_a.18_68 = vectp_a.18_67 + _52;
  73   │   ivtmp_72 = ivtmp_71 - _73;

After this patch:
  60   │   _70 = .SELECT_VL (ivtmp_68, POLY_INT_CST [4, 4]);
  61   │   _52 = _54 * _70;
  62   │   vect__5.16_58 = .MASK_LEN_STRIDED_LOAD (vectp_b.14_56, _55, { 0, ... }, { -1, ... }, _70, 0);
  63   │   vect__7.17_60 = vect__5.16_58 + { 100, ... };
  64   │   .MASK_LEN_STRIDED_STORE (vectp_a.18_64, _55, vect__7.17_60, { -1, ... }, _70, 0);
  65   │   vectp_b.14_57 = vectp_b.14_56 + _52;
  66   │   vectp_a.18_65 = vectp_a.18_64 + _52;
  67   │   ivtmp_69 = ivtmp_68 - _70;

The below test suites are passed for this patch:
* The x86 bootstrap test.
* The x86 fully regression test.
* The riscv fully regression test.

gcc/ChangeLog:

	* tree-vect-stmts.cc (vect_get_strided_load_store_ops): Handle
	MASK_LEN_STRIDED_LOAD{STORE} after supported check.
	(vectorizable_store): Generate MASK_LEN_STRIDED_LOAD when the offset
	of gater is not vector type.
	(vectorizable_load): Ditto but for store.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2024-10-29 22:19:26 +08:00
Pan Li
1fdee26ee9 Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE}
This patch would like to introduce new IFN for strided load and store.

LOAD:  v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias)
STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias)

The IFN target below code example similar as below

void foo (int * a, int * b, int stride, int n)
{
  for (int i = 0; i < n; i++)
    a[i * stride] = b[i * stride];
}

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

	* internal-fn.cc (strided_load_direct): Add new define direct
	for strided load.
	(strided_store_direct): Ditto but for store.
	(expand_strided_load_optab_fn): Add new func to expand the IFN
	MASK_LEN_STRIDED_LOAD in middle-end.
	(expand_strided_store_optab_fn): Ditto but for store.
	(direct_strided_load_optab_supported_p): Add define for stride
	load optab supported.
	(direct_strided_store_optab_supported_p): Ditto but for store.
	(internal_fn_len_index): Add strided load/store len index.
	(internal_fn_mask_index): Ditto but for mask.
	(internal_fn_stored_value_index): Add strided store value index.
	* internal-fn.def (MASK_LEN_STRIDED_LOAD): Add new IFN for
	strided load.
	(MASK_LEN_STRIDED_STORE): Ditto but for store.
	* optabs.def (mask_len_strided_load_optab): Add strided load optab.
	(mask_len_strided_store_optab): Add strided store optab.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2024-10-29 22:19:19 +08:00
Richard Biener
4cfff6d413 Remove dead vect_recog_mixed_size_cond_pattern
vect_recog_mixed_size_cond_pattern only applies to COMPARISON_CLASS_P
rhs1 COND_EXPRs which no longer appear - the following removes it.
Its testcases still pass, I believe the situation is mitigated by
bool pattern handling of the compare use in COND_EXPRs.

	* tree-vect-patterns.cc (type_conversion_p): Remove.
	(vect_recog_mixed_size_cond_pattern): Likewise.
	(vect_vect_recog_func_ptrs): Remove vect_recog_mixed_size_cond_pattern
	entry.
2024-10-29 15:07:01 +01:00
Richard Biener
c738a15c50 Remove dead code in vectorizer pattern recog
The following removes the code path in vect_recog_mask_conversion_pattern
dealing with comparisons in COND_EXPRs.  That can no longer happen.

	* tree-vect-patterns.cc (vect_recog_mask_conversion_pattern):
	Remove COMPARISON_CLASS_P rhs1 of COND_EXPR case and assert
	it doesn't happen.
2024-10-29 15:07:01 +01:00
Patrick Palka
7f622ee83f libstdc++: Fix complexity of drop_view::begin() const [PR112641]
Views are required to have a amortized O(1) begin(), but our drop_view's
const begin overload is O(n) for non-common ranges with a non-sized
sentinel.  This patch reimplements it so that it's O(1) always.  See
also LWG 4009.

	PR libstdc++/112641

libstdc++-v3/ChangeLog:

	* include/std/ranges (drop_view::begin): Reimplement const
	overload so that it's O(1) always.
	* testsuite/std/ranges/adaptors/drop.cc (test10): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-10-29 09:26:19 -04:00
David Malcolm
7f41203f08 jit: fix leak of pending_assemble_externals_set [PR117275]
My recent r15-4580-g779c0390e3b57d fix for resetting state in
varasm.cc introduced some noise to "make selftest-valgrind" and,
presumably, a memory leak in libgccjit:

==2462086== 160 (56 direct, 104 indirect) bytes in 1 blocks are definitely lost in loss record 248 of 352
==2462086==    at 0x5270E7D: operator new(unsigned long) (vg_replace_malloc.c:342)
==2462086==    by 0x1D1EB89: init_varasm_once() (varasm.cc:6806)
==2462086==    by 0x181C845: backend_init() (toplev.cc:1826)
==2462086==    by 0x181D41A: do_compile() (toplev.cc:2193)
==2462086==    by 0x181D99C: toplev::main(int, char**) (toplev.cc:2371)
==2462086==    by 0x378391D: main (main.cc:39)

Fixed thusly.

gcc/ChangeLog:
	PR jit/117275
	* varasm.cc (process_pending_assemble_externals): Reset
	pending_assemble_externals_set to nullptr after deleting it.
	(varasm_cc_finalize): Delete pending_assemble_externals_set.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-10-29 08:25:56 -04:00
Richard Biener
9999cc79e9 tree-optimization/117343 - decide_masked_load_lanes and stale graph
It turns out decide_masked_load_lanes accesses a stale SLP graph
so the following re-builds it instead.

	PR tree-optimization/117343
	* tree-vect-slp.cc (vect_optimize_slp_pass::build_vertices):
	Support re-building the SLP graph.
	(vect_optimize_slp_pass::run): Re-build the SLP graph before
	decide_masked_load_lanes.
2024-10-29 12:17:38 +01:00
Richard Biener
0e99b22aa6 tree-optimization/117333 - ICE with NULL access size DR
dr_may_alias_p ICEs when TYPE_SIZE of DR->ref is NULL but this is
valid IL when the access size of an aggregate copy can be infered
from the RHS.

	PR tree-optimization/117333
	* tree-data-ref.cc (dr_may_alias_p): Guard against NULL
	access size.

	* gcc.dg/torture/pr117333.c: New testcase.
2024-10-29 11:25:07 +01:00
Jakub Jelinek
5e247ac0c2 libstdc++: Use if consteval rather than if (std::__is_constant_evaluated()) for {,b}float16_t nextafter [PR117321]
The nextafter_c++23.cc testcase fails to link at -O0.
The problem is that eventhough std::__is_constant_evaluated() has
always_inline attribute, that at -O0 just means that we inline the
call, but its result is still assigned to a temporary which is tested
later, nothing at -O0 propagates that false into the if and optimizes
away the if body.  And the __builtin_nextafterf16{,b} calls are meant
to be used solely for constant evaluation, the C libraries don't
define nextafterf16 these days.

As __STDCPP_FLOAT16_T__ and __STDCPP_BFLOAT16_T__ are predefined right
now only by GCC, not by clang which doesn't implement the extended floating
point types paper, and as they are predefined in C++23 and later modes only,
I think we can just use if consteval which is folded already during the FE
and the body isn't included even at -O0.  I've added a feature test for
that just in case clang implements those and implements those in some weird
way.  Note, if (__builtin_is_constant_evaluted()) would work correctly too,
that is also folded to false at gimplification time and the corresponding
if block not emitted at all.  But for -O0 it can't be wrapped into a helper
inline function.

2024-10-29  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/117321
	* include/c_global/cmath (nextafter(_Float16, _Float16)): Use
	if consteval rather than if (std::__is_constant_evaluated()) around
	the __builtin_nextafterf16 call.
	(nextafter(__gnu_cxx::__bfloat16_t, __gnu_cxx::__bfloat16_t)): Use
	if consteval rather than if (std::__is_constant_evaluated()) around
	the __builtin_nextafterf16b call.
	* testsuite/26_numerics/headers/cmath/117321.cc: New test.
2024-10-29 11:14:12 +01:00
Marc Poulhiès
61977b8af0 ada: Fix static_assert with one argument
Single argument static_assert is C++17 only and breaks the build using
older GCC (prerequisite is C++14).

gcc/ada

	* types.h: fix static_assert.
2024-10-29 11:08:22 +01:00
Alfie Richards
63b6967b06 arm: [MVE intrinsics] Rework MVE vld/vst intrinsics
Implement the mve vld and vst intrinsics using the MVE builtins framework.

The main part of the patch is to reimplement to vstr/vldr patterns
such that we now have much fewer of them:
- non-truncating stores
- predicated non-truncating stores
- truncating stores
- predicated truncating stores
- non-extending loads
- predicated non-extending loads
- extending loads
- predicated extending loads

This enables us to update the implementation of vld1/vst1 and use the
new vldr/vstr builtins.

The patch also adds support for the predicated vld1/vst1 versions.

gcc.target/arm/pr112337.c needs an update, to call the intrinsic
instead of the builtin, which this patch deletes.

2024-09-11  Alfie Richards  <Alfie.Richards@arm.com>
	    Christophe Lyon  <christophe.lyon@arm.com>

	gcc/

	* config/arm/arm-mve-builtins-base.cc (vld1q_impl): Add support
	for predicated version.
	(vst1q_impl): Likewise.
	(vstrq_impl): New class.
	(vldrq_impl): New class.
	(vldrbq): New.
	(vldrhq): New.
	(vldrwq): New.
	(vstrbq): New.
	(vstrhq): New.
	(vstrwq): New.
	* config/arm/arm-mve-builtins-base.def (vld1q): Add predicated
	version.
	(vldrbq): New.
	(vldrhq): New.
	(vldrwq): New.
	(vst1q): Add predicated version.
	(vstrbq): New.
	(vstrhq): New.
	(vstrwq): New.
	(vrev32q): Update types to float_16.
	* config/arm/arm-mve-builtins-base.h (vldrbq): New.
	(vldrhq): New.
	(vldrwq): New.
	(vstrbq): New.
	(vstrhq): New.
	(vstrwq): New.
	* config/arm/arm-mve-builtins-functions.h (memory_vector_mode):
	Remove conversion of floating point vectors to integer.
	* config/arm/arm-mve-builtins.cc (TYPES_float16): Change to...
	(TYPES_float_16): ...this.
	(TYPES_float_32): New.
	(float16): Change to...
	(float_16): ...this.
	(float_32): New.
	(preds_z_or_none): New.
	(function_resolver::check_gp_argument): Add support for _z
	predicate.
	* config/arm/arm_mve.h (vstrbq): Remove.
	(vstrbq_p): Likewise.
	(vstrhq): Likewise.
	(vstrhq_p): Likewise.
	(vstrwq): Likewise.
	(vstrwq_p): Likewise.
	(vst1q_p): Likewise.
	(vld1q_z): Likewise.
	(vldrbq_s8): Likewise.
	(vldrbq_u8): Likewise.
	(vldrbq_s16): Likewise.
	(vldrbq_u16): Likewise.
	(vldrbq_s32): Likewise.
	(vldrbq_u32): Likewise.
	(vstrbq_s8): Likewise.
	(vstrbq_s32): Likewise.
	(vstrbq_s16): Likewise.
	(vstrbq_u8): Likewise.
	(vstrbq_u32): Likewise.
	(vstrbq_u16): Likewise.
	(vstrbq_p_s8): Likewise.
	(vstrbq_p_s32): Likewise.
	(vstrbq_p_s16): Likewise.
	(vstrbq_p_u8): Likewise.
	(vstrbq_p_u32): Likewise.
	(vstrbq_p_u16): Likewise.
	(vldrbq_z_s16): Likewise.
	(vldrbq_z_u8): Likewise.
	(vldrbq_z_s8): Likewise.
	(vldrbq_z_s32): Likewise.
	(vldrbq_z_u16): Likewise.
	(vldrbq_z_u32): Likewise.
	(vldrhq_s32): Likewise.
	(vldrhq_s16): Likewise.
	(vldrhq_u32): Likewise.
	(vldrhq_u16): Likewise.
	(vldrhq_z_s32): Likewise.
	(vldrhq_z_s16): Likewise.
	(vldrhq_z_u32): Likewise.
	(vldrhq_z_u16): Likewise.
	(vldrwq_s32): Likewise.
	(vldrwq_u32): Likewise.
	(vldrwq_z_s32): Likewise.
	(vldrwq_z_u32): Likewise.
	(vldrhq_f16): Likewise.
	(vldrhq_z_f16): Likewise.
	(vldrwq_f32): Likewise.
	(vldrwq_z_f32): Likewise.
	(vstrhq_f16): Likewise.
	(vstrhq_s32): Likewise.
	(vstrhq_s16): Likewise.
	(vstrhq_u32): Likewise.
	(vstrhq_u16): Likewise.
	(vstrhq_p_f16): Likewise.
	(vstrhq_p_s32): Likewise.
	(vstrhq_p_s16): Likewise.
	(vstrhq_p_u32): Likewise.
	(vstrhq_p_u16): Likewise.
	(vstrwq_f32): Likewise.
	(vstrwq_s32): Likewise.
	(vstrwq_u32): Likewise.
	(vstrwq_p_f32): Likewise.
	(vstrwq_p_s32): Likewise.
	(vstrwq_p_u32): Likewise.
	(vst1q_p_u8): Likewise.
	(vst1q_p_s8): Likewise.
	(vld1q_z_u8): Likewise.
	(vld1q_z_s8): Likewise.
	(vst1q_p_u16): Likewise.
	(vst1q_p_s16): Likewise.
	(vld1q_z_u16): Likewise.
	(vld1q_z_s16): Likewise.
	(vst1q_p_u32): Likewise.
	(vst1q_p_s32): Likewise.
	(vld1q_z_u32): Likewise.
	(vld1q_z_s32): Likewise.
	(vld1q_z_f16): Likewise.
	(vst1q_p_f16): Likewise.
	(vld1q_z_f32): Likewise.
	(vst1q_p_f32): Likewise.
	(__arm_vstrbq_s8): Likewise.
	(__arm_vstrbq_s32): Likewise.
	(__arm_vstrbq_s16): Likewise.
	(__arm_vstrbq_u8): Likewise.
	(__arm_vstrbq_u32): Likewise.
	(__arm_vstrbq_u16): Likewise.
	(__arm_vldrbq_s8): Likewise.
	(__arm_vldrbq_u8): Likewise.
	(__arm_vldrbq_s16): Likewise.
	(__arm_vldrbq_u16): Likewise.
	(__arm_vldrbq_s32): Likewise.
	(__arm_vldrbq_u32): Likewise.
	(__arm_vstrbq_p_s8): Likewise.
	(__arm_vstrbq_p_s32): Likewise.
	(__arm_vstrbq_p_s16): Likewise.
	(__arm_vstrbq_p_u8): Likewise.
	(__arm_vstrbq_p_u32): Likewise.
	(__arm_vstrbq_p_u16): Likewise.
	(__arm_vldrbq_z_s8): Likewise.
	(__arm_vldrbq_z_s32): Likewise.
	(__arm_vldrbq_z_s16): Likewise.
	(__arm_vldrbq_z_u8): Likewise.
	(__arm_vldrbq_z_u32): Likewise.
	(__arm_vldrbq_z_u16): Likewise.
	(__arm_vldrhq_s32): Likewise.
	(__arm_vldrhq_s16): Likewise.
	(__arm_vldrhq_u32): Likewise.
	(__arm_vldrhq_u16): Likewise.
	(__arm_vldrhq_z_s32): Likewise.
	(__arm_vldrhq_z_s16): Likewise.
	(__arm_vldrhq_z_u32): Likewise.
	(__arm_vldrhq_z_u16): Likewise.
	(__arm_vldrwq_s32): Likewise.
	(__arm_vldrwq_u32): Likewise.
	(__arm_vldrwq_z_s32): Likewise.
	(__arm_vldrwq_z_u32): Likewise.
	(__arm_vstrhq_s32): Likewise.
	(__arm_vstrhq_s16): Likewise.
	(__arm_vstrhq_u32): Likewise.
	(__arm_vstrhq_u16): Likewise.
	(__arm_vstrhq_p_s32): Likewise.
	(__arm_vstrhq_p_s16): Likewise.
	(__arm_vstrhq_p_u32): Likewise.
	(__arm_vstrhq_p_u16): Likewise.
	(__arm_vstrwq_s32): Likewise.
	(__arm_vstrwq_u32): Likewise.
	(__arm_vstrwq_p_s32): Likewise.
	(__arm_vstrwq_p_u32): Likewise.
	(__arm_vst1q_p_u8): Likewise.
	(__arm_vst1q_p_s8): Likewise.
	(__arm_vld1q_z_u8): Likewise.
	(__arm_vld1q_z_s8): Likewise.
	(__arm_vst1q_p_u16): Likewise.
	(__arm_vst1q_p_s16): Likewise.
	(__arm_vld1q_z_u16): Likewise.
	(__arm_vld1q_z_s16): Likewise.
	(__arm_vst1q_p_u32): Likewise.
	(__arm_vst1q_p_s32): Likewise.
	(__arm_vld1q_z_u32): Likewise.
	(__arm_vld1q_z_s32): Likewise.
	(__arm_vldrwq_f32): Likewise.
	(__arm_vldrwq_z_f32): Likewise.
	(__arm_vldrhq_z_f16): Likewise.
	(__arm_vldrhq_f16): Likewise.
	(__arm_vstrwq_p_f32): Likewise.
	(__arm_vstrwq_f32): Likewise.
	(__arm_vstrhq_f16): Likewise.
	(__arm_vstrhq_p_f16): Likewise.
	(__arm_vld1q_z_f16): Likewise.
	(__arm_vst1q_p_f16): Likewise.
	(__arm_vld1q_z_f32): Likewise.
	(__arm_vst2q_f32): Likewise.
	(__arm_vst1q_p_f32): Likewise.
	(__arm_vstrbq): Likewise.
	(__arm_vstrbq_p): Likewise.
	(__arm_vstrhq): Likewise.
	(__arm_vstrhq_p): Likewise.
	(__arm_vstrwq): Likewise.
	(__arm_vstrwq_p): Likewise.
	(__arm_vst1q_p): Likewise.
	(__arm_vld1q_z): Likewise.
	* config/arm/arm_mve_builtins.def:
	(vstrbq_s): Delete.
	(vstrbq_u): Likewise.
	(vldrbq_s): Likewise.
	(vldrbq_u): Likewise.
	(vstrbq_p_s): Likewise.
	(vstrbq_p_u): Likewise.
	(vldrbq_z_s): Likewise.
	(vldrbq_z_u): Likewise.
	(vld1q_u): Likewise.
	(vld1q_s): Likewise.
	(vldrhq_z_u): Likewise.
	(vldrhq_u): Likewise.
	(vldrhq_z_s): Likewise.
	(vldrhq_s): Likewise.
	(vld1q_f): Likewise.
	(vldrhq_f): Likewise.
	(vldrhq_z_f): Likewise.
	(vldrwq_f): Likewise.
	(vldrwq_s): Likewise.
	(vldrwq_u): Likewise.
	(vldrwq_z_f): Likewise.
	(vldrwq_z_s): Likewise.
	(vldrwq_z_u): Likewise.
	(vst1q_u): Likewise.
	(vst1q_s): Likewise.
	(vstrhq_p_u): Likewise.
	(vstrhq_u): Likewise.
	(vstrhq_p_s): Likewise.
	(vstrhq_s): Likewise.
	(vst1q_f): Likewise.
	(vstrhq_f): Likewise.
	(vstrhq_p_f): Likewise.
	(vstrwq_f): Likewise.
	(vstrwq_s): Likewise.
	(vstrwq_u): Likewise.
	(vstrwq_p_f): Likewise.
	(vstrwq_p_s): Likewise.
	(vstrwq_p_u): Likewise.
	* config/arm/iterators.md (MVE_w_narrow_TYPE): New iterator.
	(MVE_w_narrow_type): New iterator.
	(MVE_wide_n_TYPE): New attribute.
	(MVE_wide_n_type): New attribute.
	(MVE_wide_n_sz_elem): New attribute.
	(MVE_wide_n_VPRED): New attribute.
	(MVE_elem_ch): New attribute.
	(supf): Remove VSTRBQ_S, VSTRBQ_U, VLDRBQ_S, VLDRBQ_U, VLD1Q_S,
	VLD1Q_U, VLDRHQ_S, VLDRHQ_U, VLDRWQ_S, VLDRWQ_U, VST1Q_S, VST1Q_U,
	VSTRHQ_S, VSTRHQ_U, VSTRWQ_S, VSTRWQ_U.
	(VSTRBQ, VLDRBQ, VLD1Q, VLDRHQ, VLDRWQ, VST1Q, VSTRHQ, VSTRWQ):
	Delete.
	* config/arm/mve.md (mve_vstrbq_<supf><mode>): Remove.
	(mve_vldrbq_<supf><mode>): Likewise.
	(mve_vstrbq_p_<supf><mode>): Likewise.
	(mve_vldrbq_z_<supf><mode>): Likewise.
	(mve_vldrhq_fv8hf): Likewise.
	(mve_vldrhq_<supf><mode>): Likewise.
	(mve_vldrhq_z_fv8hf): Likewise.
	(mve_vldrhq_z_<supf><mode>): Likewise.
	(mve_vldrwq_fv4sf): Likewise.
	(mve_vldrwq_<supf>v4si): Likewise.
	(mve_vldrwq_z_fv4sf): Likewise.
	(mve_vldrwq_z_<supf>v4si): Likewise.
	(@mve_vld1q_f<mode>): Likewise.
	(@mve_vld1q_<supf><mode>): Likewise.
	(mve_vstrhq_fv8hf): Likewise.
	(mve_vstrhq_p_fv8hf): Likewise.
	(mve_vstrhq_p_<supf><mode>): Likewise.
	(mve_vstrhq_<supf><mode>): Likewise.
	(mve_vstrwq_fv4sf): Likewise.
	(mve_vstrwq_p_fv4sf): Likewise.
	(mve_vstrwq_p_<supf>v4si): Likewise.
	(mve_vstrwq_<supf>v4si): Likewise.
	(@mve_vst1q_f<mode>): Likewise.
	(@mve_vst1q_<supf><mode>): Likewise.
	(@mve_vstrq_<mode>): New.
	(@mve_vstrq_p_<mode>): New.
	(@mve_vstrq_truncate_<mode>): New.
	(@mve_vstrq_p_truncate_<mode>): New.
	(@mve_vldrq_<mode>): New.
	(@mve_vldrq_z_<mode>): New.
	(@mve_vldrq_extend_<mode><US>): New.
	(@mve_vldrq_z_extend_<mode><US>): New.
	* config/arm/unspecs.md:
	(VSTRBQ_S): Remove.
	(VSTRBQ_U): Likewise.
	(VLDRBQ_S): Likewise.
	(VLDRBQ_U): Likewise.
	(VLD1Q_F): Likewise.
	(VLD1Q_S): Likewise.
	(VLD1Q_U): Likewise.
	(VLDRHQ_F): Likewise.
	(VLDRHQ_U): Likewise.
	(VLDRHQ_S): Likewise.
	(VLDRWQ_F): Likewise.
	(VLDRWQ_S): Likewise.
	(VLDRWQ_U): Likewise.
	(VSTRHQ_F): Likewise.
	(VST1Q_S): Likewise.
	(VST1Q_U): Likewise.
	(VSTRHQ_U): Likewise.
	(VSTRWQ_S): Likewise.
	(VSTRWQ_U): Likewise.
	(VSTRWQ_F): Likewise.
	(VST1Q_F): Likewise.
	(VLDRQ): New.
	(VLDRQ_Z): Likewise.
	(VLDRQ_EXT): Likewise.
	(VLDRQ_EXT_Z): Likewise.
	(VSTRQ): Likewise.
	(VSTRQ_P): Likewise.
	(VSTRQ_TRUNC): Likewise.
	(VSTRQ_TRUNC_P): Likewise.

	gcc/testsuite/
	* gcc.target/arm/pr112337.c: Call intrinsic instead of builtin.
2024-10-29 10:03:06 +01:00
Alfie Richards
16ee5c64e6 arm: [MVE intrinsics] Add support for predicated contiguous loads and stores
This patch extends
function_expander::use_contiguous_load_insn and
function_expander::use_contiguous_store_insn functions to
support predicated versions.

2024-09-11  Alfie Richards  <Alfie.Richards@arm.com>
	    Christophe Lyon  <christophe.lyon@arm.com>

	gcc/

	* config/arm/arm-mve-builtins.cc
	(function_expander::use_contiguous_load_insn): Add support for
	PRED_z.
	(function_expander::use_contiguous_store_insn): Add support for
	PRED_p.
2024-10-29 10:03:06 +01:00
Alfie Richards
52e36cde0f arm: [MVE intrinsics] Add load_extending and store_truncating function bases
This patch adds the load_extending and store_truncating function bases
for MVE intrinsics.

The constructors have parameters describing the memory element
type/width which is part of the function base name (e.g. "h" in
vldrhq).

2024-09-11  Alfie Richards <Alfie.Richards@arm.com>

	gcc/

	* config/arm/arm-mve-builtins-functions.h
	(load_extending): New class.
	(store_truncating): New class.
	* config/arm/arm-protos.h (arm_mve_data_mode): New helper function.
	* config/arm/arm.cc (arm_mve_data_mode): New helper function.
2024-10-29 10:03:06 +01:00
Alfie Richards
c31cdc3d85 arm: [MVE intrinsics] Add load_ext intrinsic shape
This patch adds the extending load shape.
It also adds/fixes comments for the load and store shapes.

2024-09-11  Alfie Richards <Alfie.Richards@arm.com>
	    Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc:
	(load_ext): New.
	* config/arm/arm-mve-builtins-shapes.h:
	(load_ext): New.
2024-10-29 10:03:06 +01:00
Alfie Richards
3aca5aa0f0 arm: [MVE intrinsics] fix vst tests
The tests for vst* instrinsics use functions which return a void
expression which can generate a warning. This hasn't come up previously
as the inlining presumably prevents the warning.

This change removed the uneccessary and incorrect returns.

2024-09-11  Alfie Richards <Alfie.Richards@arm.com>

	gcc/testsuite/
	* gcc.target/arm/mve/intrinsics/vst1q_p_f16.c: Remove `return`.
	* gcc.target/arm/mve/intrinsics/vst1q_p_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_p_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_p_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_p_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_p_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_p_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_p_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_f16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst2q_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_f16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_p_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_p_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_p_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_p_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_p_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_p_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s8.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u8.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s8.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u8.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrbq_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_p_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_p_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_u64.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_p_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_p_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_p_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_p_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_s64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_u64.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_f16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_p_f16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_p_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_p_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_p_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_p_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_f16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_f16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_s16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_u16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_s16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_u16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_offset_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_f16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_f16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_s16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_u16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_s16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_u16.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_scatter_shifted_offset_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrhq_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_p_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_p_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_p_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_f32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_s32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_u32.c:
	Likewise.
	* gcc.target/arm/mve/intrinsics/vstrwq_u32.c: Likewise.
2024-10-29 10:03:06 +01:00
Jakub Jelinek
972f653cad c: Add __builtin_stdc_rotate_{left,right} builtins [PR117030]
I believe the new C2Y <stdbit.h> type-generic functions
stdc_rotate_{left,right} have the same problems the other stdc_*
type-generic functions had.  If we want to support arbitrary
unsigned _BitInt(N), don't want to use statement expressions
(so that one can actually use them in static variable initializers),
don't want to evaluate the arguments multiple times and don't want
to expand the arguments multiple times during preprocessing to avoid the
old tgmath preprocessing bloat, we need a built-in for those.

The following patch adds those.  And as we need to support rotations by 0
and tree-ssa-forwprop.cc is only able to pattern recognize with BIT_AND_EXPR
for that case (i.e. for power of two widths), the patch just constructs
LROTATE_EXPR/RROTATE_EXPR right away.  Negative second arguments are
considered UB, while positive ones are modulo precision.

2024-10-29  Jakub Jelinek  <jakub@redhat.com>

	PR c/117030
gcc/
	* doc/extend.texi (__builtin_stdc_rotate_left,
	__builtin_stdc_rotate_right): Document.
gcc/c-family/
	* c-common.cc (c_common_reswords): Add __builtin_stdc_rotate_left
	and __builtin_stdc_rotate_right.
	* c-ubsan.cc (ubsan_instrument_shift): For {L,R}ROTATE_EXPR
	just check if op1 is negative.
gcc/c/
	* c-parser.cc: Include asan.h and c-family/c-ubsan.h.
	(c_parser_postfix_expression): Handle __builtin_stdc_rotate_left
	and __builtin_stdc_rotate_right.
	* c-fold.cc (c_fully_fold_internal): Handle LROTATE_EXPR and
	RROTATE_EXPR.
gcc/testsuite/
	* gcc.dg/builtin-stdc-rotate-1.c: New test.
	* gcc.dg/builtin-stdc-rotate-2.c: New test.
	* gcc.dg/ubsan/builtin-stdc-rotate-1.c: New test.
	* gcc.dg/ubsan/builtin-stdc-rotate-2.c: New test.
2024-10-29 09:06:25 +01:00
GCC Administrator
87fa88222f Daily bump. 2024-10-29 00:18:25 +00:00
David Malcolm
a67594d181 testsuite: drop the "test-" prefix from sarif-output python scripts
Drop the "text-" prefix from the various gcc.dg/sarif-output/test-*.py
scripts so that the scripts are close to the .c files they are used by
when the files are sorted by name.

gcc/testsuite/ChangeLog:
	* gcc.dg/sarif-output/test-bad-pragma.py: Rename to...
	* gcc.dg/sarif-output/bad-pragma.py: ...this.
	* gcc.dg/sarif-output/bad-pragma.c: Update for script renaming.
	* gcc.dg/sarif-output/test-include-chain-1.py: Rename to...
	* gcc.dg/sarif-output/include-chain-1.py: ...this.
	* gcc.dg/sarif-output/include-chain-1.c: Update for script renaming.
	* gcc.dg/sarif-output/test-include-chain-2.py: Rename to...
	* gcc.dg/sarif-output/include-chain-2.py: ...this.
	* gcc.dg/sarif-output/include-chain-2.c: Update for script renaming.
	* gcc.dg/sarif-output/test-missing-semicolon.py: Rename to...
	* gcc.dg/sarif-output/missing-semicolon.py: ...this.
	* gcc.dg/sarif-output/missing-semicolon.c: Update for script renaming.
	* gcc.dg/sarif-output/test-no-diagnostics.py: Rename to...
	* gcc.dg/sarif-output/no-diagnostics.py: ...this.
	* gcc.dg/sarif-output/no-diagnostics.c: Update for script renaming.
	* gcc.dg/sarif-output/test-werror.py: Rename to...
	* gcc.dg/sarif-output/werror.py: ...this.
	* gcc.dg/sarif-output/werror.c: Update for script renaming.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-10-28 18:43:11 -04:00
Andrew Pinski
e20ced2cb8 testcase: Add testcase for PR 117330 [PR117330]
This testcase was causing an ICE during vectorization
due to r15-4695-gd17e672ce82e69 but was fixed with
r15-4713-g0942bb85fc5573.

Pushed as obvious after a quick test on x86_64-linux-gnu to
make sure the testcase passes.

	PR tree-optimization/117330

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr117330-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-28 13:32:05 -07:00
Dimitar Dimitrov
6638fcc161
testsuite: Require atomic operations for pr47333_0
Since the test uses __sync_fetch_and_add, add a requirement for
target to support atomic operations on int and long types.

This fixes a spurious test failure on pru-unknown-elf, which lacks
atomic ops. The test still passes on x86_64-linux-gnu.

gcc/testsuite/ChangeLog:

	* g++.dg/lto/pr47333_0.C: Require target that supports atomic
	operations on int and long types.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2024-10-28 22:11:30 +02:00
Sam James
ca078d260a
gcc: fix 'statements' comment typo
gcc/ChangeLog:

	* opts-common.cc (prune_options): Fix typo.
2024-10-28 18:24:21 +00:00
Sam James
4e09ae37db
testsuite: add testcase for fixed PR107467
PR107467 ended up being fixed by the fix for PR115110, but let's
add the testcase on top.

gcc/testsuite/ChangeLog:
	PR tree-optimization/107467
	PR middle-end/115110

	* g++.dg/lto/pr107467_0.C: New test.
2024-10-28 17:05:24 +00:00
Andrew MacLeod
1c0ecf06c0 Fix bitwise_or logic for prange.
Set non-zero only if at least one of the two operands does not contain zero.

	* range-op-ptr.cc (operator_bitwise_or::fold_range): Fix logic
	for setting nonzero.
2024-10-28 12:46:55 -04:00
Kyrylo Tkachov
7c0e4963d5
aarch64: Use implementation namespace for vxarq_u64 immediate argument
Looks like this immediate variable was missed out when I last fixed the
namespace issues in arm_neon.h.  Fixed in the obvious manner.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

	* config/aarch64/arm_neon.h (vxarq_u64): Rename imm6 to __imm6.
2024-10-28 16:28:18 +01:00
Jonathan Wakely
e320846fec
libstdc++: Fix tests for std::vector range operations
The commit I pushed was not the one I'd tested, so it had older versions
of the tests, with bugs that I'd already fixed locally. This commit has
the fixed tests that I'd intended to push in the first place.

libstdc++-v3/ChangeLog:

	* testsuite/23_containers/vector/bool/cons/from_range.cc: Use
	dg-do run instead of compile.
	(test_ranges): Use do_test instead of do_test_a for rvalue
	range.
	(test_constexpr): Call function template instead of just
	instantiating it.
	* testsuite/23_containers/vector/bool/modifiers/assign/assign_range.cc:
	Use dg-do run instead of compile.
	(do_test): Use same test logic for vector<bool> as for primary
	template.
	(test_constexpr): Call function template instead of just
	instantiating it.
	* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
	Use dg-do run instead of compile.
	(test_ranges): Use do_test instead of do_test_a for rvalue
	range.
	(test_constexpr): Call function template instead of just
	instantiating it.
	* testsuite/23_containers/vector/bool/modifiers/insert/insert_range.cc:
	Use dg-do run instead of compile.
	(do_test): Fix incorrect function arguments to match intended
	results.
	(test_ranges): Use do_test instead of do_test_a for rvalue
	range.
	(test_constexpr): Call function template instead of just
	instantiating it.
	* testsuite/23_containers/vector/cons/from_range.cc: Use dg-do
	run instead of compile.
	(test_ranges): Fix ill-formed call to do_test.
	(test_constexpr): Call function template instead of just
	instantiating it.
	* testsuite/23_containers/vector/modifiers/append_range.cc:
	Use dg-do run instead of compile.
	(test_constexpr): Likewise.
	* testsuite/23_containers/vector/modifiers/assign/assign_range.cc:
	Use dg-do run instead of compile.
	(do_test): Do not reuse input ranges.
	(test_constexpr): Call function template instead of just
	instantiating it.
	* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
	Use dg-do run instead of compile.
	(do_test): Fix incorrect function arguments to match intended
	results.
	(test_constexpr): Call function template instead of just
	instantiating it.
2024-10-28 13:52:53 +00:00