Go to file
Roger Sayle 79649a5dcd PR target/106060: Improved SSE vector constant materialization on x86.
This patch resolves PR target/106060 by providing efficient methods for
materializing/synthesizing special "vector" constants on x86.  Currently
there are three methods of materializing a vector constant; the most
general is to load a vector from the constant pool, secondly "duplicated"
constants can be synthesized by moving an integer between units and
broadcasting (of shuffling it), and finally the special cases of the
all-zeros vector and all-ones vectors can be loaded via a single SSE
instruction.   This patch handle additional cases that can be synthesized
in two instructions, loading an all-ones vector followed by another SSE
instruction.  Following my recent patch for PR target/112992, there's
conveniently a single place in i386-expand.cc where these special cases
can be handled.

Two examples are given in the original bugzilla PR for 106060.

__m256i should_be_cmpeq_abs ()
{
  return _mm256_set1_epi8 (1);
}

is now generated (with -O3 -march=x86-64-v3) as:

        vpcmpeqd        %ymm0, %ymm0, %ymm0
        vpabsb  %ymm0, %ymm0
        ret

and

__m256i should_be_cmpeq_add ()
{
  return _mm256_set1_epi8 (-2);
}

is now generated as:

        vpcmpeqd        %ymm0, %ymm0, %ymm0
        vpaddb  %ymm0, %ymm0, %ymm0
        ret

2024-05-07  Roger Sayle  <roger@nextmovesoftware.com>
	    Hongtao Liu  <hongtao.liu@intel.com>

gcc/ChangeLog
	PR target/106060
	* config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New.
	(struct ix86_vec_bcast_map_simode_t): New type for table below.
	(ix86_vec_bcast_map_simode): Table of SImode constants that may
	be efficiently synthesized by a ix86_vec_bcast_alg method.
	(ix86_vec_bcast_map_simode_cmp): New comparator for bsearch.
	(ix86_vector_duplicate_simode_const): Efficiently synthesize
	V4SImode and V8SImode constants that duplicate special constants.
	(ix86_vector_duplicate_value): Attempt to synthesize "special"
	vector constants using ix86_vector_duplicate_simode_const.
	* config/i386/i386.cc (ix86_rtx_costs) <case ABS>: ABS of a
	vector integer mode costs with a single SSE instruction.

gcc/testsuite/ChangeLog
	PR target/106060
	* gcc.target/i386/auto-init-8.c: Update test case.
	* gcc.target/i386/avx512fp16-13.c: Likewise.
	* gcc.target/i386/pr100865-9a.c: Likewise.
	* gcc.target/i386/pr101796-1.c: Likewise.
	* gcc.target/i386/pr106060-1.c: New test case.
	* gcc.target/i386/pr106060-2.c: Likewise.
	* gcc.target/i386/pr106060-3.c: Likewise.
	* gcc.target/i386/pr70314.c: Update test case.
	* gcc.target/i386/vect-shiftv4qi.c: Likewise.
	* gcc.target/i386/vect-shiftv8qi.c: Likewise.
2024-05-07 07:16:58 +01:00
.github
c++tools
config Daily bump. 2024-04-17 00:18:45 +00:00
contrib Daily bump. 2024-05-07 00:18:28 +00:00
fixincludes
gcc PR target/106060: Improved SSE vector constant materialization on x86. 2024-05-07 07:16:58 +01:00
gnattools
gotools Daily bump. 2024-04-16 00:18:06 +00:00
include
INSTALL
libada
libatomic Daily bump. 2024-04-27 00:18:05 +00:00
libbacktrace Daily bump. 2024-05-04 00:16:30 +00:00
libcc1
libcody
libcpp Daily bump. 2024-05-01 00:17:56 +00:00
libdecnumber
libffi
libgcc Daily bump. 2024-05-07 00:18:28 +00:00
libgfortran Daily bump. 2024-05-07 00:18:28 +00:00
libgm2 Daily bump. 2024-05-03 00:17:26 +00:00
libgo runtime: dump registers on Solaris 2024-04-29 11:39:58 -07:00
libgomp Daily bump. 2024-05-03 00:17:26 +00:00
libgrust
libiberty
libitm
libobjc
libphobos
libquadmath
libsanitizer
libssp
libstdc++-v3 Daily bump. 2024-05-04 00:16:30 +00:00
libvtv
lto-plugin
maintainer-scripts Daily bump. 2024-04-27 00:18:05 +00:00
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-05-07 00:18:28 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in config-ml.in: Fix multi-os-dir search 2024-05-06 12:08:28 +08:00
config.guess
config.rpath
config.sub
configure build: Use of cargo not yet supported here in Canadian cross configurations 2024-04-16 09:43:47 +02:00
configure.ac build: Use of cargo not yet supported here in Canadian cross configurations 2024-04-16 09:43:47 +02:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.