Go to file
Kyrylo Tkachov 19757e1c28
aarch64: Optimize vector rotates as vector permutes where possible
Some vector rotate operations can be implemented in a single instruction
rather than using the fallback SHL+USRA sequence.
In particular, when the rotate amount is half the bitwidth of the element
we can use a REV64,REV32,REV16 instruction.
More generally, rotates by a byte amount can be implented using vector
permutes.
This patch adds such a generic routine in expmed.cc called
expand_rotate_as_vec_perm that calculates the required permute indices
and uses the expand_vec_perm_const interface.

On aarch64 this ends up generating the single-instruction sequences above
where possible and can use LDR+TBL sequences too, which are a good choice.

With help from Richard, the routine should be VLA-safe.
However, the only use of expand_rotate_as_vec_perm introduced in this patch
is in aarch64-specific code that for now only handles fixed-width modes.

A runtime aarch64 test is added to ensure the permute indices are not messed
up.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

gcc/

	* expmed.h (expand_rotate_as_vec_perm): Declare.
	* expmed.cc (expand_rotate_as_vec_perm): Define.
	* config/aarch64/aarch64-protos.h (aarch64_emit_opt_vec_rotate):
	Declare prototype.
	* config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Implement.
	* config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm<mode>):
	Call the above.

gcc/testsuite/

	* gcc.target/aarch64/vec-rot-exec.c: New test.
	* gcc.target/aarch64/simd/pr117048_2.c: New test.
2024-11-04 09:41:09 +01:00
.forgejo
.github
c++tools
config
contrib
fixincludes
gcc aarch64: Optimize vector rotates as vector permutes where possible 2024-11-04 09:41:09 +01:00
gnattools
gotools
include Daily bump. 2024-11-02 00:19:21 +00:00
INSTALL
libada
libatomic
libbacktrace
libcc1
libcody
libcpp Daily bump. 2024-11-02 00:19:21 +00:00
libdecnumber
libffi
libgcc
libgfortran
libgm2
libgo syscall: don't define syscall stub on Hurd 2024-10-30 11:33:07 -07:00
libgomp
libgrust
libiberty Daily bump. 2024-11-02 00:19:21 +00:00
libitm
libobjc
libphobos
libquadmath
libsanitizer
libssp
libstdc++-v3 Daily bump. 2024-11-03 00:17:26 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.b4-config
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-10-31 00:18:53 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure build: update bootstrap req to C++14 2024-10-28 08:55:35 -04:00
configure.ac build: update bootstrap req to C++14 2024-10-28 08:55:35 -04:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS [MAINTAINERS] Add myself to write after approval and DCO. 2024-10-30 14:31:09 +05:30
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.