Remove allocations which are used only for NULL pointer check and free

Extend tree-ssa-dse to remove memory allocations that are used only
to check that return value is non-NULL and freed.

New -fmalloc-dce flag can be used to control malloc/free removal.  I
ended up copying what -fallocation-dse does so -fmalloc-dce=1 enables
malloc/free removal provided return value is unused otherwise and
-fmalloc-dce=2 allows additional NULL pointer checks which it folds to
non-NULL direction.

I also added compensation for the gcc.dg/analyzer/pr101837.c testcase and
added testcase that std::nothrow variant of operator new is now optimized way.

With the -fmalloc-dce=n I can also add a level which emits runtime check for half
of address space and calloc overflow if it seems useful, but perhaps
incrementally.  Adding size parameter tracking is not that hard (I posted WIP
patch for that).

gcc/ChangeLog:

	PR tree-optimization/117370
	* common.opt: Add -fmalloc-dce.
	* common.opt.urls: Update.
	* doc/invoke.texi: Document it; also add missing -flifetime-dse entry.
	* tree-ssa-dce.cc (is_removable_allocation_p): Break out from
	...
	(mark_stmt_if_obviously_necessary): ... here; also check that
	operator new satisfies gimple_call_from_new_or_delete.
	(checks_return_value_of_removable_allocation_p): New Function.
	(mark_all_reaching_defs_necessary_1): add missing case for
	STRDUP and STRNDUP
	(propagate_necessity): Use is_removable_allocation_p and
	checks_return_value_of_removable_allocation_p.
	(eliminate_unnecessary_stmts): Update conditionals that use
	removed allocation; use is_removable_allocation_p.

gcc/testsuite/ChangeLog:

	* g++.dg/cdce3.C: Disable allocation dce.
	* g++.dg/tree-ssa/pr19476-1.C: Likewise.
	* g++.dg/tree-ssa/pr19476-2.C: Likewise.
	* g++.dg/tree-ssa/pr19476-3.C: Likewise.
	* g++.dg/tree-ssa/pr19476-4.C: Likewise.
	* gcc.dg/analyzer/pr101837.c: Disable malloc dce.
	* gcc.dg/tree-ssa/pr19831-3.c: Update.
	* gfortran.dg/pr68078.f90: Disable malloc DCE.
This commit is contained in:
Jan Hubicka 2024-11-14 17:01:12 +01:00
parent f91e34644e
commit 7828dc0705
12 changed files with 148 additions and 74 deletions

View File

@ -2282,6 +2282,13 @@ fmax-errors=
Common Joined RejectNegative UInteger Var(flag_max_errors)
-fmax-errors=<number> Maximum number of errors to report.
fmalloc-dce
Common Var(flag_malloc_dce,2) Init(2) Optimization
Allow removal of malloc and free pairs when allocated block is unused.
fmalloc-dce=
Common Joined RejectNegative UInteger Var(flag_malloc_dse) Optimization IntegerRange(0, 2)
fmem-report
Common Var(mem_report)
Report on permanent memory allocation.

View File

@ -947,6 +947,12 @@ UrlSuffix(gcc/Optimize-Options.html#index-fmath-errno)
fmax-errors=
UrlSuffix(gcc/Warning-Options.html#index-fmax-errors) LangUrlSuffix_D(gdc/Warnings.html#index-fmax-errors)
fmalloc-dce
UrlSuffix(gcc/Optimize-Options.html#index-fmalloc-dce)
fmalloc-dce=
UrlSuffix(gcc/Optimize-Options.html#index-fmalloc-dce)
fmem-report
UrlSuffix(gcc/Developer-Options.html#index-fmem-report)

View File

@ -585,7 +585,7 @@ Objective-C and Objective-C++ Dialects}.
-fipa-bit-cp -fipa-vrp -fipa-pta -fipa-profile -fipa-pure-const
-fipa-reference -fipa-reference-addressable
-fipa-stack-alignment -fipa-icf -fira-algorithm=@var{algorithm}
-flate-combine-instructions -flive-patching=@var{level}
-flate-combine-instructions -flifetime-dse -flive-patching=@var{level}
-fira-region=@var{region} -fira-hoist-pressure
-fira-loop-pressure -fno-ira-share-save-slots
-fno-ira-share-spill-slots
@ -595,7 +595,7 @@ Objective-C and Objective-C++ Dialects}.
-floop-block -floop-interchange -floop-strip-mine
-floop-unroll-and-jam -floop-nest-optimize
-floop-parallelize-all -flra-remat -flto -flto-compression-level
-flto-partition=@var{alg} -fmerge-all-constants
-flto-partition=@var{alg} -fmalloc-dce -fmerge-all-constants
-fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves
-fmove-loop-invariants -fmove-loop-stores -fno-branch-count-reg
-fno-defer-pop -fno-fp-int-builtin-inexact -fno-function-cse
@ -14567,8 +14567,12 @@ affected by @option{-flimit-function-alignment}
@opindex fno-allocation-dce
@opindex fallocation-dce
@item -fno-allocation-dce
Do not remove unused C++ allocations in dead code elimination.
Do not remove unused C++ allocations (using operator @code{new} and operator @code{delete})
in dead code elimination.
See also @option{-fmalloc-dce}.
@opindex fallow-store-data-races
@item -fallow-store-data-races
@ -15441,6 +15445,18 @@ number of iterations).
Enabled by @option{-O3}, @option{-fprofile-use}, and @option{-fauto-profile}.
@opindex fno-malloc-dce
@opindex fmalloc-dce
@item -fmalloc-dce
Control whether @code{malloc} (and its variants such as @code{calloc} or
@code{strdup}), can be optimized away provided its return value is only used
as a parameter of @code{free} call or compared with @code{NULL}. If
@option{-fmalloc-dce=1} is used, only calls to @code{free} are allowed while
with @option{-fmalloc-dce=2} also comparsions with @code{NULL} pointer are
considered safe to remove.
The default is @option{-fmalloc-dce=2}. See also @option{-fallocation-dce}.
@opindex fmove-loop-invariants
@item -fmove-loop-invariants
Enables the loop invariant motion pass in the RTL loop optimizer. Enabled

View File

@ -1,6 +1,6 @@
/* { dg-do run } */
/* { dg-require-effective-target c99_runtime } */
/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm" } */
/* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -lm -fno-allocation-dce" } */
/* { dg-additional-options "-DLARGE_LONG_DOUBLE" { target large_long_double } } */
/* { dg-additional-options "-DGNU_EXTENSION" { target pow10 } } */
/* { dg-add-options ieee } */

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */
/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks -fno-allocation-dce" } */
/* { dg-skip-if "" keeps_null_pointer_checks } */
// See pr19476-5.C for a version without including <new>.

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized -fdelete-null-pointer-checks" } */
/* { dg-options "-O2 -fdump-tree-optimized -fdelete-null-pointer-checks -fno-allocation-dce" } */
/* { dg-skip-if "" keeps_null_pointer_checks } */
#include <new>

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */
/* { dg-options "-O3 -fcheck-new -fdump-tree-optimized" } */
/* { dg-options "-O3 -fcheck-new -fdump-tree-optimized -fno-allocation-dce" } */
#include <new>

View File

@ -1,5 +1,5 @@
/* { dg-do compile } */
/* { dg-options "-O3 -fno-delete-null-pointer-checks -fdump-tree-optimized" } */
/* { dg-options "-O3 -fno-delete-null-pointer-checks -fdump-tree-optimized -fno-allocation-dce" } */
#include <new>

View File

@ -1,4 +1,4 @@
/* { dg-additional-options "-O3 -fsanitize=undefined" } */
/* { dg-additional-options "-O3 -fsanitize=undefined -fno-malloc-dce" } */
void memory_exhausted();
void memcheck(void *ptr) {

View File

@ -29,10 +29,5 @@ void test6(void)
Assume p was non-NULL for test2.
For test5, it doesn't matter if p is NULL or non-NULL. */
/* { dg-final { scan-tree-dump-times "free" 0 "optimized" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "malloc" 0 "optimized" { xfail *-*-* } } } */
/* But make sure we don't partially optimize for now. */
/* { dg-final { scan-tree-dump-times "free" 3 "optimized" } } */
/* { dg-final { scan-tree-dump-times "malloc" 3 "optimized" } } */
/* { dg-final { scan-tree-dump-times "free" 0 "optimized" } } */
/* { dg-final { scan-tree-dump-times "malloc" 0 "optimized" } } */

View File

@ -1,4 +1,6 @@
! { dg-do run { target i?86-*-linux* x86_64-*-linux* } }
! disable DCE so the allocations are not optimized away
! { dg-additional-options "-fno-malloc-dce" }
! { dg-additional-sources set_vm_limit.c }
!
! This test calls set_vm_limit to set an artificially low address space

View File

@ -240,6 +240,60 @@ mark_operand_necessary (tree op)
worklist.safe_push (stmt);
}
/* Return true if STMT is a call to allocation function that can be
optimized out if the memory block is never used for anything else
then NULL pointer check or free.
If NON_NULL_CHECK is false, we can furhter assume that return value
is never checked to be non-NULL. */
static bool
is_removable_allocation_p (gcall *stmt, bool non_null_check)
{
tree callee = gimple_call_fndecl (stmt);
if (callee != NULL_TREE
&& fndecl_built_in_p (callee, BUILT_IN_NORMAL))
switch (DECL_FUNCTION_CODE (callee))
{
case BUILT_IN_MALLOC:
case BUILT_IN_ALIGNED_ALLOC:
case BUILT_IN_CALLOC:
CASE_BUILT_IN_ALLOCA:
case BUILT_IN_STRDUP:
case BUILT_IN_STRNDUP:
return non_null_check ? flag_malloc_dce > 1 : flag_malloc_dce;
case BUILT_IN_GOMP_ALLOC:
return true;
default:;
}
if (callee != NULL_TREE
&& flag_allocation_dce
&& gimple_call_from_new_or_delete (stmt)
&& DECL_IS_REPLACEABLE_OPERATOR_NEW_P (callee))
return true;
return false;
}
/* Return true if STMT is a conditional
if (ptr != NULL)
where ptr was returned by a removable allocation function. */
static bool
checks_return_value_of_removable_allocation_p (gimple *stmt)
{
gcall *def_stmt;
return gimple_code (stmt) == GIMPLE_COND
&& (gimple_cond_code (stmt) == EQ_EXPR
|| gimple_cond_code (stmt) == NE_EXPR)
&& integer_zerop (gimple_cond_rhs (stmt))
&& TREE_CODE (gimple_cond_lhs (stmt)) == SSA_NAME
&& (def_stmt = dyn_cast <gcall *>
(SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt))))
&& is_removable_allocation_p (def_stmt, true);
}
/* Mark STMT as necessary if it obviously is. Add it to the worklist if
it can make other statements necessary.
@ -271,38 +325,23 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
case GIMPLE_CALL:
{
gcall *call = as_a <gcall *> (stmt);
/* Never elide a noreturn call we pruned control-flow for. */
if ((gimple_call_flags (stmt) & ECF_NORETURN)
&& gimple_call_ctrl_altering_p (stmt))
if ((gimple_call_flags (call) & ECF_NORETURN)
&& gimple_call_ctrl_altering_p (call))
{
mark_stmt_necessary (stmt, true);
mark_stmt_necessary (call, true);
return;
}
tree callee = gimple_call_fndecl (stmt);
if (callee != NULL_TREE
&& fndecl_built_in_p (callee, BUILT_IN_NORMAL))
switch (DECL_FUNCTION_CODE (callee))
{
case BUILT_IN_MALLOC:
case BUILT_IN_ALIGNED_ALLOC:
case BUILT_IN_CALLOC:
CASE_BUILT_IN_ALLOCA:
case BUILT_IN_STRDUP:
case BUILT_IN_STRNDUP:
case BUILT_IN_GOMP_ALLOC:
return;
default:;
}
if (callee != NULL_TREE
&& flag_allocation_dce
&& DECL_IS_REPLACEABLE_OPERATOR_NEW_P (callee))
if (is_removable_allocation_p (call, false))
return;
/* For __cxa_atexit calls, don't mark as necessary right away. */
if (is_removable_cxa_atexit_call (stmt))
if (is_removable_cxa_atexit_call (call))
return;
/* IFN_GOACC_LOOP calls are necessary in that they are used to
@ -311,9 +350,9 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
survive from aggressive loop removal for it has loop exit and
is assumed to be finite. Therefore, we need to explicitly mark
these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
if (gimple_call_internal_p (call, IFN_GOACC_LOOP))
{
mark_stmt_necessary (stmt, true);
mark_stmt_necessary (call, true);
return;
}
break;
@ -667,6 +706,8 @@ mark_all_reaching_defs_necessary_1 (ao_ref *ref ATTRIBUTE_UNUSED,
case BUILT_IN_ALIGNED_ALLOC:
case BUILT_IN_CALLOC:
CASE_BUILT_IN_ALLOCA:
case BUILT_IN_STRDUP:
case BUILT_IN_STRNDUP:
case BUILT_IN_FREE:
case BUILT_IN_GOMP_ALLOC:
case BUILT_IN_GOMP_FREE:
@ -891,19 +932,11 @@ propagate_necessity (bool aggressive)
{
tree ptr = gimple_call_arg (stmt, 0);
gcall *def_stmt;
tree def_callee;
/* If the pointer we free is defined by an allocation
function do not add the call to the worklist. */
if (TREE_CODE (ptr) == SSA_NAME
&& (def_stmt = dyn_cast <gcall *> (SSA_NAME_DEF_STMT (ptr)))
&& (def_callee = gimple_call_fndecl (def_stmt))
&& ((DECL_BUILT_IN_CLASS (def_callee) == BUILT_IN_NORMAL
&& (DECL_FUNCTION_CODE (def_callee) == BUILT_IN_ALIGNED_ALLOC
|| DECL_FUNCTION_CODE (def_callee) == BUILT_IN_MALLOC
|| DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC
|| DECL_FUNCTION_CODE (def_callee) == BUILT_IN_GOMP_ALLOC))
|| (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee)
&& gimple_call_from_new_or_delete (def_stmt))))
&& is_removable_allocation_p (def_stmt, false))
{
if (is_delete_operator
&& !valid_new_delete_pair_p (def_stmt, stmt))
@ -925,6 +958,9 @@ propagate_necessity (bool aggressive)
}
}
if (checks_return_value_of_removable_allocation_p (stmt))
continue;
FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
mark_operand_necessary (use);
@ -1379,7 +1415,6 @@ eliminate_unnecessary_stmts (bool aggressive)
basic_block bb;
gimple_stmt_iterator gsi, psi;
gimple *stmt;
tree call;
auto_vec<edge> to_remove_edges;
if (dump_file && (dump_flags & TDF_DETAILS))
@ -1448,6 +1483,25 @@ eliminate_unnecessary_stmts (bool aggressive)
gimple_set_plf (stmt, STMT_NECESSARY, false);
}
}
/* Conditional checking that return value of allocation is non-NULL
can be turned to constant if the allocation itself
is unnecesary. */
if (gimple_plf (stmt, STMT_NECESSARY)
&& gimple_code (stmt) == GIMPLE_COND
&& TREE_CODE (gimple_cond_lhs (stmt)) == SSA_NAME)
{
gimple *def_stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt));
if (!gimple_nop_p (def_stmt)
&& !gimple_plf (def_stmt, STMT_NECESSARY))
{
gcc_checking_assert
(checks_return_value_of_removable_allocation_p (stmt));
gimple_cond_set_lhs (as_a <gcond *>(stmt),
build_one_cst
(TREE_TYPE (gimple_cond_rhs (stmt))));
update_stmt (stmt);
}
}
/* If GSI is not necessary then remove it. */
if (!gimple_plf (stmt, STMT_NECESSARY))
@ -1482,11 +1536,11 @@ eliminate_unnecessary_stmts (bool aggressive)
remove_dead_stmt (&gsi, bb, to_remove_edges);
continue;
}
else if (is_gimple_call (stmt))
else if (gcall *call_stmt = dyn_cast <gcall *> (stmt))
{
tree name = gimple_call_lhs (stmt);
tree name = gimple_call_lhs (call_stmt);
notice_special_calls (as_a <gcall *> (stmt));
notice_special_calls (call_stmt);
/* When LHS of var = call (); is dead, simplify it into
call (); saving one operand. */
@ -1496,36 +1550,30 @@ eliminate_unnecessary_stmts (bool aggressive)
/* Avoid doing so for allocation calls which we
did not mark as necessary, it will confuse the
special logic we apply to malloc/free pair removal. */
&& (!(call = gimple_call_fndecl (stmt))
|| ((DECL_BUILT_IN_CLASS (call) != BUILT_IN_NORMAL
|| (DECL_FUNCTION_CODE (call) != BUILT_IN_ALIGNED_ALLOC
&& DECL_FUNCTION_CODE (call) != BUILT_IN_MALLOC
&& DECL_FUNCTION_CODE (call) != BUILT_IN_CALLOC
&& !ALLOCA_FUNCTION_CODE_P
(DECL_FUNCTION_CODE (call))))
&& !DECL_IS_REPLACEABLE_OPERATOR_NEW_P (call))))
&& !is_removable_allocation_p (call_stmt, false))
{
something_changed = true;
if (dump_file && (dump_flags & TDF_DETAILS))
{
fprintf (dump_file, "Deleting LHS of call: ");
print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
print_gimple_stmt (dump_file, call_stmt, 0, TDF_SLIM);
fprintf (dump_file, "\n");
}
gimple_call_set_lhs (stmt, NULL_TREE);
maybe_clean_or_replace_eh_stmt (stmt, stmt);
update_stmt (stmt);
gimple_call_set_lhs (call_stmt, NULL_TREE);
maybe_clean_or_replace_eh_stmt (call_stmt, call_stmt);
update_stmt (call_stmt);
release_ssa_name (name);
/* GOMP_SIMD_LANE (unless three argument) or ASAN_POISON
without lhs is not needed. */
if (gimple_call_internal_p (stmt))
switch (gimple_call_internal_fn (stmt))
if (gimple_call_internal_p (call_stmt))
switch (gimple_call_internal_fn (call_stmt))
{
case IFN_GOMP_SIMD_LANE:
if (gimple_call_num_args (stmt) >= 3
&& !integer_nonzerop (gimple_call_arg (stmt, 2)))
if (gimple_call_num_args (call_stmt) >= 3
&& !integer_nonzerop
(gimple_call_arg (call_stmt, 2)))
break;
/* FALLTHRU */
case IFN_ASAN_POISON:
@ -1535,8 +1583,8 @@ eliminate_unnecessary_stmts (bool aggressive)
break;
}
}
else if (gimple_call_internal_p (stmt))
switch (gimple_call_internal_fn (stmt))
else if (gimple_call_internal_p (call_stmt))
switch (gimple_call_internal_fn (call_stmt))
{
case IFN_ADD_OVERFLOW:
maybe_optimize_arith_overflow (&gsi, PLUS_EXPR);
@ -1548,11 +1596,11 @@ eliminate_unnecessary_stmts (bool aggressive)
maybe_optimize_arith_overflow (&gsi, MULT_EXPR);
break;
case IFN_UADDC:
if (integer_zerop (gimple_call_arg (stmt, 2)))
if (integer_zerop (gimple_call_arg (call_stmt, 2)))
maybe_optimize_arith_overflow (&gsi, PLUS_EXPR);
break;
case IFN_USUBC:
if (integer_zerop (gimple_call_arg (stmt, 2)))
if (integer_zerop (gimple_call_arg (call_stmt, 2)))
maybe_optimize_arith_overflow (&gsi, MINUS_EXPR);
break;
default: