Linux kernel source tree
Go to file
Dave Chinner 1c00f93686 mm: lift gfp_kmemleak_mask() to gfp.h
Patch series "mm: fix nested allocation context filtering".

This patchset is the followup to the comment I made earlier today:

https://lore.kernel.org/linux-xfs/ZjAyIWUzDipofHFJ@dread.disaster.area/

Tl;dr: Memory allocations that are done inside the public memory
allocation API need to obey the reclaim recursion constraints placed on
the allocation by the original caller, including the "don't track
recursion for this allocation" case defined by __GFP_NOLOCKDEP.

These nested allocations are generally in debug code that is tracking
something about the allocation (kmemleak, KASAN, etc) and so are
allocating private kernel objects that only that debug system will use.

Neither the page-owner code nor the stack depot code get this right.  They
also also clear GFP_ZONEMASK as a separate operation, which is completely
redundant because the constraint filter applied immediately after
guarantees that GFP_ZONEMASK bits are cleared.

kmemleak gets this filtering right.  It preserves the allocation
constraints for deadlock prevention and clears all other context flags
whilst also ensuring that the nested allocation will fail quickly,
silently and without depleting emergency kernel reserves if there is no
memory available.

This can be made much more robust, immune to whack-a-mole games and the
code greatly simplified by lifting gfp_kmemleak_mask() to
include/linux/gfp.h and using that everywhere.  Also document it so that
there is no excuse for not knowing about it when writing new debug code
that nests allocations.

Tested with lockdep, KASAN + page_owner=on and kmemleak=on over multiple
fstests runs with XFS.


This patch (of 3):

Any "internal" nested allocation done from within an allocation context
needs to obey the high level allocation gfp_mask constraints.  This is
necessary for debug code like KASAN, kmemleak, lockdep, etc that allocate
memory for saving stack traces and other information during memory
allocation.  If they don't obey things like __GFP_NOLOCKDEP or
__GFP_NOWARN, they produce false positive failure detections.

kmemleak gets this right by using gfp_kmemleak_mask() to pass through the
relevant context flags to the nested allocation to ensure that the
allocation follows the constraints of the caller context.

KASAN recently was foudn to be missing __GFP_NOLOCKDEP due to stack depot
allocations, and even more recently the page owner tracking code was also
found to be missing __GFP_NOLOCKDEP support.

We also don't wan't want KASAN or lockdep to drive the system into OOM
kill territory by exhausting emergency reserves.  This is something that
kmemleak also gets right by adding (__GFP_NORETRY | __GFP_NOMEMALLOC |
__GFP_NOWARN) to the allocation mask.

Hence it is clear that we need to define a common nested allocation filter
mask for these sorts of third party nested allocations used in debug code.
So to start this process, lift gfp_kmemleak_mask() to gfp.h and rename it
to gfp_nested_mask(), and convert the kmemleak callers to use it.

Link: https://lkml.kernel.org/r/20240430054604.4169568-1-david@fromorbit.com
Link: https://lkml.kernel.org/r/20240430054604.4169568-2-david@fromorbit.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:40:44 -07:00
arch Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
block Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
certs kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
crypto net-accept-more-20240515 2024-05-18 10:32:39 -07:00
Documentation Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
drivers Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
fs Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
include mm: lift gfp_kmemleak_mask() to gfp.h 2024-05-19 14:40:44 -07:00
init Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
io_uring The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
ipc Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
kernel Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
lib Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
LICENSES
mm mm: lift gfp_kmemleak_mask() to gfp.h 2024-05-19 14:40:44 -07:00
net The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
rust The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
samples Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
scripts Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
security The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
sound The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
tools Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
usr kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
virt The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
.clang-format
.cocciconfig
.editorconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap Another not-too-busy cycle for documentation, including: 2024-05-13 10:51:53 -07:00
.rustfmt.toml
COPYING
CREDITS MAINTAINERS: Drop Gustavo Pimentel as PCI DWC Maintainer 2024-03-27 13:41:02 -05:00
Kbuild
Kconfig
MAINTAINERS The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
Makefile Kbuild updates for v6.10 2024-05-18 12:39:20 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.