slab fixes for 6.1-rc1

-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEjUuTAak14xi+SF7M4CHKc/GJqRAFAmM6/BMACgkQ4CHKc/GJ
 qRBqBAgAh+5JdVkYBxW4MvGEolRw0RDIBNwEwmyJI7WeAegL8FaGI3jmA5Kcww4c
 yA+lL/jcS9zQ/qwwHHoCqZoCLDFa43oiDMjSW4MI6oZpV+T6lx5uaH5kXBKsmxy5
 2dONP7kYG/eFfBGB6F9qQOLJnCz0CXeY7+O99D1Nldx0yKKUVCK0krb018p5oI6a
 RTVRASSVuEGkxvJGo4BbIR1H40s1BKTyRO9eZCKEHSanYM5SVXdBy9GTh5VQWTPk
 WLwvXmd0DehZzlPrgg3PMVPBTNGO/yplWibugWyzUqGcPIhQPk6Z76aWE4vojI2q
 f0w+86BYR2U7SBV2ZaNrGrxk/PZJyg==
 =aDgU
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - The "common kmalloc v4" series [1] by Hyeonggon Yoo.

   While the plan after LPC is to try again if it's possible to get rid
   of SLOB and SLAB (and if any critical aspect of those is not possible
   to achieve with SLUB today, modify it accordingly), it will take a
   while even in case there are no objections.

   Meanwhile this is a nice cleanup and some parts (e.g. to the
   tracepoints) will be useful even if we end up with a single slab
   implementation in the future:

      - Improves the mm/slab_common.c wrappers to allow deleting
        duplicated code between SLAB and SLUB.

      - Large kmalloc() allocations in SLAB are passed to page allocator
        like in SLUB, reducing number of kmalloc caches.

      - Removes the {kmem_cache_alloc,kmalloc}_node variants of
        tracepoints, node id parameter added to non-_node variants.

 - Addition of kmalloc_size_roundup()

   The first two patches from a series by Kees Cook [2] that introduce
   kmalloc_size_roundup(). This will allow merging of per-subsystem
   patches using the new function and ultimately stop (ab)using ksize()
   in a way that causes ongoing trouble for debugging functionality and
   static checkers.

 - Wasted kmalloc() memory tracking in debugfs alloc_traces

   A patch from Feng Tang that enhances the existing debugfs
   alloc_traces file for kmalloc caches with information about how much
   space is wasted by allocations that needs less space than the
   particular kmalloc cache provides.

 - My series [3] to fix validation races for caches with enabled
   debugging:

      - By decoupling the debug cache operation more from non-debug
        fastpaths, extra locking simplifications were possible and thus
        done afterwards.

      - Additional cleanup of PREEMPT_RT specific code on top, by Thomas
        Gleixner.

      - A late fix for slab page leaks caused by the series, by Feng
        Tang.

 - Smaller fixes and cleanups:

      - Unneeded variable removals, by ye xingchen

      - A cleanup removing a BUG_ON() in create_unique_id(), by Chao Yu

Link: https://lore.kernel.org/all/20220817101826.236819-1-42.hyeyoo@gmail.com/ [1]
Link: https://lore.kernel.org/all/20220923202822.2667581-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/all/20220823170400.26546-1-vbabka@suse.cz/ [3]

* tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (30 commits)
  mm/slub: fix a slab missed to be freed problem
  slab: Introduce kmalloc_size_roundup()
  slab: Remove __malloc attribute from realloc functions
  mm/slub: clean up create_unique_id()
  mm/slub: enable debugging memory wasting of kmalloc
  slub: Make PREEMPT_RT support less convoluted
  mm/slub: simplify __cmpxchg_double_slab() and slab_[un]lock()
  mm/slub: convert object_map_lock to non-raw spinlock
  mm/slub: remove slab_lock() usage for debug operations
  mm/slub: restrict sysfs validation to debug caches and make it safe
  mm/sl[au]b: check if large object is valid in __ksize()
  mm/slab_common: move declaration of __ksize() to mm/slab.h
  mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
  mm/slab_common: unify NUMA and UMA version of tracepoints
  mm/sl[au]b: cleanup kmem_cache_alloc[_node]_trace()
  mm/sl[au]b: generalize kmalloc subsystem
  mm/slub: move free_debug_processing() further
  mm/sl[au]b: introduce common alloc/free functions without tracepoint
  mm/slab: kmalloc: pass requests larger than order-1 page to page allocator
  mm/slab_common: cleanup kmalloc_large()
  ...
This commit is contained in:
Linus Torvalds 2022-10-10 10:21:22 -07:00
commit 52abb27abf
11 changed files with 879 additions and 946 deletions

View File

@ -400,21 +400,30 @@ information:
allocated objects. The output is sorted by frequency of each trace. allocated objects. The output is sorted by frequency of each trace.
Information in the output: Information in the output:
Number of objects, allocating function, minimal/average/maximal jiffies since alloc, Number of objects, allocating function, possible memory wastage of
pid range of the allocating processes, cpu mask of allocating cpus, and stack trace. kmalloc objects(total/per-object), minimal/average/maximal jiffies
since alloc, pid range of the allocating processes, cpu mask of
allocating cpus, numa node mask of origins of memory, and stack trace.
Example::: Example:::
1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1:: 338 pci_alloc_dev+0x2c/0xa0 waste=521872/1544 age=290837/291891/293509 pid=1 cpus=106 nodes=0-1
__slab_alloc+0x6d/0x90 __kmem_cache_alloc_node+0x11f/0x4e0
kmem_cache_alloc_trace+0x2eb/0x300 kmalloc_trace+0x26/0xa0
populate_error_injection_list+0x97/0x110 pci_alloc_dev+0x2c/0xa0
init_error_injection+0x1b/0x71 pci_scan_single_device+0xd2/0x150
do_one_initcall+0x5f/0x2d0 pci_scan_slot+0xf7/0x2d0
kernel_init_freeable+0x26f/0x2d7 pci_scan_child_bus_extend+0x4e/0x360
kernel_init+0xe/0x118 acpi_pci_root_create+0x32e/0x3b0
ret_from_fork+0x22/0x30 pci_acpi_scan_root+0x2b9/0x2d0
acpi_pci_root_add.cold.11+0x110/0xb0a
acpi_bus_attach+0x262/0x3f0
device_for_each_child+0xb7/0x110
acpi_dev_for_each_child+0x77/0xa0
acpi_bus_attach+0x108/0x3f0
device_for_each_child+0xb7/0x110
acpi_dev_for_each_child+0x77/0xa0
acpi_bus_attach+0x108/0x3f0
2. free_traces:: 2. free_traces::

View File

@ -35,7 +35,8 @@
/* /*
* Note: do not use this directly. Instead, use __alloc_size() since it is conditionally * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally
* available and includes other attributes. * available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined
* in compiler-gcc.h, due to misbehaviors.
* *
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
* clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size

View File

@ -271,14 +271,16 @@ struct ftrace_likely_data {
/* /*
* Any place that could be marked with the "alloc_size" attribute is also * Any place that could be marked with the "alloc_size" attribute is also
* a place to be marked with the "malloc" attribute. Do this as part of the * a place to be marked with the "malloc" attribute, except those that may
* __alloc_size macro to avoid redundant attributes and to avoid missing a * be performing a _reallocation_, as that may alias the existing pointer.
* __malloc marking. * For these, use __realloc_size().
*/ */
#ifdef __alloc_size__ #ifdef __alloc_size__
# define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc # define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc
# define __realloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__)
#else #else
# define __alloc_size(x, ...) __malloc # define __alloc_size(x, ...) __malloc
# define __realloc_size(x, ...)
#endif #endif
#ifndef asm_volatile_goto #ifndef asm_volatile_goto

View File

@ -29,6 +29,8 @@
#define SLAB_RED_ZONE ((slab_flags_t __force)0x00000400U) #define SLAB_RED_ZONE ((slab_flags_t __force)0x00000400U)
/* DEBUG: Poison objects */ /* DEBUG: Poison objects */
#define SLAB_POISON ((slab_flags_t __force)0x00000800U) #define SLAB_POISON ((slab_flags_t __force)0x00000800U)
/* Indicate a kmalloc slab */
#define SLAB_KMALLOC ((slab_flags_t __force)0x00001000U)
/* Align objs on cache lines */ /* Align objs on cache lines */
#define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U) #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
/* Use GFP_DMA memory */ /* Use GFP_DMA memory */
@ -184,11 +186,25 @@ int kmem_cache_shrink(struct kmem_cache *s);
/* /*
* Common kmalloc functions provided by all allocators * Common kmalloc functions provided by all allocators
*/ */
void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __realloc_size(2);
void kfree(const void *objp); void kfree(const void *objp);
void kfree_sensitive(const void *objp); void kfree_sensitive(const void *objp);
size_t __ksize(const void *objp); size_t __ksize(const void *objp);
/**
* ksize - Report actual allocation size of associated object
*
* @objp: Pointer returned from a prior kmalloc()-family allocation.
*
* This should not be used for writing beyond the originally requested
* allocation size. Either use krealloc() or round up the allocation size
* with kmalloc_size_roundup() prior to allocation. If this is used to
* access beyond the originally requested allocation size, UBSAN_BOUNDS
* and/or FORTIFY_SOURCE may trip, since they only know about the
* originally allocated size via the __alloc_size attribute.
*/
size_t ksize(const void *objp); size_t ksize(const void *objp);
#ifdef CONFIG_PRINTK #ifdef CONFIG_PRINTK
bool kmem_valid_obj(void *object); bool kmem_valid_obj(void *object);
void kmem_dump_obj(void *object); void kmem_dump_obj(void *object);
@ -243,27 +259,17 @@ static inline unsigned int arch_slab_minalign(void)
#ifdef CONFIG_SLAB #ifdef CONFIG_SLAB
/* /*
* The largest kmalloc size supported by the SLAB allocators is * SLAB and SLUB directly allocates requests fitting in to an order-1 page
* 32 megabyte (2^25) or the maximum allocatable page order if that is * (PAGE_SIZE*2). Larger requests are passed to the page allocator.
* less than 32 MB.
*
* WARNING: Its not easy to increase this value since the allocators have
* to do various tricks to work around compiler limitations in order to
* ensure proper constant folding.
*/ */
#define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \ #define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1)
(MAX_ORDER + PAGE_SHIFT - 1) : 25) #define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1)
#define KMALLOC_SHIFT_MAX KMALLOC_SHIFT_HIGH
#ifndef KMALLOC_SHIFT_LOW #ifndef KMALLOC_SHIFT_LOW
#define KMALLOC_SHIFT_LOW 5 #define KMALLOC_SHIFT_LOW 5
#endif #endif
#endif #endif
#ifdef CONFIG_SLUB #ifdef CONFIG_SLUB
/*
* SLUB directly allocates requests fitting in to an order-1 page
* (PAGE_SIZE*2). Larger requests are passed to the page allocator.
*/
#define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1) #define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1)
#define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1) #define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1)
#ifndef KMALLOC_SHIFT_LOW #ifndef KMALLOC_SHIFT_LOW
@ -415,10 +421,6 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
if (size <= 512 * 1024) return 19; if (size <= 512 * 1024) return 19;
if (size <= 1024 * 1024) return 20; if (size <= 1024 * 1024) return 20;
if (size <= 2 * 1024 * 1024) return 21; if (size <= 2 * 1024 * 1024) return 21;
if (size <= 4 * 1024 * 1024) return 22;
if (size <= 8 * 1024 * 1024) return 23;
if (size <= 16 * 1024 * 1024) return 24;
if (size <= 32 * 1024 * 1024) return 25;
if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && size_is_constant) if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && size_is_constant)
BUILD_BUG_ON_MSG(1, "unexpected size in kmalloc_index()"); BUILD_BUG_ON_MSG(1, "unexpected size in kmalloc_index()");
@ -428,6 +430,7 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
/* Will never be reached. Needed because the compiler may complain */ /* Will never be reached. Needed because the compiler may complain */
return -1; return -1;
} }
static_assert(PAGE_SHIFT <= 20);
#define kmalloc_index(s) __kmalloc_index(s, true) #define kmalloc_index(s) __kmalloc_index(s, true)
#endif /* !CONFIG_SLOB */ #endif /* !CONFIG_SLOB */
@ -456,42 +459,22 @@ static __always_inline void kfree_bulk(size_t size, void **p)
kmem_cache_free_bulk(NULL, size, p); kmem_cache_free_bulk(NULL, size, p);
} }
#ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment
__alloc_size(1); __alloc_size(1);
void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment
__malloc; __malloc;
#else
static __always_inline __alloc_size(1) void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
return __kmalloc(size, flags);
}
static __always_inline void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node)
{
return kmem_cache_alloc(s, flags);
}
#endif
#ifdef CONFIG_TRACING #ifdef CONFIG_TRACING
extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size) void *kmalloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
__assume_slab_alignment __alloc_size(3); __assume_kmalloc_alignment __alloc_size(3);
#ifdef CONFIG_NUMA void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, int node, size_t size) __assume_kmalloc_alignment
int node, size_t size) __assume_slab_alignment
__alloc_size(4); __alloc_size(4);
#else
static __always_inline __alloc_size(4) void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
gfp_t gfpflags, int node, size_t size)
{
return kmem_cache_alloc_trace(s, gfpflags, size);
}
#endif /* CONFIG_NUMA */
#else /* CONFIG_TRACING */ #else /* CONFIG_TRACING */
static __always_inline __alloc_size(3) void *kmem_cache_alloc_trace(struct kmem_cache *s, /* Save a function call when CONFIG_TRACING=n */
gfp_t flags, size_t size) static __always_inline __alloc_size(3)
void *kmalloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
{ {
void *ret = kmem_cache_alloc(s, flags); void *ret = kmem_cache_alloc(s, flags);
@ -499,7 +482,8 @@ static __always_inline __alloc_size(3) void *kmem_cache_alloc_trace(struct kmem_
return ret; return ret;
} }
static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, static __always_inline __alloc_size(4)
void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size) int node, size_t size)
{ {
void *ret = kmem_cache_alloc_node(s, gfpflags, node); void *ret = kmem_cache_alloc_node(s, gfpflags, node);
@ -509,25 +493,11 @@ static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, g
} }
#endif /* CONFIG_TRACING */ #endif /* CONFIG_TRACING */
extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment void *kmalloc_large(size_t size, gfp_t flags) __assume_page_alignment
__alloc_size(1); __alloc_size(1);
#ifdef CONFIG_TRACING void *kmalloc_large_node(size_t size, gfp_t flags, int node) __assume_page_alignment
extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __alloc_size(1);
__assume_page_alignment __alloc_size(1);
#else
static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, gfp_t flags,
unsigned int order)
{
return kmalloc_order(size, flags, order);
}
#endif
static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t flags)
{
unsigned int order = get_order(size);
return kmalloc_order_trace(size, flags, order);
}
/** /**
* kmalloc - allocate memory * kmalloc - allocate memory
@ -597,7 +567,7 @@ static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags)
if (!index) if (!index)
return ZERO_SIZE_PTR; return ZERO_SIZE_PTR;
return kmem_cache_alloc_trace( return kmalloc_trace(
kmalloc_caches[kmalloc_type(flags)][index], kmalloc_caches[kmalloc_type(flags)][index],
flags, size); flags, size);
#endif #endif
@ -605,23 +575,35 @@ static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags)
return __kmalloc(size, flags); return __kmalloc(size, flags);
} }
#ifndef CONFIG_SLOB
static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node) static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
{ {
#ifndef CONFIG_SLOB if (__builtin_constant_p(size)) {
if (__builtin_constant_p(size) && unsigned int index;
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);
if (!i) if (size > KMALLOC_MAX_CACHE_SIZE)
return kmalloc_large_node(size, flags, node);
index = kmalloc_index(size);
if (!index)
return ZERO_SIZE_PTR; return ZERO_SIZE_PTR;
return kmem_cache_alloc_node_trace( return kmalloc_node_trace(
kmalloc_caches[kmalloc_type(flags)][i], kmalloc_caches[kmalloc_type(flags)][index],
flags, node, size); flags, node, size);
} }
#endif
return __kmalloc_node(size, flags, node); return __kmalloc_node(size, flags, node);
} }
#else
static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
{
if (__builtin_constant_p(size) && size > KMALLOC_MAX_CACHE_SIZE)
return kmalloc_large_node(size, flags, node);
return __kmalloc_node(size, flags, node);
}
#endif
/** /**
* kmalloc_array - allocate memory for an array. * kmalloc_array - allocate memory for an array.
@ -647,7 +629,7 @@ static inline __alloc_size(1, 2) void *kmalloc_array(size_t n, size_t size, gfp_
* @new_size: new size of a single member of the array * @new_size: new size of a single member of the array
* @flags: the type of memory to allocate (see kmalloc) * @flags: the type of memory to allocate (see kmalloc)
*/ */
static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p, static inline __realloc_size(2, 3) void * __must_check krealloc_array(void *p,
size_t new_n, size_t new_n,
size_t new_size, size_t new_size,
gfp_t flags) gfp_t flags)
@ -671,6 +653,12 @@ static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flag
return kmalloc_array(n, size, flags | __GFP_ZERO); return kmalloc_array(n, size, flags | __GFP_ZERO);
} }
void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
unsigned long caller) __alloc_size(1);
#define kmalloc_node_track_caller(size, flags, node) \
__kmalloc_node_track_caller(size, flags, node, \
_RET_IP_)
/* /*
* kmalloc_track_caller is a special version of kmalloc that records the * kmalloc_track_caller is a special version of kmalloc that records the
* calling function of the routine calling it for slab leak tracking instead * calling function of the routine calling it for slab leak tracking instead
@ -679,9 +667,9 @@ static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flag
* allocator where we care about the real place the memory allocation * allocator where we care about the real place the memory allocation
* request comes from. * request comes from.
*/ */
extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller);
#define kmalloc_track_caller(size, flags) \ #define kmalloc_track_caller(size, flags) \
__kmalloc_track_caller(size, flags, _RET_IP_) __kmalloc_node_track_caller(size, flags, \
NUMA_NO_NODE, _RET_IP_)
static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags, static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags,
int node) int node)
@ -700,21 +688,6 @@ static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t
return kmalloc_array_node(n, size, flags | __GFP_ZERO, node); return kmalloc_array_node(n, size, flags | __GFP_ZERO, node);
} }
#ifdef CONFIG_NUMA
extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
unsigned long caller) __alloc_size(1);
#define kmalloc_node_track_caller(size, flags, node) \
__kmalloc_node_track_caller(size, flags, node, \
_RET_IP_)
#else /* CONFIG_NUMA */
#define kmalloc_node_track_caller(size, flags, node) \
kmalloc_track_caller(size, flags)
#endif /* CONFIG_NUMA */
/* /*
* Shortcuts * Shortcuts
*/ */
@ -774,11 +747,28 @@ static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t fla
} }
extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags) extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags)
__alloc_size(3); __realloc_size(3);
extern void kvfree(const void *addr); extern void kvfree(const void *addr);
extern void kvfree_sensitive(const void *addr, size_t len); extern void kvfree_sensitive(const void *addr, size_t len);
unsigned int kmem_cache_size(struct kmem_cache *s); unsigned int kmem_cache_size(struct kmem_cache *s);
/**
* kmalloc_size_roundup - Report allocation bucket size for the given size
*
* @size: Number of bytes to round up from.
*
* This returns the number of bytes that would be available in a kmalloc()
* allocation of @size bytes. For example, a 126 byte request would be
* rounded up to the next sized kmalloc bucket, 128 bytes. (This is strictly
* for the general-purpose kmalloc()-based allocations, and is not for the
* pre-sized kmem_cache_alloc()-based allocations.)
*
* Use this to kmalloc() the full bucket size ahead of time instead of using
* ksize() to query the size after an allocation.
*/
size_t kmalloc_size_roundup(size_t size);
void __init kmem_cache_init_late(void); void __init kmem_cache_init_late(void);
#if defined(CONFIG_SMP) && defined(CONFIG_SLAB) #if defined(CONFIG_SMP) && defined(CONFIG_SLAB)

View File

@ -9,73 +9,15 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <trace/events/mmflags.h> #include <trace/events/mmflags.h>
DECLARE_EVENT_CLASS(kmem_alloc, TRACE_EVENT(kmem_cache_alloc,
TP_PROTO(unsigned long call_site, TP_PROTO(unsigned long call_site,
const void *ptr, const void *ptr,
struct kmem_cache *s, struct kmem_cache *s,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags),
TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
__field( size_t, bytes_req )
__field( size_t, bytes_alloc )
__field( unsigned long, gfp_flags )
__field( bool, accounted )
),
TP_fast_assign(
__entry->call_site = call_site;
__entry->ptr = ptr;
__entry->bytes_req = bytes_req;
__entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = (__force unsigned long)gfp_flags;
__entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ?
((gfp_flags & __GFP_ACCOUNT) ||
(s && s->flags & SLAB_ACCOUNT)) : false;
),
TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s accounted=%s",
(void *)__entry->call_site,
__entry->ptr,
__entry->bytes_req,
__entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags),
__entry->accounted ? "true" : "false")
);
DEFINE_EVENT(kmem_alloc, kmalloc,
TP_PROTO(unsigned long call_site, const void *ptr, struct kmem_cache *s,
size_t bytes_req, size_t bytes_alloc, gfp_t gfp_flags),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags)
);
DEFINE_EVENT(kmem_alloc, kmem_cache_alloc,
TP_PROTO(unsigned long call_site, const void *ptr, struct kmem_cache *s,
size_t bytes_req, size_t bytes_alloc, gfp_t gfp_flags),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags)
);
DECLARE_EVENT_CLASS(kmem_alloc_node,
TP_PROTO(unsigned long call_site,
const void *ptr,
struct kmem_cache *s,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags, gfp_t gfp_flags,
int node), int node),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node), TP_ARGS(call_site, ptr, s, gfp_flags, node),
TP_STRUCT__entry( TP_STRUCT__entry(
__field( unsigned long, call_site ) __field( unsigned long, call_site )
@ -90,13 +32,13 @@ DECLARE_EVENT_CLASS(kmem_alloc_node,
TP_fast_assign( TP_fast_assign(
__entry->call_site = call_site; __entry->call_site = call_site;
__entry->ptr = ptr; __entry->ptr = ptr;
__entry->bytes_req = bytes_req; __entry->bytes_req = s->object_size;
__entry->bytes_alloc = bytes_alloc; __entry->bytes_alloc = s->size;
__entry->gfp_flags = (__force unsigned long)gfp_flags; __entry->gfp_flags = (__force unsigned long)gfp_flags;
__entry->node = node; __entry->node = node;
__entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ? __entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ?
((gfp_flags & __GFP_ACCOUNT) || ((gfp_flags & __GFP_ACCOUNT) ||
(s && s->flags & SLAB_ACCOUNT)) : false; (s->flags & SLAB_ACCOUNT)) : false;
), ),
TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s", TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s",
@ -109,22 +51,44 @@ DECLARE_EVENT_CLASS(kmem_alloc_node,
__entry->accounted ? "true" : "false") __entry->accounted ? "true" : "false")
); );
DEFINE_EVENT(kmem_alloc_node, kmalloc_node, TRACE_EVENT(kmalloc,
TP_PROTO(unsigned long call_site, const void *ptr, TP_PROTO(unsigned long call_site,
struct kmem_cache *s, size_t bytes_req, size_t bytes_alloc, const void *ptr,
gfp_t gfp_flags, int node), size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags,
int node),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node) TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node),
);
DEFINE_EVENT(kmem_alloc_node, kmem_cache_alloc_node, TP_STRUCT__entry(
__field( unsigned long, call_site )
__field( const void *, ptr )
__field( size_t, bytes_req )
__field( size_t, bytes_alloc )
__field( unsigned long, gfp_flags )
__field( int, node )
),
TP_PROTO(unsigned long call_site, const void *ptr, TP_fast_assign(
struct kmem_cache *s, size_t bytes_req, size_t bytes_alloc, __entry->call_site = call_site;
gfp_t gfp_flags, int node), __entry->ptr = ptr;
__entry->bytes_req = bytes_req;
__entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = (__force unsigned long)gfp_flags;
__entry->node = node;
),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node) TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s",
(void *)__entry->call_site,
__entry->ptr,
__entry->bytes_req,
__entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags),
__entry->node,
(IS_ENABLED(CONFIG_MEMCG_KMEM) &&
(__entry->gfp_flags & (__force unsigned long)__GFP_ACCOUNT)) ? "true" : "false")
); );
TRACE_EVENT(kfree, TRACE_EVENT(kfree,
@ -149,20 +113,20 @@ TRACE_EVENT(kfree,
TRACE_EVENT(kmem_cache_free, TRACE_EVENT(kmem_cache_free,
TP_PROTO(unsigned long call_site, const void *ptr, const char *name), TP_PROTO(unsigned long call_site, const void *ptr, const struct kmem_cache *s),
TP_ARGS(call_site, ptr, name), TP_ARGS(call_site, ptr, s),
TP_STRUCT__entry( TP_STRUCT__entry(
__field( unsigned long, call_site ) __field( unsigned long, call_site )
__field( const void *, ptr ) __field( const void *, ptr )
__string( name, name ) __string( name, s->name )
), ),
TP_fast_assign( TP_fast_assign(
__entry->call_site = call_site; __entry->call_site = call_site;
__entry->ptr = ptr; __entry->ptr = ptr;
__assign_str(name, name); __assign_str(name, s->name);
), ),
TP_printk("call_site=%pS ptr=%p name=%s", TP_printk("call_site=%pS ptr=%p name=%s",

View File

@ -86,6 +86,7 @@ static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries
/* Also the *_bulk() variants by only checking prefixes. */ /* Also the *_bulk() variants by only checking prefixes. */
if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") || if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") || str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmem_cache_free") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") || str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc")) str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc"))
goto found; goto found;

297
mm/slab.c
View File

@ -3181,84 +3181,46 @@ must_grow:
} }
static __always_inline void * static __always_inline void *
slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid, size_t orig_size, __do_cache_alloc(struct kmem_cache *cachep, gfp_t flags, int nodeid)
unsigned long caller)
{ {
unsigned long save_flags; void *objp = NULL;
void *ptr;
int slab_node = numa_mem_id(); int slab_node = numa_mem_id();
struct obj_cgroup *objcg = NULL;
bool init = false;
flags &= gfp_allowed_mask; if (nodeid == NUMA_NO_NODE) {
cachep = slab_pre_alloc_hook(cachep, NULL, &objcg, 1, flags); if (current->mempolicy || cpuset_do_slab_mem_spread()) {
if (unlikely(!cachep)) objp = alternate_node_alloc(cachep, flags);
return NULL; if (objp)
ptr = kfence_alloc(cachep, orig_size, flags);
if (unlikely(ptr))
goto out_hooks;
local_irq_save(save_flags);
if (nodeid == NUMA_NO_NODE)
nodeid = slab_node;
if (unlikely(!get_node(cachep, nodeid))) {
/* Node not bootstrapped yet */
ptr = fallback_alloc(cachep, flags);
goto out; goto out;
} }
if (nodeid == slab_node) {
/* /*
* Use the locally cached objects if possible. * Use the locally cached objects if possible.
* However ____cache_alloc does not allow fallback * However ____cache_alloc does not allow fallback
* to other nodes. It may fail while we still have * to other nodes. It may fail while we still have
* objects on other nodes available. * objects on other nodes available.
*/ */
ptr = ____cache_alloc(cachep, flags); objp = ____cache_alloc(cachep, flags);
if (ptr) nodeid = slab_node;
} else if (nodeid == slab_node) {
objp = ____cache_alloc(cachep, flags);
} else if (!get_node(cachep, nodeid)) {
/* Node not bootstrapped yet */
objp = fallback_alloc(cachep, flags);
goto out; goto out;
} }
/* ___cache_alloc_node can fall back to other nodes */
ptr = ____cache_alloc_node(cachep, flags, nodeid);
out:
local_irq_restore(save_flags);
ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
init = slab_want_init_on_alloc(flags, cachep);
out_hooks:
slab_post_alloc_hook(cachep, objcg, flags, 1, &ptr, init);
return ptr;
}
static __always_inline void *
__do_cache_alloc(struct kmem_cache *cache, gfp_t flags)
{
void *objp;
if (current->mempolicy || cpuset_do_slab_mem_spread()) {
objp = alternate_node_alloc(cache, flags);
if (objp)
goto out;
}
objp = ____cache_alloc(cache, flags);
/* /*
* We may just have run out of memory on the local node. * We may just have run out of memory on the local node.
* ____cache_alloc_node() knows how to locate memory on other nodes * ____cache_alloc_node() knows how to locate memory on other nodes
*/ */
if (!objp) if (!objp)
objp = ____cache_alloc_node(cache, flags, numa_mem_id()); objp = ____cache_alloc_node(cachep, flags, nodeid);
out: out:
return objp; return objp;
} }
#else #else
static __always_inline void * static __always_inline void *
__do_cache_alloc(struct kmem_cache *cachep, gfp_t flags) __do_cache_alloc(struct kmem_cache *cachep, gfp_t flags, int nodeid __maybe_unused)
{ {
return ____cache_alloc(cachep, flags); return ____cache_alloc(cachep, flags);
} }
@ -3266,8 +3228,8 @@ __do_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
#endif /* CONFIG_NUMA */ #endif /* CONFIG_NUMA */
static __always_inline void * static __always_inline void *
slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags, slab_alloc_node(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags,
size_t orig_size, unsigned long caller) int nodeid, size_t orig_size, unsigned long caller)
{ {
unsigned long save_flags; unsigned long save_flags;
void *objp; void *objp;
@ -3284,7 +3246,7 @@ slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags,
goto out; goto out;
local_irq_save(save_flags); local_irq_save(save_flags);
objp = __do_cache_alloc(cachep, flags); objp = __do_cache_alloc(cachep, flags, nodeid);
local_irq_restore(save_flags); local_irq_restore(save_flags);
objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller); objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
prefetchw(objp); prefetchw(objp);
@ -3295,6 +3257,14 @@ out:
return objp; return objp;
} }
static __always_inline void *
slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags,
size_t orig_size, unsigned long caller)
{
return slab_alloc_node(cachep, lru, flags, NUMA_NO_NODE, orig_size,
caller);
}
/* /*
* Caller needs to acquire correct kmem_cache_node's list_lock * Caller needs to acquire correct kmem_cache_node's list_lock
* @list: List of detached free slabs should be freed by caller * @list: List of detached free slabs should be freed by caller
@ -3470,8 +3440,7 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
{ {
void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_); void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
trace_kmem_cache_alloc(_RET_IP_, ret, cachep, trace_kmem_cache_alloc(_RET_IP_, ret, cachep, flags, NUMA_NO_NODE);
cachep->object_size, cachep->size, flags);
return ret; return ret;
} }
@ -3521,7 +3490,8 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
local_irq_disable(); local_irq_disable();
for (i = 0; i < size; i++) { for (i = 0; i < size; i++) {
void *objp = kfence_alloc(s, s->object_size, flags) ?: __do_cache_alloc(s, flags); void *objp = kfence_alloc(s, s->object_size, flags) ?:
__do_cache_alloc(s, flags, NUMA_NO_NODE);
if (unlikely(!objp)) if (unlikely(!objp))
goto error; goto error;
@ -3548,23 +3518,6 @@ error:
} }
EXPORT_SYMBOL(kmem_cache_alloc_bulk); EXPORT_SYMBOL(kmem_cache_alloc_bulk);
#ifdef CONFIG_TRACING
void *
kmem_cache_alloc_trace(struct kmem_cache *cachep, gfp_t flags, size_t size)
{
void *ret;
ret = slab_alloc(cachep, NULL, flags, size, _RET_IP_);
ret = kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc(_RET_IP_, ret, cachep,
size, cachep->size, flags);
return ret;
}
EXPORT_SYMBOL(kmem_cache_alloc_trace);
#endif
#ifdef CONFIG_NUMA
/** /**
* kmem_cache_alloc_node - Allocate an object on the specified node * kmem_cache_alloc_node - Allocate an object on the specified node
* @cachep: The cache to allocate from. * @cachep: The cache to allocate from.
@ -3580,65 +3533,21 @@ EXPORT_SYMBOL(kmem_cache_alloc_trace);
*/ */
void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid) void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
{ {
void *ret = slab_alloc_node(cachep, flags, nodeid, cachep->object_size, _RET_IP_); void *ret = slab_alloc_node(cachep, NULL, flags, nodeid, cachep->object_size, _RET_IP_);
trace_kmem_cache_alloc_node(_RET_IP_, ret, cachep, trace_kmem_cache_alloc(_RET_IP_, ret, cachep, flags, nodeid);
cachep->object_size, cachep->size,
flags, nodeid);
return ret; return ret;
} }
EXPORT_SYMBOL(kmem_cache_alloc_node); EXPORT_SYMBOL(kmem_cache_alloc_node);
#ifdef CONFIG_TRACING void *__kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags,
void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep, int nodeid, size_t orig_size,
gfp_t flags, unsigned long caller)
int nodeid,
size_t size)
{ {
void *ret; return slab_alloc_node(cachep, NULL, flags, nodeid,
orig_size, caller);
ret = slab_alloc_node(cachep, flags, nodeid, size, _RET_IP_);
ret = kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc_node(_RET_IP_, ret, cachep,
size, cachep->size,
flags, nodeid);
return ret;
} }
EXPORT_SYMBOL(kmem_cache_alloc_node_trace);
#endif
static __always_inline void *
__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
{
struct kmem_cache *cachep;
void *ret;
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
return NULL;
cachep = kmalloc_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(cachep)))
return cachep;
ret = kmem_cache_alloc_node_trace(cachep, flags, node, size);
ret = kasan_kmalloc(cachep, ret, size, flags);
return ret;
}
void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
return __do_kmalloc_node(size, flags, node, _RET_IP_);
}
EXPORT_SYMBOL(__kmalloc_node);
void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
int node, unsigned long caller)
{
return __do_kmalloc_node(size, flags, node, caller);
}
EXPORT_SYMBOL(__kmalloc_node_track_caller);
#endif /* CONFIG_NUMA */
#ifdef CONFIG_PRINTK #ifdef CONFIG_PRINTK
void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
@ -3662,45 +3571,25 @@ void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
} }
#endif #endif
/** static __always_inline
* __do_kmalloc - allocate memory void __do_kmem_cache_free(struct kmem_cache *cachep, void *objp,
* @size: how many bytes of memory are required.
* @flags: the type of memory to allocate (see kmalloc).
* @caller: function caller for debug tracking of the caller
*
* Return: pointer to the allocated memory or %NULL in case of error
*/
static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
unsigned long caller) unsigned long caller)
{ {
struct kmem_cache *cachep; unsigned long flags;
void *ret;
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) local_irq_save(flags);
return NULL; debug_check_no_locks_freed(objp, cachep->object_size);
cachep = kmalloc_slab(size, flags); if (!(cachep->flags & SLAB_DEBUG_OBJECTS))
if (unlikely(ZERO_OR_NULL_PTR(cachep))) debug_check_no_obj_freed(objp, cachep->object_size);
return cachep; __cache_free(cachep, objp, caller);
ret = slab_alloc(cachep, NULL, flags, size, caller); local_irq_restore(flags);
ret = kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc(caller, ret, cachep,
size, cachep->size, flags);
return ret;
} }
void *__kmalloc(size_t size, gfp_t flags) void __kmem_cache_free(struct kmem_cache *cachep, void *objp,
unsigned long caller)
{ {
return __do_kmalloc(size, flags, _RET_IP_); __do_kmem_cache_free(cachep, objp, caller);
} }
EXPORT_SYMBOL(__kmalloc);
void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller)
{
return __do_kmalloc(size, flags, caller);
}
EXPORT_SYMBOL(__kmalloc_track_caller);
/** /**
* kmem_cache_free - Deallocate an object * kmem_cache_free - Deallocate an object
@ -3712,34 +3601,38 @@ EXPORT_SYMBOL(__kmalloc_track_caller);
*/ */
void kmem_cache_free(struct kmem_cache *cachep, void *objp) void kmem_cache_free(struct kmem_cache *cachep, void *objp)
{ {
unsigned long flags;
cachep = cache_from_obj(cachep, objp); cachep = cache_from_obj(cachep, objp);
if (!cachep) if (!cachep)
return; return;
trace_kmem_cache_free(_RET_IP_, objp, cachep->name); trace_kmem_cache_free(_RET_IP_, objp, cachep);
local_irq_save(flags); __do_kmem_cache_free(cachep, objp, _RET_IP_);
debug_check_no_locks_freed(objp, cachep->object_size);
if (!(cachep->flags & SLAB_DEBUG_OBJECTS))
debug_check_no_obj_freed(objp, cachep->object_size);
__cache_free(cachep, objp, _RET_IP_);
local_irq_restore(flags);
} }
EXPORT_SYMBOL(kmem_cache_free); EXPORT_SYMBOL(kmem_cache_free);
void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p) void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
{ {
struct kmem_cache *s;
size_t i;
local_irq_disable(); local_irq_disable();
for (i = 0; i < size; i++) { for (int i = 0; i < size; i++) {
void *objp = p[i]; void *objp = p[i];
struct kmem_cache *s;
if (!orig_s) /* called via kfree_bulk */ if (!orig_s) {
s = virt_to_cache(objp); struct folio *folio = virt_to_folio(objp);
else
/* called via kfree_bulk */
if (!folio_test_slab(folio)) {
local_irq_enable();
free_large_kmalloc(folio, objp);
local_irq_disable();
continue;
}
s = folio_slab(folio)->slab_cache;
} else {
s = cache_from_obj(orig_s, objp); s = cache_from_obj(orig_s, objp);
}
if (!s) if (!s)
continue; continue;
@ -3755,39 +3648,6 @@ void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
} }
EXPORT_SYMBOL(kmem_cache_free_bulk); EXPORT_SYMBOL(kmem_cache_free_bulk);
/**
* kfree - free previously allocated memory
* @objp: pointer returned by kmalloc.
*
* If @objp is NULL, no operation is performed.
*
* Don't free memory not originally allocated by kmalloc()
* or you will run into trouble.
*/
void kfree(const void *objp)
{
struct kmem_cache *c;
unsigned long flags;
trace_kfree(_RET_IP_, objp);
if (unlikely(ZERO_OR_NULL_PTR(objp)))
return;
local_irq_save(flags);
kfree_debugcheck(objp);
c = virt_to_cache(objp);
if (!c) {
local_irq_restore(flags);
return;
}
debug_check_no_locks_freed(objp, c->object_size);
debug_check_no_obj_freed(objp, c->object_size);
__cache_free(c, (void *)objp, _RET_IP_);
local_irq_restore(flags);
}
EXPORT_SYMBOL(kfree);
/* /*
* This initializes kmem_cache_node or resizes various caches for all nodes. * This initializes kmem_cache_node or resizes various caches for all nodes.
*/ */
@ -4190,28 +4050,3 @@ void __check_heap_object(const void *ptr, unsigned long n,
usercopy_abort("SLAB object", cachep->name, to_user, offset, n); usercopy_abort("SLAB object", cachep->name, to_user, offset, n);
} }
#endif /* CONFIG_HARDENED_USERCOPY */ #endif /* CONFIG_HARDENED_USERCOPY */
/**
* __ksize -- Uninstrumented ksize.
* @objp: pointer to the object
*
* Unlike ksize(), __ksize() is uninstrumented, and does not provide the same
* safety checks as ksize() with KASAN instrumentation enabled.
*
* Return: size of the actual memory used by @objp in bytes
*/
size_t __ksize(const void *objp)
{
struct kmem_cache *c;
size_t size;
BUG_ON(!objp);
if (unlikely(objp == ZERO_SIZE_PTR))
return 0;
c = virt_to_cache(objp);
size = c ? c->object_size : 0;
return size;
}
EXPORT_SYMBOL(__ksize);

View File

@ -273,6 +273,11 @@ void create_kmalloc_caches(slab_flags_t);
/* Find the kmalloc slab corresponding for a certain size */ /* Find the kmalloc slab corresponding for a certain size */
struct kmem_cache *kmalloc_slab(size_t, gfp_t); struct kmem_cache *kmalloc_slab(size_t, gfp_t);
void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t orig_size,
unsigned long caller);
void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller);
#endif #endif
gfp_t kmalloc_fix_flags(gfp_t flags); gfp_t kmalloc_fix_flags(gfp_t flags);
@ -658,8 +663,13 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
print_tracking(cachep, x); print_tracking(cachep, x);
return cachep; return cachep;
} }
void free_large_kmalloc(struct folio *folio, void *object);
#endif /* CONFIG_SLOB */ #endif /* CONFIG_SLOB */
size_t __ksize(const void *objp);
static inline size_t slab_ksize(const struct kmem_cache *s) static inline size_t slab_ksize(const struct kmem_cache *s)
{ {
#ifndef CONFIG_SLUB #ifndef CONFIG_SLUB

View File

@ -511,13 +511,9 @@ EXPORT_SYMBOL(kmem_cache_destroy);
*/ */
int kmem_cache_shrink(struct kmem_cache *cachep) int kmem_cache_shrink(struct kmem_cache *cachep)
{ {
int ret;
kasan_cache_shrink(cachep); kasan_cache_shrink(cachep);
ret = __kmem_cache_shrink(cachep);
return ret; return __kmem_cache_shrink(cachep);
} }
EXPORT_SYMBOL(kmem_cache_shrink); EXPORT_SYMBOL(kmem_cache_shrink);
@ -665,7 +661,8 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name,
if (!s) if (!s)
panic("Out of memory when creating slab %s\n", name); panic("Out of memory when creating slab %s\n", name);
create_boot_cache(s, name, size, flags, useroffset, usersize); create_boot_cache(s, name, size, flags | SLAB_KMALLOC, useroffset,
usersize);
kasan_cache_create_kmalloc(s); kasan_cache_create_kmalloc(s);
list_add(&s->list, &slab_caches); list_add(&s->list, &slab_caches);
s->refcount = 1; s->refcount = 1;
@ -737,6 +734,26 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
return kmalloc_caches[kmalloc_type(flags)][index]; return kmalloc_caches[kmalloc_type(flags)][index];
} }
size_t kmalloc_size_roundup(size_t size)
{
struct kmem_cache *c;
/* Short-circuit the 0 size case. */
if (unlikely(size == 0))
return 0;
/* Short-circuit saturated "too-large" case. */
if (unlikely(size == SIZE_MAX))
return SIZE_MAX;
/* Above the smaller buckets, size is a multiple of page size. */
if (size > KMALLOC_MAX_CACHE_SIZE)
return PAGE_SIZE << get_order(size);
/* The flags don't matter since size_index is common to all. */
c = kmalloc_slab(size, GFP_KERNEL);
return c ? c->object_size : 0;
}
EXPORT_SYMBOL(kmalloc_size_roundup);
#ifdef CONFIG_ZONE_DMA #ifdef CONFIG_ZONE_DMA
#define KMALLOC_DMA_NAME(sz) .name[KMALLOC_DMA] = "dma-kmalloc-" #sz, #define KMALLOC_DMA_NAME(sz) .name[KMALLOC_DMA] = "dma-kmalloc-" #sz,
#else #else
@ -760,8 +777,8 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
/* /*
* kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time. * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
* kmalloc_index() supports up to 2^25=32MB, so the final entry of the table is * kmalloc_index() supports up to 2^21=2MB, so the final entry of the table is
* kmalloc-32M. * kmalloc-2M.
*/ */
const struct kmalloc_info_struct kmalloc_info[] __initconst = { const struct kmalloc_info_struct kmalloc_info[] __initconst = {
INIT_KMALLOC_INFO(0, 0), INIT_KMALLOC_INFO(0, 0),
@ -785,11 +802,7 @@ const struct kmalloc_info_struct kmalloc_info[] __initconst = {
INIT_KMALLOC_INFO(262144, 256k), INIT_KMALLOC_INFO(262144, 256k),
INIT_KMALLOC_INFO(524288, 512k), INIT_KMALLOC_INFO(524288, 512k),
INIT_KMALLOC_INFO(1048576, 1M), INIT_KMALLOC_INFO(1048576, 1M),
INIT_KMALLOC_INFO(2097152, 2M), INIT_KMALLOC_INFO(2097152, 2M)
INIT_KMALLOC_INFO(4194304, 4M),
INIT_KMALLOC_INFO(8388608, 8M),
INIT_KMALLOC_INFO(16777216, 16M),
INIT_KMALLOC_INFO(33554432, 32M)
}; };
/* /*
@ -902,6 +915,155 @@ void __init create_kmalloc_caches(slab_flags_t flags)
/* Kmalloc array is now usable */ /* Kmalloc array is now usable */
slab_state = UP; slab_state = UP;
} }
void free_large_kmalloc(struct folio *folio, void *object)
{
unsigned int order = folio_order(folio);
if (WARN_ON_ONCE(order == 0))
pr_warn_once("object pointer: 0x%p\n", object);
kmemleak_free(object);
kasan_kfree_large(object);
mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
-(PAGE_SIZE << order));
__free_pages(folio_page(folio, 0), order);
}
static void *__kmalloc_large_node(size_t size, gfp_t flags, int node);
static __always_inline
void *__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
{
struct kmem_cache *s;
void *ret;
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
ret = __kmalloc_large_node(size, flags, node);
trace_kmalloc(_RET_IP_, ret, size,
PAGE_SIZE << get_order(size), flags, node);
return ret;
}
s = kmalloc_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
return s;
ret = __kmem_cache_alloc_node(s, flags, node, size, caller);
ret = kasan_kmalloc(s, ret, size, flags);
trace_kmalloc(_RET_IP_, ret, size, s->size, flags, node);
return ret;
}
void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
return __do_kmalloc_node(size, flags, node, _RET_IP_);
}
EXPORT_SYMBOL(__kmalloc_node);
void *__kmalloc(size_t size, gfp_t flags)
{
return __do_kmalloc_node(size, flags, NUMA_NO_NODE, _RET_IP_);
}
EXPORT_SYMBOL(__kmalloc);
void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
int node, unsigned long caller)
{
return __do_kmalloc_node(size, flags, node, caller);
}
EXPORT_SYMBOL(__kmalloc_node_track_caller);
/**
* kfree - free previously allocated memory
* @object: pointer returned by kmalloc.
*
* If @object is NULL, no operation is performed.
*
* Don't free memory not originally allocated by kmalloc()
* or you will run into trouble.
*/
void kfree(const void *object)
{
struct folio *folio;
struct slab *slab;
struct kmem_cache *s;
trace_kfree(_RET_IP_, object);
if (unlikely(ZERO_OR_NULL_PTR(object)))
return;
folio = virt_to_folio(object);
if (unlikely(!folio_test_slab(folio))) {
free_large_kmalloc(folio, (void *)object);
return;
}
slab = folio_slab(folio);
s = slab->slab_cache;
__kmem_cache_free(s, (void *)object, _RET_IP_);
}
EXPORT_SYMBOL(kfree);
/**
* __ksize -- Report full size of underlying allocation
* @objp: pointer to the object
*
* This should only be used internally to query the true size of allocations.
* It is not meant to be a way to discover the usable size of an allocation
* after the fact. Instead, use kmalloc_size_roundup(). Using memory beyond
* the originally requested allocation size may trigger KASAN, UBSAN_BOUNDS,
* and/or FORTIFY_SOURCE.
*
* Return: size of the actual memory used by @objp in bytes
*/
size_t __ksize(const void *object)
{
struct folio *folio;
if (unlikely(object == ZERO_SIZE_PTR))
return 0;
folio = virt_to_folio(object);
if (unlikely(!folio_test_slab(folio))) {
if (WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE))
return 0;
if (WARN_ON(object != folio_address(folio)))
return 0;
return folio_size(folio);
}
return slab_ksize(folio_slab(folio)->slab_cache);
}
#ifdef CONFIG_TRACING
void *kmalloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size)
{
void *ret = __kmem_cache_alloc_node(s, gfpflags, NUMA_NO_NODE,
size, _RET_IP_);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE);
ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
EXPORT_SYMBOL(kmalloc_trace);
void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size)
{
void *ret = __kmem_cache_alloc_node(s, gfpflags, node, size, _RET_IP_);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, node);
ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
EXPORT_SYMBOL(kmalloc_node_trace);
#endif /* !CONFIG_TRACING */
#endif /* !CONFIG_SLOB */ #endif /* !CONFIG_SLOB */
gfp_t kmalloc_fix_flags(gfp_t flags) gfp_t kmalloc_fix_flags(gfp_t flags)
@ -921,37 +1083,50 @@ gfp_t kmalloc_fix_flags(gfp_t flags)
* directly to the page allocator. We use __GFP_COMP, because we will need to * directly to the page allocator. We use __GFP_COMP, because we will need to
* know the allocation order to free the pages properly in kfree. * know the allocation order to free the pages properly in kfree.
*/ */
void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
{ {
void *ret = NULL;
struct page *page; struct page *page;
void *ptr = NULL;
unsigned int order = get_order(size);
if (unlikely(flags & GFP_SLAB_BUG_MASK)) if (unlikely(flags & GFP_SLAB_BUG_MASK))
flags = kmalloc_fix_flags(flags); flags = kmalloc_fix_flags(flags);
flags |= __GFP_COMP; flags |= __GFP_COMP;
page = alloc_pages(flags, order); page = alloc_pages_node(node, flags, order);
if (likely(page)) { if (page) {
ret = page_address(page); ptr = page_address(page);
mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
PAGE_SIZE << order); PAGE_SIZE << order);
} }
ret = kasan_kmalloc_large(ret, size, flags);
/* As ret might get tagged, call kmemleak hook after KASAN. */
kmemleak_alloc(ret, size, 1, flags);
return ret;
}
EXPORT_SYMBOL(kmalloc_order);
#ifdef CONFIG_TRACING ptr = kasan_kmalloc_large(ptr, size, flags);
void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) /* As ptr might get tagged, call kmemleak hook after KASAN. */
kmemleak_alloc(ptr, size, 1, flags);
return ptr;
}
void *kmalloc_large(size_t size, gfp_t flags)
{ {
void *ret = kmalloc_order(size, flags, order); void *ret = __kmalloc_large_node(size, flags, NUMA_NO_NODE);
trace_kmalloc(_RET_IP_, ret, NULL, size, PAGE_SIZE << order, flags);
trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size),
flags, NUMA_NO_NODE);
return ret; return ret;
} }
EXPORT_SYMBOL(kmalloc_order_trace); EXPORT_SYMBOL(kmalloc_large);
#endif
void *kmalloc_large_node(size_t size, gfp_t flags, int node)
{
void *ret = __kmalloc_large_node(size, flags, node);
trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size),
flags, node);
return ret;
}
EXPORT_SYMBOL(kmalloc_large_node);
#ifdef CONFIG_SLAB_FREELIST_RANDOM #ifdef CONFIG_SLAB_FREELIST_RANDOM
/* Randomize a generic freelist */ /* Randomize a generic freelist */
@ -1150,8 +1325,8 @@ module_init(slab_proc_init);
#endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */ #endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */
static __always_inline void *__do_krealloc(const void *p, size_t new_size, static __always_inline __realloc_size(2) void *
gfp_t flags) __do_krealloc(const void *p, size_t new_size, gfp_t flags)
{ {
void *ret; void *ret;
size_t ks; size_t ks;
@ -1283,8 +1458,6 @@ EXPORT_SYMBOL(ksize);
/* Tracepoints definitions. */ /* Tracepoints definitions. */
EXPORT_TRACEPOINT_SYMBOL(kmalloc); EXPORT_TRACEPOINT_SYMBOL(kmalloc);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc); EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
EXPORT_TRACEPOINT_SYMBOL(kmalloc_node);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc_node);
EXPORT_TRACEPOINT_SYMBOL(kfree); EXPORT_TRACEPOINT_SYMBOL(kfree);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free); EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free);

View File

@ -507,8 +507,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
*m = size; *m = size;
ret = (void *)m + minalign; ret = (void *)m + minalign;
trace_kmalloc_node(caller, ret, NULL, trace_kmalloc(caller, ret, size, size + minalign, gfp, node);
size, size + minalign, gfp, node);
} else { } else {
unsigned int order = get_order(size); unsigned int order = get_order(size);
@ -516,8 +515,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
gfp |= __GFP_COMP; gfp |= __GFP_COMP;
ret = slob_new_pages(gfp, order, node); ret = slob_new_pages(gfp, order, node);
trace_kmalloc_node(caller, ret, NULL, trace_kmalloc(caller, ret, size, PAGE_SIZE << order, gfp, node);
size, PAGE_SIZE << order, gfp, node);
} }
kmemleak_alloc(ret, size, 1, gfp); kmemleak_alloc(ret, size, 1, gfp);
@ -530,20 +528,12 @@ void *__kmalloc(size_t size, gfp_t gfp)
} }
EXPORT_SYMBOL(__kmalloc); EXPORT_SYMBOL(__kmalloc);
void *__kmalloc_track_caller(size_t size, gfp_t gfp, unsigned long caller)
{
return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, caller);
}
EXPORT_SYMBOL(__kmalloc_track_caller);
#ifdef CONFIG_NUMA
void *__kmalloc_node_track_caller(size_t size, gfp_t gfp, void *__kmalloc_node_track_caller(size_t size, gfp_t gfp,
int node, unsigned long caller) int node, unsigned long caller)
{ {
return __do_kmalloc_node(size, gfp, node, caller); return __do_kmalloc_node(size, gfp, node, caller);
} }
EXPORT_SYMBOL(__kmalloc_node_track_caller); EXPORT_SYMBOL(__kmalloc_node_track_caller);
#endif
void kfree(const void *block) void kfree(const void *block)
{ {
@ -574,6 +564,20 @@ void kfree(const void *block)
} }
EXPORT_SYMBOL(kfree); EXPORT_SYMBOL(kfree);
size_t kmalloc_size_roundup(size_t size)
{
/* Short-circuit the 0 size case. */
if (unlikely(size == 0))
return 0;
/* Short-circuit saturated "too-large" case. */
if (unlikely(size == SIZE_MAX))
return SIZE_MAX;
return ALIGN(size, ARCH_KMALLOC_MINALIGN);
}
EXPORT_SYMBOL(kmalloc_size_roundup);
/* can't use ksize for kmem_cache_alloc memory, only kmalloc */ /* can't use ksize for kmem_cache_alloc memory, only kmalloc */
size_t __ksize(const void *block) size_t __ksize(const void *block)
{ {
@ -594,7 +598,6 @@ size_t __ksize(const void *block)
m = (unsigned int *)(block - align); m = (unsigned int *)(block - align);
return SLOB_UNITS(*m) * SLOB_UNIT; return SLOB_UNITS(*m) * SLOB_UNIT;
} }
EXPORT_SYMBOL(__ksize);
int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags) int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags)
{ {
@ -602,6 +605,9 @@ int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags)
/* leave room for rcu footer at the end of object */ /* leave room for rcu footer at the end of object */
c->size += sizeof(struct slob_rcu); c->size += sizeof(struct slob_rcu);
} }
/* Actual size allocated */
c->size = SLOB_UNITS(c->size) * SLOB_UNIT;
c->flags = flags; c->flags = flags;
return 0; return 0;
} }
@ -616,14 +622,10 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
if (c->size < PAGE_SIZE) { if (c->size < PAGE_SIZE) {
b = slob_alloc(c->size, flags, c->align, node, 0); b = slob_alloc(c->size, flags, c->align, node, 0);
trace_kmem_cache_alloc_node(_RET_IP_, b, NULL, c->object_size, trace_kmem_cache_alloc(_RET_IP_, b, c, flags, node);
SLOB_UNITS(c->size) * SLOB_UNIT,
flags, node);
} else { } else {
b = slob_new_pages(flags, get_order(c->size), node); b = slob_new_pages(flags, get_order(c->size), node);
trace_kmem_cache_alloc_node(_RET_IP_, b, NULL, c->object_size, trace_kmem_cache_alloc(_RET_IP_, b, c, flags, node);
PAGE_SIZE << get_order(c->size),
flags, node);
} }
if (b && c->ctor) { if (b && c->ctor) {
@ -647,7 +649,7 @@ void *kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru, gfp_
return slob_alloc_node(cachep, flags, NUMA_NO_NODE); return slob_alloc_node(cachep, flags, NUMA_NO_NODE);
} }
EXPORT_SYMBOL(kmem_cache_alloc_lru); EXPORT_SYMBOL(kmem_cache_alloc_lru);
#ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t gfp, int node) void *__kmalloc_node(size_t size, gfp_t gfp, int node)
{ {
return __do_kmalloc_node(size, gfp, node, _RET_IP_); return __do_kmalloc_node(size, gfp, node, _RET_IP_);
@ -659,7 +661,6 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t gfp, int node)
return slob_alloc_node(cachep, gfp, node); return slob_alloc_node(cachep, gfp, node);
} }
EXPORT_SYMBOL(kmem_cache_alloc_node); EXPORT_SYMBOL(kmem_cache_alloc_node);
#endif
static void __kmem_cache_free(void *b, int size) static void __kmem_cache_free(void *b, int size)
{ {
@ -680,7 +681,7 @@ static void kmem_rcu_free(struct rcu_head *head)
void kmem_cache_free(struct kmem_cache *c, void *b) void kmem_cache_free(struct kmem_cache *c, void *b)
{ {
kmemleak_free_recursive(b, c->flags); kmemleak_free_recursive(b, c->flags);
trace_kmem_cache_free(_RET_IP_, b, c->name); trace_kmem_cache_free(_RET_IP_, b, c);
if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) { if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) {
struct slob_rcu *slob_rcu; struct slob_rcu *slob_rcu;
slob_rcu = b + (c->size - sizeof(struct slob_rcu)); slob_rcu = b + (c->size - sizeof(struct slob_rcu));

861
mm/slub.c

File diff suppressed because it is too large Load Diff