sst-linux/kernel
Ran Xiaokai 8f4d099504 tracing/osnoise: Fix possible recursive locking for cpus_read_lock()
commit 7e6b3fcc9c5294aeafed0dbe1a09a1bc899bd0f2 upstream.

Lockdep reports this deadlock log:

osnoise: could not start sampling thread
============================================
WARNING: possible recursive locking detected
--------------------------------------------
       CPU0
       ----
  lock(cpu_hotplug_lock);
  lock(cpu_hotplug_lock);

 Call Trace:
  <TASK>
  print_deadlock_bug+0x282/0x3c0
  __lock_acquire+0x1610/0x29a0
  lock_acquire+0xcb/0x2d0
  cpus_read_lock+0x49/0x120
  stop_per_cpu_kthreads+0x7/0x60
  start_kthread+0x103/0x120
  osnoise_hotplug_workfn+0x5e/0x90
  process_one_work+0x44f/0xb30
  worker_thread+0x33e/0x5e0
  kthread+0x206/0x3b0
  ret_from_fork+0x31/0x50
  ret_from_fork_asm+0x11/0x20
  </TASK>

This is the deadlock scenario:
osnoise_hotplug_workfn()
  guard(cpus_read_lock)();      // first lock call
  start_kthread(cpu)
    if (IS_ERR(kthread)) {
      stop_per_cpu_kthreads(); {
        cpus_read_lock();      // second lock call. Cause the AA deadlock
      }
    }

It is not necessary to call stop_per_cpu_kthreads() which stops osnoise
kthread for every other CPUs in the system if a failure occurs during
hotplug of a certain CPU.
For start_per_cpu_kthreads(), if the start_kthread() call fails,
this function calls stop_per_cpu_kthreads() to handle the error.
Therefore, similarly, there is no need to call stop_per_cpu_kthreads()
again within start_kthread().
So just remove stop_per_cpu_kthreads() from start_kthread to solve this issue.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250321095249.2739397-1-ranxiaokai627@163.com
Fixes: c8895e271f ("trace/osnoise: Support hotplug operations")
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-10 14:33:43 +02:00
..
bpf bpf: skip non exist keys in generic_map_lookup_batch 2025-03-07 16:56:38 +01:00
cgroup cgroup: fix race between fork and cgroup.kill 2025-02-21 13:50:04 +01:00
configs
debug kdb: Do not assume write() callback available 2025-02-21 13:50:09 +01:00
dma dma-debug: fix a possible deadlock on radix_lock 2024-12-14 19:54:43 +01:00
entry entry: Respect changes to system call number by trace_sys_enter() 2024-04-03 15:19:44 +02:00
events perf/ring_buffer: Allow the EPOLLRDNORM flag for poll 2025-04-10 14:33:31 +02:00
futex futex: Don't include process MM in futex key on no-MMU 2023-11-20 11:51:50 +01:00
gcov gcov: add support for GCC 14 2024-06-27 13:46:22 +02:00
irq genirq: Make handle_enforce_irqctx() unconditionally available 2025-02-21 13:48:57 +01:00
kcsan kcsan: Turn report_filterlist_lock into a raw_spinlock 2024-12-14 19:54:38 +01:00
livepatch livepatch: Fix missing newline character in klp_resolve_symbols() 2023-11-20 11:52:10 +01:00
locking locking/semaphore: Use wake_q to wake up processes outside lock critical section 2025-04-10 14:33:39 +02:00
module module: Fix KCOV-ignored file name 2024-10-17 15:21:27 +02:00
power PM: hibernate: Add error handling for syscore_suspend() 2025-02-21 13:49:22 +01:00
printk printk: Fix signed integer overflow when defining LOG_BUF_LEN_MAX 2025-02-21 13:49:30 +01:00
rcu Revert "rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb()" 2025-01-02 10:30:55 +01:00
sched sched/deadline: Use online cpus for validating runtime 2025-04-10 14:33:39 +02:00
time hrtimers: Mark is_migration_base() with __always_inline 2025-03-28 21:58:50 +01:00
trace tracing/osnoise: Fix possible recursive locking for cpus_read_lock() 2025-04-10 14:33:43 +02:00
.gitignore .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
acct.c acct: block access to kernel internal filesystems 2025-03-07 16:56:39 +01:00
async.c async: Introduce async_schedule_dev_nocall() 2024-01-31 16:17:00 -08:00
audit_fsnotify.c audit: fix potential double free on error path from fsnotify_add_inode_mark 2022-08-22 18:50:06 -04:00
audit_tree.c
audit_watch.c audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare() 2023-11-28 17:07:08 +00:00
audit.c audit: Send netlink ACK before setting connection in auditd_set 2024-02-05 20:12:47 +00:00
audit.h audit: remove selinux_audit_rule_update() declaration 2022-09-07 11:30:15 -04:00
auditfilter.c ima: Avoid blocking in RCU read-side critical section 2024-07-11 12:47:16 +02:00
auditsc.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-25 12:03:04 +02:00
backtracetest.c
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-05-02 16:29:32 +02:00
capability.c xfs: don't generate selinux audit messages for capability testing 2022-03-09 10:32:06 -08:00
cfi.c cfi: Switch to -fsanitize=kcfi 2022-09-26 10:13:13 -07:00
compat.c sched_getaffinity: don't assume 'cpumask_size()' is fully initialized 2023-04-06 12:10:40 +02:00
configs.c
context_tracking.c context_tracking: Fix noinstr vs KASAN 2023-03-10 09:33:45 +01:00
cpu_pm.c
cpu.c hrtimers: Handle CPU state correctly on hotplug 2025-01-23 17:17:15 +01:00
crash_core.c vmcoreinfo: add kallsyms_num_syms symbol 2022-08-28 14:02:44 -07:00
crash_dump.c crash_dump: Remove no longer used saved_max_pfn 2020-04-15 11:21:54 +02:00
cred.c cred: switch to using atomic_long_t 2023-12-20 17:00:20 +01:00
delayacct.c delayacct: support re-entrance detection of thrashing accounting 2022-09-26 19:46:07 -07:00
dma.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
exec_domain.c
exit.c mm: optimize the redundant loop of mm_update_owner_next() 2024-07-11 12:47:13 +02:00
extable.c context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
fail_function.c kernel/fail_function: fix memory leak with using debugfs_lookup() 2023-03-11 13:55:39 +01:00
fork.c posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone 2024-11-14 13:15:16 +01:00
freezer.c
gen_kheaders.sh kheaders: Ignore silly-rename files 2025-01-23 17:17:11 +01:00
groups.c security: Add LSM hook to setgroups() syscall 2022-07-15 18:21:49 +00:00
hung_task.c
iomem.c mm/nvdimm: add is_ioremap_addr and use that to check ioremap address 2019-07-12 11:05:40 -07:00
irq_work.c irq_work: use kasan_record_aux_stack_noalloc() record callstack 2022-04-15 14:49:55 -07:00
jump_label.c jump_label: Fix static_key_slow_dec() yet again 2024-10-17 15:21:29 +02:00
kallsyms_internal.h kallsyms: Reduce the memory occupied by kallsyms_seqs_of_names[] 2023-10-25 12:03:16 +02:00
kallsyms.c kallsyms: Add helper kallsyms_on_each_match_symbol() 2023-10-25 12:03:16 +02:00
kcmp.c Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
Kconfig.freezer treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.hz treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.locks locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
Kconfig.preempt
kcov.c kcov: mark in_softirq_really() as __always_inline 2025-01-09 13:30:05 +01:00
kexec_core.c kexec: fix a memory leak in crash_shrink_memory() 2023-07-19 16:21:08 +02:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-04-10 14:33:36 +02:00
kexec_file.c kexec: support purgatories with .text.hot sections 2023-06-21 16:00:55 +02:00
kexec_internal.h
kexec.c kernel: kexec: copy user-array safely 2023-11-28 17:06:57 +00:00
kheaders.c kheaders: Use array declaration instead of char 2023-05-11 23:03:02 +09:00
kmod.c modules: add CONFIG_MODPROBE_PATH 2021-05-07 00:26:33 -07:00
kprobes.c kprobes: Fix to check symbol prefixes correctly 2024-08-14 13:52:54 +02:00
ksysfs.c
kthread.c kthread: unpark only parked kthread 2024-10-17 15:22:28 +02:00
latencytop.c latencytop: use the last element of latency_record of system 2022-09-11 21:55:12 -07:00
Makefile kernel/numa.c: Move logging out of numa.h 2024-06-12 11:03:16 +02:00
module_signature.c module: harden ELF info handling 2021-01-19 10:24:45 +01:00
notifier.c
nsproxy.c
numa.c kernel/numa.c: Move logging out of numa.h 2024-06-12 11:03:16 +02:00
padata.c padata: avoid UAF for reorder_work 2025-02-21 13:49:09 +01:00
panic.c panic: Flush kernel log buffer at the end 2024-04-13 13:04:54 +02:00
params.c
pid_namespace.c pid: Replace struct pid 1-element array with flex-array 2024-08-29 17:30:18 +02:00
pid.c pid: Replace struct pid 1-element array with flex-array 2024-08-29 17:30:18 +02:00
profile.c profiling: remove profile=sleep support 2024-08-14 13:52:50 +02:00
ptrace.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
range.c
reboot.c kernel/reboot: emergency_restart: Set correct system_state 2023-11-28 17:07:13 +00:00
regset.c
relay.c relayfs: fix out-of-bounds access in relay_file_read 2023-05-11 23:03:03 +09:00
resource_kunit.c
resource.c resource: fix region_intersects() vs add_memory_driver_managed() 2024-10-17 15:21:55 +02:00
rseq.c rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered 2022-11-14 09:58:32 +01:00
scftorture.c scftorture: Forgive memory-allocation failure if KASAN 2023-09-23 11:11:00 +02:00
scs.c kasan, vmalloc: only tag normal vmalloc allocations 2022-03-24 19:06:48 -07:00
seccomp.c seccomp: Add wait_killable semantic to seccomp user notifier 2022-05-03 14:11:58 -07:00
signal.c signal: restore the override_rlimit logic 2024-11-14 13:15:18 +01:00
smp.c smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu() 2024-09-12 11:10:24 +02:00
smpboot.c smpboot: use atomic_try_cmpxchg in cpu_wait_death and cpu_report_death 2022-09-11 21:55:10 -07:00
smpboot.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
softirq.c softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel 2025-02-01 18:30:06 +01:00
stackleak.c stackleak: add on/off stack variants 2022-05-08 01:33:09 -07:00
stacktrace.c
static_call_inline.c x86/static-call: provide a way to do very early static-call updates 2024-12-19 18:08:58 +01:00
static_call.c
stop_machine.c
sys_ni.c syscalls: fix compat_sys_io_pgetevents_time64 usage 2024-07-05 09:31:59 +02:00
sys.c hrtimer: Use and report correct timerslack values for realtime tasks 2025-03-28 21:58:48 +01:00
sysctl-test.c
sysctl.c
task_work.c task_work: Introduce task_work_cancel() again 2024-08-03 08:49:34 +02:00
taskstats.c
torture.c torture: Fix hang during kthread shutdown phase 2023-03-10 09:34:07 +01:00
tracepoint.c tracepoint: Optimize the critical region of mutex_lock in tracepoint_module_coming() 2022-09-26 13:01:18 -04:00
tsacct.c
ucount.c ucounts: fix counter leak in inc_rlimit_get_ucounts() 2024-11-14 13:15:19 +01:00
uid16.c fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers 2018-04-02 20:15:59 +02:00
uid16.h
umh.c freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL 2023-02-22 12:59:50 +01:00
up.c
user_namespace.c ucounts: Split rlimit and ucount values and max values 2022-10-09 16:24:05 -07:00
user-return-notifier.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
user.c
usermode_driver.c blob_to_mnt(): kern_unmount() is needed to undo kern_mount() 2022-05-19 23:25:47 -04:00
utsname_sysctl.c
utsname.c uts: Use generic ns_common::count 2020-08-19 14:13:20 +02:00
watch_queue.c watch_queue: fix pipe accounting mismatch 2025-04-10 14:33:30 +02:00
watchdog_hld.c watchdog/perf: properly initialize the turbo mode timestamp and rearm counter 2024-08-03 08:49:42 +02:00
watchdog.c watchdog: move softlockup_panic back to early_param 2023-11-28 17:07:09 +00:00
workqueue_internal.h
workqueue.c workqueue: Improve scalability of workqueue watchdog touch 2024-09-12 11:10:27 +02:00