sst-linux/drivers/soc
Saranya R 0a566a79ac soc: qcom: pdr: Fix the potential deadlock
commit 2eeb03ad9f42dfece63051be2400af487ddb96d2 upstream.

When some client process A call pdr_add_lookup() to add the look up for
the service and does schedule locator work, later a process B got a new
server packet indicating locator is up and call pdr_locator_new_server()
which eventually sets pdr->locator_init_complete to true which process A
sees and takes list lock and queries domain list but it will timeout due
to deadlock as the response will queued to the same qmi->wq and it is
ordered workqueue and process B is not able to complete new server
request work due to deadlock on list lock.

Fix it by removing the unnecessary list iteration as the list iteration
is already being done inside locator work, so avoid it here and just
call schedule_work() here.

       Process A                        Process B

                                     process_scheduled_works()
pdr_add_lookup()                      qmi_data_ready_work()
 process_scheduled_works()             pdr_locator_new_server()
                                         pdr->locator_init_complete=true;
   pdr_locator_work()
    mutex_lock(&pdr->list_lock);

     pdr_locate_service()                  mutex_lock(&pdr->list_lock);

      pdr_get_domain_list()
       pr_err("PDR: %s get domain list
               txn wait failed: %d\n",
               req->service_name,
               ret);

Timeout error log due to deadlock:

"
 PDR: tms/servreg get domain list txn wait failed: -110
 PDR: service lookup for msm/adsp/sensor_pd:tms/servreg failed: -110
"

Thanks to Bjorn and Johan for letting me know that this commit also fixes
an audio regression when using the in-kernel pd-mapper as that makes it
easier to hit this race. [1]

Link: https://lore.kernel.org/lkml/Zqet8iInnDhnxkT9@hovoldconsulting.com/ # [1]
Fixes: fbe639b44a ("soc: qcom: Introduce Protection Domain Restart helpers")
CC: stable@vger.kernel.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Tested-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Saranya R <quic_sarar@quicinc.com>
Co-developed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250212163720.1577876-1-mukesh.ojha@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-28 21:59:01 +01:00
..
actions
amlogic
apple
aspeed
atmel soc: atmel: fix device_node release in atmel_soc_device_init() 2025-02-21 13:49:09 +01:00
bcm
canaan
dove
fsl soc: fsl: rcpm: fix missing of_node_put() in copy_ippdexpcr1_setting() 2024-12-14 19:54:04 +01:00
fujitsu
gemini
imx soc: imx8m: Unregister cpufreq and soc dev in cleanup path 2025-03-28 21:58:59 +01:00
ixp4xx
lantiq
litex
mediatek soc: mediatek: mtk-devapc: Fix leaking IO map on driver remove 2025-03-07 16:56:32 +01:00
microchip
pxa
qcom soc: qcom: pdr: Fix the potential deadlock 2025-03-28 21:59:01 +01:00
renesas
rockchip
samsung
sifive
sunxi
tegra
ti pmdomain: ti-sci: Add missing of_node_put() for args.np 2024-12-14 19:53:22 +01:00
ux500
versatile
xilinx drivers: soc: xilinx: add the missing kfree in xlnx_add_cb_for_suspend() 2024-12-14 19:53:20 +01:00
Kconfig
Makefile