From 4301eb1c3ea67c16f07f56789cdbf56a87b9e950 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:44:45 -0500 Subject: [PATCH 01/16] net_sched: hfsc: Fix a UAF vulnerability in class handling jira VULN-67699 cve CVE-2025-37797 commit-author Cong Wang commit 3df275ef0a6ae181e8428a6589ef5d5231e58b5c This patch fixes a Use-After-Free vulnerability in the HFSC qdisc class handling. The issue occurs due to a time-of-check/time-of-use condition in hfsc_change_class() when working with certain child qdiscs like netem or codel. The vulnerability works as follows: 1. hfsc_change_class() checks if a class has packets (q.qlen != 0) 2. It then calls qdisc_peek_len(), which for certain qdiscs (e.g., codel, netem) might drop packets and empty the queue 3. The code continues assuming the queue is still non-empty, adding the class to vttree 4. This breaks HFSC scheduler assumptions that only non-empty classes are in vttree 5. Later, when the class is destroyed, this can lead to a Use-After-Free The fix adds a second queue length check after qdisc_peek_len() to verify the queue wasn't emptied. Fixes: 21f4d5cc25ec ("net_sched/hfsc: fix curve activation in hfsc_change_class()") Reported-by: Gerrard Tai Reviewed-by: Konstantin Khlebnikov Signed-off-by: Cong Wang Reviewed-by: Jamal Hadi Salim Link: https://patch.msgid.link/20250417184732.943057-2-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski (cherry picked from commit 3df275ef0a6ae181e8428a6589ef5d5231e58b5c) Signed-off-by: Jonathan Maple --- net/sched/sch_hfsc.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index 1dd1473f78376..b31e9c1df9d8b 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -964,6 +964,7 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid, if (cl != NULL) { int old_flags; + int len = 0; if (parentid) { if (cl->cl_parent && @@ -994,9 +995,13 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid, if (usc != NULL) hfsc_change_usc(cl, usc, cur_time); + if (cl->qdisc->q.qlen != 0) + len = qdisc_peek_len(cl->qdisc); + /* Check queue length again since some qdisc implementations + * (e.g., netem/codel) might empty the queue during the peek + * operation. + */ if (cl->qdisc->q.qlen != 0) { - int len = qdisc_peek_len(cl->qdisc); - if (cl->cl_flags & HFSC_RSC) { if (old_flags & HFSC_RSC) update_ed(cl, len); From 6740c30b9472490c013d7fbf2f43f9e1c4b826b5 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:44:49 -0500 Subject: [PATCH 02/16] net_sched: ets: Fix double list add in class with netem as child qdisc jira VULN-73372 cve CVE-2025-37914 commit-author Victor Nogueira commit 1a6d0c00fa07972384b0c308c72db091d49988b6 As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of ets, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before doing the addition to cater for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Link: https://patch.msgid.link/20250425220710.3964791-4-victor@mojatatu.com Signed-off-by: Jakub Kicinski (cherry picked from commit 1a6d0c00fa07972384b0c308c72db091d49988b6) Signed-off-by: Jonathan Maple --- net/sched/sch_ets.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index d733934935533..8b30a09f359ff 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -74,6 +74,11 @@ static const struct nla_policy ets_class_policy[TCA_ETS_MAX + 1] = { [TCA_ETS_QUANTA_BAND] = { .type = NLA_U32 }, }; +static bool cl_is_active(struct ets_class *cl) +{ + return !list_empty(&cl->alist); +} + static int ets_quantum_parse(struct Qdisc *sch, const struct nlattr *attr, unsigned int *quantum, struct netlink_ext_ack *extack) @@ -421,7 +426,6 @@ static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct ets_sched *q = qdisc_priv(sch); struct ets_class *cl; int err = 0; - bool first; cl = ets_classify(skb, sch, &err); if (!cl) { @@ -431,7 +435,6 @@ static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, return err; } - first = !cl->qdisc->q.qlen; err = qdisc_enqueue(skb, cl->qdisc, to_free); if (unlikely(err != NET_XMIT_SUCCESS)) { if (net_xmit_drop_count(err)) { @@ -441,7 +444,7 @@ static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, return err; } - if (first && !ets_class_is_strict(q, cl)) { + if (!cl_is_active(cl) && !ets_class_is_strict(q, cl)) { list_add_tail(&cl->alist, &q->active); cl->deficit = cl->quantum; } From 12abecbe379b170f75003d34ebac6b9e95462ddc Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:00 -0500 Subject: [PATCH 03/16] net: ch9200: fix uninitialised access during mii_nway_restart jira VULN-71593 cve CVE-2025-38086 commit-author Qasim Ijaz commit 9ad0452c0277b816a435433cca601304cfac7c21 In mii_nway_restart() the code attempts to call mii->mdio_read which is ch9200_mdio_read(). ch9200_mdio_read() utilises a local buffer called "buff", which is initialised with control_read(). However "buff" is conditionally initialised inside control_read(): if (err == size) { memcpy(data, buf, size); } If the condition of "err == size" is not met, then "buff" remains uninitialised. Once this happens the uninitialised "buff" is accessed and returned during ch9200_mdio_read(): return (buff[0] | buff[1] << 8); The problem stems from the fact that ch9200_mdio_read() ignores the return value of control_read(), leading to uinit-access of "buff". To fix this we should check the return value of control_read() and return early on error. Reported-by: syzbot Closes: https://syzkaller.appspot.com/bug?extid=3361c2d6f78a3e0892f9 Tested-by: syzbot Fixes: 4a476bd6d1d9 ("usbnet: New driver for QinHeng CH9200 devices") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz Link: https://patch.msgid.link/20250526183607.66527-1-qasdev00@gmail.com Signed-off-by: Jakub Kicinski (cherry picked from commit 9ad0452c0277b816a435433cca601304cfac7c21) Signed-off-by: Jonathan Maple --- drivers/net/usb/ch9200.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/usb/ch9200.c b/drivers/net/usb/ch9200.c index d7f3b70d54775..421853410fc27 100644 --- a/drivers/net/usb/ch9200.c +++ b/drivers/net/usb/ch9200.c @@ -178,6 +178,7 @@ static int ch9200_mdio_read(struct net_device *netdev, int phy_id, int loc) { struct usbnet *dev = netdev_priv(netdev); unsigned char buff[2]; + int ret; netdev_dbg(netdev, "%s phy_id:%02x loc:%02x\n", __func__, phy_id, loc); @@ -185,8 +186,10 @@ static int ch9200_mdio_read(struct net_device *netdev, int phy_id, int loc) if (phy_id != 0) return -ENODEV; - control_read(dev, REQUEST_READ, 0, loc * 2, buff, 0x02, - CONTROL_TIMEOUT_MS); + ret = control_read(dev, REQUEST_READ, 0, loc * 2, buff, 0x02, + CONTROL_TIMEOUT_MS); + if (ret < 0) + return ret; return (buff[0] | buff[1] << 8); } From c788266a1a41ea093975f2047683d2f14a777998 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:44:55 -0500 Subject: [PATCH 04/16] crypto: algif_hash - fix double free in hash_accept jira VULN-70978 cve CVE-2025-38079 commit-author Ivan Pravdin commit b2df03ed4052e97126267e8c13ad4204ea6ba9b6 If accept(2) is called on socket type algif_hash with MSG_MORE flag set and crypto_ahash_import fails, sk2 is freed. However, it is also freed in af_alg_release, leading to slab-use-after-free error. Fixes: fe869cdb89c9 ("crypto: algif_hash - User-space interface for hash operations") Cc: Signed-off-by: Ivan Pravdin Signed-off-by: Herbert Xu (cherry picked from commit b2df03ed4052e97126267e8c13ad4204ea6ba9b6) Signed-off-by: Jonathan Maple --- crypto/algif_hash.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index 765a6ba281ac8..41c6c67c97243 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -268,10 +268,6 @@ static int hash_accept(struct socket *sock, struct socket *newsock, int flags, return err; err = crypto_ahash_import(&ctx2->req, state); - if (err) { - sock_orphan(sk2); - sock_put(sk2); - } return err; } From 98961539436d69e50036b75e2a19caaa4b6f628c Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:03 -0500 Subject: [PATCH 05/16] wifi: rtw88: fix the 'para' buffer size to avoid reading out of bounds jira VULN-71887 cve CVE-2025-38159 commit-author Alexey Kodanev commit 4c2c372de2e108319236203cce6de44d70ae15cd Set the size to 6 instead of 2, since 'para' array is passed to 'rtw_fw_bt_wifi_control(rtwdev, para[0], ¶[1])', which reads 5 bytes: void rtw_fw_bt_wifi_control(struct rtw_dev *rtwdev, u8 op_code, u8 *data) { ... SET_BT_WIFI_CONTROL_DATA1(h2c_pkt, *data); SET_BT_WIFI_CONTROL_DATA2(h2c_pkt, *(data + 1)); ... SET_BT_WIFI_CONTROL_DATA5(h2c_pkt, *(data + 4)); Detected using the static analysis tool - Svace. Fixes: 4136214f7c46 ("rtw88: add BT co-existence support") Signed-off-by: Alexey Kodanev Signed-off-by: Ping-Ke Shih Link: https://patch.msgid.link/20250513121304.124141-1-aleksei.kodanev@bell-sw.com (cherry picked from commit 4c2c372de2e108319236203cce6de44d70ae15cd) Signed-off-by: Jonathan Maple --- drivers/net/wireless/realtek/rtw88/coex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtw88/coex.c b/drivers/net/wireless/realtek/rtw88/coex.c index cac053f485c3b..9dc7137aa2bfd 100644 --- a/drivers/net/wireless/realtek/rtw88/coex.c +++ b/drivers/net/wireless/realtek/rtw88/coex.c @@ -309,7 +309,7 @@ static void rtw_coex_tdma_timer_base(struct rtw_dev *rtwdev, u8 type) { struct rtw_coex *coex = &rtwdev->coex; struct rtw_coex_stat *coex_stat = &coex->stat; - u8 para[2] = {0}; + u8 para[6] = {}; u8 times; u16 tbtt_interval = coex_stat->wl_beacon_interval; From 4801881eb4c3417b30770883ee31196e29e29e5a Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:07 -0500 Subject: [PATCH 06/16] sch_hfsc: make hfsc_qlen_notify() idempotent jira VULN-71948 cve CVE-2025-38177 commit-author Cong Wang commit 51eb3b65544c9efd6a1026889ee5fb5aa62da3bb hfsc_qlen_notify() is not idempotent either and not friendly to its callers, like fq_codel_dequeue(). Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers' life: 1. update_vf() decreases cl->cl_nactive, so we can check whether it is non-zero before calling it. 2. eltree_remove() always removes RB node cl->el_node, but we can use RB_EMPTY_NODE() + RB_CLEAR_NODE() to make it safe. Reported-by: Gerrard Tai Signed-off-by: Cong Wang Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250403211033.166059-4-xiyou.wangcong@gmail.com Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni (cherry picked from commit 51eb3b65544c9efd6a1026889ee5fb5aa62da3bb) Signed-off-by: Jonathan Maple --- net/sched/sch_hfsc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index b31e9c1df9d8b..f61a8cec81595 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -209,7 +209,10 @@ eltree_insert(struct hfsc_class *cl) static inline void eltree_remove(struct hfsc_class *cl) { - rb_erase(&cl->el_node, &cl->sched->eligible); + if (!RB_EMPTY_NODE(&cl->el_node)) { + rb_erase(&cl->el_node, &cl->sched->eligible); + RB_CLEAR_NODE(&cl->el_node); + } } static inline void @@ -1229,7 +1232,8 @@ hfsc_qlen_notify(struct Qdisc *sch, unsigned long arg) /* vttree is now handled in update_vf() so that update_vf(cl, 0, 0) * needs to be called explicitly to remove a class from vttree. */ - update_vf(cl, 0, 0); + if (cl->cl_nactive) + update_vf(cl, 0, 0); if (cl->cl_flags & HFSC_RSC) eltree_remove(cl); } From 0732e6efc15ac399939e2cf3a10264ea60dd044b Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:10 -0500 Subject: [PATCH 07/16] i40e: fix MMIO write access to an invalid page in i40e_clear_hw jira VULN-72063 cve CVE-2025-38200 commit-author Kyungwook Boo commit 015bac5daca978448f2671478c553ce1f300c21e When the device sends a specific input, an integer underflow can occur, leading to MMIO write access to an invalid page. Prevent the integer underflow by changing the type of related variables. Signed-off-by: Kyungwook Boo Link: https://lore.kernel.org/lkml/ffc91764-1142-4ba2-91b6-8c773f6f7095@gmail.com/T/ Reviewed-by: Przemek Kitszel Reviewed-by: Simon Horman Reviewed-by: Aleksandr Loktionov Tested-by: Rinitha S (A Contingent worker at Intel) Signed-off-by: Tony Nguyen (cherry picked from commit 015bac5daca978448f2671478c553ce1f300c21e) Signed-off-by: Jonathan Maple --- drivers/net/ethernet/intel/i40e/i40e_common.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index 2819e261a126b..0e30c70adaab2 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -1209,10 +1209,11 @@ i40e_status i40e_pf_reset(struct i40e_hw *hw) void i40e_clear_hw(struct i40e_hw *hw) { u32 num_queues, base_queue; - u32 num_pf_int; - u32 num_vf_int; + s32 num_pf_int; + s32 num_vf_int; u32 num_vfs; - u32 i, j; + s32 i; + u32 j; u32 val; u32 eol = 0x7ff; From ab1ad02ba463f453aa5d26141fc79ec617e27131 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:15 -0500 Subject: [PATCH 08/16] scsi: lpfc: Use memcpy() for BIOS version jira VULN-72457 cve CVE-2025-38332 commit-author Daniel Wagner commit ae82eaf4aeea060bb736c3e20c0568b67c701d7d The strlcat() with FORTIFY support is triggering a panic because it thinks the target buffer will overflow although the correct target buffer size is passed in. Anyway, instead of memset() with 0 followed by a strlcat(), just use memcpy() and ensure that the resulting buffer is NULL terminated. BIOSVersion is only used for the lpfc_printf_log() which expects a properly terminated string. Signed-off-by: Daniel Wagner Link: https://lore.kernel.org/r/20250409-fix-lpfc-bios-str-v1-1-05dac9e51e13@kernel.org Reviewed-by: Justin Tee Signed-off-by: Martin K. Petersen (cherry picked from commit ae82eaf4aeea060bb736c3e20c0568b67c701d7d) Signed-off-by: Jonathan Maple --- drivers/scsi/lpfc/lpfc_sli.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 35ac2c240dc16..fe43bfe117eb6 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -5946,9 +5946,9 @@ lpfc_sli4_get_ctl_attr(struct lpfc_hba *phba) phba->sli4_hba.flash_id = bf_get(lpfc_cntl_attr_flash_id, cntl_attr); phba->sli4_hba.asic_rev = bf_get(lpfc_cntl_attr_asic_rev, cntl_attr); - memset(phba->BIOSVersion, 0, sizeof(phba->BIOSVersion)); - strlcat(phba->BIOSVersion, (char *)cntl_attr->bios_ver_str, + memcpy(phba->BIOSVersion, cntl_attr->bios_ver_str, sizeof(phba->BIOSVersion)); + phba->BIOSVersion[sizeof(phba->BIOSVersion) - 1] = '\0'; lpfc_printf_log(phba, KERN_INFO, LOG_SLI, "3086 lnk_type:%d, lnk_numb:%d, bios_ver:%s, " From 6af506d230799e84338f283f135b6ba123c426d1 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:20 -0500 Subject: [PATCH 09/16] vsock: Fix transport_* TOCTOU jira VULN-80683 cve CVE-2025-38461 commit-author Michal Luczaj commit 687aa0c5581b8d4aa87fd92973e4ee576b550cdf Transport assignment may race with module unload. Protect new_transport from becoming a stale pointer. This also takes care of an insecure call in vsock_use_local_transport(); add a lockdep assert. BUG: unable to handle page fault for address: fffffbfff8056000 Oops: Oops: 0000 [#1] SMP KASAN RIP: 0010:vsock_assign_transport+0x366/0x600 Call Trace: vsock_connect+0x59c/0xc40 __sys_connect+0xe8/0x100 __x64_sys_connect+0x6e/0xc0 do_syscall_64+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reviewed-by: Stefano Garzarella Signed-off-by: Michal Luczaj Link: https://patch.msgid.link/20250703-vsock-transports-toctou-v4-2-98f0eb530747@rbox.co Signed-off-by: Jakub Kicinski (cherry picked from commit 687aa0c5581b8d4aa87fd92973e4ee576b550cdf) Signed-off-by: Jonathan Maple --- net/vmw_vsock/af_vsock.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index ab1f9e3d5c352..53757fdc7de2c 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -398,6 +398,8 @@ EXPORT_SYMBOL_GPL(vsock_enqueue_accept); static bool vsock_use_local_transport(unsigned int remote_cid) { + lockdep_assert_held(&vsock_register_mutex); + if (!transport_local) return false; @@ -455,6 +457,8 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) remote_flags = vsk->remote_addr.svm_flags; + mutex_lock(&vsock_register_mutex); + switch (sk->sk_type) { case SOCK_DGRAM: new_transport = transport_dgram; @@ -469,12 +473,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) new_transport = transport_h2g; break; default: - return -ESOCKTNOSUPPORT; + ret = -ESOCKTNOSUPPORT; + goto err; } if (vsk->transport) { - if (vsk->transport == new_transport) - return 0; + if (vsk->transport == new_transport) { + ret = 0; + goto err; + } /* transport->release() must be called with sock lock acquired. * This path can only be taken during vsock_stream_connect(), @@ -489,8 +496,16 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) /* We increase the module refcnt to prevent the transport unloading * while there are open sockets assigned to it. */ - if (!new_transport || !try_module_get(new_transport->module)) - return -ENODEV; + if (!new_transport || !try_module_get(new_transport->module)) { + ret = -ENODEV; + goto err; + } + + /* It's safe to release the mutex after a successful try_module_get(). + * Whichever transport `new_transport` points at, it won't go away until + * the last module_put() below or in vsock_deassign_transport(). + */ + mutex_unlock(&vsock_register_mutex); ret = new_transport->init(vsk, psk); if (ret) { @@ -501,6 +516,9 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) vsk->transport = new_transport; return 0; +err: + mutex_unlock(&vsock_register_mutex); + return ret; } EXPORT_SYMBOL_GPL(vsock_assign_transport); From d073069bc50634a968d6fc9840f91aec5f89e446 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:23 -0500 Subject: [PATCH 10/16] tipc: Fix use-after-free in tipc_conn_close(). jira VULN-80317 cve CVE-2025-38464 commit-author Kuniyuki Iwashima commit 667eeab4999e981c96b447a4df5f20bdf5c26f13 syzbot reported a null-ptr-deref in tipc_conn_close() during netns dismantle. [0] tipc_topsrv_stop() iterates tipc_net(net)->topsrv->conn_idr and calls tipc_conn_close() for each tipc_conn. The problem is that tipc_conn_close() is called after releasing the IDR lock. At the same time, there might be tipc_conn_recv_work() running and it could call tipc_conn_close() for the same tipc_conn and release its last ->kref. Once we release the IDR lock in tipc_topsrv_stop(), there is no guarantee that the tipc_conn is alive. Let's hold the ref before releasing the lock and put the ref after tipc_conn_close() in tipc_topsrv_stop(). [0]: BUG: KASAN: use-after-free in tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165 Read of size 8 at addr ffff888099305a08 by task kworker/u4:3/435 CPU: 0 PID: 435 Comm: kworker/u4:3 Not tainted 4.19.204-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1fc/0x2ef lib/dump_stack.c:118 print_address_description.cold+0x54/0x219 mm/kasan/report.c:256 kasan_report_error.cold+0x8a/0x1b9 mm/kasan/report.c:354 kasan_report mm/kasan/report.c:412 [inline] __asan_report_load8_noabort+0x88/0x90 mm/kasan/report.c:433 tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165 tipc_topsrv_stop net/tipc/topsrv.c:701 [inline] tipc_topsrv_exit_net+0x27b/0x5c0 net/tipc/topsrv.c:722 ops_exit_list+0xa5/0x150 net/core/net_namespace.c:153 cleanup_net+0x3b4/0x8b0 net/core/net_namespace.c:553 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 Allocated by task 23: kmem_cache_alloc_trace+0x12f/0x380 mm/slab.c:3625 kmalloc include/linux/slab.h:515 [inline] kzalloc include/linux/slab.h:709 [inline] tipc_conn_alloc+0x43/0x4f0 net/tipc/topsrv.c:192 tipc_topsrv_accept+0x1b5/0x280 net/tipc/topsrv.c:470 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 Freed by task 23: __cache_free mm/slab.c:3503 [inline] kfree+0xcc/0x210 mm/slab.c:3822 tipc_conn_kref_release net/tipc/topsrv.c:150 [inline] kref_put include/linux/kref.h:70 [inline] conn_put+0x2cd/0x3a0 net/tipc/topsrv.c:155 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 The buggy address belongs to the object at ffff888099305a00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 8 bytes inside of 512-byte region [ffff888099305a00, ffff888099305c00) The buggy address belongs to the page: page:ffffea000264c140 count:1 mapcount:0 mapping:ffff88813bff0940 index:0x0 flags: 0xfff00000000100(slab) raw: 00fff00000000100 ffffea00028b6b88 ffffea0002cd2b08 ffff88813bff0940 raw: 0000000000000000 ffff888099305000 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888099305900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888099305980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888099305a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888099305a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888099305b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure") Reported-by: syzbot+d333febcf8f4bc5f6110@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=27169a847a70550d17be Signed-off-by: Kuniyuki Iwashima Reviewed-by: Tung Nguyen Link: https://patch.msgid.link/20250702014350.692213-1-kuniyu@google.com Signed-off-by: Jakub Kicinski (cherry picked from commit 667eeab4999e981c96b447a4df5f20bdf5c26f13) Signed-off-by: Jonathan Maple --- net/tipc/topsrv.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c index 2cb7c0ec2d2e4..98a23154c7a6d 100644 --- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -691,8 +691,10 @@ static void tipc_topsrv_stop(struct net *net) for (id = 0; srv->idr_in_use; id++) { con = idr_find(&srv->conn_idr, id); if (con) { + conn_get(con); spin_unlock_bh(&srv->idr_lock); tipc_conn_close(con); + conn_put(con); spin_lock_bh(&srv->idr_lock); } } From 9a6cd3855edd698756e2b25ae609104b30830670 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:32 -0500 Subject: [PATCH 11/16] do_change_type(): refuse to operate on unmounted/not ours mounts jira VULN-98608 cve CVE-2025-38498 commit-author Al Viro commit 12f147ddd6de7382dad54812e65f3f08d05809fc Ensure that propagation settings can only be changed for mounts located in the caller's mount namespace. This change aligns permission checking with the rest of mount(2). Reviewed-by: Christian Brauner Fixes: 07b20889e305 ("beginning of the shared-subtree proper") Reported-by: "Orlando, Noah" Signed-off-by: Al Viro (cherry picked from commit 12f147ddd6de7382dad54812e65f3f08d05809fc) Signed-off-by: Jonathan Maple --- fs/namespace.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/namespace.c b/fs/namespace.c index 35fa33e108c87..793d39bd7edac 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2290,6 +2290,10 @@ static int do_change_type(struct path *path, int ms_flags) return -EINVAL; namespace_lock(); + if (!check_mnt(mnt)) { + err = -EINVAL; + goto out_unlock; + } if (type == MS_SHARED) { err = invent_group_ids(mnt, recurse); if (err) From 25b5b1dce9b72f88008952eb8fff6cf1db657bb9 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Wed, 31 Dec 2025 13:09:03 -0500 Subject: [PATCH 12/16] move_mount: allow to add a mount into an existing group jira VULN-98608 cve-bf CVE-2025-38498 commit-author Pavel Tikhomirov commit 9ffb14ef61bab83fa818736bf3e7e6b6e182e8e2 Previously a sharing group (shared and master ids pair) can be only inherited when mount is created via bindmount. This patch adds an ability to add an existing private mount into an existing sharing group. With this functionality one can first create the desired mount tree from only private mounts (without the need to care about undesired mount propagation or mount creation order implied by sharing group dependencies), and next then setup any desired mount sharing between those mounts in tree as needed. This allows CRIU to restore any set of mount namespaces, mount trees and sharing group trees for a container. We have many issues with restoring mounts in CRIU related to sharing groups and propagation: - reverse sharing groups vs mount tree order requires complex mounts reordering which mostly implies also using some temporary mounts (please see https://lkml.org/lkml/2021/3/23/569 for more info) - mount() syscall creates tons of mounts due to propagation - mount re-parenting due to propagation - "Mount Trap" due to propagation - "Non Uniform" propagation, meaning that with different tricks with mount order and temporary children-"lock" mounts one can create mount trees which can't be restored without those tricks (see https://www.linuxplumbersconf.org/event/7/contributions/640/) With this new functionality we can resolve all the problems with propagation at once. Link: https://lore.kernel.org/r/20210715100714.120228-1-ptikhomirov@virtuozzo.com Cc: Eric W. Biederman Cc: Alexander Viro Cc: Christian Brauner Cc: Mattias Nissler Cc: Aleksa Sarai Cc: Andrei Vagin Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Cc: lkml Co-developed-by: Andrei Vagin Acked-by: Christian Brauner Signed-off-by: Pavel Tikhomirov Signed-off-by: Andrei Vagin Signed-off-by: Christian Brauner (cherry picked from commit 9ffb14ef61bab83fa818736bf3e7e6b6e182e8e2) Signed-off-by: Jonathan Maple --- fs/namespace.c | 77 +++++++++++++++++++++++++++++++++++++- include/uapi/linux/mount.h | 3 +- 2 files changed, 78 insertions(+), 2 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 793d39bd7edac..b0f8baa9426c4 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2634,6 +2634,78 @@ static bool check_for_nsfs_mounts(struct mount *subtree) return ret; } +static int do_set_group(struct path *from_path, struct path *to_path) +{ + struct mount *from, *to; + int err; + + from = real_mount(from_path->mnt); + to = real_mount(to_path->mnt); + + namespace_lock(); + + err = -EINVAL; + /* To and From must be mounted */ + if (!is_mounted(&from->mnt)) + goto out; + if (!is_mounted(&to->mnt)) + goto out; + + err = -EPERM; + /* We should be allowed to modify mount namespaces of both mounts */ + if (!ns_capable(from->mnt_ns->user_ns, CAP_SYS_ADMIN)) + goto out; + if (!ns_capable(to->mnt_ns->user_ns, CAP_SYS_ADMIN)) + goto out; + + err = -EINVAL; + /* To and From paths should be mount roots */ + if (from_path->dentry != from_path->mnt->mnt_root) + goto out; + if (to_path->dentry != to_path->mnt->mnt_root) + goto out; + + /* Setting sharing groups is only allowed across same superblock */ + if (from->mnt.mnt_sb != to->mnt.mnt_sb) + goto out; + + /* From mount root should be wider than To mount root */ + if (!is_subdir(to->mnt.mnt_root, from->mnt.mnt_root)) + goto out; + + /* From mount should not have locked children in place of To's root */ + if (has_locked_children(from, to->mnt.mnt_root)) + goto out; + + /* Setting sharing groups is only allowed on private mounts */ + if (IS_MNT_SHARED(to) || IS_MNT_SLAVE(to)) + goto out; + + /* From should not be private */ + if (!IS_MNT_SHARED(from) && !IS_MNT_SLAVE(from)) + goto out; + + if (IS_MNT_SLAVE(from)) { + struct mount *m = from->mnt_master; + + list_add(&to->mnt_slave, &m->mnt_slave_list); + to->mnt_master = m; + } + + if (IS_MNT_SHARED(from)) { + to->mnt_group_id = from->mnt_group_id; + list_add(&to->mnt_share, &from->mnt_share); + lock_mount_hash(); + set_mnt_shared(to); + unlock_mount_hash(); + } + + err = 0; +out: + namespace_unlock(); + return err; +} + static int do_move_mount(struct path *old_path, struct path *new_path) { struct path parent_path = {.mnt = NULL, .dentry = NULL}; @@ -3587,7 +3659,10 @@ SYSCALL_DEFINE5(move_mount, if (ret < 0) goto out_to; - ret = do_move_mount(&from_path, &to_path); + if (flags & MOVE_MOUNT_SET_GROUP) + ret = do_set_group(&from_path, &to_path); + else + ret = do_move_mount(&from_path, &to_path); out_to: path_put(&to_path); diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h index 96a0240f23fed..535ca707dfd7a 100644 --- a/include/uapi/linux/mount.h +++ b/include/uapi/linux/mount.h @@ -70,7 +70,8 @@ #define MOVE_MOUNT_T_SYMLINKS 0x00000010 /* Follow symlinks on to path */ #define MOVE_MOUNT_T_AUTOMOUNTS 0x00000020 /* Follow automounts on to path */ #define MOVE_MOUNT_T_EMPTY_PATH 0x00000040 /* Empty to path permitted */ -#define MOVE_MOUNT__MASK 0x00000077 +#define MOVE_MOUNT_SET_GROUP 0x00000100 /* Set sharing group instead */ +#define MOVE_MOUNT__MASK 0x00000177 /* * fsopen() flags. From 0a6a12488825a82c814fbd6c69dd7923b2d616a0 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Wed, 31 Dec 2025 13:09:24 -0500 Subject: [PATCH 13/16] use uniform permission checks for all mount propagation changes jira VULN-98608 cve-bf CVE-2025-38498 commit-author Al Viro commit cffd0441872e7f6b1fce5e78fb1c99187a291330 do_change_type() and do_set_group() are operating on different aspects of the same thing - propagation graph. The latter asks for mounts involved to be mounted in namespace(s) the caller has CAP_SYS_ADMIN for. The former is a mess - originally it didn't even check that mount *is* mounted. That got fixed, but the resulting check turns out to be too strict for userland - in effect, we check that mount is in our namespace, having already checked that we have CAP_SYS_ADMIN there. What we really need (in both cases) is * only touch mounts that are mounted. That's a must-have constraint - data corruption happens if it get violated. * don't allow to mess with a namespace unless you already have enough permissions to do so (i.e. CAP_SYS_ADMIN in its userns). That's an equivalent of what do_set_group() does; let's extract that into a helper (may_change_propagation()) and use it in both do_set_group() and do_change_type(). Fixes: 12f147ddd6de "do_change_type(): refuse to operate on unmounted/not ours mounts" Acked-by: Andrei Vagin Reviewed-by: Pavel Tikhomirov Tested-by: Pavel Tikhomirov Reviewed-by: Christian Brauner Signed-off-by: Al Viro (cherry picked from commit cffd0441872e7f6b1fce5e78fb1c99187a291330) Signed-off-by: Jonathan Maple --- fs/namespace.c | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index b0f8baa9426c4..306fc2778ec62 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2254,6 +2254,19 @@ static int graft_tree(struct mount *mnt, struct mount *p, struct mountpoint *mp) return attach_recursive_mnt(mnt, p, mp, NULL); } +static int may_change_propagation(const struct mount *m) +{ + struct mnt_namespace *ns = m->mnt_ns; + + // it must be mounted in some namespace + if (IS_ERR_OR_NULL(ns)) // is_mounted() + return -EINVAL; + // and the caller must be admin in userns of that namespace + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + return 0; +} + /* * Sanity check the flags to change_mnt_propagation. */ @@ -2290,10 +2303,10 @@ static int do_change_type(struct path *path, int ms_flags) return -EINVAL; namespace_lock(); - if (!check_mnt(mnt)) { - err = -EINVAL; + err = may_change_propagation(mnt); + if (err) goto out_unlock; - } + if (type == MS_SHARED) { err = invent_group_ids(mnt, recurse); if (err) @@ -2644,18 +2657,11 @@ static int do_set_group(struct path *from_path, struct path *to_path) namespace_lock(); - err = -EINVAL; - /* To and From must be mounted */ - if (!is_mounted(&from->mnt)) - goto out; - if (!is_mounted(&to->mnt)) - goto out; - - err = -EPERM; - /* We should be allowed to modify mount namespaces of both mounts */ - if (!ns_capable(from->mnt_ns->user_ns, CAP_SYS_ADMIN)) + err = may_change_propagation(from); + if (err) goto out; - if (!ns_capable(to->mnt_ns->user_ns, CAP_SYS_ADMIN)) + err = may_change_propagation(to); + if (err) goto out; err = -EINVAL; From be6f10c46a2abe4a3f3251c848cbfd745109c9a8 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Thu, 18 Dec 2025 16:45:28 -0500 Subject: [PATCH 14/16] net/sched: sch_qfq: Fix race condition on qfq_aggregate jira VULN-89291 cve CVE-2025-38477 commit-author Xiang Mei commit 5e28d5a3f774f118896aec17a3a20a9c5c9dfc64 A race condition can occur when 'agg' is modified in qfq_change_agg (called during qfq_enqueue) while other threads access it concurrently. For example, qfq_dump_class may trigger a NULL dereference, and qfq_delete_class may cause a use-after-free. This patch addresses the issue by: 1. Moved qfq_destroy_class into the critical section. 2. Added sch_tree_lock protection to qfq_dump_class and qfq_dump_class_stats. Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Signed-off-by: Xiang Mei Reviewed-by: Cong Wang Signed-off-by: David S. Miller (cherry picked from commit 5e28d5a3f774f118896aec17a3a20a9c5c9dfc64) Signed-off-by: Jonathan Maple --- net/sched/sch_qfq.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index 4dd99cd300ebe..e616c2240e75a 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -406,7 +406,7 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, bool existing = false; struct nlattr *tb[TCA_QFQ_MAX + 1]; struct qfq_aggregate *new_agg = NULL; - u32 weight, lmax, inv_w; + u32 weight, lmax, inv_w, old_weight, old_lmax; int err; int delta_w; @@ -442,12 +442,16 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, inv_w = ONE_FP / weight; weight = ONE_FP / inv_w; - if (cl != NULL && - lmax == cl->agg->lmax && - weight == cl->agg->class_weight) - return 0; /* nothing to change */ + if (cl != NULL) { + sch_tree_lock(sch); + old_weight = cl->agg->class_weight; + old_lmax = cl->agg->lmax; + sch_tree_unlock(sch); + if (lmax == old_lmax && weight == old_weight) + return 0; /* nothing to change */ + } - delta_w = weight - (cl ? cl->agg->class_weight : 0); + delta_w = weight - (cl ? old_weight : 0); if (q->wsum + delta_w > QFQ_MAX_WSUM) { pr_notice("qfq: total weight out of range (%d + %u)\n", @@ -550,10 +554,10 @@ static int qfq_delete_class(struct Qdisc *sch, unsigned long arg, qdisc_purge_queue(cl->qdisc); qdisc_class_hash_remove(&q->clhash, &cl->common); + qfq_destroy_class(sch, cl); sch_tree_unlock(sch); - qfq_destroy_class(sch, cl); return 0; } @@ -620,6 +624,7 @@ static int qfq_dump_class(struct Qdisc *sch, unsigned long arg, { struct qfq_class *cl = (struct qfq_class *)arg; struct nlattr *nest; + u32 class_weight, lmax; tcm->tcm_parent = TC_H_ROOT; tcm->tcm_handle = cl->common.classid; @@ -628,8 +633,13 @@ static int qfq_dump_class(struct Qdisc *sch, unsigned long arg, nest = nla_nest_start_noflag(skb, TCA_OPTIONS); if (nest == NULL) goto nla_put_failure; - if (nla_put_u32(skb, TCA_QFQ_WEIGHT, cl->agg->class_weight) || - nla_put_u32(skb, TCA_QFQ_LMAX, cl->agg->lmax)) + + sch_tree_lock(sch); + class_weight = cl->agg->class_weight; + lmax = cl->agg->lmax; + sch_tree_unlock(sch); + if (nla_put_u32(skb, TCA_QFQ_WEIGHT, class_weight) || + nla_put_u32(skb, TCA_QFQ_LMAX, lmax)) goto nla_put_failure; return nla_nest_end(skb, nest); @@ -646,8 +656,10 @@ static int qfq_dump_class_stats(struct Qdisc *sch, unsigned long arg, memset(&xstats, 0, sizeof(xstats)); + sch_tree_lock(sch); xstats.weight = cl->agg->class_weight; xstats.lmax = cl->agg->lmax; + sch_tree_unlock(sch); if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || From e86585584025132edcef585eadb6da754a7553cd Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Wed, 31 Dec 2025 12:56:10 -0500 Subject: [PATCH 15/16] net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class jira VULN-89291 cve-bf CVE-2025-38477 commit-author Xiang Mei commit cf074eca0065bc5142e6004ae236bb35a2687fdf might_sleep could be trigger in the atomic context in qfq_delete_class. qfq_destroy_class was moved into atomic context locked by sch_tree_lock to avoid a race condition bug on qfq_aggregate. However, might_sleep could be triggered by qfq_destroy_class, which introduced sleeping in atomic context (path: qfq_destroy_class->qdisc_put->__qdisc_destroy->lockdep_unregister_key ->might_sleep). Considering the race is on the qfq_aggregate objects, keeping qfq_rm_from_agg in the lock but moving the left part out can solve this issue. Fixes: 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on qfq_aggregate") Reported-by: Dan Carpenter Signed-off-by: Xiang Mei Link: https://patch.msgid.link/4a04e0cc-a64b-44e7-9213-2880ed641d77@sabinyo.mountain Reviewed-by: Cong Wang Reviewed-by: Dan Carpenter Link: https://patch.msgid.link/20250717230128.159766-1-xmei5@asu.edu Signed-off-by: Paolo Abeni (cherry picked from commit cf074eca0065bc5142e6004ae236bb35a2687fdf) Signed-off-by: Jonathan Maple --- net/sched/sch_qfq.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index e616c2240e75a..116eb7e7679fb 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -533,9 +533,6 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, static void qfq_destroy_class(struct Qdisc *sch, struct qfq_class *cl) { - struct qfq_sched *q = qdisc_priv(sch); - - qfq_rm_from_agg(q, cl); gen_kill_estimator(&cl->rate_est); qdisc_put(cl->qdisc); kfree(cl); @@ -554,10 +551,11 @@ static int qfq_delete_class(struct Qdisc *sch, unsigned long arg, qdisc_purge_queue(cl->qdisc); qdisc_class_hash_remove(&q->clhash, &cl->common); - qfq_destroy_class(sch, cl); + qfq_rm_from_agg(q, cl); sch_tree_unlock(sch); + qfq_destroy_class(sch, cl); return 0; } @@ -1507,6 +1505,7 @@ static void qfq_destroy_qdisc(struct Qdisc *sch) for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry_safe(cl, next, &q->clhash.hash[i], common.hnode) { + qfq_rm_from_agg(q, cl); qfq_destroy_class(sch, cl); } } From 4e163429b2e26338591aadc654bde76da33f5688 Mon Sep 17 00:00:00 2001 From: Jonathan Maple Date: Wed, 31 Dec 2025 15:09:27 -0500 Subject: [PATCH 16/16] fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2) jira VULN-98608 cve-bf CVE-2025-38498 commit-author Al Viro commit d8cc0362f918d020ca1340d7694f07062dc30f36 9ffb14ef61ba "move_mount: allow to add a mount into an existing group" breaks assertions on ->mnt_share/->mnt_slave. For once, the data structures in question are actually documented. Documentation/filesystem/sharedsubtree.rst: All vfsmounts in a peer group have the same ->mnt_master. If it is non-NULL, they form a contiguous (ordered) segment of slave list. do_set_group() puts a mount into the same place in propagation graph as the old one. As the result, if old mount gets events from somewhere and is not a pure event sink, new one needs to be placed next to the old one in the slave list the old one's on. If it is a pure event sink, we only need to make sure the new one doesn't end up in the middle of some peer group. "move_mount: allow to add a mount into an existing group" ends up putting the new one in the beginning of list; that's definitely not going to be in the middle of anything, so that's fine for case when old is not marked shared. In case when old one _is_ marked shared (i.e. is not a pure event sink), that breaks the assumptions of propagation graph iterators. Put the new mount next to the old one on the list - that does the right thing in "old is marked shared" case and is just as correct as the current behaviour if old is not marked shared (kudos to Pavel for pointing that out - my original suggested fix changed behaviour in the "nor marked" case, which complicated things for no good reason). Reviewed-by: Christian Brauner Fixes: 9ffb14ef61ba ("move_mount: allow to add a mount into an existing group") Signed-off-by: Al Viro (cherry picked from commit d8cc0362f918d020ca1340d7694f07062dc30f36) Signed-off-by: Jonathan Maple --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index 306fc2778ec62..673a550eadbe1 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2694,7 +2694,7 @@ static int do_set_group(struct path *from_path, struct path *to_path) if (IS_MNT_SLAVE(from)) { struct mount *m = from->mnt_master; - list_add(&to->mnt_slave, &m->mnt_slave_list); + list_add(&to->mnt_slave, &from->mnt_slave); to->mnt_master = m; }