[LTS 9.2] CVE-2025-38022, CVE-2025-38129#1387
Open
pvts-mat wants to merge 4 commits into
Open
Conversation
…" problem jira VULN-71153 cve CVE-2025-38022 commit-author Zhu Yanjun <yanjun.zhu@linux.dev> commit d0706bf upstream-diff Used linux-5.15.y backport 5629064f92f0de6d6b3572055cd35361c3ad953c for the clean pick. In LTS 9.2 the `ib_device_notify_register()' function, where the `kobject_uevent()' call wa moved in the upstream fix, doesn't exist. It was introduced in 9cbed5a ("RDMA/nldev: Add support for RDMA monitoring"), with locks put in place in 1d6a9e7 ("RDMA/core: Fix use-after-free when rename device name"). Wrapping the `kobject_uevent()' call in `down_read(&devices_rwsem)' and `up_read(&devices_rwsem)' results in the same protection. Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 strlen+0x93/0xa0 lib/string.c:420 __fortify_strlen include/linux/fortify-string.h:268 [inline] get_kobj_path_length lib/kobject.c:118 [inline] kobject_get_path+0x3f/0x2a0 lib/kobject.c:158 kobject_uevent_env+0x289/0x1870 lib/kobject_uevent.c:545 ib_register_device drivers/infiniband/core/device.c:1472 [inline] ib_register_device+0x8cf/0xe00 drivers/infiniband/core/device.c:1393 rxe_register_device+0x275/0x320 drivers/infiniband/sw/rxe/rxe_verbs.c:1552 rxe_net_add+0x8e/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:550 rxe_newlink+0x70/0x190 drivers/infiniband/sw/rxe/rxe.c:225 nldev_newlink+0x3a3/0x680 drivers/infiniband/core/nldev.c:1796 rdma_nl_rcv_msg+0x387/0x6e0 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb.constprop.0.isra.0+0x2e5/0x450 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] netlink_unicast+0x53a/0x7f0 net/netlink/af_netlink.c:1339 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg net/socket.c:727 [inline] ____sys_sendmsg+0xa95/0xc70 net/socket.c:2566 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2620 __sys_sendmsg+0x16d/0x220 net/socket.c:2652 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f This problem is similar to the problem that the commit 1d6a9e7 ("RDMA/core: Fix use-after-free when rename device name") fixes. The root cause is: the function ib_device_rename() renames the name with lock. But in the function kobject_uevent(), this name is accessed without lock protection at the same time. The solution is to add the lock protection when this name is accessed in the function kobject_uevent(). Fixes: 779e0bf ("RDMA/core: Do not indicate device ready when device enablement fails") Link: https://patch.msgid.link/r/20250506151008.75701-1-yanjun.zhu@linux.dev Reported-by: syzbot+e2ce9e275ecc70a30b72@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e2ce9e275ecc70a30b72 Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> (cherry picked from commit 5629064f92f0de6d6b3572055cd35361c3ad953c) Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-71838 cve-pre CVE-2025-38129 commit-author Qingfang DENG <qingfang.deng@siflower.com.cn> commit 542bcea We use BH context only for synchronization, so we don't care if it's actually serving softirq or not. As a side node, in case of threaded NAPI, in_serving_softirq() will return false because it's in process context with BH off, making page_pool_recycle_in_cache() unreachable. Signed-off-by: Qingfang DENG <qingfang.deng@siflower.com.cn> Tested-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 542bcea) Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-71838 cve-pre CVE-2025-38129 commit-author Yunsheng Lin <linyunsheng@huawei.com> commit 368d3cb page_pool_ring_[un]lock() use in_softirq() to decide which spin lock variant to use, and when they are called in the context with in_softirq() being false, spin_lock_bh() is called in page_pool_ring_lock() while spin_unlock() is called in page_pool_ring_unlock(), because spin_lock_bh() has disabled the softirq in page_pool_ring_lock(), which causes inconsistency for spin lock pair calling. This patch fixes it by returning in_softirq state from page_pool_producer_lock(), and use it to decide which spin lock variant to use in page_pool_producer_unlock(). As pool->ring has both producer and consumer lock, so rename it to page_pool_producer_[un]lock() to reflect the actual usage. Also move them to page_pool.c as they are only used there, and remove the 'inline' as the compiler may have better idea to do inlining or not. Fixes: 7886244 ("net: page_pool: Add bulk support for ptr_ring") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20230522031714.5089-1-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 368d3cb) Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-71838 cve CVE-2025-38129 commit-author Dong Chenchen <dongchenchen2@huawei.com> commit 271683b upstream-diff Used linux-6.1.y backport 1a8c0b61d4cb55c5440583ec9e7f86a730369e32 for a clean pick. It accounts for the similarly non-backported commit 4dec64c ("page_pool: convert to use netmem") and deals with the same context conflicts in `page_pool_release()'. syzbot reported a uaf in page_pool_recycle_in_ring: BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30 kernel/locking/lockdep.c:5862 Read of size 8 at addr ffff8880286045a0 by task syz.0.284/6943 CPU: 0 UID: 0 PID: 6943 Comm: syz.0.284 Not tainted 6.13.0-rc3-syzkaller-gdfa94ce54f41 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x169/0x550 mm/kasan/report.c:489 kasan_report+0x143/0x180 mm/kasan/report.c:602 lock_release+0x151/0xa30 kernel/locking/lockdep.c:5862 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:165 [inline] _raw_spin_unlock_bh+0x1b/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:396 [inline] ptr_ring_produce_bh include/linux/ptr_ring.h:164 [inline] page_pool_recycle_in_ring net/core/page_pool.c:707 [inline] page_pool_put_unrefed_netmem+0x748/0xb00 net/core/page_pool.c:826 page_pool_put_netmem include/net/page_pool/helpers.h:323 [inline] page_pool_put_full_netmem include/net/page_pool/helpers.h:353 [inline] napi_pp_put_page+0x149/0x2b0 net/core/skbuff.c:1036 skb_pp_recycle net/core/skbuff.c:1047 [inline] skb_free_head net/core/skbuff.c:1094 [inline] skb_release_data+0x6c4/0x8a0 net/core/skbuff.c:1125 skb_release_all net/core/skbuff.c:1190 [inline] __kfree_skb net/core/skbuff.c:1204 [inline] sk_skb_reason_drop+0x1c9/0x380 net/core/skbuff.c:1242 kfree_skb_reason include/linux/skbuff.h:1263 [inline] __skb_queue_purge_reason include/linux/skbuff.h:3343 [inline] root cause is: page_pool_recycle_in_ring ptr_ring_produce spin_lock(&r->producer_lock); WRITE_ONCE(r->queue[r->producer++], ptr) //recycle last page to pool page_pool_release page_pool_scrub page_pool_empty_ring ptr_ring_consume page_pool_return_page //release all page __page_pool_destroy free_percpu(pool->recycle_stats); free(pool) //free spin_unlock(&r->producer_lock); //pool->ring uaf read recycle_stat_inc(pool, ring); page_pool can be free while page pool recycle the last page in ring. Add producer-lock barrier to page_pool_release to prevent the page pool from being free before all pages have been recycled. recycle_stat_inc() is empty when CONFIG_PAGE_POOL_STATS is not enabled, which will trigger Wempty-body build warning. Add definition for pool stat macro to fix warning. Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20250513083123.3514193-1-dongchenchen2@huawei.com Fixes: ff7d6b2 ("page_pool: refurbish version of page_pool code") Reported-by: syzbot+204a4382fcb3311f3858@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=204a4382fcb3311f3858 Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250527114152.3119109-1-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 1a8c0b61d4cb55c5440583ec9e7f86a730369e32) Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[LTS 9.2]
Commits
CVE-2025-38129
0:
1:
This commit introduces
page_pool_producer_lock()/page_pool_producer_unlock()functions on which the CVE-2025-38129 fix is based.2:
This prerequisite may not have been strictly necessary as the
in_softirq()~/~in_serving_softirq()branching inpage_pool_recycle_in_ring()is discarded in the CVE-2025-38129 fix anyway, but it simplifies patch backporting and is bugfix worthy of backporting in itself.This solution is the same as in
linux-6.1.y, with which LTS 9.2 has very similarnet/core/page_pool.chistory (=: same commit,~: backported commit):CVE-2025-38022
Since the fix looks completely different from upstream here's the breakdown of the issue. The fix message:
The
ib_device_rename()function:kernel-src-tree/drivers/infiniband/core/device.c
Lines 385 to 426 in d0706bf
The lock is
kernel-src-tree/drivers/infiniband/core/device.c
Line 391 in d0706bf
and
kernel-src-tree/drivers/infiniband/core/device.c
Line 424 in d0706bf
The renaming appears in
kernel-src-tree/drivers/infiniband/core/device.c
Line 420 in d0706bf
This is a function pointer of the
struct ib_clientstructurekernel-src-tree/include/rdma/ib_verbs.h
Line 2854 in d0706bf
The only instance found in the codebase with this pointer set to non-null is
srp_client:kernel-src-tree/drivers/infiniband/ulp/srp/ib_srp.c
Lines 153 to 158 in d0706bf
Definition:
kernel-src-tree/drivers/infiniband/ulp/srp/ib_srp.c
Lines 3986 to 3998 in d0706bf
Renaming takes place in
device_rename(), usingkobject_rename():kernel-src-tree/drivers/base/core.c
Line 4526 in d0706bf
This function is a part of kernel's library for drivers, it operates on kobjects - core mechanism for reference-counted objects that organize system components into sysfs hierarchies and enable device hotplug notifications. This is the most likely
freeafter which the use was done inkobject_uevent():kernel-src-tree/lib/kobject.c
Line 524 in d0706bf
Tracking the KASAN stack trace from d0706bf's message it can be seen that the UAF occured in this line of the
kobject_uevent_env()function (to whichkobject_uevent()delegates with no additional logic):kernel-src-tree/lib/kobject_uevent.c
Line 545 in d0706bf
In the upstream the function was called here
kernel-src-tree/drivers/infiniband/core/device.c
Lines 1471 to 1472 in 4bcc063
The d0706bf fix moved it inside the
ib_device_notify_register()call appearing in the next line:kernel-src-tree/drivers/infiniband/core/device.c
Line 1474 in 4bcc063
kernel-src-tree/drivers/infiniband/core/device.c
Lines 1355 to 1356 in d0706bf
between
kernel-src-tree/drivers/infiniband/core/device.c
Line 1353 in d0706bf
and
kernel-src-tree/drivers/infiniband/core/device.c
Line 1375 in d0706bf
In LTS 9.2 this function doesn't exist, introduced in 9cbed5a, with locks put in place in 1d6a9e7. The latter is what was referenced in the CVE-2025-38022 fix commit's message
Wrapping the
kobject_uevent()call indown_read(&devices_rwsem)andup_read(&devices_rwsem)effetcively gives the same protection which was achieved in the upstream. The renaming functionib_device_rename()in LTS 9.2 doesn't differ much from the upstream and uses the same lockskernel-src-tree/drivers/infiniband/core/device.c
Lines 402 to 442 in 17e4221
kABI check: passed
Boot test: passed
boot-test.log
Kselftests: passed relative
Reference
kselftests–ciqlts9_2–run1.log
Patch
kselftests–ciqlts9_2-CVE-batch-40–run1.log
kselftests–ciqlts9_2-CVE-batch-40–run2.log
Comparison
The tests results for the reference and the patch are the same.
full-test-results-comparison.log