| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
devlink: rate: Unset parent pointer in devl_rate_nodes_destroy
The function devl_rate_nodes_destroy is documented to "Unset parent for
all rate objects". However, it was only calling the driver-specific
`rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing
the parent's refcount, without actually setting the
`devlink_rate->parent` pointer to NULL.
This leaves a dangling pointer in the `devlink_rate` struct, which cause
refcount error in netdevsim[1] and mlx5[2]. In addition, this is
inconsistent with the behavior of `devlink_nl_rate_parent_node_set`,
where the parent pointer is correctly cleared.
This patch fixes the issue by explicitly setting `devlink_rate->parent`
to NULL after notifying the driver, thus fulfilling the function's
documented behavior for all rate objects.
[1]
repro steps:
echo 1 > /sys/bus/netdevsim/new_device
devlink dev eswitch set netdevsim/netdevsim1 mode switchdev
echo 1 > /sys/bus/netdevsim/devices/netdevsim1/sriov_numvfs
devlink port function rate add netdevsim/netdevsim1/test_node
devlink port function rate set netdevsim/netdevsim1/128 parent test_node
echo 1 > /sys/bus/netdevsim/del_device
dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 8 PID: 1530 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 8 UID: 0 PID: 1530 Comm: bash Not tainted 6.18.0-rc4+ #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
<TASK>
devl_rate_leaf_destroy+0x8d/0x90
__nsim_dev_port_del+0x6c/0x70 [netdevsim]
nsim_dev_reload_destroy+0x11c/0x140 [netdevsim]
nsim_drv_remove+0x2b/0xb0 [netdevsim]
device_release_driver_internal+0x194/0x1f0
bus_remove_device+0xc6/0x130
device_del+0x159/0x3c0
device_unregister+0x1a/0x60
del_device_store+0x111/0x170 [netdevsim]
kernfs_fop_write_iter+0x12e/0x1e0
vfs_write+0x215/0x3d0
ksys_write+0x5f/0xd0
do_syscall_64+0x55/0x10f0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
[2]
devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 1000
devlink port function rate add pci/0000:08:00.0/group1
devlink port function rate set pci/0000:08:00.0/32768 parent group1
modprobe -r mlx5_ib mlx5_fwctl mlx5_core
dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 7 PID: 16151 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 7 UID: 0 PID: 16151 Comm: bash Not tainted 6.17.0-rc7_for_upstream_min_debug_2025_10_02_12_44 #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
<TASK>
devl_rate_leaf_destroy+0x8d/0x90
mlx5_esw_offloads_devlink_port_unregister+0x33/0x60 [mlx5_core]
mlx5_esw_offloads_unload_rep+0x3f/0x50 [mlx5_core]
mlx5_eswitch_unload_sf_vport+0x40/0x90 [mlx5_core]
mlx5_sf_esw_event+0xc4/0x120 [mlx5_core]
notifier_call_chain+0x33/0xa0
blocking_notifier_call_chain+0x3b/0x50
mlx5_eswitch_disable_locked+0x50/0x110 [mlx5_core]
mlx5_eswitch_disable+0x63/0x90 [mlx5_core]
mlx5_unload+0x1d/0x170 [mlx5_core]
mlx5_uninit_one+0xa2/0x130 [mlx5_core]
remove_one+0x78/0xd0 [mlx5_core]
pci_device_remove+0x39/0xa0
device_release_driver_internal+0x194/0x1f0
unbind_store+0x99/0xa0
kernfs_fop_write_iter+0x12e/0x1e0
vfs_write+0x215/0x3d0
ksys_write+0x5f/0xd0
do_syscall_64+0x53/0x1f0
entry_SYSCALL_64_after_hwframe+0x4b/0x53 |
| In the Linux kernel, the following vulnerability has been resolved:
PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV
Before disabling SR-IOV via config space accesses to the parent PF,
sriov_disable() first removes the PCI devices representing the VFs.
Since commit 9d16947b7583 ("PCI: Add global pci_lock_rescan_remove()")
such removal operations are serialized against concurrent remove and
rescan using the pci_rescan_remove_lock. No such locking was ever added
in sriov_disable() however. In particular when commit 18f9e9d150fc
("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device
removal into sriov_del_vfs() there was still no locking around the
pci_iov_remove_virtfn() calls.
On s390 the lack of serialization in sriov_disable() may cause double
remove and list corruption with the below (amended) trace being observed:
PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56)
GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001
00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480
0000000000000001 0000000000000000 0000000000000000 0000000180692828
00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8
#0 [3800313fb20] device_del at c9158ad5c
#1 [3800313fb88] pci_remove_bus_device at c915105ba
#2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198
#3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0
#4 [3800313fc60] zpci_bus_remove_device at c90fb6104
#5 [3800313fca0] __zpci_event_availability at c90fb3dca
#6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2
#7 [3800313fd60] crw_collect_info at c91905822
#8 [3800313fe10] kthread at c90feb390
#9 [3800313fe68] __ret_from_fork at c90f6aa64
#10 [3800313fe98] ret_from_fork at c9194f3f2.
This is because in addition to sriov_disable() removing the VFs, the
platform also generates hot-unplug events for the VFs. This being the
reverse operation to the hotplug events generated by sriov_enable() and
handled via pdev->no_vf_scan. And while the event processing takes
pci_rescan_remove_lock and checks whether the struct pci_dev still exists,
the lack of synchronization makes this checking racy.
Other races may also be possible of course though given that this lack of
locking persisted so long observable races seem very rare. Even on s390 the
list corruption was only observed with certain devices since the platform
events are only triggered by config accesses after the removal, so as long
as the removal finished synchronously they would not race. Either way the
locking is missing so fix this by adding it to the sriov_del_vfs() helper.
Just like PCI rescan-remove, locking is also missing in sriov_add_vfs()
including for the error case where pci_stop_and_remove_bus_device() is
called without the PCI rescan-remove lock being held. Even in the non-error
case, adding new PCI devices and buses should be serialized via the PCI
rescan-remove lock. Add the necessary locking. |
| In the Linux kernel, the following vulnerability has been resolved:
vfat: fix missing sb_min_blocksize() return value checks
When emulating an nvme device on qemu with both logical_block_size and
physical_block_size set to 8 KiB, but without format, a kernel panic
was triggered during the early boot stage while attempting to mount a
vfat filesystem.
[95553.682035] EXT4-fs (nvme0n1): unable to set blocksize
[95553.684326] EXT4-fs (nvme0n1): unable to set blocksize
[95553.686501] EXT4-fs (nvme0n1): unable to set blocksize
[95553.696448] ISOFS: unsupported/invalid hardware sector size 8192
[95553.697117] ------------[ cut here ]------------
[95553.697567] kernel BUG at fs/buffer.c:1582!
[95553.697984] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[95553.698602] CPU: 0 UID: 0 PID: 7212 Comm: mount Kdump: loaded Not tainted 6.18.0-rc2+ #38 PREEMPT(voluntary)
[95553.699511] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[95553.700534] RIP: 0010:folio_alloc_buffers+0x1bb/0x1c0
[95553.701018] Code: 48 8b 15 e8 93 18 02 65 48 89 35 e0 93 18 02 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff c3 cc cc cc cc <0f> 0b 90 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f
[95553.702648] RSP: 0018:ffffd1b0c676f990 EFLAGS: 00010246
[95553.703132] RAX: ffff8cfc4176d820 RBX: 0000000000508c48 RCX: 0000000000000001
[95553.703805] RDX: 0000000000002000 RSI: 0000000000000000 RDI: 0000000000000000
[95553.704481] RBP: ffffd1b0c676f9c8 R08: 0000000000000000 R09: 0000000000000000
[95553.705148] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[95553.705816] R13: 0000000000002000 R14: fffff8bc8257e800 R15: 0000000000000000
[95553.706483] FS: 000072ee77315840(0000) GS:ffff8cfdd2c8d000(0000) knlGS:0000000000000000
[95553.707248] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[95553.707782] CR2: 00007d8f2a9e5a20 CR3: 0000000039d0c006 CR4: 0000000000772ef0
[95553.708439] PKRU: 55555554
[95553.708734] Call Trace:
[95553.709015] <TASK>
[95553.709266] __getblk_slow+0xd2/0x230
[95553.709641] ? find_get_block_common+0x8b/0x530
[95553.710084] bdev_getblk+0x77/0xa0
[95553.710449] __bread_gfp+0x22/0x140
[95553.710810] fat_fill_super+0x23a/0xfc0
[95553.711216] ? __pfx_setup+0x10/0x10
[95553.711580] ? __pfx_vfat_fill_super+0x10/0x10
[95553.712014] vfat_fill_super+0x15/0x30
[95553.712401] get_tree_bdev_flags+0x141/0x1e0
[95553.712817] get_tree_bdev+0x10/0x20
[95553.713177] vfat_get_tree+0x15/0x20
[95553.713550] vfs_get_tree+0x2a/0x100
[95553.713910] vfs_cmd_create+0x62/0xf0
[95553.714273] __do_sys_fsconfig+0x4e7/0x660
[95553.714669] __x64_sys_fsconfig+0x20/0x40
[95553.715062] x64_sys_call+0x21ee/0x26a0
[95553.715453] do_syscall_64+0x80/0x670
[95553.715816] ? __fs_parse+0x65/0x1e0
[95553.716172] ? fat_parse_param+0x103/0x4b0
[95553.716587] ? vfs_parse_fs_param_source+0x21/0xa0
[95553.717034] ? __do_sys_fsconfig+0x3d9/0x660
[95553.717548] ? __x64_sys_fsconfig+0x20/0x40
[95553.717957] ? x64_sys_call+0x21ee/0x26a0
[95553.718360] ? do_syscall_64+0xb8/0x670
[95553.718734] ? __x64_sys_fsconfig+0x20/0x40
[95553.719141] ? x64_sys_call+0x21ee/0x26a0
[95553.719545] ? do_syscall_64+0xb8/0x670
[95553.719922] ? x64_sys_call+0x1405/0x26a0
[95553.720317] ? do_syscall_64+0xb8/0x670
[95553.720702] ? __x64_sys_close+0x3e/0x90
[95553.721080] ? x64_sys_call+0x1b5e/0x26a0
[95553.721478] ? do_syscall_64+0xb8/0x670
[95553.721841] ? irqentry_exit+0x43/0x50
[95553.722211] ? exc_page_fault+0x90/0x1b0
[95553.722681] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[95553.723166] RIP: 0033:0x72ee774f3afe
[95553.723562] Code: 73 01 c3 48 8b 0d 0a 33 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 49 89 ca b8 af 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d da 32 0f 00 f7 d8 64 89 01 48
[95553.725188] RSP: 002b:00007ffe97148978 EFLAGS: 00000246 ORIG_RAX: 00000000000001af
[95553.725892] RAX: ffffffffffffffda RBX:
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Clean up only new IRQ glue on request_irq() failure
The mlx5_irq_alloc() function can inadvertently free the entire rmap
and end up in a crash[1] when the other threads tries to access this,
when request_irq() fails due to exhausted IRQ vectors. This commit
modifies the cleanup to remove only the specific IRQ mapping that was
just added.
This prevents removal of other valid mappings and ensures precise
cleanup of the failed IRQ allocation's associated glue object.
Note: This error is observed when both fwctl and rds configs are enabled.
[1]
mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1
mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to
request irq. err = -28
infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while
trying to test write-combining support
mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1
mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1
mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to
request irq. err = -28
infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while
trying to test write-combining support
mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1
mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to
request irq. err = -28
mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to
request irq. err = -28
general protection fault, probably for non-canonical address
0xe277a58fde16f291: 0000 [#1] SMP NOPTI
RIP: 0010:free_irq_cpu_rmap+0x23/0x7d
Call Trace:
<TASK>
? show_trace_log_lvl+0x1d6/0x2f9
? show_trace_log_lvl+0x1d6/0x2f9
? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core]
? __die_body.cold+0x8/0xa
? die_addr+0x39/0x53
? exc_general_protection+0x1c4/0x3e9
? dev_vprintk_emit+0x5f/0x90
? asm_exc_general_protection+0x22/0x27
? free_irq_cpu_rmap+0x23/0x7d
mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core]
irq_pool_request_vector+0x7d/0x90 [mlx5_core]
mlx5_irq_request+0x2e/0xe0 [mlx5_core]
mlx5_irq_request_vector+0xad/0xf7 [mlx5_core]
comp_irq_request_pci+0x64/0xf0 [mlx5_core]
create_comp_eq+0x71/0x385 [mlx5_core]
? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core]
mlx5_comp_eqn_get+0x72/0x90 [mlx5_core]
? xas_load+0x8/0x91
mlx5_comp_irqn_get+0x40/0x90 [mlx5_core]
mlx5e_open_channel+0x7d/0x3c7 [mlx5_core]
mlx5e_open_channels+0xad/0x250 [mlx5_core]
mlx5e_open_locked+0x3e/0x110 [mlx5_core]
mlx5e_open+0x23/0x70 [mlx5_core]
__dev_open+0xf1/0x1a5
__dev_change_flags+0x1e1/0x249
dev_change_flags+0x21/0x5c
do_setlink+0x28b/0xcc4
? __nla_parse+0x22/0x3d
? inet6_validate_link_af+0x6b/0x108
? cpumask_next+0x1f/0x35
? __snmp6_fill_stats64.constprop.0+0x66/0x107
? __nla_validate_parse+0x48/0x1e6
__rtnl_newlink+0x5ff/0xa57
? kmem_cache_alloc_trace+0x164/0x2ce
rtnl_newlink+0x44/0x6e
rtnetlink_rcv_msg+0x2bb/0x362
? __netlink_sendskb+0x4c/0x6c
? netlink_unicast+0x28f/0x2ce
? rtnl_calcit.isra.0+0x150/0x146
netlink_rcv_skb+0x5f/0x112
netlink_unicast+0x213/0x2ce
netlink_sendmsg+0x24f/0x4d9
__sock_sendmsg+0x65/0x6a
____sys_sendmsg+0x28f/0x2c9
? import_iovec+0x17/0x2b
___sys_sendmsg+0x97/0xe0
__sys_sendmsg+0x81/0xd8
do_syscall_64+0x35/0x87
entry_SYSCALL_64_after_hwframe+0x6e/0x0
RIP: 0033:0x7fc328603727
Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed
ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48
RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727
RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d
RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 00000000000
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Fix IPsec cleanup over MPV device
When we do mlx5e_detach_netdev() we eventually disable blocking events
notifier, among those events are IPsec MPV events from IB to core.
So before disabling those blocking events, make sure to also unregister
the devcom device and mark all this device operations as complete,
in order to prevent the other device from using invalid netdev
during future devcom events which could cause the trace below.
BUG: kernel NULL pointer dereference, address: 0000000000000010
PGD 146427067 P4D 146427067 PUD 146488067 PMD 0
Oops: Oops: 0000 [#1] SMP
CPU: 1 UID: 0 PID: 7735 Comm: devlink Tainted: GW 6.12.0-rc6_for_upstream_min_debug_2024_11_08_00_46 #1
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
Code: 00 01 48 83 05 23 32 1e 00 01 41 b8 ed ff ff ff e9 60 ff ff ff 48 83 05 00 32 1e 00 01 eb e3 66 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 47 10 48 83 05 5f 32 1e 00 01 48 8b 50 40 48 85 d2 74 05 40
RSP: 0018:ffff88811a5c35f8 EFLAGS: 00010206
RAX: ffff888106e8ab80 RBX: ffff888107d7e200 RCX: ffff88810d6f0a00
RDX: ffff88810d6f0a00 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff88811a17e620 R08: 0000000000000040 R09: 0000000000000000
R10: ffff88811a5c3618 R11: 0000000de85d51bd R12: ffff88811a17e600
R13: ffff88810d6f0a00 R14: 0000000000000000 R15: ffff8881034bda80
FS: 00007f27bdf89180(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000010f159005 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die+0x20/0x60
? page_fault_oops+0x150/0x3e0
? exc_page_fault+0x74/0x130
? asm_exc_page_fault+0x22/0x30
? mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
mlx5e_devcom_event_mpv+0x42/0x60 [mlx5_core]
mlx5_devcom_send_event+0x8c/0x170 [mlx5_core]
blocking_event+0x17b/0x230 [mlx5_core]
notifier_call_chain+0x35/0xa0
blocking_notifier_call_chain+0x3d/0x60
mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core]
mlx5_core_mp_event_replay+0x12/0x20 [mlx5_core]
mlx5_ib_bind_slave_port+0x228/0x2c0 [mlx5_ib]
mlx5_ib_stage_init_init+0x664/0x9d0 [mlx5_ib]
? idr_alloc_cyclic+0x50/0xb0
? __kmalloc_cache_noprof+0x167/0x340
? __kmalloc_noprof+0x1a7/0x430
__mlx5_ib_add+0x34/0xd0 [mlx5_ib]
mlx5r_probe+0xe9/0x310 [mlx5_ib]
? kernfs_add_one+0x107/0x150
? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib]
auxiliary_bus_probe+0x3e/0x90
really_probe+0xc5/0x3a0
? driver_probe_device+0x90/0x90
__driver_probe_device+0x80/0x160
driver_probe_device+0x1e/0x90
__device_attach_driver+0x7d/0x100
bus_for_each_drv+0x80/0xd0
__device_attach+0xbc/0x1f0
bus_probe_device+0x86/0xa0
device_add+0x62d/0x830
__auxiliary_device_add+0x3b/0xa0
? auxiliary_device_init+0x41/0x90
add_adev+0xd1/0x150 [mlx5_core]
mlx5_rescan_drivers_locked+0x21c/0x300 [mlx5_core]
esw_mode_change+0x6c/0xc0 [mlx5_core]
mlx5_devlink_eswitch_mode_set+0x21e/0x640 [mlx5_core]
devlink_nl_eswitch_set_doit+0x60/0xe0
genl_family_rcv_msg_doit+0xd0/0x120
genl_rcv_msg+0x180/0x2b0
? devlink_get_from_attrs_lock+0x170/0x170
? devlink_nl_eswitch_get_doit+0x290/0x290
? devlink_nl_pre_doit_port_optional+0x50/0x50
? genl_family_rcv_msg_dumpit+0xf0/0xf0
netlink_rcv_skb+0x54/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x1fc/0x2d0
netlink_sendmsg+0x1e4/0x410
__sock_sendmsg+0x38/0x60
? sockfd_lookup_light+0x12/0x60
__sys_sendto+0x105/0x160
? __sys_recvmsg+0x4e/0x90
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x4c/0x100
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f27bc91b13a
Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa 96 2c 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
mm: prevent poison consumption when splitting THP
When performing memory error injection on a THP (Transparent Huge Page)
mapped to userspace on an x86 server, the kernel panics with the following
trace. The expected behavior is to terminate the affected process instead
of panicking the kernel, as the x86 Machine Check code can recover from an
in-userspace #MC.
mce: [Hardware Error]: CPU 0: Machine Check Exception: f Bank 3: bd80000000070134
mce: [Hardware Error]: RIP 10:<ffffffff8372f8bc> {memchr_inv+0x4c/0xf0}
mce: [Hardware Error]: TSC afff7bbff88a ADDR 1d301b000 MISC 80 PPIN 1e741e77539027db
mce: [Hardware Error]: PROCESSOR 0:d06d0 TIME 1758093249 SOCKET 0 APIC 0 microcode 80000320
mce: [Hardware Error]: Run the above through 'mcelog --ascii'
mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel
Kernel panic - not syncing: Fatal local machine check
The root cause of this panic is that handling a memory failure triggered
by an in-userspace #MC necessitates splitting the THP. The splitting
process employs a mechanism, implemented in
try_to_map_unused_to_zeropage(), which reads the pages in the THP to
identify zero-filled pages. However, reading the pages in the THP results
in a second in-kernel #MC, occurring before the initial memory_failure()
completes, ultimately leading to a kernel panic. See the kernel panic
call trace on the two #MCs.
First Machine Check occurs // [1]
memory_failure() // [2]
try_to_split_thp_page()
split_huge_page()
split_huge_page_to_list_to_order()
__folio_split() // [3]
remap_page()
remove_migration_ptes()
remove_migration_pte()
try_to_map_unused_to_zeropage() // [4]
memchr_inv() // [5]
Second Machine Check occurs // [6]
Kernel panic
[1] Triggered by accessing a hardware-poisoned THP in userspace, which is
typically recoverable by terminating the affected process.
[2] Call folio_set_has_hwpoisoned() before try_to_split_thp_page().
[3] Pass the RMP_USE_SHARED_ZEROPAGE remap flag to remap_page().
[4] Try to map the unused THP to zeropage.
[5] Re-access pages in the hw-poisoned THP in the kernel.
[6] Triggered in-kernel, leading to a panic kernel.
In Step[2], memory_failure() sets the poisoned flag on the page in the THP
by TestSetPageHWPoison() before calling try_to_split_thp_page().
As suggested by David Hildenbrand, fix this panic by not accessing to the
poisoned page in the THP during zeropage identification, while continuing
to scan unaffected pages in the THP for possible zeropage mapping. This
prevents a second in-kernel #MC that would cause kernel panic in Step[4].
Thanks to Andrew Zaborowski for his initial work on fixing this issue. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/panthor: Fix kernel panic on partial unmap of a GPU VA region
This commit address a kernel panic issue that can happen if Userspace
tries to partially unmap a GPU virtual region (aka drm_gpuva).
The VM_BIND interface allows partial unmapping of a BO.
Panthor driver pre-allocates memory for the new drm_gpuva structures
that would be needed for the map/unmap operation, done using drm_gpuvm
layer. It expected that only one new drm_gpuva would be needed on umap
but a partial unmap can require 2 new drm_gpuva and that's why it
ended up doing a NULL pointer dereference causing a kernel panic.
Following dump was seen when partial unmap was exercised.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000078
Mem abort info:
ESR = 0x0000000096000046
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
Data abort info:
ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000000088a863000
[000000000000078] pgd=080000088a842003, p4d=080000088a842003, pud=0800000884bf5003, pmd=0000000000000000
Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP
<snip>
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : panthor_gpuva_sm_step_remap+0xe4/0x330 [panthor]
lr : panthor_gpuva_sm_step_remap+0x6c/0x330 [panthor]
sp : ffff800085d43970
x29: ffff800085d43970 x28: ffff00080363e440 x27: ffff0008090c6000
x26: 0000000000000030 x25: ffff800085d439f8 x24: ffff00080d402000
x23: ffff800085d43b60 x22: ffff800085d439e0 x21: ffff00080abdb180
x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000010
x17: 6e656c202c303030 x16: 3666666666646466 x15: 393d61766f69202c
x14: 312d3d7361203a70 x13: 303030323d6e656c x12: ffff80008324bf58
x11: 0000000000000003 x10: 0000000000000002 x9 : ffff8000801a6a9c
x8 : ffff00080360b300 x7 : 0000000000000000 x6 : 000000088aa35fc7
x5 : fff1000080000000 x4 : ffff8000842ddd30 x3 : 0000000000000001
x2 : 0000000100000000 x1 : 0000000000000001 x0 : 0000000000000078
Call trace:
panthor_gpuva_sm_step_remap+0xe4/0x330 [panthor]
op_remap_cb.isra.22+0x50/0x80
__drm_gpuvm_sm_unmap+0x10c/0x1c8
drm_gpuvm_sm_unmap+0x40/0x60
panthor_vm_exec_op+0xb4/0x3d0 [panthor]
panthor_vm_bind_exec_sync_op+0x154/0x278 [panthor]
panthor_ioctl_vm_bind+0x160/0x4a0 [panthor]
drm_ioctl_kernel+0xbc/0x138
drm_ioctl+0x240/0x500
__arm64_sys_ioctl+0xb0/0xf8
invoke_syscall+0x4c/0x110
el0_svc_common.constprop.1+0x98/0xf8
do_el0_svc+0x24/0x38
el0_svc+0x40/0xf8
el0t_64_sync_handler+0xa0/0xc8
el0t_64_sync+0x174/0x178 |
| In the Linux kernel, the following vulnerability has been resolved:
virtio-net: zero unused hash fields
When GSO tunnel is negotiated virtio_net_hdr_tnl_from_skb() tries to
initialize the tunnel metadata but forget to zero unused rxhash
fields. This may leak information to another side. Fixing this by
zeroing the unused hash fields. |
| In the Linux kernel, the following vulnerability has been resolved:
platform/x86: alienware-wmi-wmax: Fix NULL pointer dereference in sleep handlers
Devices without the AWCC interface don't initialize `awcc`. Add a check
before dereferencing it in sleep handlers. |
| LogStare Collector contains an incorrect authorization vulnerability in UserRegistration. If exploited, a non-administrative user may create a new user account by sending a crafted HTTP request. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: ath9k: verify the expected usb_endpoints are present
The bug arises when a USB device claims to be an ATH9K but doesn't
have the expected endpoints. (In this case there was an interrupt
endpoint where the driver expected a bulk endpoint.) The kernel
needs to be able to handle such devices without getting an internal error.
usb 1-1: BOGUS urb xfer, pipe 3 != type 1
WARNING: CPU: 3 PID: 500 at drivers/usb/core/urb.c:493 usb_submit_urb+0xce2/0x1430 drivers/usb/core/urb.c:493
Modules linked in:
CPU: 3 PID: 500 Comm: kworker/3:2 Not tainted 5.10.135-syzkaller #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Workqueue: events request_firmware_work_func
RIP: 0010:usb_submit_urb+0xce2/0x1430 drivers/usb/core/urb.c:493
Call Trace:
ath9k_hif_usb_alloc_rx_urbs drivers/net/wireless/ath/ath9k/hif_usb.c:908 [inline]
ath9k_hif_usb_alloc_urbs+0x75e/0x1010 drivers/net/wireless/ath/ath9k/hif_usb.c:1019
ath9k_hif_usb_dev_init drivers/net/wireless/ath/ath9k/hif_usb.c:1109 [inline]
ath9k_hif_usb_firmware_cb+0x142/0x530 drivers/net/wireless/ath/ath9k/hif_usb.c:1242
request_firmware_work_func+0x12e/0x240 drivers/base/firmware_loader/main.c:1097
process_one_work+0x9af/0x1600 kernel/workqueue.c:2279
worker_thread+0x61d/0x12f0 kernel/workqueue.c:2425
kthread+0x3b4/0x4a0 kernel/kthread.c:313
ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:299
Found by Linux Verification Center (linuxtesting.org) with Syzkaller. |
| In the Linux kernel, the following vulnerability has been resolved:
slimbus: qcom-ngd: cleanup in probe error path
Add proper error path in probe() to cleanup resources previously
acquired/allocated to fix warnings visible during probe deferral:
notifier callback qcom_slim_ngd_ssr_notify already registered
WARNING: CPU: 6 PID: 70 at kernel/notifier.c:28 notifier_chain_register+0x5c/0x90
Modules linked in:
CPU: 6 PID: 70 Comm: kworker/u16:1 Not tainted 6.0.0-rc3-next-20220830 #380
Call trace:
notifier_chain_register+0x5c/0x90
srcu_notifier_chain_register+0x44/0x90
qcom_register_ssr_notifier+0x38/0x4c
qcom_slim_ngd_ctrl_probe+0xd8/0x400
platform_probe+0x6c/0xe0
really_probe+0xbc/0x2d4
__driver_probe_device+0x78/0xe0
driver_probe_device+0x3c/0x12c
__device_attach_driver+0xb8/0x120
bus_for_each_drv+0x78/0xd0
__device_attach+0xa8/0x1c0
device_initial_probe+0x18/0x24
bus_probe_device+0xa0/0xac
deferred_probe_work_func+0x88/0xc0
process_one_work+0x1d4/0x320
worker_thread+0x2cc/0x44c
kthread+0x110/0x114
ret_from_fork+0x10/0x20 |
| In the Linux kernel, the following vulnerability has been resolved:
md: Replace snprintf with scnprintf
Current code produces a warning as shown below when total characters
in the constituent block device names plus the slashes exceeds 200.
snprintf() returns the number of characters generated from the given
input, which could cause the expression “200 – len” to wrap around
to a large positive number. Fix this by using scnprintf() instead,
which returns the actual number of characters written into the buffer.
[ 1513.267938] ------------[ cut here ]------------
[ 1513.267943] WARNING: CPU: 15 PID: 37247 at <snip>/lib/vsprintf.c:2509 vsnprintf+0x2c8/0x510
[ 1513.267944] Modules linked in: <snip>
[ 1513.267969] CPU: 15 PID: 37247 Comm: mdadm Not tainted 5.4.0-1085-azure #90~18.04.1-Ubuntu
[ 1513.267969] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022
[ 1513.267971] RIP: 0010:vsnprintf+0x2c8/0x510
<-snip->
[ 1513.267982] Call Trace:
[ 1513.267986] snprintf+0x45/0x70
[ 1513.267990] ? disk_name+0x71/0xa0
[ 1513.267993] dump_zones+0x114/0x240 [raid0]
[ 1513.267996] ? _cond_resched+0x19/0x40
[ 1513.267998] raid0_run+0x19e/0x270 [raid0]
[ 1513.268000] md_run+0x5e0/0xc50
[ 1513.268003] ? security_capable+0x3f/0x60
[ 1513.268005] do_md_run+0x19/0x110
[ 1513.268006] md_ioctl+0x195e/0x1f90
[ 1513.268007] blkdev_ioctl+0x91f/0x9f0
[ 1513.268010] block_ioctl+0x3d/0x50
[ 1513.268012] do_vfs_ioctl+0xa9/0x640
[ 1513.268014] ? __fput+0x162/0x260
[ 1513.268016] ksys_ioctl+0x75/0x80
[ 1513.268017] __x64_sys_ioctl+0x1a/0x20
[ 1513.268019] do_syscall_64+0x5e/0x200
[ 1513.268021] entry_SYSCALL_64_after_hwframe+0x44/0xa9 |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
Store the error code before freeing the extent_map. Though it's
reference counted structure, in that function it's the first and last
allocation so this would lead to a potential use-after-free.
The error can happen eg. when chunk is stored on a missing device and
the degraded mount option is missing.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216721 |
| In the Linux kernel, the following vulnerability has been resolved:
iommu/omap: Fix buffer overflow in debugfs
There are two issues here:
1) The "len" variable needs to be checked before the very first write.
Otherwise if omap2_iommu_dump_ctx() with "bytes" less than 32 it is a
buffer overflow.
2) The snprintf() function returns the number of bytes that *would* have
been copied if there were enough space. But we want to know the
number of bytes which were *actually* copied so use scnprintf()
instead. |
| In the Linux kernel, the following vulnerability has been resolved:
lockd: set other missing fields when unlocking files
vfs_lock_file() expects the struct file_lock to be fully initialised by
the caller. Re-exported NFSv3 has been seen to Oops if the fl_file field
is NULL. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdkfd: Fix double release compute pasid
If kfd_process_device_init_vm returns failure after vm is converted to
compute vm and vm->pasid set to compute pasid, KFD will not take
pdd->drm_file reference. As a result, drm close file handler maybe
called to release the compute pasid before KFD process destroy worker to
release the same pasid and set vm->pasid to zero, this generates below
WARNING backtrace and NULL pointer access.
Add helper amdgpu_amdkfd_gpuvm_set_vm_pasid and call it at the last step
of kfd_process_device_init_vm, to ensure vm pasid is the original pasid
if acquiring vm failed or is the compute pasid with pdd->drm_file
reference taken to avoid double release same pasid.
amdgpu: Failed to create process VM object
ida_free called for id=32770 which is not allocated.
WARNING: CPU: 57 PID: 72542 at ../lib/idr.c:522 ida_free+0x96/0x140
RIP: 0010:ida_free+0x96/0x140
Call Trace:
amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu]
amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu]
drm_file_free.part.13+0x216/0x270 [drm]
drm_close_helper.isra.14+0x60/0x70 [drm]
drm_release+0x6e/0xf0 [drm]
__fput+0xcc/0x280
____fput+0xe/0x20
task_work_run+0x96/0xc0
do_exit+0x3d0/0xc10
BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: 0010:ida_free+0x76/0x140
Call Trace:
amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu]
amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu]
drm_file_free.part.13+0x216/0x270 [drm]
drm_close_helper.isra.14+0x60/0x70 [drm]
drm_release+0x6e/0xf0 [drm]
__fput+0xcc/0x280
____fput+0xe/0x20
task_work_run+0x96/0xc0
do_exit+0x3d0/0xc10 |
| In the Linux kernel, the following vulnerability has been resolved:
mtd: core: fix possible resource leak in init_mtd()
I got the error report while inject fault in init_mtd():
sysfs: cannot create duplicate filename '/devices/virtual/bdi/mtd-0'
Call Trace:
<TASK>
dump_stack_lvl+0x67/0x83
sysfs_warn_dup+0x60/0x70
sysfs_create_dir_ns+0x109/0x120
kobject_add_internal+0xce/0x2f0
kobject_add+0x98/0x110
device_add+0x179/0xc00
device_create_groups_vargs+0xf4/0x100
device_create+0x7b/0xb0
bdi_register_va.part.13+0x58/0x2d0
bdi_register+0x9b/0xb0
init_mtd+0x62/0x171 [mtd]
do_one_initcall+0x6c/0x3c0
do_init_module+0x58/0x222
load_module+0x268e/0x27d0
__do_sys_finit_module+0xd5/0x140
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
</TASK>
kobject_add_internal failed for mtd-0 with -EEXIST, don't try to register
things with the same name in the same directory.
Error registering mtd class or bdi: -17
If init_mtdchar() fails in init_mtd(), mtd_bdi will not be unregistered,
as a result, we can't load the mtd module again, to fix this by calling
bdi_unregister(mtd_bdi) after out_procfs label. |
| In the Linux kernel, the following vulnerability has been resolved:
ASoC: sof_es8336: fix possible use-after-free in sof_es8336_remove()
sof_es8336_remove() calls cancel_delayed_work(). However, that
function does not wait until the work function finishes. This
means that the callback function may still be running after
the driver's remove function has finished, which would result
in a use-after-free.
Fix by calling cancel_delayed_work_sync(), which ensures that
the work is properly cancelled, no longer running, and unable
to re-schedule itself. |
| In the Linux kernel, the following vulnerability has been resolved:
s390/cio: fix out-of-bounds access on cio_ignore free
The channel-subsystem-driver scans for newly available devices whenever
device-IDs are removed from the cio_ignore list using a command such as:
echo free >/proc/cio_ignore
Since an I/O device scan might interfer with running I/Os, commit
172da89ed0ea ("s390/cio: avoid excessive path-verification requests")
introduced an optimization to exclude online devices from the scan.
The newly added check for online devices incorrectly assumes that
an I/O-subchannel's drvdata points to a struct io_subchannel_private.
For devices that are bound to a non-default I/O subchannel driver, such
as the vfio_ccw driver, this results in an out-of-bounds read access
during each scan.
Fix this by changing the scan logic to rely on a driver-independent
online indication. For this we can use struct subchannel->config.ena,
which is the driver's requested subchannel-enabled state. Since I/Os
can only be started on enabled subchannels, this matches the intent
of the original optimization of not scanning devices where I/O might
be running. |