| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
nilfs2: avoid having an active sc_timer before freeing sci
Because kthread_stop did not stop sc_task properly and returned -EINTR,
the sc_timer was not properly closed, ultimately causing the problem [1]
reported by syzbot when freeing sci due to the sc_timer not being closed.
Because the thread sc_task main function nilfs_segctor_thread() returns 0
when it succeeds, when the return value of kthread_stop() is not 0 in
nilfs_segctor_destroy(), we believe that it has not properly closed
sc_timer.
We use timer_shutdown_sync() to sync wait for sc_timer to shutdown, and
set the value of sc_task to NULL under the protection of lock
sc_state_lock, so as to avoid the issue caused by sc_timer not being
properly shutdowned.
[1]
ODEBUG: free active (active state 0) object: 00000000dacb411a object type: timer_list hint: nilfs_construction_timeout
Call trace:
nilfs_segctor_destroy fs/nilfs2/segment.c:2811 [inline]
nilfs_detach_log_writer+0x668/0x8cc fs/nilfs2/segment.c:2877
nilfs_put_super+0x4c/0x12c fs/nilfs2/super.c:509 |
| In the Linux kernel, the following vulnerability has been resolved:
mtdchar: fix integer overflow in read/write ioctls
The "req.start" and "req.len" variables are u64 values that come from the
user at the start of the function. We mask away the high 32 bits of
"req.len" so that's capped at U32_MAX but the "req.start" variable can go
up to U64_MAX which means that the addition can still integer overflow.
Use check_add_overflow() to fix this bug. |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: target: tcm_loop: Fix segfault in tcm_loop_tpg_address_show()
If the allocation of tl_hba->sh fails in tcm_loop_driver_probe() and we
attempt to dereference it in tcm_loop_tpg_address_show() we will get a
segfault, see below for an example. So, check tl_hba->sh before
dereferencing it.
Unable to allocate struct scsi_host
BUG: kernel NULL pointer dereference, address: 0000000000000194
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 8356 Comm: tokio-runtime-w Not tainted 6.6.104.2-4.azl3 #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024
RIP: 0010:tcm_loop_tpg_address_show+0x2e/0x50 [tcm_loop]
...
Call Trace:
<TASK>
configfs_read_iter+0x12d/0x1d0 [configfs]
vfs_read+0x1b5/0x300
ksys_read+0x6f/0xf0
... |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: Initialise rcv_mss before calling tcp_send_active_reset() in mptcp_do_fastclose().
syzbot reported divide-by-zero in __tcp_select_window() by
MPTCP socket. [0]
We had a similar issue for the bare TCP and fixed in commit
499350a5a6e7 ("tcp: initialize rcv_mss to TCP_MIN_MSS instead
of 0").
Let's apply the same fix to mptcp_do_fastclose().
[0]:
Oops: divide error: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 6068 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:__tcp_select_window+0x824/0x1320 net/ipv4/tcp_output.c:3336
Code: ff ff ff 44 89 f1 d3 e0 89 c1 f7 d1 41 01 cc 41 21 c4 e9 a9 00 00 00 e8 ca 49 01 f8 e9 9c 00 00 00 e8 c0 49 01 f8 44 89 e0 99 <f7> 7c 24 1c 41 29 d4 48 bb 00 00 00 00 00 fc ff df e9 80 00 00 00
RSP: 0018:ffffc90003017640 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88807b469e40
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90003017730 R08: ffff888033268143 R09: 1ffff1100664d028
R10: dffffc0000000000 R11: ffffed100664d029 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 000055557faa0500(0000) GS:ffff888126135000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f64a1912ff8 CR3: 0000000072122000 CR4: 00000000003526f0
Call Trace:
<TASK>
tcp_select_window net/ipv4/tcp_output.c:281 [inline]
__tcp_transmit_skb+0xbc7/0x3aa0 net/ipv4/tcp_output.c:1568
tcp_transmit_skb net/ipv4/tcp_output.c:1649 [inline]
tcp_send_active_reset+0x2d1/0x5b0 net/ipv4/tcp_output.c:3836
mptcp_do_fastclose+0x27e/0x380 net/mptcp/protocol.c:2793
mptcp_disconnect+0x238/0x710 net/mptcp/protocol.c:3253
mptcp_sendmsg_fastopen+0x2f8/0x580 net/mptcp/protocol.c:1776
mptcp_sendmsg+0x1774/0x1980 net/mptcp/protocol.c:1855
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0xe5/0x270 net/socket.c:742
__sys_sendto+0x3bd/0x520 net/socket.c:2244
__do_sys_sendto net/socket.c:2251 [inline]
__se_sys_sendto net/socket.c:2247 [inline]
__x64_sys_sendto+0xde/0x100 net/socket.c:2247
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f66e998f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffff9acedb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f66e9be5fa0 RCX: 00007f66e998f749
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007ffff9acee10 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f66e9be5fa0 R14: 00007f66e9be5fa0 R15: 0000000000000006
</TASK> |
| In the Linux kernel, the following vulnerability has been resolved:
drm/tegra: Add call to put_pid()
Add a call to put_pid() corresponding to get_task_pid().
host1x_memory_context_alloc() does not take ownership of the PID so we
need to free it here to avoid leaking.
[mperttunen@nvidia.com: reword commit message] |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: fix gpu page fault after hibernation on PF passthrough
On PF passthrough environment, after hibernate and then resume, coralgemm
will cause gpu page fault.
Mode1 reset happens during hibernate, but partition mode is not restored
on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right
after resume. When CP access the MQD BO, wrong stride size is used,
this will cause out of bound access on the MQD BO, resulting page fault.
The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called
when resume from a hibernation.
KFD resume is called separately during a reset recovery or resume from
suspend sequence. Hence it's not required to be called as part of
partition switch.
(cherry picked from commit 5d1b32cfe4a676fe552416cb5ae847b215463a1a) |
| In the Linux kernel, the following vulnerability has been resolved:
pinctrl: s32cc: fix uninitialized memory in s32_pinctrl_desc
s32_pinctrl_desc is allocated with devm_kmalloc(), but not all of its
fields are initialized. Notably, num_custom_params is used in
pinconf_generic_parse_dt_config(), resulting in intermittent allocation
errors, such as the following splat when probing i2c-imx:
WARNING: CPU: 0 PID: 176 at mm/page_alloc.c:4795 __alloc_pages_noprof+0x290/0x300
[...]
Hardware name: NXP S32G3 Reference Design Board 3 (S32G-VNP-RDB3) (DT)
[...]
Call trace:
__alloc_pages_noprof+0x290/0x300 (P)
___kmalloc_large_node+0x84/0x168
__kmalloc_large_node_noprof+0x34/0x120
__kmalloc_noprof+0x2ac/0x378
pinconf_generic_parse_dt_config+0x68/0x1a0
s32_dt_node_to_map+0x104/0x248
dt_to_map_one_config+0x154/0x1d8
pinctrl_dt_to_map+0x12c/0x280
create_pinctrl+0x6c/0x270
pinctrl_get+0xc0/0x170
devm_pinctrl_get+0x50/0xa0
pinctrl_bind_pins+0x60/0x2a0
really_probe+0x60/0x3a0
[...]
__platform_driver_register+0x2c/0x40
i2c_adap_imx_init+0x28/0xff8 [i2c_imx]
[...]
This results in later parse failures that can cause issues in dependent
drivers:
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c0-pins/i2c0-grp0: could not parse node property
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c0-pins/i2c0-grp0: could not parse node property
[...]
pca953x 0-0022: failed writing register: -6
i2c i2c-0: IMX I2C adapter registered
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c2-pins/i2c2-grp0: could not parse node property
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c2-pins/i2c2-grp0: could not parse node property
i2c i2c-1: IMX I2C adapter registered
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c4-pins/i2c4-grp0: could not parse node property
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c4-pins/i2c4-grp0: could not parse node property
i2c i2c-2: IMX I2C adapter registered
Fix this by initializing s32_pinctrl_desc with devm_kzalloc() instead of
devm_kmalloc() in s32_pinctrl_probe(), which sets the previously
uninitialized fields to zero. |
| In the Linux kernel, the following vulnerability has been resolved:
cifs: fix memory leak in smb3_fs_context_parse_param error path
Add proper cleanup of ctx->source and fc->source to the
cifs_parse_mount_err error handler. This ensures that memory allocated
for the source strings is correctly freed on all error paths, matching
the cleanup already performed in the success path by
smb3_cleanup_fs_context_contents().
Pointers are also set to NULL after freeing to prevent potential
double-free issues.
This change fixes a memory leak originally detected by syzbot. The
leak occurred when processing Opt_source mount options if an error
happened after ctx->source and fc->source were successfully
allocated but before the function completed.
The specific leak sequence was:
1. ctx->source = smb3_fs_context_fullpath(ctx, '/') allocates memory
2. fc->source = kstrdup(ctx->source, GFP_KERNEL) allocates more memory
3. A subsequent error jumps to cifs_parse_mount_err
4. The old error handler freed passwords but not the source strings,
causing the memory to leak.
This issue was not addressed by commit e8c73eb7db0a ("cifs: client:
fix memory leak in smb3_fs_context_parse_param"), which only fixed
leaks from repeated fsconfig() calls but not this error path.
Patch updated with minor change suggested by kernel test robot |
| In the Linux kernel, the following vulnerability has been resolved:
drm/radeon: delete radeon_fence_process in is_signaled, no deadlock
Delete the attempt to progress the queue when checking if fence is
signaled. This avoids deadlock.
dma-fence_ops::signaled can be called with the fence lock in unknown
state. For radeon, the fence lock is also the wait queue lock. This can
cause a self deadlock when signaled() tries to make forward progress on
the wait queue. But advancing the queue is unneeded because incorrectly
returning false from signaled() is perfectly acceptable.
(cherry picked from commit 527ba26e50ec2ca2be9c7c82f3ad42998a75d0db) |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: core: Fix a regression triggered by scsi_host_busy()
Commit 995412e23bb2 ("blk-mq: Replace tags->lock with SRCU for tag
iterators") introduced the following regression:
Call trace:
__srcu_read_lock+0x30/0x80 (P)
blk_mq_tagset_busy_iter+0x44/0x300
scsi_host_busy+0x38/0x70
ufshcd_print_host_state+0x34/0x1bc
ufshcd_link_startup.constprop.0+0xe4/0x2e0
ufshcd_init+0x944/0xf80
ufshcd_pltfrm_init+0x504/0x820
ufs_rockchip_probe+0x2c/0x88
platform_probe+0x5c/0xa4
really_probe+0xc0/0x38c
__driver_probe_device+0x7c/0x150
driver_probe_device+0x40/0x120
__driver_attach+0xc8/0x1e0
bus_for_each_dev+0x7c/0xdc
driver_attach+0x24/0x30
bus_add_driver+0x110/0x230
driver_register+0x68/0x130
__platform_driver_register+0x20/0x2c
ufs_rockchip_pltform_init+0x1c/0x28
do_one_initcall+0x60/0x1e0
kernel_init_freeable+0x248/0x2c4
kernel_init+0x20/0x140
ret_from_fork+0x10/0x20
Fix this regression by making scsi_host_busy() check whether the SCSI
host tag set has already been initialized. tag_set->ops is set by
scsi_mq_setup_tags() just before blk_mq_alloc_tag_set() is called. This
fix is based on the assumption that scsi_host_busy() and
scsi_mq_setup_tags() calls are serialized. This is the case in the UFS
driver. |
| In the Linux kernel, the following vulnerability has been resolved:
comedi: multiq3: sanitize config options in multiq3_attach()
Syzbot identified an issue [1] in multiq3_attach() that induces a
task timeout due to open() or COMEDI_DEVCONFIG ioctl operations,
specifically, in the case of multiq3 driver.
This problem arose when syzkaller managed to craft weird configuration
options used to specify the number of channels in encoder subdevice.
If a particularly great number is passed to s->n_chan in
multiq3_attach() via it->options[2], then multiple calls to
multiq3_encoder_reset() at the end of driver-specific attach() method
will be running for minutes, thus blocking tasks and affected devices
as well.
While this issue is most likely not too dangerous for real-life
devices, it still makes sense to sanitize configuration inputs. Enable
a sensible limit on the number of encoder chips (4 chips max, each
with 2 channels) to stop this behaviour from manifesting.
[1] Syzbot crash:
INFO: task syz.2.19:6067 blocked for more than 143 seconds.
...
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5254 [inline]
__schedule+0x17c4/0x4d60 kernel/sched/core.c:6862
__schedule_loop kernel/sched/core.c:6944 [inline]
schedule+0x165/0x360 kernel/sched/core.c:6959
schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7016
__mutex_lock_common kernel/locking/mutex.c:676 [inline]
__mutex_lock+0x7e6/0x1350 kernel/locking/mutex.c:760
comedi_open+0xc0/0x590 drivers/comedi/comedi_fops.c:2868
chrdev_open+0x4cc/0x5e0 fs/char_dev.c:414
do_dentry_open+0x953/0x13f0 fs/open.c:965
vfs_open+0x3b/0x340 fs/open.c:1097
... |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hci_sock: Prevent race in socket write iter and sock bind
There is a potential race condition between sock bind and socket write
iter. bind may free the same cmd via mgmt_pending before write iter sends
the cmd, just as syzbot reported in UAF[1].
Here we use hci_dev_lock to synchronize the two, thereby avoiding the
UAF mentioned in [1].
[1]
syzbot reported:
BUG: KASAN: slab-use-after-free in mgmt_pending_remove+0x3b/0x210 net/bluetooth/mgmt_util.c:316
Read of size 8 at addr ffff888077164818 by task syz.0.17/5989
Call Trace:
mgmt_pending_remove+0x3b/0x210 net/bluetooth/mgmt_util.c:316
set_link_security+0x5c2/0x710 net/bluetooth/mgmt.c:1918
hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719
hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0x21c/0x270 net/socket.c:742
sock_write_iter+0x279/0x360 net/socket.c:1195
Allocated by task 5989:
mgmt_pending_add+0x35/0x140 net/bluetooth/mgmt_util.c:296
set_link_security+0x557/0x710 net/bluetooth/mgmt.c:1910
hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719
hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0x21c/0x270 net/socket.c:742
sock_write_iter+0x279/0x360 net/socket.c:1195
Freed by task 5991:
mgmt_pending_free net/bluetooth/mgmt_util.c:311 [inline]
mgmt_pending_foreach+0x30d/0x380 net/bluetooth/mgmt_util.c:257
mgmt_index_removed+0x112/0x2f0 net/bluetooth/mgmt.c:9477
hci_sock_bind+0xbe9/0x1000 net/bluetooth/hci_sock.c:1314 |
| In the Linux kernel, the following vulnerability has been resolved:
io_uring/cmd_net: fix wrong argument types for skb_queue_splice()
If timestamp retriving needs to be retried and the local list of
SKB's already has entries, then it's spliced back into the socket
queue. However, the arguments for the splice helper are transposed,
causing exactly the wrong direction of splicing into the on-stack
list. Fix that up. |
| In the Linux kernel, the following vulnerability has been resolved:
parisc: Avoid crash due to unaligned access in unwinder
Guenter Roeck reported this kernel crash on his emulated B160L machine:
Starting network: udhcpc: started, v1.36.1
Backtrace:
[<104320d4>] unwind_once+0x1c/0x5c
[<10434a00>] walk_stackframe.isra.0+0x74/0xb8
[<10434a6c>] arch_stack_walk+0x28/0x38
[<104e5efc>] stack_trace_save+0x48/0x5c
[<105d1bdc>] set_track_prepare+0x44/0x6c
[<105d9c80>] ___slab_alloc+0xfc4/0x1024
[<105d9d38>] __slab_alloc.isra.0+0x58/0x90
[<105dc80c>] kmem_cache_alloc_noprof+0x2ac/0x4a0
[<105b8e54>] __anon_vma_prepare+0x60/0x280
[<105a823c>] __vmf_anon_prepare+0x68/0x94
[<105a8b34>] do_wp_page+0x8cc/0xf10
[<105aad88>] handle_mm_fault+0x6c0/0xf08
[<10425568>] do_page_fault+0x110/0x440
[<10427938>] handle_interruption+0x184/0x748
[<11178398>] schedule+0x4c/0x190
BUG: spinlock recursion on CPU#0, ifconfig/2420
lock: terminate_lock.2+0x0/0x1c, .magic: dead4ead, .owner: ifconfig/2420, .owner_cpu: 0
While creating the stack trace, the unwinder uses the stack pointer to guess
the previous frame to read the previous stack pointer from memory. The crash
happens, because the unwinder tries to read from unaligned memory and as such
triggers the unalignment trap handler which then leads to the spinlock
recursion and finally to a deadlock.
Fix it by checking the alignment before accessing the memory. |
| In the Linux kernel, the following vulnerability has been resolved:
ASoC: SDCA: bug fix while parsing mipi-sdca-control-cn-list
"struct sdca_control" declares "values" field as integer array.
But the memory allocated to it is of char array. This causes
crash for sdca_parse_function API. This patch addresses the
issue by allocating correct data size. |
| In the Linux kernel, the following vulnerability has been resolved:
s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump
Do not block PCI config accesses through pci_cfg_access_lock() when
executing the s390 variant of PCI error recovery: Acquire just
device_lock() instead of pci_dev_lock() as powerpc's EEH and
generig PCI AER processing do.
During error recovery testing a pair of tasks was reported to be hung:
mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working
INFO: task kmcheck:72 blocked for more than 122 seconds.
Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000
Call Trace:
[<000000065256f030>] __schedule+0x2a0/0x590
[<000000065256f356>] schedule+0x36/0xe0
[<000000065256f572>] schedule_preempt_disabled+0x22/0x30
[<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8
[<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core]
[<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core]
[<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398
[<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0
INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds.
Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000
Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
[<000000065256f030>] __schedule+0x2a0/0x590
[<000000065256f356>] schedule+0x36/0xe0
[<0000000652172e28>] pci_wait_cfg+0x80/0xe8
[<0000000652172f94>] pci_cfg_access_lock+0x74/0x88
[<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core]
[<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core]
[<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core]
[<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168
[<0000000652513212>] devlink_health_report+0x19a/0x230
[<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core]
No kernel log of the exact same error with an upstream kernel is
available - but the very same deadlock situation can be constructed there,
too:
- task: kmcheck
mlx5_unload_one() tries to acquire devlink lock while the PCI error
recovery code has set pdev->block_cfg_access by way of
pci_cfg_access_lock()
- task: kworker
mlx5_crdump_collect() tries to set block_cfg_access through
pci_cfg_access_lock() while devlink_health_report() had acquired
the devlink lock.
A similar deadlock situation can be reproduced by requesting a
crdump with
> devlink health dump show pci/<BDF> reporter fw_fatal
while PCI error recovery is executed on the same <BDF> physical function
by mlx5_core's pci_error_handlers. On s390 this can be injected with
> zpcictl --reset-fw <BDF>
Tests with this patch failed to reproduce that second deadlock situation,
the devlink command is rejected with "kernel answers: Permission denied" -
and we get a kernel log message of:
mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5
because the config read of VSC_SEMAPHORE is rejected by the underlying
hardware.
Two prior attempts to address this issue have been discussed and
ultimately rejected [see link], with the primary argument that s390's
implementation of PCI error recovery is imposing restrictions that
neither powerpc's EEH nor PCI AER handling need. Tests show that PCI
error recovery on s390 is running to completion even without blocking
access to PCI config space. |
| In the Linux kernel, the following vulnerability has been resolved:
net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
Make knav_dma_open_channel consistently return NULL on error instead
of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h
returns NULL when the driver is disabled, but the driver
implementation does not even return NULL or ERR_PTR on failure,
causing inconsistency in the users. This results in a crash in
netcp_free_navigator_resources as followed (trimmed):
Unhandled fault: alignment exception (0x221) at 0xfffffff2
[fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000
Internal error: : 221 [#1] SMP ARM
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE
Hardware name: Keystone
PC is at knav_dma_close_channel+0x30/0x19c
LR is at netcp_free_navigator_resources+0x2c/0x28c
[... TRIM...]
Call trace:
knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c
netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c
netcp_ndo_open from __dev_open+0x114/0x29c
__dev_open from __dev_change_flags+0x190/0x208
__dev_change_flags from netif_change_flags+0x1c/0x58
netif_change_flags from dev_change_flags+0x38/0xa0
dev_change_flags from ip_auto_config+0x2c4/0x11f0
ip_auto_config from do_one_initcall+0x58/0x200
do_one_initcall from kernel_init_freeable+0x1cc/0x238
kernel_init_freeable from kernel_init+0x1c/0x12c
kernel_init from ret_from_fork+0x14/0x38
[... TRIM...]
Standardize the error handling by making the function return NULL on
all error conditions. The API is used in just the netcp_core.c so the
impact is limited.
Note, this change, in effect reverts commit 5b6cb43b4d62 ("net:
ethernet: ti: netcp_core: return error while dma channel open issue"),
but provides a less error prone implementation. |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: BPF: Disable trampoline for kernel module function trace
The current LoongArch BPF trampoline implementation is incompatible
with tracing functions in kernel modules. This causes several severe
and user-visible problems:
* The `bpf_selftests/module_attach` test fails consistently.
* Kernel lockup when a BPF program is attached to a module function [1].
* Critical kernel modules like WireGuard experience traffic disruption
when their functions are traced with fentry [2].
Given the severity and the potential for other unknown side-effects, it
is safest to disable the feature entirely for now. This patch prevents
the BPF subsystem from allowing trampoline attachments to kernel module
functions on LoongArch.
This is a temporary mitigation until the core issues in the trampoline
code for kernel module handling can be identified and fixed.
[root@fedora bpf]# ./test_progs -a module_attach -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
[1]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C13FMT58oqQ@mail.gmail.com/
[2]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcFaK93uoELPg@mail.gmail.com/ |
| In the Linux kernel, the following vulnerability has been resolved:
veth: more robust handing of race to avoid txq getting stuck
Commit dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to
reduce TX drops") introduced a race condition that can lead to a permanently
stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra
Max).
The race occurs in veth_xmit(). The producer observes a full ptr_ring and
stops the queue (netif_tx_stop_queue()). The subsequent conditional logic,
intended to re-wake the queue if the consumer had just emptied it (if
(__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a
"lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and
traffic halts.
This failure is caused by an incorrect use of the __ptr_ring_empty() API
from the producer side. As noted in kernel comments, this check is not
guaranteed to be correct if a consumer is operating on another CPU. The
empty test is based on ptr_ring->consumer_head, making it reliable only for
the consumer. Using this check from the producer side is fundamentally racy.
This patch fixes the race by adopting the more robust logic from an earlier
version V4 of the patchset, which always flushed the peer:
(1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier
are removed. Instead, after stopping the queue, we unconditionally call
__veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled,
making it solely responsible for re-waking the TXQ.
This handles the race where veth_poll() consumes all packets and completes
NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue.
The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule
NAPI.
(2) On the consumer side, the logic for waking the peer TXQ is moved out of
veth_xdp_rcv() and placed at the end of the veth_poll() function. This
placement is part of fixing the race, as the netif_tx_queue_stopped() check
must occur after rx_notify_masked is potentially set to false during NAPI
completion.
This handles the race where veth_poll() consumes all packets, but haven't
finished (rx_notify_masked is still true). The producer veth_xmit() stops the
TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning
not starting NAPI. Then veth_poll() change rx_notify_masked to false and
stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake
it up. |
| In the Linux kernel, the following vulnerability has been resolved:
lib/test_kho: check if KHO is enabled
We must check whether KHO is enabled prior to issuing KHO commands,
otherwise KHO internal data structures are not initialized. |