In the Linux kernel, the following vulnerability has been resolved: perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling amd_pmu_enable_all() does: if (!test_bit(idx, cpuc->active_mask)) continue; amd_pmu_enable_event(cpuc->events[idx]); A perf NMI of another event can come between these two steps. Perf NMI handler internally disables and enables _all_ events, including the one which nmi-intercepted amd_pmu_enable_all() was in process of enabling. If that unintentionally enabled event has very low sampling period and causes immediate successive NMI, causing the event to be throttled, cpuc->events[idx] and cpuc->active_mask gets cleared by x86_pmu_stop(). This will result in amd_pmu_enable_event() getting called with event=NULL when amd_pmu_enable_all() resumes after handling the NMIs. This causes a kernel crash: BUG: kernel NULL pointer dereference, address: 0000000000000198 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page [...] Call Trace: <TASK> amd_pmu_enable_all+0x68/0xb0 ctx_resched+0xd9/0x150 event_function+0xb8/0x130 ? hrtimer_start_range_ns+0x141/0x4a0 ? perf_duration_warn+0x30/0x30 remote_function+0x4d/0x60 __flush_smp_call_function_queue+0xc4/0x500 flush_smp_call_function_queue+0x11d/0x1b0 do_idle+0x18f/0x2d0 cpu_startup_entry+0x19/0x20 start_secondary+0x121/0x160 secondary_startup_64_no_verify+0xe5/0xeb </TASK> amd_pmu_disable_all()/amd_pmu_enable_all() calls inside perf NMI handler were recently added as part of BRS enablement but I'm not sure whether we really need them. We can just disable BRS in the beginning and enable it back while returning from NMI. This will solve the issue by not enabling those events whose active_masks are set but are not yet enabled in hw pmu.
History

Wed, 18 Jun 2025 10:45:00 +0000

Type Values Removed Values Added
Metrics cvssV3_1

{'score': 7.0, 'vector': 'CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H'}

cvssV3_1

{'score': 4.1, 'vector': 'CVSS:3.1/AV:L/AC:H/PR:H/UI:N/S:U/C:N/I:N/A:H'}


Sat, 07 Jun 2025 03:00:00 +0000

Type Values Removed Values Added
Metrics cvssV3_1

{'score': 5.5, 'vector': 'CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H'}

cvssV3_1

{'score': 7.0, 'vector': 'CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H'}


Fri, 02 May 2025 13:45:00 +0000

Type Values Removed Values Added
References
Metrics threat_severity

None

cvssV3_1

{'score': 5.5, 'vector': 'CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H'}

threat_severity

Moderate


Thu, 01 May 2025 14:30:00 +0000

Type Values Removed Values Added
Description In the Linux kernel, the following vulnerability has been resolved: perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling amd_pmu_enable_all() does: if (!test_bit(idx, cpuc->active_mask)) continue; amd_pmu_enable_event(cpuc->events[idx]); A perf NMI of another event can come between these two steps. Perf NMI handler internally disables and enables _all_ events, including the one which nmi-intercepted amd_pmu_enable_all() was in process of enabling. If that unintentionally enabled event has very low sampling period and causes immediate successive NMI, causing the event to be throttled, cpuc->events[idx] and cpuc->active_mask gets cleared by x86_pmu_stop(). This will result in amd_pmu_enable_event() getting called with event=NULL when amd_pmu_enable_all() resumes after handling the NMIs. This causes a kernel crash: BUG: kernel NULL pointer dereference, address: 0000000000000198 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page [...] Call Trace: <TASK> amd_pmu_enable_all+0x68/0xb0 ctx_resched+0xd9/0x150 event_function+0xb8/0x130 ? hrtimer_start_range_ns+0x141/0x4a0 ? perf_duration_warn+0x30/0x30 remote_function+0x4d/0x60 __flush_smp_call_function_queue+0xc4/0x500 flush_smp_call_function_queue+0x11d/0x1b0 do_idle+0x18f/0x2d0 cpu_startup_entry+0x19/0x20 start_secondary+0x121/0x160 secondary_startup_64_no_verify+0xe5/0xeb </TASK> amd_pmu_disable_all()/amd_pmu_enable_all() calls inside perf NMI handler were recently added as part of BRS enablement but I'm not sure whether we really need them. We can just disable BRS in the beginning and enable it back while returning from NMI. This will solve the issue by not enabling those events whose active_masks are set but are not yet enabled in hw pmu.
Title perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling
References

cve-icon MITRE

Status: PUBLISHED

Assigner: Linux

Published: 2025-05-01T14:09:15.775Z

Updated: 2025-05-04T08:45:14.518Z

Reserved: 2025-04-16T07:17:33.806Z

Link: CVE-2022-49781

cve-icon Vulnrichment

No data.

cve-icon NVD

Status : Awaiting Analysis

Published: 2025-05-01T15:16:01.307

Modified: 2025-05-02T13:53:20.943

Link: CVE-2022-49781

cve-icon Redhat

Severity : Moderate

Publid Date: 2025-05-01T00:00:00Z

Links: CVE-2022-49781 - Bugzilla