TencentOS Server 4: kernel (TSSA-2024:0981)

high Nessus Plugin ID 239342

Synopsis

The remote TencentOS Server 4 host is missing one or more security updates.

Description

The version of Tencent Linux installed on the remote TencentOS Server 4 host is prior to tested version. It is, therefore, affected by multiple vulnerabilities as referenced in the TSSA-2024:0981 advisory.

Package updates are available for TencentOS Server 4 that fix the following vulnerabilities:

CVE-2024-39496:
In the Linux kernel, the following vulnerability has been resolved:

btrfs: zoned: fix use-after-free due to race with dev replace

While loading a zone's info during creation of a block group, we can race with a device replace operation and then trigger a use-after-free on the device that was just replaced (source device of the replace operation).

This happens because at btrfs_load_zone_info() we extract a device from the chunk map into a local variable and then use the device while not under the protection of the device replace rwsem. So if there's a device replace operation happening when we extract the device and that device is the source of the replace operation, we will trigger a use-after-free if before we finish using the device the replace operation finishes and frees the device.

Fix this by enlarging the critical section under the protection of the device replace rwsem so that all uses of the device are done inside the critical section.

CVE-2024-38571:
In the Linux kernel, the following vulnerability has been resolved:

thermal/drivers/tsens: Fix null pointer dereference

compute_intercept_slope() is called from calibrate_8960() (in tsens-8960.c) as compute_intercept_slope(priv, p1, NULL, ONE_PT_CALIB) which lead to null pointer dereference (if DEBUG or DYNAMIC_DEBUG set).
Fix this bug by adding null pointer check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

CVE-2024-38569:
In the Linux kernel, the following vulnerability has been resolved:

drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group

The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HISI_PCIE_MAX_COUNTERS, the memory write overflow of event_group array occurs.

Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds.

There are 9 different events in an event_group.
[1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}'

CVE-2024-38568:
In the Linux kernel, the following vulnerability has been resolved:

drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group

The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the memory write overflow of event_group array occurs.

Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds.

There are 9 different events in an event_group.
[1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}

CVE-2024-38548:
In the Linux kernel, the following vulnerability has been resolved:

drm: bridge: cdns-mhdp8546: Fix possible null pointer dereference

In cdns_mhdp_atomic_enable(), the return value of drm_mode_duplicate() is assigned to mhdp_state->current_mode, and there is a dereference of it in drm_mode_set_name(), which will lead to a NULL pointer dereference on failure of drm_mode_duplicate().

Fix this bug add a check of mhdp_state->current_mode.

CVE-2024-38547:
In the Linux kernel, the following vulnerability has been resolved:

media: atomisp: ssh_css: Fix a null-pointer dereference in load_video_binaries

The allocation failure of mycs->yuv_scaler_binary in load_video_binaries() is followed with a dereference of mycs->yuv_scaler_binary after the following call chain:

sh_css_pipe_load_binaries() |-> load_video_binaries(mycs->yuv_scaler_binary == NULL) | |-> sh_css_pipe_unload_binaries() |-> unload_video_binaries()

In unload_video_binaries(), it calls to ia_css_binary_unload with argument &pipe->pipe_settings.video.yuv_scaler_binary[i], which refers to the same memory slot as mycs->yuv_scaler_binary. Thus, a null-pointer dereference is triggered.

CVE-2024-38546:
In the Linux kernel, the following vulnerability has been resolved:

drm: vc4: Fix possible null pointer dereference

In vc4_hdmi_audio_init() of_get_address() may return NULL which is later dereferenced. Fix this bug by adding NULL check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

CVE-2024-38388:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: hda/cs_dsp_ctl: Use private_free for control cleanup

Use the control private_free callback to free the associated data block. This ensures that the memory won't leak, whatever way the control gets destroyed.

The original implementation didn't actually remove the ALSA controls in hda_cs_dsp_control_remove(). It only freed the internal tracking structure. This meant it was possible to remove/unload the amp driver while leaving its ALSA controls still present in the soundcard. Obviously attempting to access them could cause segfaults or at least dereferencing stale pointers.

CVE-2024-36936:
In the Linux kernel, the following vulnerability has been resolved:

efi/unaccepted: touch soft lockup during memory accept

Commit 50e782a86c98 (efi/unaccepted: Fix soft lockups caused by parallel memory acceptance) has released the spinlock so other CPUs can do memory acceptance in parallel and not triggers softlockup on other CPUs.

However the softlock up was intermittent shown up if the memory of the TD guest is large, and the timeout of softlockup is set to 1 second:

RIP: 0010:_raw_spin_unlock_irqrestore Call Trace:
? __hrtimer_run_queues <IRQ> ? hrtimer_interrupt ? watchdog_timer_fn ? __sysvec_apic_timer_interrupt ? __pfx_watchdog_timer_fn ? sysvec_apic_timer_interrupt </IRQ> ? __hrtimer_run_queues <TASK> ? hrtimer_interrupt ? asm_sysvec_apic_timer_interrupt ? _raw_spin_unlock_irqrestore ? __sysvec_apic_timer_interrupt ? sysvec_apic_timer_interrupt accept_memory try_to_accept_memory do_huge_pmd_anonymous_page get_page_from_freelist
__handle_mm_fault
__alloc_pages
__folio_alloc ? __tdx_hypercall handle_mm_fault vma_alloc_folio do_user_addr_fault do_huge_pmd_anonymous_page exc_page_fault ? __do_huge_pmd_anonymous_page asm_exc_page_fault
__handle_mm_fault

When the local irq is enabled at the end of accept_memory(), the softlockup detects that the watchdog on single CPU has not been fed for a while. That is to say, even other CPUs will not be blocked by spinlock, the current CPU might be stunk with local irq disabled for a while, which hurts not only nmi watchdog but also softlockup.

Chao Gao pointed out that the memory accept could be time costly and there was similar report before. Thus to avoid any softlocup detection during this stage, give the softlockup a flag to skip the timeout check at the end of accept_memory(), by invoking touch_softlockup_watchdog().

CVE-2024-36930:
In the Linux kernel, the following vulnerability has been resolved:

spi: fix null pointer dereference within spi_sync

If spi_sync() is called with the non-empty queue and the same spi_message is then reused, the complete callback for the message remains set while the context is cleared, leading to a null pointer dereference when the callback is invoked from spi_finalize_current_message().

With function inlining disabled, the call stack might look like this:

_raw_spin_lock_irqsave from complete_with_flags+0x18/0x58 complete_with_flags from spi_complete+0x8/0xc spi_complete from spi_finalize_current_message+0xec/0x184 spi_finalize_current_message from spi_transfer_one_message+0x2a8/0x474 spi_transfer_one_message from __spi_pump_transfer_message+0x104/0x230
__spi_pump_transfer_message from __spi_transfer_message_noqueue+0x30/0xc4
__spi_transfer_message_noqueue from __spi_sync+0x204/0x248
__spi_sync from spi_sync+0x24/0x3c spi_sync from mcp251xfd_regmap_crc_read+0x124/0x28c [mcp251xfd] mcp251xfd_regmap_crc_read [mcp251xfd] from _regmap_raw_read+0xf8/0x154
_regmap_raw_read from _regmap_bus_read+0x44/0x70
_regmap_bus_read from _regmap_read+0x60/0xd8
_regmap_read from regmap_read+0x3c/0x5c regmap_read from mcp251xfd_alloc_can_err_skb+0x1c/0x54 [mcp251xfd] mcp251xfd_alloc_can_err_skb [mcp251xfd] from mcp251xfd_irq+0x194/0xe70 [mcp251xfd] mcp251xfd_irq [mcp251xfd] from irq_thread_fn+0x1c/0x78 irq_thread_fn from irq_thread+0x118/0x1f4 irq_thread from kthread+0xd8/0xf4 kthread from ret_from_fork+0x14/0x28

Fix this by also setting message->complete to NULL when the transfer is complete.

CVE-2024-36924:
In the Linux kernel, the following vulnerability has been resolved:

scsi: lpfc: Release hbalock before calling lpfc_worker_wake_up()

lpfc_worker_wake_up() calls the lpfc_work_done() routine, which takes the hbalock. Thus, lpfc_worker_wake_up() should not be called while holding the hbalock to avoid potential deadlock.

CVE-2024-36920:
In the Linux kernel, the following vulnerability has been resolved:

scsi: mpi3mr: Avoid memcpy field-spanning write WARNING

When the storcli2 show command is executed for eHBA-9600, mpi3mr driver prints this WARNING message:

memcpy: detected field-spanning write (size 128) of single field bsg_reply_buf->reply_buf at drivers/scsi/mpi3mr/mpi3mr_app.c:1658 (size 1) WARNING: CPU: 0 PID: 12760 at drivers/scsi/mpi3mr/mpi3mr_app.c:1658 mpi3mr_bsg_request+0x6b12/0x7f10 [mpi3mr]

The cause of the WARN is 128 bytes memcpy to the 1 byte size array __u8 replay_buf[1] in the struct mpi3mr_bsg_in_reply_buf. The array is intended to be a flexible length array, so the WARN is a false positive.

To suppress the WARN, remove the constant number '1' from the array declaration and clarify that it has flexible length. Also, adjust the memory allocation size to match the change.

CVE-2024-36914:
In the Linux kernel, the following vulnerability has been resolved:

drm/amd/display: Skip on writeback when it's not applicable

[WHY] dynamic memory safety error detector (KASAN) catches and generates error messages BUG: KASAN: slab-out-of-bounds as writeback connector does not support certain features which are not initialized.

[HOW] Skip them when connector type is DRM_MODE_CONNECTOR_WRITEBACK.

CVE-2024-36913:
In the Linux kernel, the following vulnerability has been resolved:

Drivers: hv: vmbus: Leak pages if set_memory_encrypted() fails

In CoCo VMs it is possible for the untrusted host to cause set_memory_encrypted() or set_memory_decrypted() to fail such that an error is returned and the resulting memory is shared. Callers need to take care to handle these errors to avoid returning decrypted (shared) memory to the page allocator, which could lead to functional or security issues.

VMBus code could free decrypted pages if set_memory_encrypted()/decrypted() fails. Leak the pages if this happens.

CVE-2024-36912:
In the Linux kernel, the following vulnerability has been resolved:

Drivers: hv: vmbus: Track decrypted status in vmbus_gpadl

In CoCo VMs it is possible for the untrusted host to cause set_memory_encrypted() or set_memory_decrypted() to fail such that an error is returned and the resulting memory is shared. Callers need to take care to handle these errors to avoid returning decrypted (shared) memory to the page allocator, which could lead to functional or security issues.

In order to make sure callers of vmbus_establish_gpadl() and vmbus_teardown_gpadl() don't return decrypted/shared pages to allocators, add a field in struct vmbus_gpadl to keep track of the decryption status of the buffers. This will allow the callers to know if they should free or leak the pages.

CVE-2024-36890:
In the Linux kernel, the following vulnerability has been resolved:

mm/slab: make __free(kfree) accept error pointers

Currently, if an automatically freed allocation is an error pointer that will lead to a crash. An example of this is in wm831x_gpio_dbg_show().

171 char *label __free(kfree) = gpiochip_dup_line_label(chip, i);
172 if (IS_ERR(label)) { 173 dev_err(wm831x->dev, Failed to duplicate label\n);
174 continue;
175 }

The auto clean up function should check for error pointers as well, otherwise we're going to keep hitting issues like this.

CVE-2024-36885:
In the Linux kernel, the following vulnerability has been resolved:

drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a BUG() on startup:

kernel BUG at include/linux/scatterlist.h:187! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30 Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019 RIP: 0010:sg_init_one+0x85/0xa0 Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54 24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b 0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00 RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000 RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000 R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508 R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018 FS: 00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0 Call Trace:
<TASK> ? die+0x36/0x90 ? do_trap+0xdd/0x100 ? sg_init_one+0x85/0xa0 ? do_error_trap+0x65/0x80 ? sg_init_one+0x85/0xa0 ? exc_invalid_op+0x50/0x70 ? sg_init_one+0x85/0xa0 ? asm_exc_invalid_op+0x1a/0x20 ? sg_init_one+0x85/0xa0 nvkm_firmware_ctor+0x14a/0x250 [nouveau] nvkm_falcon_fw_ctor+0x42/0x70 [nouveau] ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau] r535_gsp_oneinit+0xb3/0x15f0 [nouveau] ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f ? nvkm_udevice_new+0x95/0x140 [nouveau] ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f ? ktime_get+0x47/0xb0 ? srso_return_thunk+0x5/0x5f nvkm_subdev_oneinit_+0x4f/0x120 [nouveau] nvkm_subdev_init_+0x39/0x140 [nouveau] ? srso_return_thunk+0x5/0x5f nvkm_subdev_init+0x44/0x90 [nouveau] nvkm_device_init+0x166/0x2e0 [nouveau] nvkm_udevice_init+0x47/0x70 [nouveau] nvkm_object_init+0x41/0x1c0 [nouveau] nvkm_ioctl_new+0x16a/0x290 [nouveau] ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau] ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau] nvkm_ioctl+0x126/0x290 [nouveau] nvif_object_ctor+0x112/0x190 [nouveau] nvif_device_ctor+0x23/0x60 [nouveau] nouveau_cli_init+0x164/0x640 [nouveau] nouveau_drm_device_init+0x97/0x9e0 [nouveau] ? srso_return_thunk+0x5/0x5f ? pci_update_current_state+0x72/0xb0 ? srso_return_thunk+0x5/0x5f nouveau_drm_probe+0x12c/0x280 [nouveau] ? srso_return_thunk+0x5/0x5f local_pci_probe+0x45/0xa0 pci_device_probe+0xc7/0x270 really_probe+0xe6/0x3a0
__driver_probe_device+0x87/0x160 driver_probe_device+0x1f/0xc0
__driver_attach+0xec/0x1f0 ? __pfx___driver_attach+0x10/0x10 bus_for_each_dev+0x88/0xd0 bus_add_driver+0x116/0x220 driver_register+0x59/0x100 ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau] do_one_initcall+0x5b/0x320 do_init_module+0x60/0x250 init_module_from_file+0x86/0xc0 idempotent_init_module+0x120/0x2b0
__x64_sys_finit_module+0x5e/0xb0 do_syscall_64+0x83/0x160 ? srso_return_thunk+0x5/0x5f entry_SYSCALL_64_after_hwframe+0x71/0x79 RIP: 0033:0x7feeb5cc20cd Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1b cd 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffcf220b2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000055fdd2916aa0 RCX: 00007feeb5cc20cd RDX: 0000000000000000 RSI: 000055fdd29161e0 RDI: 0000000000000035 RBP: 00007ffcf220b380 R08: 00007feeb5d8fb20 R09: 00007ffcf220b310 R10: 000055fdd2909dc0 R11: 0000000000000246 R12: 000055
---truncated---

CVE-2024-36882:
In the Linux kernel, the following vulnerability has been resolved:

mm: use memalloc_nofs_save() in page_cache_ra_order()

See commit f2c817bed58d (mm: use memalloc_nofs_save in readahead path), ensure that page_cache_ra_order() do not attempt to reclaim file-backed pages too, or it leads to a deadlock, found issue when test ext4 large folio.

INFO: task DataXceiver for:7494 blocked for more than 120 seconds.
echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message.
task:DataXceiver for state:D stack:0 pid:7494 ppid:1 flags:0x00000200 Call trace:
__switch_to+0x14c/0x240
__schedule+0x82c/0xdd0 schedule+0x58/0xf0 io_schedule+0x24/0xa0
__folio_lock+0x130/0x300 migrate_pages_batch+0x378/0x918 migrate_pages+0x350/0x700 compact_zone+0x63c/0xb38 compact_zone_order+0xc0/0x118 try_to_compact_pages+0xb0/0x280
__alloc_pages_direct_compact+0x98/0x248
__alloc_pages+0x510/0x1110 alloc_pages+0x9c/0x130 folio_alloc+0x20/0x78 filemap_alloc_folio+0x8c/0x1b0 page_cache_ra_order+0x174/0x308 ondemand_readahead+0x1c8/0x2b8 page_cache_async_ra+0x68/0xb8 filemap_readahead.isra.0+0x64/0xa8 filemap_get_pages+0x3fc/0x5b0 filemap_splice_read+0xf4/0x280 ext4_file_splice_read+0x2c/0x48 [ext4] vfs_splice_read.part.0+0xa8/0x118 splice_direct_to_actor+0xbc/0x288 do_splice_direct+0x9c/0x108 do_sendfile+0x328/0x468
__arm64_sys_sendfile64+0x8c/0x148 invoke_syscall+0x4c/0x118 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x4c/0x1f8 el0t_64_sync_handler+0xc0/0xc8 el0t_64_sync+0x188/0x190

CVE-2024-36489:
In the Linux kernel, the following vulnerability has been resolved:

tls: fix missing memory barrier in tls_init

In tls_init(), a write memory barrier is missing, and store-store reordering may cause NULL dereference in tls_{setsockopt,getsockopt}.

CPU0 CPU1
----- ----- // In tls_init() // In tls_ctx_create() ctx = kzalloc() ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1)

// In update_sk_prot() WRITE_ONCE(sk->sk_prot, tls_prots) -(2)

// In sock_common_setsockopt() READ_ONCE(sk->sk_prot)->setsockopt()

// In tls_{setsockopt,getsockopt}() ctx->sk_proto->setsockopt() -(3)

In the above scenario, when (1) and (2) are reordered, (3) can observe the NULL value of ctx->sk_proto, causing NULL dereference.

To fix it, we rely on rcu_assign_pointer() which implies the release barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is initialized, we can ensure that ctx->sk_proto are visible when changing sk->sk_prot.

CVE-2024-36484:
In the Linux kernel, the following vulnerability has been resolved:

net: relax socket state check at accept time.

Christoph reported the following splat:

WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 Modules linked in:
CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace:
<TASK> inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 do_accept+0x435/0x620 net/socket.c:1929
__sys_accept4_file net/socket.c:1969 [inline]
__sys_accept4+0x9b/0x110 net/socket.c:1999
__do_sys_accept net/socket.c:2016 [inline]
__se_sys_accept net/socket.c:2013 [inline]
__x64_sys_accept+0x7d/0x90 net/socket.c:2013 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x4315f9 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 </TASK>

The reproducer invokes shutdown() before entering the listener status.
After commit 94062790aedb (tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets), the above causes the child to reach the accept syscall in FIN_WAIT1 status.

Eric noted we can relax the existing assertion in __inet_accept()

CVE-2024-36286:
In the Linux kernel, the following vulnerability has been resolved:

netfilter: nfnetlink_queue: acquire rcu_read_lock() in instance_destroy_rcu()

syzbot reported that nf_reinject() could be called without rcu_read_lock() :

WARNING: suspicious RCU usage 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Not tainted

net/netfilter/nfnetlink_queue.c:263 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1 2 locks held by syz-executor.4/13427:
#0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2190 [inline] #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_core+0xa86/0x1830 kernel/rcu/tree.c:2471 #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: nfqnl_flush net/netfilter/nfnetlink_queue.c:405 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: instance_destroy_rcu+0x30/0x220 net/netfilter/nfnetlink_queue.c:172

stack backtrace:
CPU: 0 PID: 13427 Comm: syz-executor.4 Not tainted 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712 nf_reinject net/netfilter/nfnetlink_queue.c:323 [inline] nfqnl_reinject+0x6ec/0x1120 net/netfilter/nfnetlink_queue.c:397 nfqnl_flush net/netfilter/nfnetlink_queue.c:410 [inline] instance_destroy_rcu+0x1ae/0x220 net/netfilter/nfnetlink_queue.c:172 rcu_do_batch kernel/rcu/tree.c:2196 [inline] rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2471 handle_softirqs+0x2d6/0x990 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 </IRQ> <TASK>

CVE-2024-36281:
In the Linux kernel, the following vulnerability has been resolved:

net/mlx5: Use mlx5_ipsec_rx_status_destroy to correctly delete status rules

rx_create no longer allocates a modify_hdr instance that needs to be cleaned up. The mlx5_modify_header_dealloc call will lead to a NULL pointer dereference. A leak in the rules also previously occurred since there are now two rules populated related to status.

BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 109907067 P4D 109907067 PUD 116890067 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 484 Comm: ip Not tainted 6.9.0-rc2-rrameshbabu+ #254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.3-1-1 04/01/2014 RIP: 0010:mlx5_modify_header_dealloc+0xd/0x70 <snip> Call Trace:
<TASK> ? show_regs+0x60/0x70 ? __die+0x24/0x70 ? page_fault_oops+0x15f/0x430 ? free_to_partial_list.constprop.0+0x79/0x150 ? do_user_addr_fault+0x2c9/0x5c0 ? exc_page_fault+0x63/0x110 ? asm_exc_page_fault+0x27/0x30 ? mlx5_modify_header_dealloc+0xd/0x70 rx_create+0x374/0x590 rx_add_rule+0x3ad/0x500 ? rx_add_rule+0x3ad/0x500 ? mlx5_cmd_exec+0x2c/0x40 ? mlx5_create_ipsec_obj+0xd6/0x200 mlx5e_accel_ipsec_fs_add_rule+0x31/0xf0 mlx5e_xfrm_add_state+0x426/0xc00 <snip>

CVE-2024-36270:
In the Linux kernel, the following vulnerability has been resolved:

netfilter: tproxy: bail out if IP has been disabled on the device

syzbot reports:
general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] [..] RIP: 0010:nf_tproxy_laddr4+0xb7/0x340 net/ipv4/netfilter/nf_tproxy_ipv4.c:62 Call Trace:
nft_tproxy_eval_v4 net/netfilter/nft_tproxy.c:56 [inline] nft_tproxy_eval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168

__in_dev_get_rcu() can return NULL, so check for this.

CVE-2024-36033:
In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: qca: fix info leak when fetching board id

Add the missing sanity check when fetching the board id to avoid leaking slab data when later requesting the firmware.

CVE-2024-36031:
In the Linux kernel, the following vulnerability has been resolved:

keys: Fix overwrite of key expiration on instantiation

The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry.

CVE-2024-36030:
In the Linux kernel, the following vulnerability has been resolved:

octeontx2-af: fix the double free in rvu_npc_freemem()

Clang static checker(scan-build) warning drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c:line 2184, column 2 Attempt to free released memory.

npc_mcam_rsrcs_deinit() has released 'mcam->counters.bmap'. Deleted this redundant kfree() to fix this double free problem.

CVE-2024-36028:
In the Linux kernel, the following vulnerability has been resolved:

mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()

When I did memory failure tests recently, below warning occurs:

DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0 Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0 Call Trace:
<TASK> lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60 hugepage_subpool_put_pages.part.0+0xe/0xc0 free_huge_folio+0x253/0x3f0 dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70 memory_failure+0x4e6/0x8c0 hard_offline_page_store+0x55/0xa0 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xbc/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK> Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace:
<TASK> panic+0x326/0x350 check_panic_on_warn+0x4f/0x50
__warn+0x98/0x190 report_bug+0x18e/0x1a0 handle_bug+0x3d/0x70 exc_invalid_op+0x18/0x70 asm_exc_invalid_op+0x1a/0x20 RIP: 0010:__lock_acquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60 hugepage_subpool_put_pages.part.0+0xe/0xc0 free_huge_folio+0x253/0x3f0 dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70 memory_failure+0x4e6/0x8c0 hard_offline_page_store+0x55/0xa0 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xbc/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK>

After git bisecting and digging into the code, I believe the root cause is that _deferred_list field of folio is unioned with _hugetlb_subpool field.
In __update_and_free_hugetlb_folio(), folio->_deferred_
---truncated---

CVE-2024-36021:
In the Linux kernel, the following vulnerability has been resolved:

net: hns3: fix kernel crash when devlink reload during pf initialization

The devlink reload process will access the hardware resources, but the register operation is done before the hardware is initialized.
So, processing the devlink reload during initialization may lead to kernel crash. This patch fixes this by taking devl_lock during initialization.

CVE-2024-36020:
In the Linux kernel, the following vulnerability has been resolved:

i40e: fix vf may be used uninitialized in this function warning

To fix the regression introduced by commit 52424f974bc5, which causes servers hang in very hard to reproduce conditions with resets races.
Using two sources for the information is the root cause.
In this function before the fix bumping v didn't mean bumping vf pointer. But the code used this variables interchangeably, so stale vf could point to different/not intended vf.

Remove redundant v variable and iterate via single VF pointer across whole function instead to guarantee VF pointer validity.

CVE-2024-36017:
In the Linux kernel, the following vulnerability has been resolved:

rtnetlink: Correct nested IFLA_VF_VLAN_LIST attribute validation

Each attribute inside a nested IFLA_VF_VLAN_LIST is assumed to be a struct ifla_vf_vlan_info so the size of such attribute needs to be at least of sizeof(struct ifla_vf_vlan_info) which is 14 bytes.
The current size validation in do_setvfinfo is against NLA_HDRLEN (4 bytes) which is less than sizeof(struct ifla_vf_vlan_info) so this validation is not enough and a too small attribute might be cast to a struct ifla_vf_vlan_info, this might result in an out of bands read access when accessing the saved (casted) entry in ivvl.

CVE-2024-35996:
In the Linux kernel, the following vulnerability has been resolved:

cpu: Re-enable CPU mitigations by default for !X86 architectures

Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it on for all architectures exception x86. A recent commit to turn mitigations off by default if SPECULATION_MITIGATIONS=n kinda sorta missed that cpu_mitigations is completely generic, whereas SPECULATION_MITIGATIONS is x86-specific.

Rename x86's SPECULATIVE_MITIGATIONS instead of keeping both and have it select CPU_MITIGATIONS, as having two configs for the same thing is unnecessary and confusing. This will also allow x86 to use the knob to manage mitigations that aren't strictly related to speculative execution.

Use another Kconfig to communicate to common code that CPU_MITIGATIONS is already defined instead of having x86's menu depend on the common CPU_MITIGATIONS. This allows keeping a single point of contact for all of x86's mitigations, and it's not clear that other architectures *want* to allow disabling mitigations at compile-time.

CVE-2024-35988:
In the Linux kernel, the following vulnerability has been resolved:

riscv: Fix TASK_SIZE on 64-bit NOMMU

On NOMMU, userspace memory can come from anywhere in physical RAM. The current definition of TASK_SIZE is wrong if any RAM exists above 4G, causing spurious failures in the userspace access routines.

CVE-2024-35987:
In the Linux kernel, the following vulnerability has been resolved:

riscv: Fix loading 64-bit NOMMU kernels past the start of RAM

commit 3335068f8721 (riscv: Use PUD/P4D/PGD pages for the linear mapping) added logic to allow using RAM below the kernel load address.
However, this does not work for NOMMU, where PAGE_OFFSET is fixed to the kernel load address. Since that range of memory corresponds to PFNs below ARCH_PFN_OFFSET, mm initialization runs off the beginning of mem_map and corrupts adjacent kernel memory. Fix this by restoring the previous behavior for NOMMU kernels.

CVE-2024-35981:
In the Linux kernel, the following vulnerability has been resolved:

virtio_net: Do not send RSS key if it is not supported

There is a bug when setting the RSS options in virtio_net that can break the whole machine, getting the kernel into an infinite loop.

Running the following command in any QEMU virtual machine with virtionet will reproduce this problem:

# ethtool -X eth0 hfunc toeplitz

This is how the problem happens:

1) ethtool_set_rxfh() calls virtnet_set_rxfh()

2) virtnet_set_rxfh() calls virtnet_commit_rss_command()

3) virtnet_commit_rss_command() populates 4 entries for the rss scatter-gather

4) Since the command above does not have a key, then the last scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;

5) This buffer is passed to qemu, but qemu is not happy with a buffer with zero length, and do the following in virtqueue_map_desc() (QEMU function):

if (!sz) { virtio_error(vdev, virtio: zero sized buffers are not allowed);

6) virtio_error() (also QEMU function) set the device as broken

vdev->broken = true;

7) Qemu bails out, and do not repond this crazy kernel.

8) The kernel is waiting for the response to come back (function virtnet_send_command())

9) The kernel is waiting doing the following :

while (!virtqueue_get_buf(vi->cvq, &tmp) && !virtqueue_is_broken(vi->cvq)) cpu_relax();

10) None of the following functions above is true, thus, the kernel loops here forever. Keeping in mind that virtqueue_is_broken() does not look at the qemu `vdev->broken`, so, it never realizes that the vitio is broken at QEMU side.

Fix it by not sending RSS commands if the feature is not available in the device.

CVE-2024-35980:
In the Linux kernel, the following vulnerability has been resolved:

arm64: tlb: Fix TLBI RANGE operand

KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty pages are collected by VMM and the page table entries become write protected during live migration. Unfortunately, the operand passed to the TLBI RANGE instruction isn't correctly sorted out due to the commit 117940aa6e5f (KVM: arm64: Define kvm_tlb_flush_vmid_range()).
It leads to crash on the destination VM after live migration because TLBs aren't flushed completely and some of the dirty pages are missed.

For example, I have a VM where 8GB memory is assigned, starting from 0x40000000 (1GB). Note that the host has 4KB as the base page size.
In the middile of migration, kvm_tlb_flush_vmid_range() is executed to flush TLBs. It passes MAX_TLBI_RANGE_PAGES as the argument to
__kvm_tlb_flush_vmid_range() and __flush_s2_tlb_range_op(). SCALE#3 and NUM#31, corresponding to MAX_TLBI_RANGE_PAGES, isn't supported by __TLBI_RANGE_NUM(). In this specific case, -1 has been returned from __TLBI_RANGE_NUM() for SCALE#3/2/1/0 and rejected by the loop in the __flush_tlb_range_op() until the variable @scale underflows and becomes -9, 0xffff708000040000 is set as the operand. The operand is wrong since it's sorted out by __TLBI_VADDR_RANGE() according to invalid @scale and @num.

Fix it by extending __TLBI_RANGE_NUM() to support the combination of SCALE#3 and NUM#31. With the changes, [-1 31] instead of [-1 30] can be returned from the macro, meaning the TLBs for 0x200000 pages in the above example can be flushed in one shoot with SCALE#3 and NUM#31. The macro TLBI_RANGE_MASK is dropped since no one uses it any more. The comments are also adjusted accordingly.

CVE-2024-35972:
In the Linux kernel, the following vulnerability has been resolved:

bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init()

If ulp = kzalloc() fails, the allocated edev will leak because it is not properly assigned and the cleanup path will not be able to free it.
Fix it by assigning it properly immediately after allocation.

CVE-2024-35947:
In the Linux kernel, the following vulnerability has been resolved:

dyndbg: fix old BUG_ON in >control parser

Fix a BUG_ON from 2009. Even if it looks unreachable (I didn't really look), lets make sure by removing it, doing pr_err and return
-EINVAL instead.

CVE-2024-35941:
In the Linux kernel, the following vulnerability has been resolved:

net: skbuff: add overflow debug check to pull/push helpers

syzbot managed to trigger following splat:
BUG: KASAN: use-after-free in __skb_flow_dissect+0x4a3b/0x5e50 Read of size 1 at addr ffff888208a4000e by task a.out/2313 [..]
__skb_flow_dissect+0x4a3b/0x5e50
__skb_get_hash+0xb4/0x400 ip_tunnel_xmit+0x77e/0x26f0 ipip_tunnel_xmit+0x298/0x410 ..

Analysis shows that the skb has a valid ->head, but bogus ->data pointer.

skb->data gets its bogus value via the neigh layer, which does:

1556 __skb_pull(skb, skb_network_offset(skb));

... and the skb was already dodgy at this point:

skb_network_offset(skb) returns a negative value due to an earlier overflow of skb->network_header (u16). __skb_pull thus adjusts skb->data by a huge offset, pointing outside skb->head area.

Allow debug builds to splat when we try to pull/push more than INT_MAX bytes.

After this, the syzkaller reproducer yields a more precise splat before the flow dissector attempts to read off skb->data memory:

WARNING: CPU: 5 PID: 2313 at include/linux/skbuff.h:2653 neigh_connected_output+0x28e/0x400 ip_finish_output2+0xb25/0xed0 iptunnel_xmit+0x4ff/0x870 ipgre_xmit+0x78e/0xbb0

CVE-2024-35918:
In the Linux kernel, the following vulnerability has been resolved:

randomize_kstack: Improve entropy diffusion

The kstack_offset variable was really only ever using the low bits for kernel stack offset entropy. Add a ror32() to increase bit diffusion.

CVE-2024-35917:
In the Linux kernel, the following vulnerability has been resolved:

s390/bpf: Fix bpf_plt pointer arithmetic

Kui-Feng Lee reported a crash on s390x triggered by the dummy_st_ops/dummy_init_ptr_arg test [1]:

[<0000000000000002>] 0x2 [<00000000009d5cde>] bpf_struct_ops_test_run+0x156/0x250 [<000000000033145a>] __sys_bpf+0xa1a/0xd00 [<00000000003319dc>] __s390x_sys_bpf+0x44/0x50 [<0000000000c4382c>] __do_syscall+0x244/0x300 [<0000000000c59a40>] system_call+0x70/0x98

This is caused by GCC moving memcpy() after assignments in bpf_jit_plt(), resulting in NULL pointers being written instead of the return and the target addresses.

Looking at the GCC internals, the reordering is allowed because the alias analysis thinks that the memcpy() destination and the assignments' left-hand-sides are based on different objects: new_plt and bpf_plt_ret/bpf_plt_target respectively, and therefore they cannot alias.

This is in turn due to a violation of the C standard:

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object ...

From the C's perspective, bpf_plt_ret and bpf_plt are distinct objects and cannot be subtracted. In the practical terms, doing so confuses the GCC's alias analysis.

The code was written this way in order to let the C side know a few offsets defined in the assembly. While nice, this is by no means necessary. Fix the noncompliance by hardcoding these offsets.

[1] https://lore.kernel.org/bpf/[email protected]/

CVE-2024-35903:
In the Linux kernel, the following vulnerability has been resolved:

x86/bpf: Fix IP after emitting call depth accounting

Adjust the IP passed to `emit_patch` so it calculates the correct offset for the CALL instruction if `x86_call_depth_emit_accounting` emits code.
Otherwise we will skip some instructions and most likely crash.

CVE-2024-35876:
In the Linux kernel, the following vulnerability has been resolved:

x86/mce: Make sure to grab mce_sysfs_mutex in set_bank()

Modifying a MCA bank's MCA_CTL bits which control which error types to be reported is done over

/sys/devices/system/machinecheck/ machinecheck0 bank0 bank1 bank10 bank11 ...

sysfs nodes by writing the new bit mask of events to enable.

When the write is accepted, the kernel deletes all current timers and reinits all banks.

Doing that in parallel can lead to initializing a timer which is already armed and in the timer wheel, i.e., in use already:

ODEBUG: init active (active state 0) object: ffff888063a28000 object type: timer_list hint: mce_timer_fn+0x0/0x240 arch/x86/kernel/cpu/mce/core.c:2642 WARNING: CPU: 0 PID: 8120 at lib/debugobjects.c:514 debug_print_object+0x1a0/0x2a0 lib/debugobjects.c:514

Fix that by grabbing the sysfs mutex as the rest of the MCA sysfs code does.

Reported by: Yue Sun <[email protected]> Reported by: xingwei lee <[email protected]>

CVE-2024-35875:
In the Linux kernel, the following vulnerability has been resolved:

x86/coco: Require seeding RNG with RDRAND on CoCo systems

There are few uses of CoCo that don't rely on working cryptography and hence a working RNG. Unfortunately, the CoCo threat model means that the VM host cannot be trusted and may actively work against guests to extract secrets or manipulate computation. Since a malicious host can modify or observe nearly all inputs to guests, the only remaining source of entropy for CoCo guests is RDRAND.

If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole is meant to gracefully continue on gathering entropy from other sources, but since there aren't other sources on CoCo, this is catastrophic.
This is mostly a concern at boot time when initially seeding the RNG, as after that the consequences of a broken RDRAND are much more theoretical.

So, try at boot to seed the RNG using 256 bits of RDRAND output. If this fails, panic(). This will also trigger if the system is booted without RDRAND, as RDRAND is essential for a safe CoCo boot.

Add this deliberately to be just a CoCo x86 driver feature and not part of the RNG itself. Many device drivers and platforms have some desire to contribute something to the RNG, and add_device_randomness() is specifically meant for this purpose.

Any driver can call it with seed data of any quality, or even garbage quality, and it can only possibly make the quality of the RNG better or have no effect, but can never make it worse.

Rather than trying to build something into the core of the RNG, consider the particular CoCo issue just a CoCo issue, and therefore separate it all out into driver (well, arch/platform) code.

[ bp: Massage commit message. ]

CVE-2024-35872:
In the Linux kernel, the following vulnerability has been resolved:

mm/secretmem: fix GUP-fast succeeding on secretmem folios

folio_is_secretmem() currently relies on secretmem folios being LRU folios, to save some cycles.

However, folios might reside in a folio batch without the LRU flag set, or temporarily have their LRU flag cleared. Consequently, the LRU flag is unreliable for this purpose.

In particular, this is the case when secretmem_fault() allocates a fresh page and calls filemap_add_folio()->folio_add_lru(). The folio might be added to the per-cpu folio batch and won't get the LRU flag set until the batch was drained using e.g., lru_add_drain().

Consequently, folio_is_secretmem() might not detect secretmem folios and GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel when we would later try reading/writing to the folio, because the folio has been unmapped from the directmap.

Fix it by removing that unreliable check.

CVE-2024-35871:
In the Linux kernel, the following vulnerability has been resolved:

riscv: process: Fix kernel gp leakage

childregs represents the registers which are active for the new thread in user context. For a kernel thread, childregs->gp is never used since the kernel gp is not touched by switch_to. For a user mode helper, the gp value can be observed in user space after execve or possibly by other means.

[From the email thread]

The /* Kernel thread */ comment is somewhat inaccurate in that it is also used for user_mode_helper threads, which exec a user process, e.g. /sbin/init or when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.

childregs is the *user* context during syscall execution and it is observable from userspace in at least five ways:

1. kernel_execve does not currently clear integer registers, so the starting register state for PID 1 and other user processes started by the kernel has sp = user stack, gp = kernel __global_pointer$, all other integer registers zeroed by the memset in the patch comment.

This is a bug in its own right, but I'm unwilling to bet that it is the only way to exploit the issue addressed by this patch.

2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread before it execs, but ptrace requires SIGSTOP to be delivered which can only happen at user/kernel boundaries.

3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for user_mode_helpers before the exec completes, but gp is not one of the registers it returns.

4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses are also exposed via PERF_SAMPLE_REGS_USER which is permitted under LOCKDOWN_PERF. I have not attempted to write exploit code.

5. Much of the tracing infrastructure allows access to user registers. I have not attempted to determine which forms of tracing allow access to user registers without already allowing access to kernel registers.

CVE-2024-35842:
In the Linux kernel, the following vulnerability has been resolved:

ASoC: mediatek: sof-common: Add NULL check for normal_link string

It's not granted that all entries of struct sof_conn_stream declare a `normal_link` (a non-SOF, direct link) string, and this is the case for SoCs that support only SOF paths (hence do not support both direct and SOF usecases).

For example, in the case of MT8188 there is no normal_link string in any of the sof_conn_stream entries and there will be more drivers doing that in the future.

To avoid possible NULL pointer KPs, add a NULL check for `normal_link`.

CVE-2024-35818:
In the Linux kernel, the following vulnerability has been resolved:

LoongArch: Define the __io_aw() hook as mmiowb()

Commit fb24ea52f78e0d595852e (drivers: Remove explicit invocations of mmiowb()) remove all mmiowb() in drivers, but it says:

NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation.

The mmio in radeon_ring_commit() is protected by a mutex rather than a spinlock, but in the mutex fastpath it behaves similar to spinlock. We can add mmiowb() calls in the radeon driver but the maintainer says he doesn't like such a workaround, and radeon is not the only example of mutex protected mmio.

So we should extend the mmiowb tracking system from spinlock to mutex, and maybe other locking primitives. This is not easy and error prone, so we solve it in the architectural code, by simply defining the __io_aw() hook as mmiowb(). And we no longer need to override queued_spin_unlock() so use the generic definition.

Without this, we get such an error when run 'glxgears' on weak ordering architectures such as LoongArch:

radeon 0000:04:00.0: ring 0 stalled for more than 10324msec radeon 0000:04:00.0: ring 3 stalled for more than 10240msec radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3) radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35) radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)

CVE-2024-35804:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86: Mark target gfn of emulated atomic instruction as dirty

When emulating an atomic access on behalf of the guest, mark the target gfn dirty if the CMPXCHG by KVM is attempted and doesn't fault. This fixes a bug where KVM effectively corrupts guest memory during live migration by writing to guest memory without informing userspace that the page is dirty.

Marking the page dirty got unintentionally dropped when KVM's emulated CMPXCHG was converted to do a user access. Before that, KVM explicitly mapped the guest page into kernel memory, and marked the page dirty during the unmap phase.

Mark the page dirty even if the CMPXCHG fails, as the old data is written back on failure, i.e. the page is still written. The value written is guaranteed to be the same because the operation is atomic, but KVM's ABI is that all writes are dirty logged regardless of the value written. And more importantly, that's what KVM did before the buggy commit.

Huge kudos to the folks on the Cc list (and many others), who did all the actual work of triaging and debugging.

base-commit: 6769ea8da8a93ed4630f1ce64df6aafcaabfce64

CVE-2024-35803:
In the Linux kernel, the following vulnerability has been resolved:

x86/efistub: Call mixed mode boot services on the firmware's stack

Normally, the EFI stub calls into the EFI boot services using the stack that was live when the stub was entered. According to the UEFI spec, this stack needs to be at least 128k in size - this might seem large but all asynchronous processing and event handling in EFI runs from the same stack and so quite a lot of space may be used in practice.

In mixed mode, the situation is a bit different: the bootloader calls the 32-bit EFI stub entry point, which calls the decompressor's 32-bit entry point, where the boot stack is set up, using a fixed allocation of 16k. This stack is still in use when the EFI stub is started in 64-bit mode, and so all calls back into the EFI firmware will be using the decompressor's limited boot stack.

Due to the placement of the boot stack right after the boot heap, any stack overruns have gone unnoticed. However, commit

5c4feadb0011983b (x86/decompressor: Move global symbol references to C code)

moved the definition of the boot heap into C code, and now the boot stack is placed right at the base of BSS, where any overruns will corrupt the end of the .data section.

While it would be possible to work around this by increasing the size of the boot stack, doing so would affect all x86 systems, and mixed mode systems are a tiny (and shrinking) fraction of the x86 installed base.

So instead, record the firmware stack pointer value when entering from the 32-bit firmware, and switch to this stack every time a EFI boot service call is made.

CVE-2024-35802:
null

CVE-2024-35801:
In the Linux kernel, the following vulnerability has been resolved:

x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD

Commit 672365477ae8 (x86/fpu: Update XFD state where required) and commit 8bf26758ca96 (x86/fpu: Add XFD state to fpstate) introduced a per CPU variable xfd_state to keep the MSR_IA32_XFD value cached, in order to avoid unnecessary writes to the MSR.

On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which wipes out any stale state. But the per CPU cached xfd value is not reset, which brings them out of sync.

As a consequence a subsequent xfd_update_state() might fail to update the MSR which in turn can result in XRSTOR raising a #NM in kernel space, which crashes the kernel.

To fix this, introduce xfd_set_state() to write xfd_state together with MSR_IA32_XFD, and use it in all places that set MSR_IA32_XFD.

CVE-2024-35792:
In the Linux kernel, the following vulnerability has been resolved:

crypto: rk3288 - Fix use after free in unprepare

The unprepare call must be carried out before the finalize call as the latter can free the request.

CVE-2024-35791:
In the Linux kernel, the following vulnerability has been resolved:

KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region()

Do the cache flush of converted pages in svm_register_enc_region() before dropping kvm->lock to fix use-after-free issues where region and/or its array of pages could be freed by a different task, e.g. if userspace has
__unregister_enc_region_locked() already queued up for the region.

Note, the obvious alternative of using local variables doesn't fully resolve the bug, as region->pages is also dynamically allocated. I.e. the region structure itself would be fine, but region->pages could be freed.

Flushing multiple pages under kvm->lock is unfortunate, but the entire flow is a rare slow path, and the manual flush is only needed on CPUs that lack coherency for encrypted memory.

CVE-2024-35790:
In the Linux kernel, the following vulnerability has been resolved:

usb: typec: altmodes/displayport: create sysfs nodes as driver's default device attribute group

The DisplayPort driver's sysfs nodes may be present to the userspace before typec_altmode_set_drvdata() completes in dp_altmode_probe. This means that a sysfs read can trigger a NULL pointer error by deferencing dp->hpd in hpd_show or dp->lock in pin_assignment_show, as dev_get_drvdata() returns NULL in those cases.

Remove manual sysfs node creation in favor of adding attribute group as default for devices bound to the driver. The ATTRIBUTE_GROUPS() macro is not used here otherwise the path to the sysfs nodes is no longer compliant with the ABI.

CVE-2024-35787:
In the Linux kernel, the following vulnerability has been resolved:

md/md-bitmap: fix incorrect usage for sb_index

Commit d7038f951828 (md-bitmap: don't use ->index for pages backing the bitmap file) removed page->index from bitmap code, but left wrong code logic for clustered-md. current code never set slot offset for cluster nodes, will sometimes cause crash in clustered env.

Call trace (partly):
md_bitmap_file_set_bit+0x110/0x1d8 [md_mod] md_bitmap_startwrite+0x13c/0x240 [md_mod] raid1_make_request+0x6b0/0x1c08 [raid1] md_handle_request+0x1dc/0x368 [md_mod] md_submit_bio+0x80/0xf8 [md_mod]
__submit_bio+0x178/0x300 submit_bio_noacct_nocheck+0x11c/0x338 submit_bio_noacct+0x134/0x614 submit_bio+0x28/0xdc submit_bh_wbc+0x130/0x1cc submit_bh+0x1c/0x28

CVE-2024-35786:
In the Linux kernel, the following vulnerability has been resolved:

drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf

If VM_BIND is enabled on the client the legacy submission ioctl can't be used, however if a client tries to do so regardless it will return an error. In this case the clients mutex remained unlocked leading to a deadlock inside nouveau_drm_postclose or any other nouveau ioctl call.

CVE-2024-35785:
In the Linux kernel, the following vulnerability has been resolved:

tee: optee: Fix kernel panic caused by incorrect error handling

The error path while failing to register devices on the TEE bus has a bug leading to kernel panic as follows:

[ 15.398930] Unable to handle kernel paging request at virtual address ffff07ed00626d7c [ 15.406913] Mem abort info:
[ 15.409722] ESR = 0x0000000096000005 [ 15.413490] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.418814] SET = 0, FnV = 0 [ 15.421878] EA = 0, S1PTW = 0 [ 15.425031] FSC = 0x05: level 1 translation fault [ 15.429922] Data abort info:
[ 15.432813] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 15.438310] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 15.443372] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 15.448697] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000d9e3e000 [ 15.455413] [ffff07ed00626d7c] pgd=1800000bffdf9003, p4d=1800000bffdf9003, pud=0000000000000000 [ 15.464146] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP

Commit 7269cba53d90 (tee: optee: Fix supplicant based device enumeration) lead to the introduction of this bug. So fix it appropriately.

CVE-2024-33621:
In the Linux kernel, the following vulnerability has been resolved:

ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound

Raw packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path.

WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70 Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:sk_mc_loop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
<IRQ> ? __warn (kernel/panic.c:693) ? sk_mc_loop (net/core/sock.c:760) ? report_bug (lib/bug.c:201 lib/bug.c:219) ? handle_bug (arch/x86/kernel/traps.c:239) ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) ? sk_mc_loop (net/core/sock.c:760) ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1)) ? nf_hook_slow (net/netfilter/core.c:626) ip6_finish_output (net/ipv6/ip6_output.c:222) ? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215) ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:602) ipvlan ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan dev_hard_start_xmit (net/core/dev.c:3594) sch_direct_xmit (net/sched/sch_generic.c:343)
__qdisc_run (net/sched/sch_generic.c:416) net_tx_action (net/core/dev.c:5286) handle_softirqs (kernel/softirq.c:555)
__irq_exit_rcu (kernel/softirq.c:589) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043)

The warning triggers as this:
packet_sendmsg packet_snd //skb->sk is packet sk
__dev_queue_xmit
__dev_xmit_skb //q->enqueue is not NULL
__qdisc_run sch_direct_xmit dev_hard_start_xmit ipvlan_start_xmit ipvlan_xmit_mode_l3 //l3 mode ipvlan_process_outbound //vepa flag ipvlan_process_v6_outbound ip6_local_out
__ip6_finish_output ip6_finish_output2 //multicast packet sk_mc_loop //sk->sk_family is AF_PACKET

Call ip{6}_local_out() with NULL sk in ipvlan as other tunnels to fix this.

CVE-2024-27435:
In the Linux kernel, the following vulnerability has been resolved:

nvme: fix reconnection fail due to reserved tag allocation

We found a issue on production environment while using NVMe over RDMA, admin_q reconnect failed forever while remote target and network is ok.
After dig into it, we found it may caused by a ABBA deadlock due to tag allocation. In my case, the tag was hold by a keep alive request waiting inside admin_q, as we quiesced admin_q while reset ctrl, so the request maked as idle and will not process before reset success. As fabric_q shares tagset with admin_q, while reconnect remote target, we need a tag for connect command, but the only one reserved tag was held by keep alive command which waiting inside admin_q. As a result, we failed to reconnect admin_q forever. In order to fix this issue, I think we should keep two reserved tags for admin queue.

CVE-2024-27433:
In the Linux kernel, the following vulnerability has been resolved:

clk: mediatek: mt7622-apmixedsys: Fix an error handling path in clk_mt8135_apmixed_probe()

'clk_data' is allocated with mtk_devm_alloc_clk_data(). So calling mtk_free_clk_data() explicitly in the remove function would lead to a double-free.

Remove the redundant call.

CVE-2024-27427:
In the Linux kernel, the following vulnerability has been resolved:

netrom: Fix a data-race around sysctl_netrom_transport_timeout

We need to protect the reader reading the sysctl value because the value can be changed concurrently.

CVE-2024-27426:
In the Linux kernel, the following vulnerability has been resolved:

netrom: Fix a data-race around sysctl_netrom_transport_maximum_tries

We need to protect the reader reading the sysctl value because the value can be changed concurrently.

CVE-2024-27425:
In the Linux kernel, the following vulnerability has been resolved:

netrom: Fix a data-race around sysctl_netrom_transport_acknowledge_delay

We need to protect the reader reading the sysctl value because the value can be changed concurrently.

CVE-2024-27424:
In the Linux kernel, the following vulnerability has been resolved:

netrom: Fix a data-race around sysctl_netrom_transport_busy_delay

We need to protect the reader reading the sysctl value because the value can be changed concurrently.

CVE-2024-27423:
In the Linux kernel, the following vulnerability has been resolved:

netrom: Fix a data-race around sysctl_netrom_transport_requested_window_size

We need to protect the reader reading the sysctl value because the value can be changed concurrently.

CVE-2024-27422:
In the Linux kernel, the following vulnerability has been resolved:

netrom: Fix a data-race around sysctl_netrom_transport_no_activity_timeout

We need to protect the reader reading the sysctl value because the value can be changed concurrently.

CVE-2024-27417:
In the Linux kernel, the following vulnerability has been resolved:

ipv6: fix potential struct net leak in inet6_rtm_getaddr()

It seems that if userspace provides a correct IFA_TARGET_NETNSID value but no IFA_ADDRESS and IFA_LOCAL attributes, inet6_rtm_getaddr() returns -EINVAL with an elevated struct net refcount.

CVE-2024-27415:
In the Linux kernel, the following vulnerability has been resolved:

netfilter: bridge: confirm multicast packets before passing them up the stack

conntrack nf_confirm logic cannot handle cloned skbs referencing the same nf_conn entry, which will happen for multicast (broadcast) frames on bridges.

Example:
macvlan0 | br0 / \ ethX ethY

ethX (or Y) receives a L2 multicast or broadcast packet containing an IP packet, flow is not yet in conntrack table.

1. skb passes through bridge and fake-ip (br_netfilter)Prerouting.
-> skb->_nfct now references a unconfirmed entry 2. skb is broad/mcast packet. bridge now passes clones out on each bridge interface.
3. skb gets passed up the stack.
4. In macvlan case, macvlan driver retains clone(s) of the mcast skb and schedules a work queue to send them out on the lower devices.

The clone skb->_nfct is not a copy, it is the same entry as the original skb. The macvlan rx handler then returns RX_HANDLER_PASS.
5. Normal conntrack hooks (in NF_INET_LOCAL_IN) confirm the orig skb.

The Macvlan broadcast worker and normal confirm path will race.

This race will not happen if step 2 already confirmed a clone. In that case later steps perform skb_clone() with skb->_nfct already confirmed (in hash table). This works fine.

But such confirmation won't happen when eb/ip/nftables rules dropped the packets before they reached the nf_confirm step in postrouting.

Pablo points out that nf_conntrack_bridge doesn't allow use of stateful nat, so we can safely discard the nf_conn entry and let inet call conntrack again.

This doesn't work for bridge netfilter: skb could have a nat transformation. Also bridge nf prevents re-invocation of inet prerouting via 'sabotage_in' hook.

Work around this problem by explicit confirmation of the entry at LOCAL_IN time, before upper layer has a chance to clone the unconfirmed entry.

The downside is that this disables NAT and conntrack helpers.

Alternative fix would be to add locking to all code parts that deal with unconfirmed packets, but even if that could be done in a sane way this opens up other problems, for example:

-m physdev --physdev-out eth0 -j SNAT --snat-to 1.2.3.4
-m physdev --physdev-out eth1 -j SNAT --snat-to 1.2.3.5

For multicast case, only one of such conflicting mappings will be created, conntrack only handles 1:1 NAT mappings.

Users should set create a setup that explicitly marks such traffic NOTRACK (conntrack bypass) to avoid this, but we cannot auto-bypass them, ruleset might have accept rules for untracked traffic already, so user-visible behaviour would change.

CVE-2024-27414:
In the Linux kernel, the following vulnerability has been resolved:

rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back

In the commit d73ef2d69c0d (rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length), an adjustment was made to the old loop logic in the function `rtnl_bridge_setlink` to enable the loop to also check the length of the IFLA_BRIDGE_MODE attribute. However, this adjustment removed the `break` statement and led to an error logic of the flags writing back at the end of this function.

if (have_flags) memcpy(nla_data(attr), &flags, sizeof(flags));
// attr should point to IFLA_BRIDGE_FLAGS NLA !!!

Before the mentioned commit, the `attr` is granted to be IFLA_BRIDGE_FLAGS.
However, this is not necessarily true fow now as the updated loop will let the attr point to the last NLA, even an invalid NLA which could cause overflow writes.

This patch introduces a new variable `br_flag` to save the NLA pointer that points to IFLA_BRIDGE_FLAGS and uses it to resolve the mentioned error logic.

CVE-2024-27413:
In the Linux kernel, the following vulnerability has been resolved:

efi/capsule-loader: fix incorrect allocation size

gcc-14 notices that the allocation with sizeof(void) on 32-bit architectures is not enough for a 64-bit phys_addr_t:

drivers/firmware/efi/capsule-loader.c: In function 'efi_capsule_open':
drivers/firmware/efi/capsule-loader.c:295:24: error: allocation of insufficient size '4' for type 'phys_addr_t' {aka 'long long unsigned int'} with size '8' [-Werror=alloc-size] 295 | cap_info->phys = kzalloc(sizeof(void *), GFP_KERNEL);
| ^

Use the correct type instead here.

CVE-2024-27412:
In the Linux kernel, the following vulnerability has been resolved:

power: supply: bq27xxx-i2c: Do not free non existing IRQ

The bq27xxx i2c-client may not have an IRQ, in which case client->irq will be 0. bq27xxx_battery_i2c_probe() already has an if (client->irq) check wrapping the request_threaded_irq().

But bq27xxx_battery_i2c_remove() unconditionally calls free_irq(client->irq) leading to:

[ 190.310742] ------------[ cut here ]------------ [ 190.310843] Trying to free already-free IRQ 0 [ 190.310861] WARNING: CPU: 2 PID: 1304 at kernel/irq/manage.c:1893 free_irq+0x1b8/0x310

Followed by a backtrace when unbinding the driver. Add an if (client->irq) to bq27xxx_battery_i2c_remove() mirroring probe() to fix this.

CVE-2024-27409:
In the Linux kernel, the following vulnerability has been resolved:

dmaengine: dw-edma: HDMA: Add sync read before starting the DMA transfer in remote setup

The Linked list element and pointer are not stored in the same memory as the HDMA controller register. If the doorbell register is toggled before the full write of the linked list a race condition error will occur.
In remote setup we can only use a readl to the memory to assure the full write has occurred.

CVE-2024-27408:
In the Linux kernel, the following vulnerability has been resolved:

dmaengine: dw-edma: eDMA: Add sync read before starting the DMA transfer in remote setup

The Linked list element and pointer are not stored in the same memory as the eDMA controller register. If the doorbell register is toggled before the full write of the linked list a race condition error will occur.
In remote setup we can only use a readl to the memory to assure the full write has occurred.

CVE-2024-27407:
In the Linux kernel, the following vulnerability has been resolved:

fs/ntfs3: Fixed overflow check in mi_enum_attr()

CVE-2024-27406:
In the Linux kernel, the following vulnerability has been resolved:

lib/Kconfig.debug: TEST_IOV_ITER depends on MMU

Trying to run the iov_iter unit test on a nommu system such as the qemu kc705-nommu emulation results in a crash.

KTAP version 1 # Subtest: iov_iter # module: kunit_iov_iter 1..9 BUG: failure at mm/nommu.c:318/vmap()! Kernel panic - not syncing: BUG!

The test calls vmap() directly, but vmap() is not supported on nommu systems, causing the crash. TEST_IOV_ITER therefore needs to depend on MMU.

CVE-2024-27404:
In the Linux kernel, the following vulnerability has been resolved:

mptcp: fix data races on remote_id

Similar to the previous patch, address the data race on remote_id, adding the suitable ONCE annotations.

CVE-2024-27403:
In the Linux kernel, the following vulnerability has been resolved:

netfilter: nft_flow_offload: reset dst in route object after setting up flow

dst is transferred to the flow object, route object does not own it anymore. Reset dst in route object, otherwise if flow_offload_add() fails, error path releases dst twice, leading to a refcount underflow.

CVE-2024-27402:
In the Linux kernel, the following vulnerability has been resolved:

phonet/pep: fix racy skb_queue_empty() use

The receive queues are protected by their respective spin-lock, not the socket lock. This could lead to skb_peek() unexpectedly returning NULL or a pointer to an already dequeued socket buffer.

CVE-2024-27400:
In the Linux kernel, the following vulnerability has been resolved:

drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2

This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap. The basic problem here is that after the move the old location is simply not available any more.

Some fixes were suggested, but essentially we should call the move notification before actually moving things because only this way we have the correct order for DMA-buf and VM move notifications as well.

Also rework the statistic handling so that we don't update the eviction counter before the move.

v2: add missing NULL check

CVE-2024-27395:
In the Linux kernel, the following vulnerability has been resolved:

net: openvswitch: Fix Use-After-Free in ovs_ct_exit

Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal of ovs_ct_limit_exit, is not part of the RCU read critical section, it is possible that the RCU grace period will pass during the traversal and the key will be free.

To prevent this, it should be changed to hlist_for_each_entry_safe.

CVE-2024-27393:
In the Linux kernel, the following vulnerability has been resolved:

xen-netfront: Add missing skb_mark_for_recycle

Notice that skb_mark_for_recycle() is introduced later than fixes tag in commit 6a5bcd84e886 (page_pool: Allow drivers to hint on SKB recycling).

It is believed that fixes tag were missing a call to page_pool_release_page() between v5.9 to v5.14, after which is should have used skb_mark_for_recycle().
Since v6.6 the call page_pool_release_page() were removed (in commit 535b9c61bdef (net: page_pool: hide page_pool_release_page()) and remaining callers converted (in commit 6bfef2ec0172 (Merge branch 'net-page_pool-remove-page_pool_release_page')).

This leak became visible in v6.8 via commit dba1b8a7ab68 (mm/page_pool: catch page_pool memory leaks).

CVE-2024-27061:
In the Linux kernel, the following vulnerability has been resolved:

crypto: sun8i-ce - Fix use after free in unprepare

sun8i_ce_cipher_unprepare should be called before crypto_finalize_skcipher_request, because client callbacks may immediately free memory, that isn't needed anymore. But it will be used by unprepare after free. Before removing prepare/unprepare callbacks it was handled by crypto engine in crypto_finalize_request.

Usually that results in a pointer dereference problem during a in crypto selftest.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030 Mem abort info:
ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000000004716d000 [0000000000000030] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP

This problem is detected by KASAN as well.
================================================================== BUG: KASAN: slab-use-after-free in sun8i_ce_cipher_do_one+0x6e8/0xf80 [sun8i_ce] Read of size 8 at addr ffff00000dcdc040 by task 1c15000.crypto-/373

Hardware name: Pine64 PinePhone (1.2) (DT) Call trace:
dump_backtrace+0x9c/0x128 show_stack+0x20/0x38 dump_stack_lvl+0x48/0x60 print_report+0xf8/0x5d8 kasan_report+0x90/0xd0
__asan_load8+0x9c/0xc0 sun8i_ce_cipher_do_one+0x6e8/0xf80 [sun8i_ce] crypto_pump_work+0x354/0x620 [crypto_engine] kthread_worker_fn+0x244/0x498 kthread+0x168/0x178 ret_from_fork+0x10/0x20

Allocated by task 379:
kasan_save_stack+0x3c/0x68 kasan_set_track+0x2c/0x40 kasan_save_alloc_info+0x24/0x38
__kasan_kmalloc+0xd4/0xd8
__kmalloc+0x74/0x1d0 alg_test_skcipher+0x90/0x1f0 alg_test+0x24c/0x830 cryptomgr_test+0x38/0x60 kthread+0x168/0x178 ret_from_fork+0x10/0x20

Freed by task 379:
kasan_save_stack+0x3c/0x68 kasan_set_track+0x2c/0x40 kasan_save_free_info+0x38/0x60
__kasan_slab_free+0x100/0x170 slab_free_freelist_hook+0xd4/0x1e8
__kmem_cache_free+0x15c/0x290 kfree+0x74/0x100 kfree_sensitive+0x80/0xb0 alg_test_skcipher+0x12c/0x1f0 alg_test+0x24c/0x830 cryptomgr_test+0x38/0x60 kthread+0x168/0x178 ret_from_fork+0x10/0x20

The buggy address belongs to the object at ffff00000dcdc000 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 64 bytes inside of freed 256-byte region [ffff00000dcdc000, ffff00000dcdc100)

CVE-2024-27050:
In the Linux kernel, the following vulnerability has been resolved:

libbpf: Use OPTS_SET() macro in bpf_xdp_query()

When the feature_flags and xdp_zc_max_segs fields were added to the libbpf bpf_xdp_query_opts, the code writing them did not use the OPTS_SET() macro.
This causes libbpf to write to those fields unconditionally, which means that programs compiled against an older version of libbpf (with a smaller size of the bpf_xdp_query_opts struct) will have its stack corrupted by libbpf writing out of bounds.

The patch adding the feature_flags field has an early bail out if the feature_flags field is not part of the opts struct (via the OPTS_HAS) macro, but the patch adding xdp_zc_max_segs does not. For consistency, this fix just changes the assignments to both fields to use the OPTS_SET() macro.

CVE-2024-27048:
In the Linux kernel, the following vulnerability has been resolved:

wifi: brcm80211: handle pmk_op allocation failure

The kzalloc() in brcmf_pmksa_v3_op() will return null if the physical memory has run out. As a result, if we dereference the null value, the null pointer dereference bug will happen.

Return -ENOMEM from brcmf_pmksa_v3_op() if kzalloc() fails for pmk_op.

CVE-2024-27047:
In the Linux kernel, the following vulnerability has been resolved:

net: phy: fix phy_get_internal_delay accessing an empty array

The phy_get_internal_delay function could try to access to an empty array in the case that the driver is calling phy_get_internal_delay without defining delay_values and rx-internal-delay-ps or tx-internal-delay-ps is defined to 0 in the device-tree.
This will lead to unable to handle kernel NULL pointer dereference at virtual address 0. To avoid this kernel oops, the test should be delay >= 0. As there is already delay < 0 test just before, the test could only be size == 0.

CVE-2024-27046:
In the Linux kernel, the following vulnerability has been resolved:

nfp: flower: handle acti_netdevs allocation failure

The kmalloc_array() in nfp_fl_lag_do_work() will return null, if the physical memory has run out. As a result, if we dereference the acti_netdevs, the null pointer dereference bugs will happen.

This patch adds a check to judge whether allocation failure occurs.
If it happens, the delayed work will be rescheduled and try again.

CVE-2024-27030:
In the Linux kernel, the following vulnerability has been resolved:

octeontx2-af: Use separate handlers for interrupts

For PF to AF interrupt vector and VF to AF vector same interrupt handler is registered which is causing race condition.
When two interrupts are raised to two CPUs at same time then two cores serve same event corrupting the data.

CVE-2024-27026:
In the Linux kernel, the following vulnerability has been resolved:

vmxnet3: Fix missing reserved tailroom

Use rbi->len instead of rcd->len for non-dataring packet.

Found issue:
XDP_WARN: xdp_update_frame_from_buff(line:278): Driver BUG: missing reserved tailroom WARNING: CPU: 0 PID: 0 at net/core/xdp.c:586 xdp_warn+0xf/0x20 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W O 6.5.1 #1 RIP: 0010:xdp_warn+0xf/0x20 ...
? xdp_warn+0xf/0x20 xdp_do_redirect+0x15f/0x1c0 vmxnet3_run_xdp+0x17a/0x400 [vmxnet3] vmxnet3_process_xdp+0xe4/0x760 [vmxnet3] ? vmxnet3_tq_tx_complete.isra.0+0x21e/0x2c0 [vmxnet3] vmxnet3_rq_rx_complete+0x7ad/0x1120 [vmxnet3] vmxnet3_poll_rx_only+0x2d/0xa0 [vmxnet3]
__napi_poll+0x20/0x180 net_rx_action+0x177/0x390

CVE-2024-27024:
In the Linux kernel, the following vulnerability has been resolved:

net/rds: fix WARNING in rds_conn_connect_if_down

If connection isn't established yet, get_mr() will fail, trigger connection after get_mr().

CVE-2024-27020:
In the Linux kernel, the following vulnerability has been resolved:

netfilter: nf_tables: Fix potential data-race in __nft_expr_type_get()

nft_unregister_expr() can concurrent with __nft_expr_type_get(), and there is not any protection when iterate over nf_tables_expressions list in __nft_expr_type_get(). Therefore, there is potential data-race of nf_tables_expressions list entry.

Use list_for_each_entry_rcu() to iterate over nf_tables_expressions list in __nft_expr_type_get(), and use rcu_read_lock() in the caller nft_expr_type_get() to protect the entire type query process.

CVE-2024-27000:
In the Linux kernel, the following vulnerability has been resolved:

serial: mxs-auart: add spinlock around changing cts state

The uart_handle_cts_change() function in serial_core expects the caller to hold uport->lock. For example, I have seen the below kernel splat, when the Bluetooth driver is loaded on an i.MX28 board.

[ 85.119255] ------------[ cut here ]------------ [ 85.124413] WARNING: CPU: 0 PID: 27 at /drivers/tty/serial/serial_core.c:3453 uart_handle_cts_change+0xb4/0xec [ 85.134694] Modules linked in: hci_uart bluetooth ecdh_generic ecc wlcore_sdio configfs [ 85.143314] CPU: 0 PID: 27 Comm: kworker/u3:0 Not tainted 6.6.3-00021-gd62a2f068f92 #1 [ 85.151396] Hardware name: Freescale MXS (Device Tree) [ 85.156679] Workqueue: hci0 hci_power_on [bluetooth] (...) [ 85.191765] uart_handle_cts_change from mxs_auart_irq_handle+0x380/0x3f4 [ 85.198787] mxs_auart_irq_handle from __handle_irq_event_percpu+0x88/0x210 (...)

CVE-2024-26999:
In the Linux kernel, the following vulnerability has been resolved:

serial/pmac_zilog: Remove flawed mitigation for rx irq flood

The mitigation was intended to stop the irq completely. That may be better than a hard lock-up but it turns out that you get a crash anyway if you're using pmac_zilog as a serial console:

ttyPZ0: pmz: rx irq flood ! BUG: spinlock recursion on CPU#0, swapper/0

That's because the pr_err() call in pmz_receive_chars() results in pmz_console_write() attempting to lock a spinlock already locked in pmz_interrupt(). With CONFIG_DEBUG_SPINLOCK=y, this produces a fatal BUG splat. The spinlock in question is the one in struct uart_port.

Even when it's not fatal, the serial port rx function ceases to work.
Also, the iteration limit doesn't play nicely with QEMU, as can be seen in the bug report linked below.

A web search for other reports of the error message pmz: rx irq flood didn't produce anything. So I don't think this code is needed any more.
Remove it.

CVE-2024-26998:
In the Linux kernel, the following vulnerability has been resolved:

serial: core: Clearing the circular buffer before NULLifying it

The circular buffer is NULLified in uart_tty_port_shutdown() under the spin lock. However, the PM or other timer based callbacks may still trigger after this event without knowning that buffer pointer is not valid. Since the serial code is a bit inconsistent in checking the buffer state (some rely on the head-tail positions, some on the buffer pointer), it's better to have both aligned, i.e. buffer pointer to be NULL and head-tail possitions to be the same, meaning it's empty.
This will prevent asynchronous calls to dereference NULL pointer as reported recently in 8250 case:

BUG: kernel NULL pointer dereference, address: 00000cf5 Workqueue: pm pm_runtime_work EIP: serial8250_tx_chars (drivers/tty/serial/8250/8250_port.c:1809) ...
? serial8250_tx_chars (drivers/tty/serial/8250/8250_port.c:1809)
__start_tx (drivers/tty/serial/8250/8250_port.c:1551) serial8250_start_tx (drivers/tty/serial/8250/8250_port.c:1654) serial_port_runtime_suspend (include/linux/serial_core.h:667 drivers/tty/serial/serial_port.c:63)
__rpm_callback (drivers/base/power/runtime.c:393) ? serial_port_remove (drivers/tty/serial/serial_port.c:50) rpm_suspend (drivers/base/power/runtime.c:447)

The proposed change will prevent ->start_tx() to be called during suspend on shut down port.

CVE-2024-26997:
In the Linux kernel, the following vulnerability has been resolved:

usb: dwc2: host: Fix dereference issue in DDMA completion flow.

Fixed variable dereference issue in DDMA completion flow.

CVE-2024-26994:
In the Linux kernel, the following vulnerability has been resolved:

speakup: Avoid crash on very long word

In case a console is set up really large and contains a really long word (> 256 characters), we have to stop before the length of the word buffer.

CVE-2024-26992:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86/pmu: Disable support for adaptive PEBS

Drop support for virtualizing adaptive PEBS, as KVM's implementation is architecturally broken without an obvious/easy path forward, and because exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak host kernel addresses to the guest.

Bug #1 is that KVM doesn't account for the upper 32 bits of IA32_FIXED_CTR_CTRL when (re)programming fixed counters, e.g fixed_ctrl_field() drops the upper bits, reprogram_fixed_counters() stores local variables as u8s and truncates the upper bits too, etc.

Bug #2 is that, because KVM _always_ sets precise_ip to a non-zero value for PEBS events, perf will _always_ generate an adaptive record, even if the guest requested a basic record. Note, KVM will also enable adaptive PEBS in individual *counter*, even if adaptive PEBS isn't exposed to the guest, but this is benign as MSR_PEBS_DATA_CFG is guaranteed to be zero, i.e. the guest will only ever see Basic records.

Bug #3 is in perf. intel_pmu_disable_fixed() doesn't clear the upper bits either, i.e. leaves ICL_FIXED_0_ADAPTIVE set, and intel_pmu_enable_fixed() effectively doesn't clear ICL_FIXED_0_ADAPTIVE either. I.e. perf _always_ enables ADAPTIVE counters, regardless of what KVM requests.

Bug #4 is that adaptive PEBS *might* effectively bypass event filters set by the host, as Updated Memory Access Info Group records information that might be disallowed by userspace via KVM_SET_PMU_EVENT_FILTER.

Bug #5 is that KVM doesn't ensure LBR MSRs hold guest values (or at least zeros) when entering a vCPU with adaptive PEBS, which allows the guest to read host LBRs, i.e. host RIPs/addresses, by enabling LBR Entries records.

Disable adaptive PEBS support as an immediate fix due to the severity of the LBR leak in particular, and because fixing all of the bugs will be non-trivial, e.g. not suitable for backporting to stable kernels.

Note! This will break live migration, but trying to make KVM play nice with live migration would be quite complicated, wouldn't be guaranteed to work (i.e. KVM might still kill/confuse the guest), and it's not clear that there are any publicly available VMMs that support adaptive PEBS, let alone live migrate VMs that support adaptive PEBS, e.g. QEMU doesn't support PEBS in any capacity.

CVE-2024-26990:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status

Check kvm_mmu_page_ad_need_write_protect() when deciding whether to write-protect or clear D-bits on TDP MMU SPTEs, so that the TDP MMU accounts for any role-specific reasons for disabling D-bit dirty logging.

Specifically, TDP MMU SPTEs must be write-protected when the TDP MMU is being used to run an L2 (i.e. L1 has disabled EPT) and PML is enabled.
KVM always disables PML when running L2, even when L1 and L2 GPAs are in the some domain, so failing to write-protect TDP MMU SPTEs will cause writes made by L2 to not be reflected in the dirty log.

[sean: massage shortlog and changelog, tweak ternary op formatting]

CVE-2024-26988:
In the Linux kernel, the following vulnerability has been resolved:

init/main.c: Fix potential static_command_line memory overflow

We allocate memory of size 'xlen + strlen(boot_command_line) + 1' for static_command_line, but the strings copied into static_command_line are extra_command_line and command_line, rather than extra_command_line and boot_command_line.

When strlen(command_line) > strlen(boot_command_line), static_command_line will overflow.

This patch just recovers strlen(command_line) which was miss-consolidated with strlen(boot_command_line) in the commit f5c7310ac73e (init/main: add checks for the return value of memblock_alloc*())

CVE-2024-26986:
In the Linux kernel, the following vulnerability has been resolved:

drm/amdkfd: Fix memory leak in create_process failure

Fix memory leak due to a leaked mmget reference on an error handling code path that is triggered when attempting to create KFD processes while a GPU reset is in progress.

CVE-2024-26984:
In the Linux kernel, the following vulnerability has been resolved:

nouveau: fix instmem race condition around ptr stores

Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash.

BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:

...

? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]

Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm.

Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap.

If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read.

v2: use paired smp_rmb/smp_wmb.

CVE-2024-26979:
In the Linux kernel, the following vulnerability has been resolved:

drm/vmwgfx: Fix possible null pointer derefence with invalid contexts

vmw_context_cotable can return either an error or a null pointer and its usage sometimes went unchecked. Subsequent code would then try to access either a null pointer or an error value.

The invalid dereferences were only possible with malformed userspace apps which never properly initialized the rendering contexts.

Check the results of vmw_context_cotable to fix the invalid derefs.

Thanks:
ziming zhang(@ezrak1e) from Ant Group Light-Year Security Lab who was the first person to discover it.
Niels De Graef who reported it and helped to track down the poc.

CVE-2024-26978:
In the Linux kernel, the following vulnerability has been resolved:

serial: max310x: fix NULL pointer dereference in I2C instantiation

When trying to instantiate a max14830 device from userspace:

echo max14830 0x60 > /sys/bus/i2c/devices/i2c-2/new_device

we get the following error:

Unable to handle kernel NULL pointer dereference at virtual address...
...
Call trace:
max310x_i2c_probe+0x48/0x170 [max310x] i2c_device_probe+0x150/0x2a0 ...

Add check for validity of devtype to prevent the error, and abort probe with a meaningful error message.

CVE-2024-26976:
In the Linux kernel, the following vulnerability has been resolved:

KVM: Always flush async #PF workqueue when vCPU is being destroyed

Always flush the per-vCPU async #PF workqueue when a vCPU is clearing its completion queue, e.g. when a VM and all its vCPUs is being destroyed.
KVM must ensure that none of its workqueue callbacks is running when the last reference to the KVM _module_ is put. Gifting a reference to the associated VM prevents the workqueue callback from dereferencing freed vCPU/VM memory, but does not prevent the KVM module from being unloaded before the callback completes.

Drop the misguided VM refcount gifting, as calling kvm_put_kvm() from async_pf_execute() if kvm_put_kvm() flushes the async #PF workqueue will result in deadlock. async_pf_execute() can't return until kvm_put_kvm() finishes, and kvm_put_kvm() can't return until async_pf_execute() finishes:

WARNING: CPU: 8 PID: 251 at virt/kvm/kvm_main.c:1435 kvm_put_kvm+0x2d/0x320 [kvm] Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel kvm irqbypass CPU: 8 PID: 251 Comm: kworker/8:1 Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events async_pf_execute [kvm] RIP: 0010:kvm_put_kvm+0x2d/0x320 [kvm] Call Trace:
<TASK> async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK>
---[ end trace 0000000000000000 ]--- INFO: task kworker/8:1:251 blocked for more than 120 seconds.
Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message.
task:kworker/8:1 state:D stack:0 pid:251 ppid:2 flags:0x00004000 Workqueue: events async_pf_execute [kvm] Call Trace:
<TASK>
__schedule+0x33f/0xa40 schedule+0x53/0xc0 schedule_timeout+0x12a/0x140
__wait_for_common+0x8d/0x1d0
__flush_work.isra.0+0x19f/0x2c0 kvm_clear_async_pf_completion_queue+0x129/0x190 [kvm] kvm_arch_destroy_vm+0x78/0x1b0 [kvm] kvm_put_kvm+0x1c1/0x320 [kvm] async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK>

If kvm_clear_async_pf_completion_queue() actually flushes the workqueue, then there's no need to gift async_pf_execute() a reference because all invocations of async_pf_execute() will be forced to complete before the vCPU and its VM are destroyed/freed. And that in turn fixes the module unloading bug as __fput() won't do module_put() on the last vCPU reference until the vCPU has been freed, e.g. if closing the vCPU file also puts the last reference to the KVM module.

Note that kvm_check_async_pf_completion() may also take the work item off the completion queue and so also needs to flush the work queue, as the work will not be seen by kvm_clear_async_pf_completion_queue(). Waiting on the workqueue could theoretically delay a vCPU due to waiting for the work to complete, but that's a very, very small chance, and likely a very small delay. kvm_arch_async_page_present_queued() unconditionally makes a new request, i.e. will effectively delay entering the guest, so the remaining work is really just:

trace_kvm_async_pf_completed(addr, cr2_or_gpa);

__kvm_vcpu_wake_up(vcpu);

mmput(mm);

and mmput() can't drop the last reference to the page tables if the vCPU is still alive, i.e. the vCPU won't get stuck tearing down page tables.

Add a helper to do the flushing, specifically to deal with wakeup all work items, as they aren't actually work items, i.e. are never placed in a workqueue. Trying to flush a bogus workqueue entry rightly makes
__flush_work() complain (kudos to whoever added that sanity check).

Note, commit 5f6de5cbebee (KVM: Prevent module exit until al
---truncated---

CVE-2024-26975:
In the Linux kernel, the following vulnerability has been resolved:

powercap: intel_rapl: Fix a NULL pointer dereference

A NULL pointer dereference is triggered when probing the MMIO RAPL driver on platforms with CPU ID not listed in intel_rapl_common CPU model list.

This is because the intel_rapl_common module still probes on such platforms even if 'defaults_msr' is not set after commit 1488ac990ac8 (powercap: intel_rapl: Allow probing without CPUID match). Thus the MMIO RAPL rp->priv->defaults is NULL when registering to RAPL framework.

Fix the problem by adding sanity check to ensure rp->priv->rapl_defaults is always valid.

CVE-2024-26974:
In the Linux kernel, the following vulnerability has been resolved:

crypto: qat - resolve race condition during AER recovery

During the PCI AER system's error recovery process, the kernel driver may encounter a race condition with freeing the reset_data structure's memory. If the device restart will take more than 10 seconds the function scheduling that restart will exit due to a timeout, and the reset_data structure will be freed. However, this data structure is used for completion notification after the restart is completed, which leads to a UAF bug.

This results in a KFENCE bug notice.

BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat] Use-after-free read at 0x00000000bc56fddf (in kfence-#142):
adf_device_reset_worker+0x38/0xa0 [intel_qat] process_one_work+0x173/0x340

To resolve this race condition, the memory associated to the container of the work_struct is freed on the worker if the timeout expired, otherwise on the function that schedules the worker.
The timeout detection can be done by checking if the caller is still waiting for completion or not by using completion_done() function.

CVE-2024-26971:
In the Linux kernel, the following vulnerability has been resolved:

clk: qcom: gcc-ipq5018: fix terminating of frequency table arrays

The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().

CVE-2024-26970:
In the Linux kernel, the following vulnerability has been resolved:

clk: qcom: gcc-ipq6018: fix terminating of frequency table arrays

The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().

Only compile tested.

CVE-2024-26969:
In the Linux kernel, the following vulnerability has been resolved:

clk: qcom: gcc-ipq8074: fix terminating of frequency table arrays

The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().

Only compile tested.

CVE-2024-26968:
In the Linux kernel, the following vulnerability has been resolved:

clk: qcom: gcc-ipq9574: fix terminating of frequency table arrays

The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().

Only compile tested.

CVE-2024-26966:
In the Linux kernel, the following vulnerability has been resolved:

clk: qcom: mmcc-apq8084: fix terminating of frequency table arrays

The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().

Only compile tested.

CVE-2024-26964:
In the Linux kernel, the following vulnerability has been resolved:

usb: xhci: Add error handling in xhci_map_urb_for_dma

Currently xhci_map_urb_for_dma() creates a temporary buffer and copies the SG list to the new linear buffer. But if the kzalloc_node() fails, then the following sg_pcopy_to_buffer() can lead to crash since it tries to memcpy to NULL pointer.

So return -ENOMEM if kzalloc returns null pointer.

CVE-2024-26963:
In the Linux kernel, the following vulnerability has been resolved:

usb: dwc3-am62: fix module unload/reload behavior

As runtime PM is enabled, the module can be runtime suspended when .remove() is called.

Do a pm_runtime_get_sync() to make sure module is active before doing any register operations.

Doing a pm_runtime_put_sync() should disable the refclk so no need to disable it again.

Fixes the below warning at module removel.

[ 39.705310] ------------[ cut here ]------------ [ 39.710004] clk:162:3 already disabled [ 39.713941] WARNING: CPU: 0 PID: 921 at drivers/clk/clk.c:1090 clk_core_disable+0xb0/0xb8

We called of_platform_populate() in .probe() so call the cleanup function of_platform_depopulate() in .remove().
Get rid of the now unnnecessary dwc3_ti_remove_core().
Without this, module re-load doesn't work properly.

CVE-2024-26962:
In the Linux kernel, the following vulnerability has been resolved:

dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape

For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang:

1) the array is read-only;
2) MD_RECOVERY_WAIT is set;
3) MD_RECOVERY_FROZEN is set;

After commit c467e97f079f (md/raid6: use valid sector values to determine if an I/O should wait on the reshape) fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang:

[root@fedora ~]# cat /proc/979/stack [<0>] wait_woken+0x7d/0x90 [<0>] raid5_make_request+0x929/0x1d70 [raid456] [<0>] md_handle_request+0xc2/0x3b0 [md_mod] [<0>] raid_map+0x2c/0x50 [dm_raid] [<0>] __map_bio+0x251/0x380 [dm_mod] [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod] [<0>] __submit_bio+0xc2/0x1c0 [<0>] submit_bio_noacct_nocheck+0x17f/0x450 [<0>] submit_bio_noacct+0x2bc/0x780 [<0>] submit_bio+0x70/0xc0 [<0>] mpage_readahead+0x169/0x1f0 [<0>] blkdev_readahead+0x18/0x30 [<0>] read_pages+0x7c/0x3b0 [<0>] page_cache_ra_unbounded+0x1ab/0x280 [<0>] force_page_cache_ra+0x9e/0x130 [<0>] page_cache_sync_ra+0x3b/0x110 [<0>] filemap_get_pages+0x143/0xa30 [<0>] filemap_read+0xdc/0x4b0 [<0>] blkdev_read_iter+0x75/0x200 [<0>] vfs_read+0x272/0x460 [<0>] ksys_read+0x7a/0x170 [<0>] __x64_sys_read+0x1c/0x30 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74

This is because reshape can't make progress.

For md/raid, the problem doesn't exist because register new sync_thread doesn't rely on the IO to be done any more:

1) If array is read-only, it can switch to read-write by ioctl/sysfs;
2) md/raid never set MD_RECOVERY_WAIT;
3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex', hence it can be cleared and reshape can continue by sysfs api 'sync_action'.

However, I'm not sure yet how to avoid the problem in dm-raid yet. This patch on the one hand make sure raid_message() can't change sync_thread() through raid_message() after presuspend(), on the other hand detect the above 3 cases before wait for IO do be done in dm_suspend(), and let dm-raid requeue those IO.

CVE-2024-26959:
In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: btnxpuart: Fix btnxpuart_close

Fix scheduling while atomic BUG in btnxpuart_close(), properly purge the transmit queue and free the receive skb.

[ 10.973809] BUG: scheduling while atomic: kworker/u9:0/80/0x00000002 ...
[ 10.980740] CPU: 3 PID: 80 Comm: kworker/u9:0 Not tainted 6.8.0-rc7-0.0.0-devel-00005-g61fdfceacf09 #1 [ 10.980751] Hardware name: Toradex Verdin AM62 WB on Dahlia Board (DT) [ 10.980760] Workqueue: hci0 hci_power_off [bluetooth] [ 10.981169] Call trace:
...
[ 10.981363] uart_update_mctrl+0x58/0x78 [ 10.981373] uart_dtr_rts+0x104/0x114 [ 10.981381] tty_port_shutdown+0xd4/0xdc [ 10.981396] tty_port_close+0x40/0xbc [ 10.981407] uart_close+0x34/0x9c [ 10.981414] ttyport_close+0x50/0x94 [ 10.981430] serdev_device_close+0x40/0x50 [ 10.981442] btnxpuart_close+0x24/0x98 [btnxpuart] [ 10.981469] hci_dev_close_sync+0x2d8/0x718 [bluetooth] [ 10.981728] hci_dev_do_close+0x2c/0x70 [bluetooth] [ 10.981862] hci_power_off+0x20/0x64 [bluetooth]

CVE-2024-26953:
In the Linux kernel, the following vulnerability has been resolved:

net: esp: fix bad handling of pages from page_pool

When the skb is reorganized during esp_output (!esp->inline), the pages coming from the original skb fragments are supposed to be released back to the system through put_page. But if the skb fragment pages are originating from a page_pool, calling put_page on them will trigger a page_pool leak which will eventually result in a crash.

This leak can be easily observed when using CONFIG_DEBUG_VM and doing ipsec + gre (non offloaded) forwarding:

BUG: Bad page state in process ksoftirqd/16 pfn:1451b6 page:00000000de2b8d32 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1451b6000 pfn:0x1451b6 flags: 0x200000000000000(node=0|zone=2) page_type: 0xffffffff() raw: 0200000000000000 dead000000000040 ffff88810d23c000 0000000000000000 raw: 00000001451b6000 0000000000000001 00000000ffffffff 0000000000000000 page dumped because: page_pool leak Modules linked in: ip_gre gre mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay zram zsmalloc fuse [last unloaded: mlx5_core] CPU: 16 PID: 96 Comm: ksoftirqd/16 Not tainted 6.8.0-rc4+ #22 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace:
<TASK> dump_stack_lvl+0x36/0x50 bad_page+0x70/0xf0 free_unref_page_prepare+0x27a/0x460 free_unref_page+0x38/0x120 esp_ssg_unref.isra.0+0x15f/0x200 esp_output_tail+0x66d/0x780 esp_xmit+0x2c5/0x360 validate_xmit_xfrm+0x313/0x370 ? validate_xmit_skb+0x1d/0x330 validate_xmit_skb_list+0x4c/0x70 sch_direct_xmit+0x23e/0x350
__dev_queue_xmit+0x337/0xba0 ? nf_hook_slow+0x3f/0xd0 ip_finish_output2+0x25e/0x580 iptunnel_xmit+0x19b/0x240 ip_tunnel_xmit+0x5fb/0xb60 ipgre_xmit+0x14d/0x280 [ip_gre] dev_hard_start_xmit+0xc3/0x1c0
__dev_queue_xmit+0x208/0xba0 ? nf_hook_slow+0x3f/0xd0 ip_finish_output2+0x1ca/0x580 ip_sublist_rcv_finish+0x32/0x40 ip_sublist_rcv+0x1b2/0x1f0 ? ip_rcv_finish_core.constprop.0+0x460/0x460 ip_list_rcv+0x103/0x130
__netif_receive_skb_list_core+0x181/0x1e0 netif_receive_skb_list_internal+0x1b3/0x2c0 napi_gro_receive+0xc8/0x200 gro_cell_poll+0x52/0x90
__napi_poll+0x25/0x1a0 net_rx_action+0x28e/0x300
__do_softirq+0xc3/0x276 ? sort_range+0x20/0x20 run_ksoftirqd+0x1e/0x30 smpboot_thread_fn+0xa6/0x130 kthread+0xcd/0x100 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x31/0x50 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork_asm+0x11/0x20 </TASK>

The suggested fix is to introduce a new wrapper (skb_page_unref) that covers page refcounting for page_pool pages as well.

CVE-2024-26946:
In the Linux kernel, the following vulnerability has been resolved:

kprobes/x86: Use copy_from_kernel_nofault() to read from unsafe address

Read from an unsafe address with copy_from_kernel_nofault() in arch_adjust_kprobe_addr() because this function is used before checking the address is in text or not. Syzcaller bot found a bug and reported the case if user specifies inaccessible data area, arch_adjust_kprobe_addr() will cause a kernel panic.

[ mingo: Clarified the comment. ]

CVE-2024-26943:
In the Linux kernel, the following vulnerability has been resolved:

nouveau/dmem: handle kcalloc() allocation failure

The kcalloc() in nouveau_dmem_evict_chunk() will return null if the physical memory has run out. As a result, if we dereference src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs will happen.

Moreover, the GPU is going away. If the kcalloc() fails, we could not evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL flag in kcalloc().

Finally, as there is no need to have physically contiguous memory, this patch switches kcalloc() to kvcalloc() in order to avoid failing allocations.

CVE-2024-26938:
In the Linux kernel, the following vulnerability has been resolved:

drm/i915/bios: Tolerate devdata==NULL in intel_bios_encoder_supports_dp_dual_mode()

If we have no VBT, or the VBT didn't declare the encoder in question, we won't have the 'devdata' for the encoder.
Instead of oopsing just bail early.

We won't be able to tell whether the port is DP++ or not, but so be it.

(cherry picked from commit 26410896206342c8a80d2b027923e9ee7d33b733)

CVE-2024-26937:
In the Linux kernel, the following vulnerability has been resolved:

drm/i915/gt: Reset queue_priority_hint on parking

Originally, with strict in order execution, we could complete execution only when the queue was empty. Preempt-to-busy allows replacement of an active request that may complete before the preemption is processed by HW. If that happens, the request is retired from the queue, but the queue_priority_hint remains set, preventing direct submission until after the next CS interrupt is processed.

This preempt-to-busy race can be triggered by the heartbeat, which will also act as the power-management barrier and upon completion allow us to idle the HW. We may process the completion of the heartbeat, and begin parking the engine before the CS event that restores the queue_priority_hint, causing us to fail the assertion that it is MIN.

<3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 166.210781] Dumping ftrace buffer:
<0>[ 166.210795] --------------------------------- ...
<0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 } <0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0:
preempting last=1217:2, prio=0, hint=2147483646 <0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0:
fence 1217:2, current 0 <0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659 <0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0:
context:3 schedule-in, ccid:40 <0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 } <0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0:
fence c90:2, current 2 <0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0:
context:c90 unpin <0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0:
fence 1217:2, current 2 <0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0:
context:1217 unpin <0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0:
fence 3:4660, current 4660 <0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0:
context:1216 retire runtime: { total:56028ns, avg:56028ns } <0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked <0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0:
semaphore yield: 00000040 <0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0:
context:1217 retire runtime: { total:325575ns, avg:0ns } <0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0:
context:c90 retire runtime: { total:0ns, avg:0ns } <0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 167.303811] --------------------------------- <4>[ 167.304722] ------------[ cut here ]------------ <2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283! <4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1 <4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 <4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915] <4>[ 16
---truncated---

CVE-2024-26936:
In the Linux kernel, the following vulnerability has been resolved:

ksmbd: validate request buffer size in smb2_allocate_rsp_buf()

The response buffer should be allocated in smb2_allocate_rsp_buf before validating request. But the fields in payload as well as smb2 header is used in smb2_allocate_rsp_buf(). This patch add simple buffer size validation to avoid potencial out-of-bounds in request buffer.

CVE-2024-26935:
In the Linux kernel, the following vulnerability has been resolved:

scsi: core: Fix unremoved procfs host directory regression

Commit fc663711b944 (scsi: core: Remove the /proc/scsi/${proc_name} directory earlier) fixed a bug related to modules loading/unloading, by adding a call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led to a potential duplicate call to the hostdir_rm() routine, since it's also called from scsi_host_dev_release(). That triggered a regression report, which was then fixed by commit be03df3d4bfe (scsi: core: Fix a procfs host directory removal regression). The fix just dropped the hostdir_rm() call from dev_release().

But it happens that this proc directory is created on scsi_host_alloc(), and that function pairs with scsi_host_dev_release(), while scsi_remove_host() pairs with scsi_add_host(). In other words, it seems the reason for removing the proc directory on dev_release() was meant to cover cases in which a SCSI host structure was allocated, but the call to scsi_add_host() didn't happen. And that pattern happens to exist in some error paths, for example.

Syzkaller causes that by using USB raw gadget device, error'ing on usb-storage driver, at usb_stor_probe2(). By checking that path, we can see that the BadDevice label leads to a scsi_host_put() after a SCSI host allocation, but there's no call to scsi_add_host() in such path. That leads to messages like this in dmesg (and a leak of the SCSI host proc structure):

usb-storage 4-1:87.51: USB Mass Storage device detected proc_dir_entry 'scsi/usb-storage' already registered WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376

The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(), but guard that with the state check for SHOST_CREATED; there is even a comment in scsi_host_dev_release() detailing that: such conditional is meant for cases where the SCSI host was allocated but there was no calls to {add,remove}_host(), like the usb-storage case.

This is what we propose here and with that, the error path of usb-storage does not trigger the warning anymore.

CVE-2024-26934:
In the Linux kernel, the following vulnerability has been resolved:

USB: core: Fix deadlock in usb_deauthorize_interface()

Among the attribute file callback routines in drivers/usb/core/sysfs.c, the interface_authorized_store() function is the only one which acquires a device lock on an ancestor device: It calls usb_deauthorize_interface(), which locks the interface's parent USB device.

The will lead to deadlock if another process already owns that lock and tries to remove the interface, whether through a configuration change or because the device has been disconnected. As part of the removal procedure, device_del() waits for all ongoing sysfs attribute callbacks to complete. But usb_deauthorize_interface() can't complete until the device lock has been released, and the lock won't be released until the removal has finished.

The mechanism provided by sysfs to prevent this kind of deadlock is to use the sysfs_break_active_protection() function, which tells sysfs not to wait for the attribute callback.

Reported-and-tested by: Yue Sun <[email protected]> Reported by: xingwei lee <[email protected]>

CVE-2024-26933:
In the Linux kernel, the following vulnerability has been resolved:

USB: core: Fix deadlock in port disable sysfs attribute

The show and store callback routines for the disable sysfs attribute file in port.c acquire the device lock for the port's parent hub device. This can cause problems if another process has locked the hub to remove it or change its configuration:

Removing the hub or changing its configuration requires the hub interface to be removed, which requires the port device to be removed, and device_del() waits until all outstanding sysfs attribute callbacks for the ports have returned. The lock can't be released until then.

But the disable_show() or disable_store() routine can't return until after it has acquired the lock.

The resulting deadlock can be avoided by calling sysfs_break_active_protection(). This will cause the sysfs core not to wait for the attribute's callback routine to return, allowing the removal to proceed. The disadvantage is that after making this call, there is no guarantee that the hub structure won't be deallocated at any moment. To prevent this, we have to acquire a reference to it first by calling hub_get().

CVE-2024-26931:
In the Linux kernel, the following vulnerability has been resolved:

scsi: qla2xxx: Fix command flush on cable pull

System crash due to command failed to flush back to SCSI layer.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 27 PID: 793455 Comm: kworker/u130:6 Kdump: loaded Tainted: G OE --------- - - 4.18.0-372.9.1.el8.x86_64 #1 Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021 Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc] RIP: 0010:__wake_up_common+0x4c/0x190 Code: 24 10 4d 85 c9 74 0a 41 f6 01 04 0f 85 9d 00 00 00 48 8b 43 08 48 83 c3 08 4c 8d 48 e8 49 8d 41 18 48 39 c3 0f 84 f0 00 00 00 <49> 8b 41 18 89 54 24 08 31 ed 4c 8d 70 e8 45 8b 29 41 f6 c5 04 75 RSP: 0018:ffff95f3e0cb7cd0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff8b08d3b26328 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8b08d3b26320 RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffffffffffe8 R10: 0000000000000000 R11: ffff95f3e0cb7a60 R12: ffff95f3e0cb7d20 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8b2fdf6c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000002f1e410002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace:
__wake_up_common_lock+0x7c/0xc0 qla_nvme_ls_req+0x355/0x4c0 [qla2xxx] qla2xxx [0000:12:00.1]-f084:3: qlt_free_session_done: se_sess 0000000000000000 / sess ffff8ae1407ca000 from port 21:32:00:02:ac:07:ee:b8 loop_id 0x02 s_id 01:02:00 logout 1 keep 0 els_logo 0 ? __nvme_fc_send_ls_req+0x260/0x380 [nvme_fc] qla2xxx [0000:12:00.1]-207d:3: FCPort 21:32:00:02:ac:07:ee:b8 state transitioned from ONLINE to LOST - portid=010200.
? nvme_fc_send_ls_req.constprop.42+0x1a/0x45 [nvme_fc] qla2xxx [0000:12:00.1]-2109:3: qla2x00_schedule_rport_del 21320002ac07eeb8. rport ffff8ae598122000 roles 1 ? nvme_fc_connect_ctrl_work.cold.63+0x1e3/0xa7d [nvme_fc] qla2xxx [0000:12:00.1]-f084:3: qlt_free_session_done: se_sess 0000000000000000 / sess ffff8ae14801e000 from port 21:32:01:02:ad:f7:ee:b8 loop_id 0x04 s_id 01:02:01 logout 1 keep 0 els_logo 0 ? __switch_to+0x10c/0x450 ? process_one_work+0x1a7/0x360 qla2xxx [0000:12:00.1]-207d:3: FCPort 21:32:01:02:ad:f7:ee:b8 state transitioned from ONLINE to LOST - portid=010201.
? worker_thread+0x1ce/0x390 ? create_worker+0x1a0/0x1a0 qla2xxx [0000:12:00.1]-2109:3: qla2x00_schedule_rport_del 21320102adf7eeb8. rport ffff8ae3b2312800 roles 70 ? kthread+0x10a/0x120 qla2xxx [0000:12:00.1]-2112:3: qla_nvme_unregister_remote_port: unregister remoteport on ffff8ae14801e000 21320102adf7eeb8 ? set_kthread_struct+0x40/0x40 qla2xxx [0000:12:00.1]-2110:3: remoteport_delete of ffff8ae14801e000 21320102adf7eeb8 completed.
? ret_from_fork+0x1f/0x40 qla2xxx [0000:12:00.1]-f086:3: qlt_free_session_done: waiting for sess ffff8ae14801e000 logout

The system was under memory stress where driver was not able to allocate an SRB to carry out error recovery of cable pull. The failure to flush causes upper layer to start modifying scsi_cmnd. When the system frees up some memory, the subsequent cable pull trigger another command flush. At this point the driver access a null pointer when attempting to DMA unmap the SGL.

Add a check to make sure commands are flush back on session tear down to prevent the null pointer access.

CVE-2024-26930:
In the Linux kernel, the following vulnerability has been resolved:

scsi: qla2xxx: Fix double free of the ha->vp_map pointer

Coverity scan reported potential risk of double free of the pointer ha->vp_map. ha->vp_map was freed in qla2x00_mem_alloc(), and again freed in function qla2x00_mem_free(ha).

Assign NULL to vp_map and kfree take care of NULL.

CVE-2024-26929:
In the Linux kernel, the following vulnerability has been resolved:

scsi: qla2xxx: Fix double free of fcport

The server was crashing after LOGO because fcport was getting freed twice.

-----------[ cut here ]----------- kernel BUG at mm/slub.c:371! invalid opcode: 0000 1 SMP PTI CPU: 35 PID: 4610 Comm: bash Kdump: loaded Tainted: G OE --------- - - 4.18.0-425.3.1.el8.x86_64 #1 Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021 RIP: 0010:set_freepointer.part.57+0x0/0x10 RSP: 0018:ffffb07107027d90 EFLAGS: 00010246 RAX: ffff9cb7e3150000 RBX: ffff9cb7e332b9c0 RCX: ffff9cb7e3150400 RDX: 0000000000001f37 RSI: 0000000000000000 RDI: ffff9cb7c0005500 RBP: fffff693448c5400 R08: 0000000080000000 R09: 0000000000000009 R10: 0000000000000000 R11: 0000000000132af0 R12: ffff9cb7c0005500 R13: ffff9cb7e3150000 R14: ffffffffc06990e0 R15: ffff9cb7ea85ea58 FS: 00007ff6b79c2740(0000) GS:ffff9cb8f7ec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b426b7d700 CR3: 0000000169c18002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace:
kfree+0x238/0x250 qla2x00_els_dcmd_sp_free+0x20/0x230 [qla2xxx] ? qla24xx_els_dcmd_iocb+0x607/0x690 [qla2xxx] qla2x00_issue_logo+0x28c/0x2a0 [qla2xxx] ? qla2x00_issue_logo+0x28c/0x2a0 [qla2xxx] ? kernfs_fop_write+0x11e/0x1a0

Remove one of the free calls and add check for valid fcport. Also use function qla2x00_free_fcport() instead of kfree().

CVE-2024-26908:
In the Linux kernel, the following vulnerability has been resolved:

x86/xen: Add some null pointer checking to smp.c

kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Ensure the allocation was successful by checking the pointer validity.

CVE-2024-26906:
In the Linux kernel, the following vulnerability has been resolved:

x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()

When trying to use copy_from_kernel_nofault() to read vsyscall page through a bpf program, the following oops was reported:

BUG: unable to handle page fault for address: ffffffffff600000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
RIP: 0010:copy_from_kernel_nofault+0x6f/0x110 ......
Call Trace:
<TASK> ? copy_from_kernel_nofault+0x6f/0x110 bpf_probe_read_kernel+0x1d/0x50 bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d trace_call_bpf+0xc5/0x1c0 perf_call_bpf_enter.isra.0+0x69/0xb0 perf_syscall_enter+0x13e/0x200 syscall_trace_enter+0x188/0x1c0 do_syscall_64+0xb5/0xe0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 </TASK> ......
---[ end trace 0000000000000000 ]---

The oops is triggered when:

1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall page and invokes copy_from_kernel_nofault() which in turn calls
__get_user_asm().

2) Because the vsyscall page address is not readable from kernel space, a page fault exception is triggered accordingly.

3) handle_page_fault() considers the vsyscall page address as a user space address instead of a kernel space address. This results in the fix-up setup by bpf not being applied and a page_fault_oops() is invoked due to SMAP.

Considering handle_page_fault() has already considered the vsyscall page address as a userspace address, fix the problem by disallowing vsyscall page read for copy_from_kernel_nofault().

CVE-2024-26892:
In the Linux kernel, the following vulnerability has been resolved:

wifi: mt76: mt7921e: fix use-after-free in free_irq()

From commit a304e1b82808 ([PATCH] Debug shared irqs), there is a test to make sure the shared irq handler should be able to handle the unexpected event after deregistration. For this case, let's apply MT76_REMOVED flag to indicate the device was removed and do not run into the resource access anymore.

BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e] Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115 CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 #10 Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024 Call Trace:
<TASK> dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x1f/0x190 ? mt7921_irq_handler+0xd8/0x100 [mt7921e] ? mt7921_irq_handler+0xd8/0x100 [mt7921e] kasan_report.cold+0x7f/0x11b ? mt7921_irq_handler+0xd8/0x100 [mt7921e] mt7921_irq_handler+0xd8/0x100 [mt7921e] free_irq+0x627/0xaa0 devm_free_irq+0x94/0xd0 ? devm_request_any_context_irq+0x160/0x160 ? kobject_put+0x18d/0x4a0 mt7921_pci_remove+0x153/0x190 [mt7921e] pci_device_remove+0xa2/0x1d0
__device_release_driver+0x346/0x6e0 driver_detach+0x1ef/0x2c0 bus_remove_driver+0xe7/0x2d0 ? __check_object_size+0x57/0x310 pci_unregister_driver+0x26/0x250
__do_sys_delete_module+0x307/0x510 ? free_module+0x6a0/0x6a0 ? fpregs_assert_state_consistent+0x4b/0xb0 ? rcu_read_lock_sched_held+0x10/0x70 ? syscall_enter_from_user_mode+0x20/0x70 ? trace_hardirqs_on+0x1c/0x130 do_syscall_64+0x5c/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 ? do_syscall_64+0x68/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xae

CVE-2024-26889:
In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: hci_core: Fix possible buffer overflow

struct hci_dev_info has a fixed size name[8] field so in the event that hdev->name is bigger than that strcpy would attempt to write past its size, so this fixes this problem by switching to use strscpy.

CVE-2024-26888:
In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: msft: Fix memory leak

Fix leaking buffer allocated to send MSFT_OP_LE_MONITOR_ADVERTISEMENT.

CVE-2024-26886:
In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: af_bluetooth: Fix deadlock

Attemting to do sock_lock on .recvmsg may cause a deadlock as shown bellow, so instead of using sock_sock this uses sk_receive_queue.lock on bt_sock_ioctl to avoid the UAF:

INFO: task kworker/u9:1:121 blocked for more than 30 seconds.
Not tainted 6.7.6-lemon #183 Workqueue: hci0 hci_rx_work Call Trace:
<TASK>
__schedule+0x37d/0xa00 schedule+0x32/0xe0
__lock_sock+0x68/0xa0 ? __pfx_autoremove_wake_function+0x10/0x10 lock_sock_nested+0x43/0x50 l2cap_sock_recv_cb+0x21/0xa0 l2cap_recv_frame+0x55b/0x30a0 ? psi_task_switch+0xeb/0x270 ? finish_task_switch.isra.0+0x93/0x2a0 hci_rx_work+0x33a/0x3f0 process_one_work+0x13a/0x2f0 worker_thread+0x2f0/0x410 ? __pfx_worker_thread+0x10/0x10 kthread+0xe0/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK>

CVE-2024-26882:
In the Linux kernel, the following vulnerability has been resolved:

net: ip_tunnel: make sure to pull inner header in ip_tunnel_rcv()

Apply the same fix than ones found in :

8d975c15c0cd (ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()) 1ca1ba465e55 (geneve: make sure to pull inner header in geneve_rx())

We have to save skb->network_header in a temporary variable in order to be able to recompute the network_header pointer after a pskb_inet_may_pull() call.

pskb_inet_may_pull() makes sure the needed headers are in skb->head.

syzbot reported:
BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline] BUG: KMSAN: uninit-value in INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline] BUG: KMSAN: uninit-value in IP_ECN_decapsulate include/net/inet_ecn.h:302 [inline] BUG: KMSAN: uninit-value in ip_tunnel_rcv+0xed9/0x2ed0 net/ipv4/ip_tunnel.c:409
__INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline] INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline] IP_ECN_decapsulate include/net/inet_ecn.h:302 [inline] ip_tunnel_rcv+0xed9/0x2ed0 net/ipv4/ip_tunnel.c:409
__ipgre_rcv+0x9bc/0xbc0 net/ipv4/ip_gre.c:389 ipgre_rcv net/ipv4/ip_gre.c:411 [inline] gre_rcv+0x423/0x19f0 net/ipv4/ip_gre.c:447 gre_rcv+0x2a4/0x390 net/ipv4/gre_demux.c:163 ip_protocol_deliver_rcu+0x264/0x1300 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2b8/0x440 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:314 [inline] ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:461 [inline] ip_rcv_finish net/ipv4/ip_input.c:449 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] ip_rcv+0x46f/0x760 net/ipv4/ip_input.c:569
__netif_receive_skb_one_core net/core/dev.c:5534 [inline]
__netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5648 netif_receive_skb_internal net/core/dev.c:5734 [inline] netif_receive_skb+0x58/0x660 net/core/dev.c:5793 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1556 tun_get_user+0x53b9/0x66e0 drivers/net/tun.c:2009 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2055 call_write_iter include/linux/fs.h:2087 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xb6b/0x1520 fs/read_write.c:590 ksys_write+0x20f/0x4c0 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__x64_sys_write+0x93/0xd0 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b

Uninit was created at:
__alloc_pages+0x9a6/0xe00 mm/page_alloc.c:4590 alloc_pages_mpol+0x62b/0x9d0 mm/mempolicy.c:2133 alloc_pages+0x1be/0x1e0 mm/mempolicy.c:2204 skb_page_frag_refill+0x2bf/0x7c0 net/core/sock.c:2909 tun_build_skb drivers/net/tun.c:1686 [inline] tun_get_user+0xe0a/0x66e0 drivers/net/tun.c:1826 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2055 call_write_iter include/linux/fs.h:2087 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xb6b/0x1520 fs/read_write.c:590 ksys_write+0x20f/0x4c0 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__x64_sys_write+0x93/0xd0 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b

CVE-2024-26881:
In the Linux kernel, the following vulnerability has been resolved:

net: hns3: fix kernel crash when 1588 is received on HIP08 devices

The HIP08 devices does not register the ptp devices, so the hdev->ptp is NULL, but the hardware can receive 1588 messages, and set the HNS3_RXD_TS_VLD_B bit, so, if match this case, the access of hdev->ptp->flags will cause a kernel crash:

[ 5888.946472] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 [ 5888.946475] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 ...
[ 5889.266118] pc : hclge_ptp_get_rx_hwts+0x40/0x170 [hclge] [ 5889.272612] lr : hclge_ptp_get_rx_hwts+0x34/0x170 [hclge] [ 5889.279101] sp : ffff800012c3bc50 [ 5889.283516] x29: ffff800012c3bc50 x28: ffff2040002be040 [ 5889.289927] x27: ffff800009116484 x26: 0000000080007500 [ 5889.296333] x25: 0000000000000000 x24: ffff204001c6f000 [ 5889.302738] x23: ffff204144f53c00 x22: 0000000000000000 [ 5889.309134] x21: 0000000000000000 x20: ffff204004220080 [ 5889.315520] x19: ffff204144f53c00 x18: 0000000000000000 [ 5889.321897] x17: 0000000000000000 x16: 0000000000000000 [ 5889.328263] x15: 0000004000140ec8 x14: 0000000000000000 [ 5889.334617] x13: 0000000000000000 x12: 00000000010011df [ 5889.340965] x11: bbfeff4d22000000 x10: 0000000000000000 [ 5889.347303] x9 : ffff800009402124 x8 : 0200f78811dfbb4d [ 5889.353637] x7 : 2200000000191b01 x6 : ffff208002a7d480 [ 5889.359959] x5 : 0000000000000000 x4 : 0000000000000000 [ 5889.366271] x3 : 0000000000000000 x2 : 0000000000000000 [ 5889.372567] x1 : 0000000000000000 x0 : ffff20400095c080 [ 5889.378857] Call trace:
[ 5889.382285] hclge_ptp_get_rx_hwts+0x40/0x170 [hclge] [ 5889.388304] hns3_handle_bdinfo+0x324/0x410 [hns3] [ 5889.394055] hns3_handle_rx_bd+0x60/0x150 [hns3] [ 5889.399624] hns3_clean_rx_ring+0x84/0x170 [hns3] [ 5889.405270] hns3_nic_common_poll+0xa8/0x220 [hns3] [ 5889.411084] napi_poll+0xcc/0x264 [ 5889.415329] net_rx_action+0xd4/0x21c [ 5889.419911] __do_softirq+0x130/0x358 [ 5889.424484] irq_exit+0x134/0x154 [ 5889.428700] __handle_domain_irq+0x88/0xf0 [ 5889.433684] gic_handle_irq+0x78/0x2c0 [ 5889.438319] el1_irq+0xb8/0x140 [ 5889.442354] arch_cpu_idle+0x18/0x40 [ 5889.446816] default_idle_call+0x5c/0x1c0 [ 5889.451714] cpuidle_idle_call+0x174/0x1b0 [ 5889.456692] do_idle+0xc8/0x160 [ 5889.460717] cpu_startup_entry+0x30/0xfc [ 5889.465523] secondary_start_kernel+0x158/0x1ec [ 5889.470936] Code: 97ffab78 f9411c14 91408294 f9457284 (f9400c80) [ 5889.477950] SMP: stopping secondary CPUs [ 5890.514626] SMP: failed to stop secondary CPUs 0-69,71-95 [ 5890.522951] Starting crashdump kernel...

CVE-2024-26849:
In the Linux kernel, the following vulnerability has been resolved:

netlink: add nla be16/32 types to minlen array

BUG: KMSAN: uninit-value in nla_validate_range_unsigned lib/nlattr.c:222 [inline] BUG: KMSAN: uninit-value in nla_validate_int_range lib/nlattr.c:336 [inline] BUG: KMSAN: uninit-value in validate_nla lib/nlattr.c:575 [inline] BUG: KMSAN: uninit-value in __nla_validate_parse+0x2e20/0x45c0 lib/nlattr.c:631 nla_validate_range_unsigned lib/nlattr.c:222 [inline] nla_validate_int_range lib/nlattr.c:336 [inline] validate_nla lib/nlattr.c:575 [inline] ...

The message in question matches this policy:

[NFTA_TARGET_REV] = NLA_POLICY_MAX(NLA_BE32, 255),

but because NLA_BE32 size in minlen array is 0, the validation code will read past the malformed (too small) attribute.

Note: Other attributes, e.g. BITFIELD32, SINT, UINT.. are also missing:
those likely should be added too.

CVE-2024-26842:
In the Linux kernel, the following vulnerability has been resolved:

scsi: ufs: core: Fix shift issue in ufshcd_clear_cmd()

When task_tag >= 32 (in MCQ mode) and sizeof(unsigned int) == 4, 1U << task_tag will out of bounds for a u32 mask. Fix this up to prevent SHIFT_ISSUE (bitwise shifts that are out of bounds for their data type).

[name:debug_monitors&]Unexpected kernel BRK exception at EL1 [name:traps&]Internal error: BRK handler: 00000000f2005514 [#1] PREEMPT SMP [name:mediatek_cpufreq_hw&]cpufreq stop DVFS log done [name:mrdump&]Kernel Offset: 0x1ba5800000 from 0xffffffc008000000 [name:mrdump&]PHYS_OFFSET: 0x80000000 [name:mrdump&]pstate: 22400005 (nzCv daif +PAN -UAO) [name:mrdump&]pc : [0xffffffdbaf52bb2c] ufshcd_clear_cmd+0x280/0x288 [name:mrdump&]lr : [0xffffffdbaf52a774] ufshcd_wait_for_dev_cmd+0x3e4/0x82c [name:mrdump&]sp : ffffffc0081471b0 <snip> Workqueue: ufs_eh_wq_0 ufshcd_err_handler Call trace:
dump_backtrace+0xf8/0x144 show_stack+0x18/0x24 dump_stack_lvl+0x78/0x9c dump_stack+0x18/0x44 mrdump_common_die+0x254/0x480 [mrdump] ipanic_die+0x20/0x30 [mrdump] notify_die+0x15c/0x204 die+0x10c/0x5f8 arm64_notify_die+0x74/0x13c do_debug_exception+0x164/0x26c el1_dbg+0x64/0x80 el1h_64_sync_handler+0x3c/0x90 el1h_64_sync+0x68/0x6c ufshcd_clear_cmd+0x280/0x288 ufshcd_wait_for_dev_cmd+0x3e4/0x82c ufshcd_exec_dev_cmd+0x5bc/0x9ac ufshcd_verify_dev_init+0x84/0x1c8 ufshcd_probe_hba+0x724/0x1ce0 ufshcd_host_reset_and_restore+0x260/0x574 ufshcd_reset_and_restore+0x138/0xbd0 ufshcd_err_handler+0x1218/0x2f28 process_one_work+0x5fc/0x1140 worker_thread+0x7d8/0xe20 kthread+0x25c/0x468 ret_from_fork+0x10/0x20

CVE-2024-26841:
In the Linux kernel, the following vulnerability has been resolved:

LoongArch: Update cpu_sibling_map when disabling nonboot CPUs

Update cpu_sibling_map when disabling nonboot CPUs by defining & calling clear_cpu_sibling_map(), otherwise we get such errors on SMT systems:

jump label: negative count! WARNING: CPU: 6 PID: 45 at kernel/jump_label.c:263 __static_key_slow_dec_cpuslocked+0xec/0x100 CPU: 6 PID: 45 Comm: cpuhp/6 Not tainted 6.8.0-rc5+ #1340 pc 90000000004c302c ra 90000000004c302c tp 90000001005bc000 sp 90000001005bfd20 a0 000000000000001b a1 900000000224c278 a2 90000001005bfb58 a3 900000000224c280 a4 900000000224c278 a5 90000001005bfb50 a6 0000000000000001 a7 0000000000000001 t0 ce87a4763eb5234a t1 ce87a4763eb5234a t2 0000000000000000 t3 0000000000000000 t4 0000000000000006 t5 0000000000000000 t6 0000000000000064 t7 0000000000001964 t8 000000000009ebf6 u0 9000000001f2a068 s9 0000000000000000 s0 900000000246a2d8 s1 ffffffffffffffff s2 ffffffffffffffff s3 90000000021518c0 s4 0000000000000040 s5 9000000002151058 s6 9000000009828e40 s7 00000000000000b4 s8 0000000000000006 ra: 90000000004c302c __static_key_slow_dec_cpuslocked+0xec/0x100 ERA: 90000000004c302c __static_key_slow_dec_cpuslocked+0xec/0x100 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1c (LIE=2-4,10-12 VS=7) ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) CPU: 6 PID: 45 Comm: cpuhp/6 Not tainted 6.8.0-rc5+ #1340 Stack : 0000000000000000 900000000203f258 900000000179afc8 90000001005bc000 90000001005bf980 0000000000000000 90000001005bf988 9000000001fe0be0 900000000224c280 900000000224c278 90000001005bf8c0 0000000000000001 0000000000000001 ce87a4763eb5234a 0000000007f38000 90000001003f8cc0 0000000000000000 0000000000000006 0000000000000000 4c206e6f73676e6f 6f4c203a656d616e 000000000009ec99 0000000007f38000 0000000000000000 900000000214b000 9000000001fe0be0 0000000000000004 0000000000000000 0000000000000107 0000000000000009 ffffffffffafdabe 00000000000000b4 0000000000000006 90000000004c302c 9000000000224528 00005555939a0c7c 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c ...
Call Trace:
[<9000000000224528>] show_stack+0x48/0x1a0 [<900000000179afc8>] dump_stack_lvl+0x78/0xa0 [<9000000000263ed0>] __warn+0x90/0x1a0 [<90000000017419b8>] report_bug+0x1b8/0x280 [<900000000179c564>] do_bp+0x264/0x420 [<90000000004c302c>] __static_key_slow_dec_cpuslocked+0xec/0x100 [<90000000002b4d7c>] sched_cpu_deactivate+0x2fc/0x300 [<9000000000266498>] cpuhp_invoke_callback+0x178/0x8a0 [<9000000000267f70>] cpuhp_thread_fun+0xf0/0x240 [<90000000002a117c>] smpboot_thread_fn+0x1dc/0x2e0 [<900000000029a720>] kthread+0x140/0x160 [<9000000000222288>] ret_from_kernel_thread+0xc/0xa4

CVE-2024-26839:
In the Linux kernel, the following vulnerability has been resolved:

IB/hfi1: Fix a memleak in init_credit_return

When dma_alloc_coherent fails to allocate dd->cr_base[i].va, init_credit_return should deallocate dd->cr_base and dd->cr_base[i] that allocated before. Or those resources would be never freed and a memleak is triggered.

CVE-2024-26838:
In the Linux kernel, the following vulnerability has been resolved:

RDMA/irdma: Fix KASAN issue with tasklet

KASAN testing revealed the following issue assocated with freeing an IRQ.

[50006.466686] Call Trace:
[50006.466691] <IRQ> [50006.489538] dump_stack+0x5c/0x80 [50006.493475] print_address_description.constprop.6+0x1a/0x150 [50006.499872] ? irdma_sc_process_ceq+0x483/0x790 [irdma] [50006.505742] ? irdma_sc_process_ceq+0x483/0x790 [irdma] [50006.511644] kasan_report.cold.11+0x7f/0x118 [50006.516572] ? irdma_sc_process_ceq+0x483/0x790 [irdma] [50006.522473] irdma_sc_process_ceq+0x483/0x790 [irdma] [50006.528232] irdma_process_ceq+0xb2/0x400 [irdma] [50006.533601] ? irdma_hw_flush_wqes_callback+0x370/0x370 [irdma] [50006.540298] irdma_ceq_dpc+0x44/0x100 [irdma] [50006.545306] tasklet_action_common.isra.14+0x148/0x2c0 [50006.551096] __do_softirq+0x1d0/0xaf8 [50006.555396] irq_exit_rcu+0x219/0x260 [50006.559670] irq_exit+0xa/0x20 [50006.563320] smp_apic_timer_interrupt+0x1bf/0x690 [50006.568645] apic_timer_interrupt+0xf/0x20 [50006.573341] </IRQ>

The issue is that a tasklet could be pending on another core racing the delete of the irq.

Fix by insuring any scheduled tasklet is killed after deleting the irq.

CVE-2024-26833:
In the Linux kernel, the following vulnerability has been resolved:

drm/amd/display: Fix memory leak in dm_sw_fini()

After destroying dmub_srv, the memory associated with it is not freed, causing a memory leak:

unreferenced object 0xffff896302b45800 (size 1024):
comm (udev-worker), pid 222, jiffies 4294894636 hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 6265fd77):
[<ffffffff993495ed>] kmalloc_trace+0x29d/0x340 [<ffffffffc0ea4a94>] dm_dmub_sw_init+0xb4/0x450 [amdgpu] [<ffffffffc0ea4e55>] dm_sw_init+0x15/0x2b0 [amdgpu] [<ffffffffc0ba8557>] amdgpu_device_init+0x1417/0x24e0 [amdgpu] [<ffffffffc0bab285>] amdgpu_driver_load_kms+0x15/0x190 [amdgpu] [<ffffffffc0ba09c7>] amdgpu_pci_probe+0x187/0x4e0 [amdgpu] [<ffffffff9968fd1e>] local_pci_probe+0x3e/0x90 [<ffffffff996918a3>] pci_device_probe+0xc3/0x230 [<ffffffff99805872>] really_probe+0xe2/0x480 [<ffffffff99805c98>] __driver_probe_device+0x78/0x160 [<ffffffff99805daf>] driver_probe_device+0x1f/0x90 [<ffffffff9980601e>] __driver_attach+0xce/0x1c0 [<ffffffff99803170>] bus_for_each_dev+0x70/0xc0 [<ffffffff99804822>] bus_add_driver+0x112/0x210 [<ffffffff99807245>] driver_register+0x55/0x100 [<ffffffff990012d1>] do_one_initcall+0x41/0x300

Fix this by freeing dmub_srv after destroying it.

CVE-2024-26829:
In the Linux kernel, the following vulnerability has been resolved:

media: ir_toy: fix a memleak in irtoy_tx

When irtoy_command fails, buf should be freed since it is allocated by irtoy_tx, or there is a memleak.

CVE-2024-26823:
In the Linux kernel, the following vulnerability has been resolved:

irqchip/gic-v3-its: Restore quirk probing for ACPI-based systems

While refactoring the way the ITSs are probed, the handling of quirks applicable to ACPI-based platforms was lost. As a result, systems such as HIP07 lose their GICv4 functionnality, and some other may even fail to boot, unless they are configured to boot with DT.

Move the enabling of quirks into its_probe_one(), making it common to all firmware implementations.

CVE-2024-26818:
In the Linux kernel, the following vulnerability has been resolved:

tools/rtla: Fix clang warning about mount_point var size

clang is reporting this warning:

$ make HOSTCC=clang CC=clang LLVM_IAS=1 [...] clang -O -g -DVERSION=\6.8.0-rc3\ -flto=auto -fexceptions
-fstack-protector-strong -fasynchronous-unwind-tables
-fstack-clash-protection -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS $(pkg-config --cflags libtracefs) -c -o src/utils.o src/utils.c

src/utils.c:548:66: warning: 'fscanf' may overflow; destination buffer in argument 3 has size 1024, but the corresponding specifier may require size 1025 [-Wfortify-source] 548 | while (fscanf(fp, %*s % STR(MAX_PATH) s %99s %*s %*d %*d\n, mount_point, type) == 2) { | ^

Increase mount_point variable size to MAX_PATH+1 to avoid the overflow.

CVE-2024-26816:
In the Linux kernel, the following vulnerability has been resolved:

x86, relocs: Ignore relocations in .notes section

When building with CONFIG_XEN_PV=y, .text symbols are emitted into the .notes section so that Xen can find the startup_xen entry point.
This information is used prior to booting the kernel, so relocations are not useful. In fact, performing relocations against the .notes section means that the KASLR base is exposed since /sys/kernel/notes is world-readable.

To avoid leaking the KASLR base without breaking unprivileged tools that are expecting to read /sys/kernel/notes, skip performing relocations in the .notes section. The values readable in .notes are then identical to those found in System.map.

CVE-2024-26814:
In the Linux kernel, the following vulnerability has been resolved:

vfio/fsl-mc: Block calling interrupt handler without trigger

The eventfd_ctx trigger pointer of the vfio_fsl_mc_irq object is initially NULL and may become NULL if the user sets the trigger eventfd to -1. The interrupt handler itself is guaranteed that trigger is always valid between request_irq() and free_irq(), but the loopback testing mechanisms to invoke the handler function need to test the trigger. The triggering and setting ioctl paths both make use of igate and are therefore mutually exclusive.

The vfio-fsl-mc driver does not make use of irqfds, nor does it support any sort of masking operations, therefore unlike vfio-pci and vfio-platform, the flow can remain essentially unchanged.

CVE-2024-26812:
In the Linux kernel, the following vulnerability has been resolved:

vfio/pci: Create persistent INTx handler

A vulnerability exists where the eventfd for INTx signaling can be deconfigured, which unregisters the IRQ handler but still allows eventfds to be signaled with a NULL context through the SET_IRQS ioctl or through unmask irqfd if the device interrupt is pending.

Ideally this could be solved with some additional locking; the igate mutex serializes the ioctl and config space accesses, and the interrupt handler is unregistered relative to the trigger, but the irqfd path runs asynchronous to those. The igate mutex cannot be acquired from the atomic context of the eventfd wake function. Disabling the irqfd relative to the eventfd registration is potentially incompatible with existing userspace.

As a result, the solution implemented here moves configuration of the INTx interrupt handler to track the lifetime of the INTx context object and irq_type configuration, rather than registration of a particular trigger eventfd. Synchronization is added between the ioctl path and eventfd_signal() wrapper such that the eventfd trigger can be dynamically updated relative to in-flight interrupts or irqfd callbacks.

CVE-2024-26810:
In the Linux kernel, the following vulnerability has been resolved:

vfio/pci: Lock external INTx masking ops

Mask operations through config space changes to DisINTx may race INTx configuration changes via ioctl. Create wrappers that add locking for paths outside of the core interrupt code.

In particular, irq_type is updated holding igate, therefore testing is_intx() requires holding igate. For example clearing DisINTx from config space can otherwise race changes of the interrupt configuration.

This aligns interfaces which may trigger the INTx eventfd into two camps, one side serialized by igate and the other only enabled while INTx is configured. A subsequent patch introduces synchronization for the latter flows.

CVE-2024-26799:
In the Linux kernel, the following vulnerability has been resolved:

ASoC: qcom: Fix uninitialized pointer dmactl

In the case where __lpass_get_dmactl_handle is called and the driver id dai_id is invalid the pointer dmactl is not being assigned a value, and dmactl contains a garbage value since it has not been initialized and so the null check may not work. Fix this to initialize dmactl to NULL. One could argue that modern compilers will set this to zero, but it is useful to keep this initialized as per the same way in functions
__lpass_platform_codec_intf_init and lpass_cdc_dma_daiops_hw_params.

Cleans up clang scan build warning:
sound/soc/qcom/lpass-cdc-dma.c:275:7: warning: Branch condition evaluates to a garbage value [core.uninitialized.Branch]

CVE-2024-26795:
In the Linux kernel, the following vulnerability has been resolved:

riscv: Sparse-Memory/vmemmap out-of-bounds fix

Offset vmemmap so that the first page of vmemmap will be mapped to the first page of physical memory in order to ensure that vmemmap's bounds will be respected during pfn_to_page()/page_to_pfn() operations.
The conversion macros will produce correct SV39/48/57 addresses for every possible/valid DRAM_BASE inside the physical memory limits.

v2:Address Alex's comments

CVE-2024-26789:
In the Linux kernel, the following vulnerability has been resolved:

crypto: arm64/neonbs - fix out-of-bounds access on short input

The bit-sliced implementation of AES-CTR operates on blocks of 128 bytes, and will fall back to the plain NEON version for tail blocks or inputs that are shorter than 128 bytes to begin with.

It will call straight into the plain NEON asm helper, which performs all memory accesses in granules of 16 bytes (the size of a NEON register).
For this reason, the associated plain NEON glue code will copy inputs shorter than 16 bytes into a temporary buffer, given that this is a rare occurrence and it is not worth the effort to work around this in the asm code.

The fallback from the bit-sliced NEON version fails to take this into account, potentially resulting in out-of-bounds accesses. So clone the same workaround, and use a temp buffer for short in/outputs.

CVE-2024-26734:
In the Linux kernel, the following vulnerability has been resolved:

devlink: fix possible use-after-free and memory leaks in devlink_init()

The pernet operations structure for the subsystem must be registered before registering the generic netlink family.

Make an unregister in case of unsuccessful registration.

CVE-2024-26733:
In the Linux kernel, the following vulnerability has been resolved:

arp: Prevent overflow in arp_req_get().

syzkaller reported an overflown write in arp_req_get(). [0]

When ioctl(SIOCGARP) is issued, arp_req_get() looks up an neighbour entry and copies neigh->ha to struct arpreq.arp_ha.sa_data.

The arp_ha here is struct sockaddr, not struct sockaddr_storage, so the sa_data buffer is just 14 bytes.

In the splat below, 2 bytes are overflown to the next int field, arp_flags. We initialise the field just after the memcpy(), so it's not a problem.

However, when dev->addr_len is greater than 22 (e.g. MAX_ADDR_LEN), arp_netmask is overwritten, which could be set as htonl(0xFFFFFFFFUL) in arp_ioctl() before calling arp_req_get().

To avoid the overflow, let's limit the max length of memcpy().

Note that commit b5f0de6df6dc (net: dev: Convert sa_data to flexible array in struct sockaddr) just silenced syzkaller.

[0]:
memcpy: detected field-spanning write (size 16) of single field r->arp_ha.sa_data at net/ipv4/arp.c:1128 (size 14) WARNING: CPU: 0 PID: 144638 at net/ipv4/arp.c:1128 arp_req_get+0x411/0x4a0 net/ipv4/arp.c:1128 Modules linked in:
CPU: 0 PID: 144638 Comm: syz-executor.4 Not tainted 6.1.74 #31 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 RIP: 0010:arp_req_get+0x411/0x4a0 net/ipv4/arp.c:1128 Code: fd ff ff e8 41 42 de fb b9 0e 00 00 00 4c 89 fe 48 c7 c2 20 6d ab 87 48 c7 c7 80 6d ab 87 c6 05 25 af 72 04 01 e8 5f 8d ad fb <0f> 0b e9 6c fd ff ff e8 13 42 de fb be 03 00 00 00 4c 89 e7 e8 a6 RSP: 0018:ffffc900050b7998 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88803a815000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8641a44a RDI: 0000000000000001 RBP: ffffc900050b7a98 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 203a7970636d656d R12: ffff888039c54000 R13: 1ffff92000a16f37 R14: ffff88803a815084 R15: 0000000000000010 FS: 00007f172bf306c0(0000) GS:ffff88805aa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f172b3569f0 CR3: 0000000057f12005 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace:
<TASK> arp_ioctl+0x33f/0x4b0 net/ipv4/arp.c:1261 inet_ioctl+0x314/0x3a0 net/ipv4/af_inet.c:981 sock_do_ioctl+0xdf/0x260 net/socket.c:1204 sock_ioctl+0x3ef/0x650 net/socket.c:1321 vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x18e/0x220 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x64/0xce RIP: 0033:0x7f172b262b8d Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f172bf300b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f172b3abf80 RCX: 00007f172b262b8d RDX: 0000000020000000 RSI: 0000000000008954 RDI: 0000000000000003 RBP: 00007f172b2d3493 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007f172b3abf80 R15: 00007f172bf10000 </TASK>

CVE-2024-26731:
In the Linux kernel, the following vulnerability has been resolved:

bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready()

syzbot reported the following NULL pointer dereference issue [1]:

BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] RIP: 0010:0x0 [...] Call Trace:
<TASK> sk_psock_verdict_data_ready+0x232/0x340 net/core/skmsg.c:1230 unix_stream_sendmsg+0x9b4/0x1230 net/unix/af_unix.c:2293 sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667 do_syscall_64+0xf9/0x240 entry_SYSCALL_64_after_hwframe+0x6f/0x77

If sk_psock_verdict_data_ready() and sk_psock_stop_verdict() are called concurrently, psock->saved_data_ready can be NULL, causing the above issue.

This patch fixes this issue by calling the appropriate data ready function using the sk_psock_data_ready() helper and protecting it from concurrency with sk->sk_callback_lock.

CVE-2024-26723:
In the Linux kernel, the following vulnerability has been resolved:

lan966x: Fix crash when adding interface under a lag

There is a crash when adding one of the lan966x interfaces under a lag interface. The issue can be reproduced like this:
ip link add name bond0 type bond miimon 100 mode balance-xor ip link set dev eth0 master bond0

The reason is because when adding a interface under the lag it would go through all the ports and try to figure out which other ports are under that lag interface. And the issue is that lan966x can have ports that are NULL pointer as they are not probed. So then iterating over these ports it would just crash as they are NULL pointers.
The fix consists in actually checking for NULL pointers before accessing something from the ports. Like we do in other places.

CVE-2024-26722:
In the Linux kernel, the following vulnerability has been resolved:

ASoC: rt5645: Fix deadlock in rt5645_jack_detect_work()

There is a path in rt5645_jack_detect_work(), where rt5645->jd_mutex is left locked forever. That may lead to deadlock when rt5645_jack_detect_work() is called for the second time.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

CVE-2024-26708:
In the Linux kernel, the following vulnerability has been resolved:

mptcp: really cope with fastopen race

Fastopen and PM-trigger subflow shutdown can race, as reported by syzkaller.

In my first attempt to close such race, I missed the fact that the subflow status can change again before the subflow_state_change callback is invoked.

Address the issue additionally copying with all the states directly reachable from TCP_FIN_WAIT1.

CVE-2024-26707:
In the Linux kernel, the following vulnerability has been resolved:

net: hsr: remove WARN_ONCE() in send_hsr_supervision_frame()

Syzkaller reported [1] hitting a warning after failing to allocate resources for skb in hsr_init_skb(). Since a WARN_ONCE() call will not help much in this case, it might be prudent to switch to netdev_warn_once(). At the very least it will suppress syzkaller reports such as [1].

Just in case, use netdev_warn_once() in send_prp_supervision_frame() for similar reasons.

[1] HSR: Could not send supervision frame WARNING: CPU: 1 PID: 85 at net/hsr/hsr_device.c:294 send_hsr_supervision_frame+0x60a/0x810 net/hsr/hsr_device.c:294 RIP: 0010:send_hsr_supervision_frame+0x60a/0x810 net/hsr/hsr_device.c:294 ...
Call Trace:
<IRQ> hsr_announce+0x114/0x370 net/hsr/hsr_device.c:382 call_timer_fn+0x193/0x590 kernel/time/timer.c:1700 expire_timers kernel/time/timer.c:1751 [inline]
__run_timers+0x764/0xb20 kernel/time/timer.c:2022 run_timer_softirq+0x58/0xd0 kernel/time/timer.c:2035
__do_softirq+0x21a/0x8de kernel/softirq.c:553 invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline] irq_exit_rcu+0xb7/0x120 kernel/softirq.c:644 sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649 ...

This issue is also found in older kernels (at least up to 5.10).

CVE-2024-26705:
In the Linux kernel, the following vulnerability has been resolved:

parisc: BTLB: Fix crash when setting up BTLB at CPU bringup

When using hotplug and bringing up a 32-bit CPU, ask the firmware about the BTLB information to set up the static (block) TLB entries.

For that write access to the static btlb_info struct is needed, but since it is marked __ro_after_init the kernel segfaults with missing write permissions.

Fix the crash by dropping the __ro_after_init annotation.

CVE-2024-26698:
In the Linux kernel, the following vulnerability has been resolved:

hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove

In commit ac5047671758 (hv_netvsc: Disable NAPI before closing the VMBus channel), napi_disable was getting called for all channels, including all subchannels without confirming if they are enabled or not.

This caused hv_netvsc getting hung at napi_disable, when netvsc_probe() has finished running but nvdev->subchan_work has not started yet.
netvsc_subchan_work() -> rndis_set_subchannel() has not created the sub-channels and because of that netvsc_sc_open() is not running.
netvsc_remove() calls cancel_work_sync(&nvdev->subchan_work), for which netvsc_subchan_work did not run.

netif_napi_add() sets the bit NAPI_STATE_SCHED because it ensures NAPI cannot be scheduled. Then netvsc_sc_open() -> napi_enable will clear the NAPIF_STATE_SCHED bit, so it can be scheduled. napi_disable() does the opposite.

Now during netvsc_device_remove(), when napi_disable is called for those subchannels, napi_disable gets stuck on infinite msleep.

This fix addresses this problem by ensuring that napi_disable() is not getting called for non-enabled NAPI struct.
But netif_napi_del() is still necessary for these non-enabled NAPI struct for cleanup purpose.

Call trace:
[ 654.559417] task:modprobe state:D stack: 0 pid: 2321 ppid: 1091 flags:0x00004002 [ 654.568030] Call Trace:
[ 654.571221] <TASK> [ 654.573790] __schedule+0x2d6/0x960 [ 654.577733] schedule+0x69/0xf0 [ 654.581214] schedule_timeout+0x87/0x140 [ 654.585463] ? __bpf_trace_tick_stop+0x20/0x20 [ 654.590291] msleep+0x2d/0x40 [ 654.593625] napi_disable+0x2b/0x80 [ 654.597437] netvsc_device_remove+0x8a/0x1f0 [hv_netvsc] [ 654.603935] rndis_filter_device_remove+0x194/0x1c0 [hv_netvsc] [ 654.611101] ? do_wait_intr+0xb0/0xb0 [ 654.615753] netvsc_remove+0x7c/0x120 [hv_netvsc] [ 654.621675] vmbus_remove+0x27/0x40 [hv_vmbus]

CVE-2024-26694:
In the Linux kernel, the following vulnerability has been resolved:

wifi: iwlwifi: fix double-free bug

The storage for the TLV PC register data wasn't done like all the other storage in the drv->fw area, which is cleared at the end of deallocation. Therefore, the freeing must also be done differently, explicitly NULL'ing it out after the free, since otherwise there's a nasty double-free bug here if a file fails to load after this has been parsed, and we get another free later (e.g. because no other file exists.) Fix that by adding the missing NULL assignment.

CVE-2024-26693:
In the Linux kernel, the following vulnerability has been resolved:

wifi: iwlwifi: mvm: fix a crash when we run out of stations

A DoS tool that injects loads of authentication frames made our AP crash. The iwl_mvm_is_dup() function couldn't find the per-queue dup_data which was not allocated.

The root cause for that is that we ran out of stations in the firmware and we didn't really add the station to the firmware, yet we didn't return an error to mac80211.
Mac80211 was thinking that we have the station and because of that, sta_info::uploaded was set to 1. This allowed ieee80211_find_sta_by_ifaddr() to return a valid station object, but that ieee80211_sta didn't have any iwl_mvm_sta object initialized and that caused the crash mentioned earlier when we got Rx on that station.

CVE-2024-26691:
In the Linux kernel, the following vulnerability has been resolved:

KVM: arm64: Fix circular locking dependency

The rule inside kvm enforces that the vcpu->mutex is taken *inside* kvm->lock. The rule is violated by the pkvm_create_hyp_vm() which acquires the kvm->lock while already holding the vcpu->mutex lock from kvm_vcpu_ioctl(). Avoid the circular locking dependency altogether by protecting the hyp vm handle with the config_lock, much like we already do for other forms of VM-scoped data.

CVE-2024-26690:
In the Linux kernel, the following vulnerability has been resolved:

net: stmmac: protect updates of 64-bit statistics counters

As explained by a comment in <linux/u64_stats_sync.h>, write side of struct u64_stats_sync must ensure mutual exclusion, or one seqcount update could be lost on 32-bit platforms, thus blocking readers forever. Such lockups have been observed in real world after stmmac_xmit() on one CPU raced with stmmac_napi_poll_tx() on another CPU.

To fix the issue without introducing a new lock, split the statics into three parts:

1. fields updated only under the tx queue lock, 2. fields updated only during NAPI poll, 3. fields updated only from interrupt context,

Updates to fields in the first two groups are already serialized through other locks. It is sufficient to split the existing struct u64_stats_sync so that each group has its own.

Note that tx_set_ic_bit is updated from both contexts. Split this counter so that each context gets its own, and calculate their sum to get the total value in stmmac_get_ethtool_stats().

For the third group, multiple interrupts may be processed by different CPUs at the same time, but interrupts on the same CPU will not nest. Move fields from this group to a newly created per-cpu struct stmmac_pcpu_stats.

CVE-2024-26681:
In the Linux kernel, the following vulnerability has been resolved:

netdevsim: avoid potential loop in nsim_dev_trap_report_work()

Many syzbot reports include the following trace [1]

If nsim_dev_trap_report_work() can not grab the mutex, it should rearm itself at least one jiffie later.

[1] Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0 CPU: 0 PID: 32383 Comm: kworker/0:2 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 Workqueue: events nsim_dev_trap_report_work RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:89 [inline] RIP: 0010:memory_is_nonzero mm/kasan/generic.c:104 [inline] RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:129 [inline] RIP: 0010:memory_is_poisoned mm/kasan/generic.c:161 [inline] RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline] RIP: 0010:kasan_check_range+0x101/0x190 mm/kasan/generic.c:189 Code: 07 49 39 d1 75 0a 45 3a 11 b8 01 00 00 00 7c 0b 44 89 c2 e8 21 ed ff ff 83 f0 01 5b 5d 41 5c c3 48 85 d2 74 4f 48 01 ea eb 09 <48> 83 c0 01 48 39 d0 74 41 80 38 00 74 f2 eb b6 41 bc 08 00 00 00 RSP: 0018:ffffc90012dcf998 EFLAGS: 00000046 RAX: fffffbfff258af1e RBX: fffffbfff258af1f RCX: ffffffff8168eda3 RDX: fffffbfff258af1f RSI: 0000000000000004 RDI: ffffffff92c578f0 RBP: fffffbfff258af1e R08: 0000000000000000 R09: fffffbfff258af1e R10: ffffffff92c578f3 R11: ffffffff8acbcbc0 R12: 0000000000000002 R13: ffff88806db38400 R14: 1ffff920025b9f42 R15: ffffffff92c578e8 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c00994e078 CR3: 000000002c250000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
<NMI> </NMI> <TASK> instrument_atomic_read include/linux/instrumented.h:68 [inline] atomic_read include/linux/atomic/atomic-instrumented.h:32 [inline] queued_spin_is_locked include/asm-generic/qspinlock.h:57 [inline] debug_spin_unlock kernel/locking/spinlock_debug.c:101 [inline] do_raw_spin_unlock+0x53/0x230 kernel/locking/spinlock_debug.c:141
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:150 [inline]
_raw_spin_unlock_irqrestore+0x22/0x70 kernel/locking/spinlock.c:194 debug_object_activate+0x349/0x540 lib/debugobjects.c:726 debug_work_activate kernel/workqueue.c:578 [inline] insert_work+0x30/0x230 kernel/workqueue.c:1650
__queue_work+0x62e/0x11d0 kernel/workqueue.c:1802
__queue_delayed_work+0x1bf/0x270 kernel/workqueue.c:1953 queue_delayed_work_on+0x106/0x130 kernel/workqueue.c:1989 queue_delayed_work include/linux/workqueue.h:563 [inline] schedule_delayed_work include/linux/workqueue.h:677 [inline] nsim_dev_trap_report_work+0x9c0/0xc80 drivers/net/netdevsim/dev.c:842 process_one_work+0x886/0x15d0 kernel/workqueue.c:2633 process_scheduled_works kernel/workqueue.c:2706 [inline] worker_thread+0x8b9/0x1290 kernel/workqueue.c:2787 kthread+0x2c6/0x3a0 kernel/kthread.c:388 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 </TASK>

CVE-2024-26680:
In the Linux kernel, the following vulnerability has been resolved:

net: atlantic: Fix DMA mapping for PTP hwts ring

Function aq_ring_hwts_rx_alloc() maps extra AQ_CFG_RXDS_DEF bytes for PTP HWTS ring but then generic aq_ring_free() does not take this into account.
Create and use a specific function to free HWTS ring to fix this issue.

Trace:
[ 215.351607] ------------[ cut here ]------------ [ 215.351612] DMA-API: atlantic 0000:4b:00.0: device driver frees DMA memory with different size [device address=0x00000000fbdd0000] [map size=34816 bytes] [unmap size=32768 bytes] [ 215.351635] WARNING: CPU: 33 PID: 10759 at kernel/dma/debug.c:988 check_unmap+0xa6f/0x2360 ...
[ 215.581176] Call Trace:
[ 215.583632] <TASK> [ 215.585745] ? show_trace_log_lvl+0x1c4/0x2df [ 215.590114] ? show_trace_log_lvl+0x1c4/0x2df [ 215.594497] ? debug_dma_free_coherent+0x196/0x210 [ 215.599305] ? check_unmap+0xa6f/0x2360 [ 215.603147] ? __warn+0xca/0x1d0 [ 215.606391] ? check_unmap+0xa6f/0x2360 [ 215.610237] ? report_bug+0x1ef/0x370 [ 215.613921] ? handle_bug+0x3c/0x70 [ 215.617423] ? exc_invalid_op+0x14/0x50 [ 215.621269] ? asm_exc_invalid_op+0x16/0x20 [ 215.625480] ? check_unmap+0xa6f/0x2360 [ 215.629331] ? mark_lock.part.0+0xca/0xa40 [ 215.633445] debug_dma_free_coherent+0x196/0x210 [ 215.638079] ? __pfx_debug_dma_free_coherent+0x10/0x10 [ 215.643242] ? slab_free_freelist_hook+0x11d/0x1d0 [ 215.648060] dma_free_attrs+0x6d/0x130 [ 215.651834] aq_ring_free+0x193/0x290 [atlantic] [ 215.656487] aq_ptp_ring_free+0x67/0x110 [atlantic] ...
[ 216.127540] ---[ end trace 6467e5964dd2640b ]--- [ 216.132160] DMA-API: Mapped at:
[ 216.132162] debug_dma_alloc_coherent+0x66/0x2f0 [ 216.132165] dma_alloc_attrs+0xf5/0x1b0 [ 216.132168] aq_ring_hwts_rx_alloc+0x150/0x1f0 [atlantic] [ 216.132193] aq_ptp_ring_alloc+0x1bb/0x540 [atlantic] [ 216.132213] aq_nic_init+0x4a1/0x760 [atlantic]

CVE-2024-26679:
In the Linux kernel, the following vulnerability has been resolved:

inet: read sk->sk_family once in inet_recv_error()

inet_recv_error() is called without holding the socket lock.

IPv6 socket could mutate to IPv4 with IPV6_ADDRFORM socket option and trigger a KCSAN warning.

CVE-2024-26634:
In the Linux kernel, the following vulnerability has been resolved:

net: fix removing a namespace with conflicting altnames

Mark reports a BUG() when a net namespace is removed.

kernel BUG at net/core/dev.c:11520!

Physical interfaces moved outside of init_net get refunded to init_net when that namespace disappears. The main interface name may get overwritten in the process if it would have conflicted. We need to also discard all conflicting altnames.
Recent fixes addressed ensuring that altnames get moved with the main interface, which surfaced this problem.

CVE-2024-26592:
In the Linux kernel, the following vulnerability has been resolved: ksmbd: fix UAF issue in ksmbd_tcp_new_connection() The race is between the handling of a new TCP connection and its disconnection.
It leads to UAF on `struct tcp_transport` in ksmbd_tcp_new_connection() function.

CVE-2024-25742:
In the Linux kernel before 6.9, an untrusted hypervisor can inject virtual interrupt 29 (#VC) at any point in time and can trigger its handler. This affects AMD SEV-SNP and AMD SEV-ES.

CVE-2023-52881:
In the Linux kernel, the following vulnerability has been resolved:

tcp: do not accept ACK of bytes we never sent

This patch is based on a detailed report and ideas from Yepeng Pan and Christian Rossow.

ACK seq validation is currently following RFC 5961 5.2 guidelines:

The ACK value is considered acceptable only if it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= SND.NXT). All incoming segments whose ACK value doesn't satisfy the above condition MUST be discarded and an ACK sent back. It needs to be noted that RFC 793 on page 72 (fifth check) says: If the ACK is a duplicate (SEG.ACK < SND.UNA), it can be ignored. If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an ACK, drop the segment, and return. The ignored above implies that the processing of the incoming data segment continues, which means the ACK value is treated as acceptable. This mitigation makes the ACK check more stringent since any ACK < SND.UNA wouldn't be accepted, instead only ACKs that are in the range ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= SND.NXT) get through.

This can be refined for new (and possibly spoofed) flows, by not accepting ACK for bytes that were never sent.

This greatly improves TCP security at a little cost.

I added a Fixes: tag to make sure this patch will reach stable trees, even if the 'blamed' patch was adhering to the RFC.

tp->bytes_acked was added in linux-4.2

Following packetdrill test (courtesy of Yepeng Pan) shows the issue at hand:

0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1024) = 0

// ---------------- Handshake ------------------- //

// when window scale is set to 14 the window size can be extended to // 65535 * (2^14) = 1073725440. Linux would accept an ACK packet // with ack number in (Server_ISN+1-1073725440. Server_ISN+1) // ,though this ack number acknowledges some data never // sent by the server.

+0 < S 0:0(0) win 65535 <mss 1400,nop,wscale 14> +0 > S. 0:0(0) ack 1 <...> +0 < . 1:1(0) ack 1 win 65535 +0 accept(3, ..., ...) = 4

// For the established connection, we send an ACK packet, // the ack packet uses ack number 1 - 1073725300 + 2^32, // where 2^32 is used to wrap around.
// Note: we used 1073725300 instead of 1073725440 to avoid possible // edge cases.
// 1 - 1073725300 + 2^32 = 3221241997

// Oops, old kernels happily accept this packet.
+0 < . 1:1001(1000) ack 3221241997 win 65535

// After the kernel fix the following will be replaced by a challenge ACK, // and prior malicious frame would be dropped.
+0 > . 1:1(0) ack 1001

CVE-2023-52872:
In the Linux kernel, the following vulnerability has been resolved:

tty: n_gsm: fix race condition in status line change on dead connections

gsm_cleanup_mux() cleans up the gsm by closing all DLCIs, stopping all timers, removing the virtual tty devices and clearing the data queues.
This procedure, however, may cause subsequent changes of the virtual modem status lines of a DLCI. More data is being added the outgoing data queue and the deleted kick timer is restarted to handle this. At this point many resources have already been removed by the cleanup procedure. Thus, a kernel panic occurs.

Fix this by proving in gsm_modem_update() that the cleanup procedure has not been started and the mux is still alive.

Note that writing to a virtual tty is already protected by checks against the DLCI specific connection state.

CVE-2023-52871:
In the Linux kernel, the following vulnerability has been resolved:

soc: qcom: llcc: Handle a second device without data corruption

Usually there is only one llcc device. But if there were a second, even a failed probe call would modify the global drv_data pointer. So check if drv_data is valid before overwriting it.

CVE-2023-52870:
In the Linux kernel, the following vulnerability has been resolved:

clk: mediatek: clk-mt6765: Add check for mtk_alloc_clk_data

Add the check for the return value of mtk_alloc_clk_data() in order to avoid NULL pointer dereference.

CVE-2023-52866:
In the Linux kernel, the following vulnerability has been resolved:

HID: uclogic: Fix user-memory-access bug in uclogic_params_ugee_v2_init_event_hooks()

When CONFIG_HID_UCLOGIC=y and CONFIG_KUNIT_ALL_TESTS=y, launch kernel and then the below user-memory-access bug occurs.

In hid_test_uclogic_params_cleanup_event_hooks(),it call uclogic_params_ugee_v2_init_event_hooks() with the first arg=NULL, so when it calls uclogic_params_ugee_v2_has_battery(), the hid_get_drvdata() will access hdev->dev with hdev=NULL, which will cause below user-memory-access.

So add a fake_device with quirks member and call hid_set_drvdata() to assign hdev->dev->driver_data which avoids the null-ptr-def bug for drvdata->quirks in uclogic_params_ugee_v2_has_battery(). After applying this patch, the below user-memory-access bug never occurs.

general protection fault, probably for non-canonical address 0xdffffc0000000329: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x0000000000001948-0x000000000000194f] CPU: 5 PID: 2189 Comm: kunit_try_catch Tainted: G B W N 6.6.0-rc2+ #30 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:uclogic_params_ugee_v2_init_event_hooks+0x87/0x600 Code: f3 f3 65 48 8b 14 25 28 00 00 00 48 89 54 24 60 31 d2 48 89 fa c7 44 24 30 00 00 00 00 48 c7 44 24 28 02 f8 02 01 48 c1 ea 03 <80> 3c 02 00 0f 85 2c 04 00 00 48 8b 9d 48 19 00 00 48 b8 00 00 00 RSP: 0000:ffff88810679fc88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000 RDX: 0000000000000329 RSI: ffff88810679fd88 RDI: 0000000000001948 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1020f639f0 R10: ffff888107b1cf87 R11: 0000000000000400 R12: 1ffff11020cf3f92 R13: ffff88810679fd88 R14: ffff888100b97b08 R15: ffff8881030bb080 FS: 0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000005286001 CR4: 0000000000770ee0 DR0: ffffffff8fdd6cf4 DR1: ffffffff8fdd6cf5 DR2: ffffffff8fdd6cf6 DR3: ffffffff8fdd6cf7 DR6: 00000000fffe0ff0 DR7: 0000000000000600 PKRU: 55555554 Call Trace:
<TASK> ? die_addr+0x3d/0xa0 ? exc_general_protection+0x144/0x220 ? asm_exc_general_protection+0x22/0x30 ? uclogic_params_ugee_v2_init_event_hooks+0x87/0x600 ? sched_clock_cpu+0x69/0x550 ? uclogic_parse_ugee_v2_desc_gen_params+0x70/0x70 ? load_balance+0x2950/0x2950 ? rcu_trc_cmpxchg_need_qs+0x67/0xa0 hid_test_uclogic_params_cleanup_event_hooks+0x9e/0x1a0 ? uclogic_params_ugee_v2_init_event_hooks+0x600/0x600 ? __switch_to+0x5cf/0xe60 ? migrate_enable+0x260/0x260 ? __kthread_parkme+0x83/0x150 ? kunit_try_run_case_cleanup+0xe0/0xe0 kunit_generic_run_threadfn_adapter+0x4a/0x90 ? kunit_try_catch_throw+0x80/0x80 kthread+0x2b5/0x380 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x2d/0x70 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork_asm+0x11/0x20 </TASK> Modules linked in:
Dumping ftrace buffer:
(ftrace buffer empty)
---[ end trace 0000000000000000 ]--- RIP: 0010:uclogic_params_ugee_v2_init_event_hooks+0x87/0x600 Code: f3 f3 65 48 8b 14 25 28 00 00 00 48 89 54 24 60 31 d2 48 89 fa c7 44 24 30 00 00 00 00 48 c7 44 24 28 02 f8 02 01 48 c1 ea 03 <80> 3c 02 00 0f 85 2c 04 00 00 48 8b 9d 48 19 00 00 48 b8 00 00 00 RSP: 0000:ffff88810679fc88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000 RDX: 0000000000000329 RSI: ffff88810679fd88 RDI: 0000000000001948 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1020f639f0 R10: ffff888107b1cf87 R11: 0000000000000400 R12: 1ffff11020cf3f92 R13: ffff88810679fd88 R14: ffff888100b97b08 R15: ffff8881030bb080 FS: 0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000005286001 CR4: 0000000000770ee0 DR0: ffffffff8fdd6cf4 DR1:
---truncated---

CVE-2023-52863:
In the Linux kernel, the following vulnerability has been resolved:

hwmon: (axi-fan-control) Fix possible NULL pointer dereference

axi_fan_control_irq_handler(), dependent on the private axi_fan_control_data structure, might be called before the hwmon device is registered. That will cause an Unable to handle kernel NULL pointer dereference error.

CVE-2023-52862:
In the Linux kernel, the following vulnerability has been resolved:

drm/amd/display: Fix null pointer dereference in error message

This patch fixes a null pointer dereference in the error message that is printed when the Display Core (DC) fails to initialize. The original message includes the DC version number, which is undefined if the DC is not initialized.

CVE-2023-52861:
In the Linux kernel, the following vulnerability has been resolved:

drm: bridge: it66121: Fix invalid connector dereference

Fix the NULL pointer dereference when no monitor is connected, and the sound card is opened from userspace.

Instead return an empty buffer (of zeroes) as the EDID information to the sound framework if there is no connector attached.

CVE-2023-52860:
In the Linux kernel, the following vulnerability has been resolved:

drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process

When tearing down a 'hisi_hns3' PMU, we mistakenly run the CPU hotplug callbacks after the device has been unregistered, leading to fireworks when we try to execute empty function callbacks within the driver:

| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 | CPU: 0 PID: 15 Comm: cpuhp/0 Tainted: G W O 5.12.0-rc4+ #1 | Hardware name: , BIOS KpxxxFPGA 1P B600 V143 04/22/2021 | pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--) | pc : perf_pmu_migrate_context+0x98/0x38c | lr : perf_pmu_migrate_context+0x94/0x38c | | Call trace:
| perf_pmu_migrate_context+0x98/0x38c | hisi_hns3_pmu_offline_cpu+0x104/0x12c [hisi_hns3_pmu]

Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been unregistered.

[will: Rewrote commit message]

CVE-2023-52859:
In the Linux kernel, the following vulnerability has been resolved:

perf: hisi: Fix use-after-free when register pmu fails

When we fail to register the uncore pmu, the pmu context may not been allocated. The error handing will call cpuhp_state_remove_instance() to call uncore pmu offline callback, which migrate the pmu context.
Since that's liable to lead to some kind of use-after-free.

Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been failed to register.

CVE-2023-52857:
In the Linux kernel, the following vulnerability has been resolved:

drm/mediatek: Fix coverity issue with unintentional integer overflow

1. Instead of multiplying 2 variable of different types. Change to assign a value of one variable and then multiply the other variable.

2. Add a int variable for multiplier calculation instead of calculating different types multiplier with dma_addr_t variable directly.

CVE-2023-52856:
In the Linux kernel, the following vulnerability has been resolved:

drm/bridge: lt8912b: Fix crash on bridge detach

The lt8912b driver, in its bridge detach function, calls drm_connector_unregister() and drm_connector_cleanup().

drm_connector_unregister() should be called only for connectors explicitly registered with drm_connector_register(), which is not the case in lt8912b.

The driver's drm_connector_funcs.destroy hook is set to drm_connector_cleanup().

Thus the driver should not call either drm_connector_unregister() nor drm_connector_cleanup() in its lt8912_bridge_detach(), as they cause a crash on bridge detach:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info:
ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info:
ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000000858f3000 [0000000000000000] pgd=0800000085918003, p4d=0800000085918003, pud=0800000085431003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP Modules linked in: tidss(-) display_connector lontium_lt8912b tc358768 panel_lvds panel_simple drm_dma_helper drm_kms_helper drm drm_panel_orientation_quirks CPU: 3 PID: 462 Comm: rmmod Tainted: G W 6.5.0-rc2+ #2 Hardware name: Toradex Verdin AM62 on Verdin Development Board (DT) pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : drm_connector_cleanup+0x78/0x2d4 [drm] lr : lt8912_bridge_detach+0x54/0x6c [lontium_lt8912b] sp : ffff800082ed3a90 x29: ffff800082ed3a90 x28: ffff0000040c1940 x27: 0000000000000000 x26: 0000000000000000 x25: dead000000000122 x24: dead000000000122 x23: dead000000000100 x22: ffff000003fb6388 x21: 0000000000000000 x20: 0000000000000000 x19: ffff000003fb6260 x18: fffffffffffe56e8 x17: 0000000000000000 x16: 0010000000000000 x15: 0000000000000038 x14: 0000000000000000 x13: ffff800081914b48 x12: 000000000000040e x11: 000000000000015a x10: ffff80008196ebb8 x9 : ffff800081914b48 x8 : 00000000ffffefff x7 : ffff0000040c1940 x6 : ffff80007aa649d0 x5 : 0000000000000000 x4 : 0000000000000001 x3 : ffff80008159e008 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace:
drm_connector_cleanup+0x78/0x2d4 [drm] lt8912_bridge_detach+0x54/0x6c [lontium_lt8912b] drm_bridge_detach+0x44/0x84 [drm] drm_encoder_cleanup+0x40/0xb8 [drm] drmm_encoder_alloc_release+0x1c/0x30 [drm] drm_managed_release+0xac/0x148 [drm] drm_dev_put.part.0+0x88/0xb8 [drm] devm_drm_dev_init_release+0x14/0x24 [drm] devm_action_release+0x14/0x20 release_nodes+0x5c/0x90 devres_release_all+0x8c/0xe0 device_unbind_cleanup+0x18/0x68 device_release_driver_internal+0x208/0x23c driver_detach+0x4c/0x94 bus_remove_driver+0x70/0xf4 driver_unregister+0x30/0x60 platform_driver_unregister+0x14/0x20 tidss_platform_driver_exit+0x18/0xb2c [tidss]
__arm64_sys_delete_module+0x1a0/0x2b4 invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0x60/0x10c do_el0_svc_compat+0x1c/0x40 el0_svc_compat+0x40/0xac el0t_32_sync_handler+0xb0/0x138 el0t_32_sync+0x194/0x198 Code: 9104a276 f2fbd5b7 aa0203e1 91008af8 (f85c0420)

CVE-2023-52855:
In the Linux kernel, the following vulnerability has been resolved:

usb: dwc2: fix possible NULL pointer dereference caused by driver concurrency

In _dwc2_hcd_urb_enqueue(), urb->hcpriv = NULL is executed without holding the lock hsotg->lock. In _dwc2_hcd_urb_dequeue():

spin_lock_irqsave(&hsotg->lock, flags);
...
if (!urb->hcpriv) { dev_dbg(hsotg->dev, ## urb->hcpriv is NULL ##\n);
goto out;
} rc = dwc2_hcd_urb_dequeue(hsotg, urb->hcpriv); // Use urb->hcpriv ...
out:
spin_unlock_irqrestore(&hsotg->lock, flags);

When _dwc2_hcd_urb_enqueue() and _dwc2_hcd_urb_dequeue() are concurrently executed, the NULL check of urb->hcpriv can be executed before urb->hcpriv = NULL. After urb->hcpriv is NULL, it can be used in the function call to dwc2_hcd_urb_dequeue(), which can cause a NULL pointer dereference.

This possible bug is found by an experimental static analysis tool developed by myself. This tool analyzes the locking APIs to extract function pairs that can be concurrently executed, and then analyzes the instructions in the paired functions to identify possible concurrency bugs including data races and atomicity violations. The above possible bug is reported, when my tool analyzes the source code of Linux 6.5.

To fix this possible bug, urb->hcpriv = NULL should be executed with holding the lock hsotg->lock. After using this patch, my tool never reports the possible bug, with the kernelconfiguration allyesconfig for x86_64. Because I have no associated hardware, I cannot test the patch in runtime testing, and just verify it according to the code logic.

CVE-2023-52853:
In the Linux kernel, the following vulnerability has been resolved:

hid: cp2112: Fix duplicate workqueue initialization

Previously the cp2112 driver called INIT_DELAYED_WORK within cp2112_gpio_irq_startup, resulting in duplicate initilizations of the workqueue on subsequent IRQ startups following an initial request. This resulted in a warning in set_work_data in workqueue.c, as well as a rare NULL dereference within process_one_work in workqueue.c.

Initialize the workqueue within _probe instead.

CVE-2023-52851:
In the Linux kernel, the following vulnerability has been resolved:

IB/mlx5: Fix init stage error handling to avoid double free of same QP and UAF

In the unlikely event that workqueue allocation fails and returns NULL in mlx5_mkey_cache_init(), delete the call to mlx5r_umr_resource_cleanup() (which frees the QP) in mlx5_ib_stage_post_ib_reg_umr_init(). This will avoid attempted double free of the same QP when __mlx5_ib_add() does its cleanup.

Resolves a splat:

Syzkaller reported a UAF in ib_destroy_qp_user

workqueue: Failed to create a rescuer kthread for wq mkey_cache: -EINTR infiniband mlx5_0: mlx5_mkey_cache_init:981:(pid 1642):
failed to create work queue infiniband mlx5_0: mlx5_ib_stage_post_ib_reg_umr_init:4075:(pid 1642):
mr cache init failed -12 ================================================================== BUG: KASAN: slab-use-after-free in ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073) Read of size 8 at addr ffff88810da310a8 by task repro_upstream/1642

Call Trace:
<TASK> kasan_report (mm/kasan/report.c:590) ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073) mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4178) mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) ...
</TASK>

Allocated by task 1642:
__kmalloc (./include/linux/kasan.h:198 mm/slab_common.c:1026 mm/slab_common.c:1039) create_qp (./include/linux/slab.h:603 ./include/linux/slab.h:720 ./include/rdma/ib_verbs.h:2795 drivers/infiniband/core/verbs.c:1209) ib_create_qp_kernel (drivers/infiniband/core/verbs.c:1347) mlx5r_umr_resource_init (drivers/infiniband/hw/mlx5/umr.c:164) mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4070)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168) mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) ...

Freed by task 1642:
__kmem_cache_free (mm/slub.c:1826 mm/slub.c:3809 mm/slub.c:3822) ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2112) mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198) mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4076 drivers/infiniband/hw/mlx5/main.c:4065)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168) mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) ...

CVE-2023-52850:
In the Linux kernel, the following vulnerability has been resolved:

media: hantro: Check whether reset op is defined before use

The i.MX8MM/N/P does not define the .reset op since reset of the VPU is done by genpd. Check whether the .reset op is defined before calling it to avoid NULL pointer dereference.

Note that the Fixes tag is set to the commit which removed the reset op from i.MX8M Hantro G2 implementation, this is because before this commit all the implementations did define the .reset op.

CVE-2023-52849:
In the Linux kernel, the following vulnerability has been resolved:

cxl/mem: Fix shutdown order

Ira reports that removing cxl_mock_mem causes a crash with the following trace:

BUG: kernel NULL pointer dereference, address: 0000000000000044 [..] RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core] [..] Call Trace:
<TASK> cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x29/0x40 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 device_unregister+0x13/0x60 devm_release_action+0x4d/0x90 ? __pfx_unregister_port+0x10/0x10 [cxl_core] delete_endpoint+0x121/0x130 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 ? lock_release+0x142/0x290 cdev_device_del+0x15/0x50 cxl_memdev_unregister+0x54/0x70 [cxl_core]

This crash is due to the clearing out the cxl_memdev's driver context (@cxlds) before the subsystem is done with it. This is ultimately due to the region(s), that this memdev is a member, being torn down and expecting to be able to de-reference @cxlds, like here:

static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) ...
if (cxlds->rcd) goto endpoint_reset;
...

Fix it by keeping the driver context valid until memdev-device unregistration, and subsequently the entire stack of related dependencies, unwinds.

CVE-2023-52848:
In the Linux kernel, the following vulnerability has been resolved:

f2fs: fix to drop meta_inode's page cache in f2fs_put_super()

syzbot reports a kernel bug as below:

F2FS-fs (loop1): detect filesystem reference count leak during umount, type: 10, count: 1 kernel BUG at fs/f2fs/super.c:1639! CPU: 0 PID: 15451 Comm: syz-executor.1 Not tainted 6.5.0-syzkaller-09338-ge0152e7481c6 #0 RIP: 0010:f2fs_put_super+0xce1/0xed0 fs/f2fs/super.c:1639 Call Trace:
generic_shutdown_super+0x161/0x3c0 fs/super.c:693 kill_block_super+0x3b/0x70 fs/super.c:1646 kill_f2fs_super+0x2b7/0x3d0 fs/f2fs/super.c:4879 deactivate_locked_super+0x9a/0x170 fs/super.c:481 deactivate_super+0xde/0x100 fs/super.c:514 cleanup_mnt+0x222/0x3d0 fs/namespace.c:1254 task_work_run+0x14d/0x240 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x210/0x240 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x60 kernel/entry/common.c:296 do_syscall_64+0x44/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd

In f2fs_put_super(), it tries to do sanity check on dirty and IO reference count of f2fs, once there is any reference count leak, it will trigger panic.

The root case is, during f2fs_put_super(), if there is any IO error in f2fs_wait_on_all_pages(), we missed to truncate meta_inode's page cache later, result in panic, fix this case.

CVE-2023-52847:
In the Linux kernel, the following vulnerability has been resolved:

media: bttv: fix use after free error due to btv->timeout timer

There may be some a race condition between timer function bttv_irq_timeout and bttv_remove. The timer is setup in probe and there is no timer_delete operation in remove function. When it hit kfree btv, the function might still be invoked, which will cause use after free bug.

This bug is found by static analysis, it may be false positive.

Fix it by adding del_timer_sync invoking to the remove function.

cpu0 cpu1 bttv_probe
->timer_setup
->bttv_set_dma
->mod_timer;
bttv_remove
->kfree(btv);
->bttv_irq_timeout
->USE btv

CVE-2023-52844:
In the Linux kernel, the following vulnerability has been resolved:

media: vidtv: psi: Add check for kstrdup

Add check for the return value of kstrdup() and return the error if it fails in order to avoid NULL pointer dereference.

CVE-2023-52841:
In the Linux kernel, the following vulnerability has been resolved:

media: vidtv: mux: Add check and kfree for kstrdup

Add check for the return value of kstrdup() and return the error if it fails in order to avoid NULL pointer dereference.
Moreover, use kfree() in the later error handling in order to avoid memory leak.

CVE-2023-52840:
In the Linux kernel, the following vulnerability has been resolved:

Input: synaptics-rmi4 - fix use after free in rmi_unregister_function()

The put_device() calls rmi_release_function() which frees fn so the dereference on the next line fn->num_of_irqs is a use after free.
Move the put_device() to the end to fix this.

CVE-2023-52839:
In the Linux kernel, the following vulnerability has been resolved:

drivers: perf: Do not broadcast to other cpus when starting a counter

This command:

$ perf record -e cycles:k -e instructions:k -c 10000 -m 64M dd if=/dev/zero of=/dev/null count=1000

gives rise to this kernel warning:

[ 444.364395] WARNING: CPU: 0 PID: 104 at kernel/smp.c:775 smp_call_function_many_cond+0x42c/0x436 [ 444.364515] Modules linked in:
[ 444.364657] CPU: 0 PID: 104 Comm: perf-exec Not tainted 6.6.0-rc6-00051-g391df82e8ec3-dirty #73 [ 444.364771] Hardware name: riscv-virtio,qemu (DT) [ 444.364868] epc : smp_call_function_many_cond+0x42c/0x436 [ 444.364917] ra : on_each_cpu_cond_mask+0x20/0x32 [ 444.364948] epc : ffffffff8009f9e0 ra : ffffffff8009fa5a sp : ff20000000003800 [ 444.364966] gp : ffffffff81500aa0 tp : ff60000002b83000 t0 : ff200000000038c0 [ 444.364982] t1 : ffffffff815021f0 t2 : 000000000000001f s0 : ff200000000038b0 [ 444.364998] s1 : ff60000002c54d98 a0 : ff60000002a73940 a1 : 0000000000000000 [ 444.365013] a2 : 0000000000000000 a3 : 0000000000000003 a4 : 0000000000000100 [ 444.365029] a5 : 0000000000010100 a6 : 0000000000f00000 a7 : 0000000000000000 [ 444.365044] s2 : 0000000000000000 s3 : ffffffffffffffff s4 : ff60000002c54d98 [ 444.365060] s5 : ffffffff81539610 s6 : ffffffff80c20c48 s7 : 0000000000000000 [ 444.365075] s8 : 0000000000000000 s9 : 0000000000000001 s10: 0000000000000001 [ 444.365090] s11: ffffffff80099394 t3 : 0000000000000003 t4 : 00000000eac0c6e6 [ 444.365104] t5 : 0000000400000000 t6 : ff60000002e010d0 [ 444.365120] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 444.365226] [<ffffffff8009f9e0>] smp_call_function_many_cond+0x42c/0x436 [ 444.365295] [<ffffffff8009fa5a>] on_each_cpu_cond_mask+0x20/0x32 [ 444.365311] [<ffffffff806e90dc>] pmu_sbi_ctr_start+0x7a/0xaa [ 444.365327] [<ffffffff806e880c>] riscv_pmu_start+0x48/0x66 [ 444.365339] [<ffffffff8012111a>] perf_adjust_freq_unthr_context+0x196/0x1ac [ 444.365356] [<ffffffff801237aa>] perf_event_task_tick+0x78/0x8c [ 444.365368] [<ffffffff8003faf4>] scheduler_tick+0xe6/0x25e [ 444.365383] [<ffffffff8008a042>] update_process_times+0x80/0x96 [ 444.365398] [<ffffffff800991ec>] tick_sched_handle+0x26/0x52 [ 444.365410] [<ffffffff800993e4>] tick_sched_timer+0x50/0x98 [ 444.365422] [<ffffffff8008a6aa>] __hrtimer_run_queues+0x126/0x18a [ 444.365433] [<ffffffff8008b350>] hrtimer_interrupt+0xce/0x1da [ 444.365444] [<ffffffff806cdc60>] riscv_timer_interrupt+0x30/0x3a [ 444.365457] [<ffffffff8006afa6>] handle_percpu_devid_irq+0x80/0x114 [ 444.365470] [<ffffffff80065b82>] generic_handle_domain_irq+0x1c/0x2a [ 444.365483] [<ffffffff8045faec>] riscv_intc_irq+0x2e/0x46 [ 444.365497] [<ffffffff808a9c62>] handle_riscv_irq+0x4a/0x74 [ 444.365521] [<ffffffff808aa760>] do_irq+0x7c/0x7e [ 444.365796] ---[ end trace 0000000000000000 ]---

That's because the fix in commit 3fec323339a4 (drivers: perf: Fix panic in riscv SBI mmap support) was wrong since there is no need to broadcast to other cpus when starting a counter, that's only needed in mmap when the counters could have already been started on other cpus, so simply remove this broadcast.

CVE-2023-52838:
In the Linux kernel, the following vulnerability has been resolved:

fbdev: imsttfb: fix a resource leak in probe

I've re-written the error handling but the bug is that if init_imstt() fails we need to call iounmap(par->cmap_regs).

CVE-2023-52837:
In the Linux kernel, the following vulnerability has been resolved:

nbd: fix uaf in nbd_open

Commit 4af5f2e03013 (nbd: use blk_mq_alloc_disk and blk_cleanup_disk) cleans up disk by blk_cleanup_disk() and it won't set disk->private_data as NULL as before. UAF may be triggered in nbd_open() if someone tries to open nbd device right after nbd_put() since nbd has been free in nbd_dev_remove().

Fix this by implementing ->free_disk and free private data in it.

CVE-2023-52833:
In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: btusb: Add date->evt_skb is NULL check

fix crash because of null pointers

[ 6104.969662] BUG: kernel NULL pointer dereference, address: 00000000000000c8 [ 6104.969667] #PF: supervisor read access in kernel mode [ 6104.969668] #PF: error_code(0x0000) - not-present page [ 6104.969670] PGD 0 P4D 0 [ 6104.969673] Oops: 0000 [#1] SMP NOPTI [ 6104.969684] RIP: 0010:btusb_mtk_hci_wmt_sync+0x144/0x220 [btusb] [ 6104.969688] RSP: 0018:ffffb8d681533d48 EFLAGS: 00010246 [ 6104.969689] RAX: 0000000000000000 RBX: ffff8ad560bb2000 RCX: 0000000000000006 [ 6104.969691] RDX: 0000000000000000 RSI: ffffb8d681533d08 RDI: 0000000000000000 [ 6104.969692] RBP: ffffb8d681533d70 R08: 0000000000000001 R09: 0000000000000001 [ 6104.969694] R10: 0000000000000001 R11: 00000000fa83b2da R12: ffff8ad461d1d7c0 [ 6104.969695] R13: 0000000000000000 R14: ffff8ad459618c18 R15: ffffb8d681533d90 [ 6104.969697] FS: 00007f5a1cab9d40(0000) GS:ffff8ad578200000(0000) knlGS:00000 [ 6104.969699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6104.969700] CR2: 00000000000000c8 CR3: 000000018620c001 CR4: 0000000000760ef0 [ 6104.969701] PKRU: 55555554 [ 6104.969702] Call Trace:
[ 6104.969708] btusb_mtk_shutdown+0x44/0x80 [btusb] [ 6104.969732] hci_dev_do_close+0x470/0x5c0 [bluetooth] [ 6104.969748] hci_rfkill_set_block+0x56/0xa0 [bluetooth] [ 6104.969753] rfkill_set_block+0x92/0x160 [ 6104.969755] rfkill_fop_write+0x136/0x1e0 [ 6104.969759] __vfs_write+0x18/0x40 [ 6104.969761] vfs_write+0xdf/0x1c0 [ 6104.969763] ksys_write+0xb1/0xe0 [ 6104.969765] __x64_sys_write+0x1a/0x20 [ 6104.969769] do_syscall_64+0x51/0x180 [ 6104.969771] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 6104.969773] RIP: 0033:0x7f5a21f18fef [ 6104.9] RSP: 002b:00007ffeefe39010 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 6104.969780] RAX: ffffffffffffffda RBX: 000055c10a7560a0 RCX: 00007f5a21f18fef [ 6104.969781] RDX: 0000000000000008 RSI: 00007ffeefe39060 RDI: 0000000000000012 [ 6104.969782] RBP: 00007ffeefe39060 R08: 0000000000000000 R09: 0000000000000017 [ 6104.969784] R10: 00007ffeefe38d97 R11: 0000000000000293 R12: 0000000000000002 [ 6104.969785] R13: 00007ffeefe39220 R14: 00007ffeefe391a0 R15: 000055c10a72acf0

CVE-2023-52826:
In the Linux kernel, the following vulnerability has been resolved:

drm/panel/panel-tpo-tpg110: fix a possible null pointer dereference

In tpg110_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd.

CVE-2023-52825:
In the Linux kernel, the following vulnerability has been resolved:

drm/amdkfd: Fix a race condition of vram buffer unref in svm code

prange->svm_bo unref can happen in both mmu callback and a callback after migrate to system ram. Both are async call in different tasks. Sync svm_bo unref operation to avoid random use-after-free.

CVE-2023-52822:
In the Linux kernel, the following vulnerability has been resolved:

drm: vmwgfx_surface.c: copy user-array safely

Currently, there is no overflow-check with memdup_user().

Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.

CVE-2023-52821:
In the Linux kernel, the following vulnerability has been resolved:

drm/panel: fix a possible null pointer dereference

In versatile_panel_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd.

CVE-2023-52820:
In the Linux kernel, the following vulnerability has been resolved:

drm_lease.c: copy user-array safely

Currently, there is no overflow-check with memdup_user().

Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.

CVE-2023-52806:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: hda: Fix possible null-ptr-deref when assigning a stream

While AudioDSP drivers assign streams exclusively of HOST or LINK type, nothing blocks a user to attempt to assign a COUPLED stream. As supplied substream instance may be a stub, what is the case when code-loading, such scenario ends with null-ptr-deref.

CVE-2023-52805:
In the Linux kernel, the following vulnerability has been resolved:

jfs: fix array-index-out-of-bounds in diAlloc

Currently there is not check against the agno of the iag while allocating new inodes to avoid fragmentation problem. Added the check which is required.

CVE-2023-52799:
In the Linux kernel, the following vulnerability has been resolved:

jfs: fix array-index-out-of-bounds in dbFindLeaf

Currently while searching for dmtree_t for sufficient free blocks there is an array out of bounds while getting element in tp->dm_stree. To add the required check for out of bound we first need to determine the type of dmtree. Thus added an extra parameter to dbFindLeaf so that the type of tree can be determined and the required check can be applied.

CVE-2023-52795:
In the Linux kernel, the following vulnerability has been resolved:

vhost-vdpa: fix use after free in vhost_vdpa_probe()

The put_device() calls vhost_vdpa_release_dev() which calls ida_simple_remove() and frees v. So this call to ida_simple_remove() is a use after free and a double free.

CVE-2023-52792:
In the Linux kernel, the following vulnerability has been resolved:

cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails

Commit 5e42bcbc3fef (cxl/region: decrement ->nr_targets on error in cxl_region_attach()) tried to avoid 'eiw' initialization errors when
->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed.

Commit 86987c766276 (cxl/region: Cleanup target list on attach error) extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by:
Commit 8d4285425714 (cxl/region: Fix port setup uninitialized variable warnings) which was merged a few days after 5e42bcbc3fef.

But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused:

1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch

Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted.

The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow.

Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region.

CVE-2023-52791:
In the Linux kernel, the following vulnerability has been resolved:

i2c: core: Run atomic i2c xfer when !preemptible

Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is disabled. However, non-atomic i2c transfers require preemption (e.g. in wait_for_completion() while waiting for the DMA).

panic() calls preempt_disable_notrace() before calling emergency_restart(). Therefore, if an i2c device is used for the restart, the xfer should be atomic. This avoids warnings like:

[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ...
[ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ...
[ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c

Use !preemptible() instead, which is basically the same check as pre-v5.2.

CVE-2023-52784:
In the Linux kernel, the following vulnerability has been resolved:

bonding: stop the device in bond_setup_by_slave()

Commit 9eed321cde22 (net: lapbether: only support ethernet devices) has been able to keep syzbot away from net/lapb, until today.

In the following splat [1], the issue is that a lapbether device has been created on a bonding device without members. Then adding a non ARPHRD_ETHER member forced the bonding master to change its type.

The fix is to make sure we call dev_close() in bond_setup_by_slave() so that the potential linked lapbether devices (or any other devices having assumptions on the physical device) are removed.

A similar bug has been addressed in commit 40baec225765 (bonding: fix panic on non-ARPHRD_ETHER enslave failure)

[1] skbuff: skb_under_panic: text:ffff800089508810 len:44 put:40 head:ffff0000c78e7c00 data:ffff0000c78e7bea tail:0x16 end:0x140 dev:bond0 kernel BUG at net/core/skbuff.c:192 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in:
CPU: 0 PID: 6007 Comm: syz-executor383 Not tainted 6.6.0-rc3-syzkaller-gbf6547d8715b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_panic net/core/skbuff.c:188 [inline] pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 lr : skb_panic net/core/skbuff.c:188 [inline] lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 sp : ffff800096a06aa0 x29: ffff800096a06ab0 x28: ffff800096a06ba0 x27: dfff800000000000 x26: ffff0000ce9b9b50 x25: 0000000000000016 x24: ffff0000c78e7bea x23: ffff0000c78e7c00 x22: 000000000000002c x21: 0000000000000140 x20: 0000000000000028 x19: ffff800089508810 x18: ffff800096a06100 x17: 0000000000000000 x16: ffff80008a629a3c x15: 0000000000000001 x14: 1fffe00036837a32 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000201 x10: 0000000000000000 x9 : cb50b496c519aa00 x8 : cb50b496c519aa00 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff800096a063b8 x4 : ffff80008e280f80 x3 : ffff8000805ad11c x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086 Call trace:
skb_panic net/core/skbuff.c:188 [inline] skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 skb_push+0xf0/0x108 net/core/skbuff.c:2446 ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1384 dev_hard_header include/linux/netdevice.h:3136 [inline] lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149 lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251
__lapb_disconnect_request+0x9c/0x17c net/lapb/lapb_iface.c:326 lapb_device_event+0x288/0x4e0 net/lapb/lapb_iface.c:492 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1970 [inline] call_netdevice_notifiers_extack net/core/dev.c:2008 [inline] call_netdevice_notifiers net/core/dev.c:2022 [inline]
__dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508 dev_close_many+0x1e0/0x470 net/core/dev.c:1559 dev_close+0x174/0x250 net/core/dev.c:1585 lapbeth_device_event+0x2e4/0x958 drivers/net/wan/lapbether.c:466 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1970 [inline] call_netdevice_notifiers_extack net/core/dev.c:2008 [inline] call_netdevice_notifiers net/core/dev.c:2022 [inline]
__dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508 dev_close_many+0x1e0/0x470 net/core/dev.c:1559 dev_close+0x174/0x250 net/core/dev.c:1585 bond_enslave+0x2298/0x30cc drivers/net/bonding/bond_main.c:2332 bond_do_ioctl+0x268/0xc64 drivers/net/bonding/bond_main.c:4539 dev_ifsioc+0x754/0x9ac dev_ioctl+0x4d8/0xd34 net/core/dev_ioctl.c:786 sock_do_ioctl+0x1d4/0x2d0 net/socket.c:1217 sock_ioctl+0x4e8/0x834 net/socket.c:1322 vfs_ioctl fs/ioctl.c:51 [inline]
__do_
---truncated---

CVE-2023-52760:
In the Linux kernel, the following vulnerability has been resolved:

gfs2: Fix slab-use-after-free in gfs2_qd_dealloc

In gfs2_put_super(), whether withdrawn or not, the quota should be cleaned up by gfs2_quota_cleanup().

Otherwise, struct gfs2_sbd will be freed before gfs2_qd_dealloc (rcu callback) has run for all gfs2_quota_data objects, resulting in use-after-free.

Also, gfs2_destroy_threads() and gfs2_quota_cleanup() is already called by gfs2_make_fs_ro(), so in gfs2_put_super(), after calling gfs2_make_fs_ro(), there is no need to call them again.

CVE-2023-52757:
In the Linux kernel, the following vulnerability has been resolved:

smb: client: fix potential deadlock when releasing mids

All release_mid() callers seem to hold a reference of @mid so there is no need to call kref_put(&mid->refcount, __release_mid) under @server->mid_lock spinlock. If they don't, then an use-after-free bug would have occurred anyways.

By getting rid of such spinlock also fixes a potential deadlock as shown below

CPU 0 CPU 1
------------------------------------------------------------------ cifs_demultiplex_thread() cifs_debug_data_proc_show() release_mid() spin_lock(&server->mid_lock);
spin_lock(&cifs_tcp_ses_lock) spin_lock(&server->mid_lock)
__release_mid() smb2_find_smb_tcon() spin_lock(&cifs_tcp_ses_lock) *deadlock*

CVE-2023-52755:
In the Linux kernel, the following vulnerability has been resolved:

ksmbd: fix slab out of bounds write in smb_inherit_dacl()

slab out-of-bounds write is caused by that offsets is bigger than pntsd allocation size. This patch add the check to validate 3 offsets using allocation size.

CVE-2023-52752:
In the Linux kernel, the following vulnerability has been resolved:

smb: client: fix use-after-free bug in cifs_debug_data_proc_show()

Skip SMB sessions that are being teared down (e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show() to avoid use-after-free in @ses.

This fixes the following GPF when reading from /proc/fs/cifs/DebugData while mounting and umounting

[ 816.251274] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI ...
[ 816.260138] Call Trace:
[ 816.260329] <TASK> [ 816.260499] ? die_addr+0x36/0x90 [ 816.260762] ? exc_general_protection+0x1b3/0x410 [ 816.261126] ? asm_exc_general_protection+0x26/0x30 [ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs] [ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs] [ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs] [ 816.262689] ? seq_read_iter+0x379/0x470 [ 816.262995] seq_read_iter+0x118/0x470 [ 816.263291] proc_reg_read_iter+0x53/0x90 [ 816.263596] ? srso_alias_return_thunk+0x5/0x7f [ 816.263945] vfs_read+0x201/0x350 [ 816.264211] ksys_read+0x75/0x100 [ 816.264472] do_syscall_64+0x3f/0x90 [ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 816.265135] RIP: 0033:0x7fd5e669d381

CVE-2023-52751:
In the Linux kernel, the following vulnerability has been resolved:

smb: client: fix use-after-free in smb2_query_info_compound()

The following UAF was triggered when running fstests generic/072 with KASAN enabled against Windows Server 2022 and mount options 'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm'

BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs] Read of size 8 at addr ffff888014941048 by task xfs_io/27534

CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace:
dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? __phys_addr+0x46/0x90 kasan_report+0xda/0x110 ? smb2_query_info_compound+0x423/0x6d0 [cifs] ? smb2_query_info_compound+0x423/0x6d0 [cifs] smb2_query_info_compound+0x423/0x6d0 [cifs] ? __pfx_smb2_query_info_compound+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __stack_depot_save+0x39/0x480 ? kasan_save_stack+0x33/0x60 ? kasan_set_track+0x25/0x30 ? ____kasan_slab_free+0x126/0x170 smb2_queryfs+0xc2/0x2c0 [cifs] ? __pfx_smb2_queryfs+0x10/0x10 [cifs] ? __pfx___lock_acquire+0x10/0x10 smb311_queryfs+0x210/0x220 [cifs] ? __pfx_smb311_queryfs+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __lock_acquire+0x480/0x26c0 ? lock_release+0x1ed/0x640 ? srso_alias_return_thunk+0x5/0x7f ? do_raw_spin_unlock+0x9b/0x100 cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0
__do_sys_fstatfs+0x7f/0xe0 ? __pfx___do_sys_fstatfs+0x10/0x10 ? srso_alias_return_thunk+0x5/0x7f ? lockdep_hardirqs_on_prepare+0x136/0x200 ? srso_alias_return_thunk+0x5/0x7f do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Allocated by task 27534:
kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30
__kasan_kmalloc+0x8f/0xa0 open_cached_dir+0x71b/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] smb311_queryfs+0x210/0x220 [cifs] cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0
__do_sys_fstatfs+0x7f/0xe0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Freed by task 27534:
kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50
____kasan_slab_free+0x126/0x170 slab_free_freelist_hook+0xd0/0x1e0
__kmem_cache_free+0x9d/0x1b0 open_cached_dir+0xff5/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs]

This is a race between open_cached_dir() and cached_dir_lease_break() where the cache entry for the open directory handle receives a lease break while creating it. And before returning from open_cached_dir(), we put the last reference of the new @cfid because of !@cfid->has_lease.

Besides the UAF, while running xfstests a lot of missed lease breaks have been noticed in tests that run several concurrent statfs(2) calls on those cached fids

CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame...
CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1...
CIFS: VFS: \\w22-root1.gandalf.test smb buf 00000000715bfe83 len 108 CIFS: VFS: Dump pending requests:
CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame...
CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1...
CIFS: VFS: \\w22-root1.gandalf.test smb buf 000000005aa7316e len 108 ...

To fix both, in open_cached_dir() ensure that @cfid->has_lease is set right before sending out compounded request so that any potential lease break will be get processed by demultiplex thread while we're still caching @cfid. And, if open failed for some reason, re-check @cfid->has_lease to decide whether or not put lease reference.

CVE-2023-52748:
In the Linux kernel, the following vulnerability has been resolved:

f2fs: avoid format-overflow warning

With gcc and W=1 option, there's a warning like this:

fs/f2fs/compress.c: In function f2fs_init_page_array_cache':
fs/f2fs/compress.c:1984:47: error: %u' directive writing between 1 and 7 bytes into a region of size between 5 and 8 [-Werror=format-overflow=] 1984 | sprintf(slab_name, f2fs_page_array_entry-%u:%u, MAJOR(dev), MINOR(dev));
| ^~

String f2fs_page_array_entry-%u:%u can up to 35. The first %u can up to 4 and the second %u can up to 7, so total size is 24 + 4 + 7 = 35.
slab_name's size should be 35 rather than 32.

CVE-2023-52699:
In the Linux kernel, the following vulnerability has been resolved:

sysv: don't call sb_bread() with pointers_lock held

syzbot is reporting sleep in atomic context in SysV filesystem [1], for sb_bread() is called with rw_spinlock held.

A write_lock(&pointers_lock) => read_lock(&pointers_lock) deadlock bug and a sb_bread() with write_lock(&pointers_lock) bug were introduced by Replace BKL for chain locking with sysvfs-private rwlock in Linux 2.5.12.

Then, [PATCH] err1-40: sysvfs locking fix in Linux 2.6.8 fixed the former bug by moving pointers_lock lock to the callers, but instead introduced a sb_bread() with read_lock(&pointers_lock) bug (which made this problem easier to hit).

Al Viro suggested that why not to do like get_branch()/get_block()/ find_shared() in Minix filesystem does. And doing like that is almost a revert of [PATCH] err1-40: sysvfs locking fix except that get_branch() from with find_shared() is called without write_lock(&pointers_lock).

CVE-2023-52697:
In the Linux kernel, the following vulnerability has been resolved:

ASoC: Intel: sof_sdw_rt_sdca_jack_common: ctx->headset_codec_dev = NULL

sof_sdw_rt_sdca_jack_exit() are used by different codecs, and some of them use the same dai name.
For example, rt712 and rt713 both use rt712-sdca-aif1 and sof_sdw_rt_sdca_jack_exit().
As a result, sof_sdw_rt_sdca_jack_exit() will be called twice by mc_dailink_exit_loop(). Set ctx->headset_codec_dev = NULL; after put_device(ctx->headset_codec_dev); to avoid ctx->headset_codec_dev being put twice.

CVE-2023-52694:
In the Linux kernel, the following vulnerability has been resolved:

drm/bridge: tpd12s015: Drop buggy __exit annotation for remove function

With tpd12s015_remove() marked with __exit this function is discarded when the driver is compiled as a built-in. The result is that when the driver unbinds there is no cleanup done which results in resource leakage or worse.

CVE-2023-52693:
In the Linux kernel, the following vulnerability has been resolved:

ACPI: video: check for error while searching for backlight device parent

If acpi_get_parent() called in acpi_video_dev_register_backlight() fails, for example, because acpi_ut_acquire_mutex() fails inside acpi_get_parent), this can lead to incorrect (uninitialized) acpi_parent handle being passed to acpi_get_pci_dev() for detecting the parent pci device.

Check acpi_get_parent() result and set parent device only in case of success.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

CVE-2023-52691:
In the Linux kernel, the following vulnerability has been resolved:

drm/amd/pm: fix a double-free in si_dpm_init

When the allocation of adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries fails, amdgpu_free_extended_power_table is called to free some fields of adev.
However, when the control flow returns to si_dpm_sw_init, it goes to label dpm_failed and calls si_dpm_fini, which calls amdgpu_free_extended_power_table again and free those fields again. Thus a double-free is triggered.

CVE-2023-52687:
In the Linux kernel, the following vulnerability has been resolved:

crypto: safexcel - Add error handling for dma_map_sg() calls

Macro dma_map_sg() may return 0 on error. This patch enables checks in case of the macro failure and ensures unmapping of previously mapped buffers with dma_unmap_sg().

Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE.

CVE-2023-52685:
In the Linux kernel, the following vulnerability has been resolved:

pstore: ram_core: fix possible overflow in persistent_ram_init_ecc()

In persistent_ram_init_ecc(), on 64-bit arches DIV_ROUND_UP() will return 64-bit value since persistent_ram_zone::buffer_size has type size_t which is derived from the 64-bit *unsigned long*, while the ecc_blocks variable this value gets assigned to has (always 32-bit) *int* type. Even if that value fits into *int* type, an overflow is still possible when calculating the size_t typed ecc_total variable further below since there's no cast to any 64-bit type before multiplication. Declaring the ecc_blocks variable as *size_t* should fix this mess...

Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool.

CVE-2023-52683:
In the Linux kernel, the following vulnerability has been resolved:

ACPI: LPIT: Avoid u32 multiplication overflow

In lpit_update_residency() there is a possibility of overflow in multiplication, if tsc_khz is large enough (> UINT_MAX/1000).

Change multiplication to mul_u32_u32().

Found by Linux Verification Center (linuxtesting.org) with SVACE.

CVE-2023-52681:
In the Linux kernel, the following vulnerability has been resolved:

efivarfs: Free s_fs_info on unmount

Now that we allocate a s_fs_info struct on fs context creation, we should ensure that we free it again when the superblock goes away.

CVE-2023-52678:
In the Linux kernel, the following vulnerability has been resolved:

drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c

Before using list_first_entry, make sure to check that list is not empty, if list is empty return -ENODATA.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1347 kfd_create_indirect_link_prop() warn: can 'gpu_link' even be NULL? drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1428 kfd_add_peer_prop() warn: can 'iolink1' even be NULL? drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1433 kfd_add_peer_prop() warn: can 'iolink2' even be NULL?

CVE-2023-52672:
In the Linux kernel, the following vulnerability has been resolved:

pipe: wakeup wr_wait after setting max_usage

Commit c73be61cede5 (pipe: Add general notification queue support) a regression was introduced that would lock up resized pipes under certain conditions. See the reproducer in [1].

The commit resizing the pipe ring size was moved to a different function, doing that moved the wakeup for pipe->wr_wait before actually raising pipe->max_usage. If a pipe was full before the resize occured it would result in the wakeup never actually triggering pipe_write.

Set @max_usage and @nr_accounted before waking writers if this isn't a watch queue.

[Christian Brauner <[email protected]>: rewrite to account for watch queues]

CVE-2023-52671:
In the Linux kernel, the following vulnerability has been resolved:

drm/amd/display: Fix hang/underflow when transitioning to ODM4:1

[Why] Under some circumstances, disabling an OPTC and attempting to reclaim its OPP(s) for a different OPTC could cause a hang/underflow due to OPPs not being properly disconnected from the disabled OPTC.

[How] Ensure that all OPPs are unassigned from an OPTC when it gets disabled.

CVE-2023-52670:
In the Linux kernel, the following vulnerability has been resolved:

rpmsg: virtio: Free driver_override when rpmsg_remove()

Free driver_override when rpmsg_remove(), otherwise the following memory leak will occur:

unreferenced object 0xffff0000d55d7080 (size 128):
comm kworker/u8:2, pid 56, jiffies 4294893188 (age 214.272s) hex dump (first 32 bytes):
72 70 6d 73 67 5f 6e 73 00 00 00 00 00 00 00 00 rpmsg_ns........
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000009c94c9c1>] __kmem_cache_alloc_node+0x1f8/0x320 [<000000002300d89b>] __kmalloc_node_track_caller+0x44/0x70 [<00000000228a60c3>] kstrndup+0x4c/0x90 [<0000000077158695>] driver_set_override+0xd0/0x164 [<000000003e9c4ea5>] rpmsg_register_device_override+0x98/0x170 [<000000001c0c89a8>] rpmsg_ns_register_device+0x24/0x30 [<000000008bbf8fa2>] rpmsg_probe+0x2e0/0x3ec [<00000000e65a68df>] virtio_dev_probe+0x1c0/0x280 [<00000000443331cc>] really_probe+0xbc/0x2dc [<00000000391064b1>] __driver_probe_device+0x78/0xe0 [<00000000a41c9a5b>] driver_probe_device+0xd8/0x160 [<000000009c3bd5df>] __device_attach_driver+0xb8/0x140 [<0000000043cd7614>] bus_for_each_drv+0x7c/0xd4 [<000000003b929a36>] __device_attach+0x9c/0x19c [<00000000a94e0ba8>] device_initial_probe+0x14/0x20 [<000000003c999637>] bus_probe_device+0xa0/0xac

CVE-2023-52666:
In the Linux kernel, the following vulnerability has been resolved:

ksmbd: fix potential circular locking issue in smb2_set_ea()

smb2_set_ea() can be called in parent inode lock range.
So add get_write argument to smb2_set_ea() not to call nested mnt_want_write().

CVE-2023-52662:
In the Linux kernel, the following vulnerability has been resolved:

drm/vmwgfx: fix a memleak in vmw_gmrid_man_get_node

When ida_alloc_max fails, resources allocated before should be freed, including *res allocated by kmalloc and ttm_resource_init.

CVE-2023-52661:
In the Linux kernel, the following vulnerability has been resolved:

drm/tegra: rgb: Fix missing clk_put() in the error handling paths of tegra_dc_rgb_probe()

If clk_get_sys(..., pll_d2_out0) fails, the clk_get_sys() call must be undone.

Add the missing clk_put and a new 'put_pll_d_out0' label in the error handling path, and use it.

CVE-2023-52660:
In the Linux kernel, the following vulnerability has been resolved:

media: rkisp1: Fix IRQ handling due to shared interrupts

The driver requests the interrupts as IRQF_SHARED, so the interrupt handlers can be called at any time. If such a call happens while the ISP is powered down, the SoC will hang as the driver tries to access the ISP registers.

This can be reproduced even without the platform sharing the IRQ line:
Enable CONFIG_DEBUG_SHIRQ and unload the driver, and the board will hang.

Fix this by adding a new field, 'irqs_enabled', which is used to bail out from the interrupt handler when the ISP is not operational.

CVE-2023-52659:
In the Linux kernel, the following vulnerability has been resolved:

x86/mm: Ensure input to pfn_to_kaddr() is treated as a 64-bit type

On 64-bit platforms, the pfn_to_kaddr() macro requires that the input value is 64 bits in order to ensure that valid address bits don't get lost when shifting that input by PAGE_SHIFT to calculate the physical address to provide a virtual address for.

One such example is in pvalidate_pages() (used by SEV-SNP guests), where the GFN in the struct used for page-state change requests is a 40-bit bit-field, so attempts to pass this GFN field directly into pfn_to_kaddr() ends up causing guest crashes when dealing with addresses above the 1TB range due to the above.

Fix this issue with SEV-SNP guests, as well as any similar cases that might cause issues in current/future code, by using an inline function, instead of a macro, so that the input is implicitly cast to the expected 64-bit input type prior to performing the shift operation.

While it might be argued that the issue is on the caller side, other archs/macros have taken similar approaches to deal with instances like this, such as ARM explicitly casting the input to phys_addr_t:

e48866647b48 (ARM: 8396/1: use phys_addr_t in pfn_to_kaddr())

A C inline function is even better though.

[ mingo: Refined the changelog some more & added __always_inline. ]

CVE-2023-52657:
In the Linux kernel, the following vulnerability has been resolved:

Revert drm/amd/pm: resolve reboot exception for si oland

This reverts commit e490d60a2f76bff636c68ce4fe34c1b6c34bbd86.

This causes hangs on SI when DC is enabled and errors on driver reboot and power off cycles.

CVE-2023-52652:
In the Linux kernel, the following vulnerability has been resolved:

NTB: fix possible name leak in ntb_register_device()

If device_register() fails in ntb_register_device(), the device name allocated by dev_set_name() should be freed. As per the comment in device_register(), callers should use put_device() to give up the reference in the error path. So fix this by calling put_device() in the error path so that the name can be freed in kobject_cleanup().

As a result of this, put_device() in the error path of ntb_register_device() is removed and the actual error is returned.

[mani: reworded commit message]

CVE-2023-52649:
In the Linux kernel, the following vulnerability has been resolved:

drm/vkms: Avoid reading beyond LUT array

When the floor LUT index (drm_fixp2int(lut_index) is the last index of the array the ceil LUT index will point to an entry beyond the array. Make sure we guard against it and use the value of the floor LUT index.

v3:
- Drop bits from commit description that didn't contribute anything of value

CVE-2023-52648:
In the Linux kernel, the following vulnerability has been resolved:

drm/vmwgfx: Unmap the surface before resetting it on a plane state

Switch to a new plane state requires unreferencing of all held surfaces.
In the work required for mob cursors the mapped surfaces started being cached but the variable indicating whether the surface is currently mapped was not being reset. This leads to crashes as the duplicated state, incorrectly, indicates the that surface is mapped even when no surface is present. That's because after unreferencing the surface it's perfectly possible for the plane to be backed by a bo instead of a surface.

Reset the surface mapped flag when unreferencing the plane state surface to fix null derefs in cleanup. Fixes crashes in KDE KWin 6.0 on Wayland:

Oops: 0000 [#1] PREEMPT SMP PTI CPU: 4 PID: 2533 Comm: kwin_wayland Not tainted 6.7.0-rc3-vmwgfx #2 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 RIP: 0010:vmw_du_cursor_plane_cleanup_fb+0x124/0x140 [vmwgfx] Code: 00 00 00 75 3a 48 83 c4 10 5b 5d c3 cc cc cc cc 48 8b b3 a8 00 00 00 48 c7 c7 99 90 43 c0 e8 93 c5 db ca 48 8b 83 a8 00 00 00 <48> 8b 78 28 e8 e3 f> RSP: 0018:ffffb6b98216fa80 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff969d84cdcb00 RCX: 0000000000000027 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff969e75f21600 RBP: ffff969d4143dc50 R08: 0000000000000000 R09: ffffb6b98216f920 R10: 0000000000000003 R11: ffff969e7feb3b10 R12: 0000000000000000 R13: 0000000000000000 R14: 000000000000027b R15: ffff969d49c9fc00 FS: 00007f1e8f1b4180(0000) GS:ffff969e75f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000028 CR3: 0000000104006004 CR4: 00000000003706f0 Call Trace:
<TASK> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x7f/0x180 ? asm_exc_page_fault+0x26/0x30 ? vmw_du_cursor_plane_cleanup_fb+0x124/0x140 [vmwgfx] drm_atomic_helper_cleanup_planes+0x9b/0xc0 commit_tail+0xd1/0x130 drm_atomic_helper_commit+0x11a/0x140 drm_atomic_commit+0x97/0xd0 ? __pfx___drm_printfn_info+0x10/0x10 drm_atomic_helper_update_plane+0xf5/0x160 drm_mode_cursor_universal+0x10e/0x270 drm_mode_cursor_common+0x102/0x230 ? __pfx_drm_mode_cursor2_ioctl+0x10/0x10 drm_ioctl_kernel+0xb2/0x110 drm_ioctl+0x26d/0x4b0 ? __pfx_drm_mode_cursor2_ioctl+0x10/0x10 ? __pfx_drm_ioctl+0x10/0x10 vmw_generic_ioctl+0xa4/0x110 [vmwgfx]
__x64_sys_ioctl+0x94/0xd0 do_syscall_64+0x61/0xe0 ? __x64_sys_ioctl+0xaf/0xd0 ? syscall_exit_to_user_mode+0x2b/0x40 ? do_syscall_64+0x70/0xe0 ? __x64_sys_ioctl+0xaf/0xd0 ? syscall_exit_to_user_mode+0x2b/0x40 ? do_syscall_64+0x70/0xe0 ? exc_page_fault+0x7f/0x180 entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7f1e93f279ed Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff f> RSP: 002b:00007ffca0faf600 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000055db876ed2c0 RCX: 00007f1e93f279ed RDX: 00007ffca0faf6c0 RSI: 00000000c02464bb RDI: 0000000000000015 RBP: 00007ffca0faf650 R08: 000055db87184010 R09: 0000000000000007 R10: 000055db886471a0 R11: 0000000000000246 R12: 00007ffca0faf6c0 R13: 00000000c02464bb R14: 0000000000000015 R15: 00007ffca0faf790 </TASK> Modules linked in: snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_ine> CR2: 0000000000000028
---[ end trace 0000000000000000 ]--- RIP: 0010:vmw_du_cursor_plane_cleanup_fb+0x124/0x140 [vmwgfx] Code: 00 00 00 75 3a 48 83 c4 10 5b 5d c3 cc cc cc cc 48 8b b3 a8 00 00 00 48 c7 c7 99 90 43 c0 e8 93 c5 db ca 48 8b 83 a8 00 00 00 <48> 8b 78 28 e8 e3 f> RSP: 0018:ffffb6b98216fa80 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff969d84cdcb00 RCX: 0000000000000027 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff969e75f21600 RBP: ffff969d4143
---truncated---

CVE-2023-52642:
In the Linux kernel, the following vulnerability has been resolved:

media: rc: bpf attach/detach requires write permission

Note that bpf attach/detach also requires CAP_NET_ADMIN.

CVE-2023-52638:
In the Linux kernel, the following vulnerability has been resolved:

can: j1939: prevent deadlock by changing j1939_socks_lock to rwlock

The following 3 locks would race against each other, causing the deadlock situation in the Syzbot bug report:

- j1939_socks_lock
- active_session_list_lock
- sk_session_queue_lock

A reasonable fix is to change j1939_socks_lock to an rwlock, since in the rare situations where a write lock is required for the linked list that j1939_socks_lock is protecting, the code does not attempt to acquire any more locks. This would break the circular lock dependency, where, for example, the current thread already locks j1939_socks_lock and attempts to acquire sk_session_queue_lock, and at the same time, another thread attempts to acquire j1939_socks_lock while holding sk_session_queue_lock.

NOTE: This patch along does not fix the unregister_netdevice bug reported by Syzbot; instead, it solves a deadlock situation to prepare for one or more further patches to actually fix the Syzbot bug, which appears to be a reference counting problem within the j1939 codebase.

[mkl: remove unrelated newline change]

CVE-2023-52616:
In the Linux kernel, the following vulnerability has been resolved:

crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init

When the mpi_ec_ctx structure is initialized, some fields are not cleared, causing a crash when referencing the field when the structure was released. Initially, this issue was ignored because memory for mpi_ec_ctx is allocated with the __GFP_ZERO flag.
For example, this error will be triggered when calculating the Za value for SM2 separately.

CVE-2023-52585:
In the Linux kernel, the following vulnerability has been resolved:

drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()

Return invalid error code -EINVAL for invalid block id.

Fixes the below:

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1183 amdgpu_ras_query_error_status_helper() error: we previously assumed 'info' could be null (see line 1176)

CVE-2023-52530:
In the Linux kernel, the following vulnerability has been resolved:

wifi: mac80211: fix potential key use-after-free

When ieee80211_key_link() is called by ieee80211_gtk_rekey_add() but returns 0 due to KRACK protection (identical key reinstall), ieee80211_gtk_rekey_add() will still return a pointer into the key, in a potential use-after-free. This normally doesn't happen since it's only called by iwlwifi in case of WoWLAN rekey offload which has its own KRACK protection, but still better to fix, do that by returning an error code and converting that to success on the cfg80211 boundary only, leaving the error for bad callers of ieee80211_gtk_rekey_add().

CVE-2023-52514:
In the Linux kernel, the following vulnerability has been resolved:

x86/reboot: VMCLEAR active VMCSes before emergency reboot

VMCLEAR active VMCSes before any emergency reboot, not just if the kernel may kexec into a new kernel after a crash. Per Intel's SDM, the VMX architecture doesn't require the CPU to flush the VMCS cache on INIT. If an emergency reboot doesn't RESET CPUs, cached VMCSes could theoretically be kept and only be written back to memory after the new kernel is booted, i.e. could effectively corrupt memory after reboot.

Opportunistically remove the setting of the global pointer to NULL to make checkpatch happy.

CVE-2023-52463:
In the Linux kernel, the following vulnerability has been resolved:

efivarfs: force RO when remounting if SetVariable is not supported

If SetVariable at runtime is not supported by the firmware we never assign a callback for that function. At the same time mount the efivarfs as RO so no one can call that. However, we never check the permission flags when someone remounts the filesystem as RW. As a result this leads to a crash looking like this:

$ mount -o remount,rw /sys/firmware/efi/efivars $ efi-updatevar -f PK.auth PK

[ 303.279166] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 303.280482] Mem abort info:
[ 303.280854] ESR = 0x0000000086000004 [ 303.281338] EC = 0x21: IABT (current EL), IL = 32 bits [ 303.282016] SET = 0, FnV = 0 [ 303.282414] EA = 0, S1PTW = 0 [ 303.282821] FSC = 0x04: level 0 translation fault [ 303.283771] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004258c000 [ 303.284913] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 303.286076] Internal error: Oops: 0000000086000004 [#1] PREEMPT SMP [ 303.286936] Modules linked in: qrtr tpm_tis tpm_tis_core crct10dif_ce arm_smccc_trng rng_core drm fuse ip_tables x_tables ipv6 [ 303.288586] CPU: 1 PID: 755 Comm: efi-updatevar Not tainted 6.3.0-rc1-00108-gc7d0c4695c68 #1 [ 303.289748] Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2023.04-00627-g88336918701d 04/01/2023 [ 303.291150] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 303.292123] pc : 0x0 [ 303.292443] lr : efivar_set_variable_locked+0x74/0xec [ 303.293156] sp : ffff800008673c10 [ 303.293619] x29: ffff800008673c10 x28: ffff0000037e8000 x27: 0000000000000000 [ 303.294592] x26: 0000000000000800 x25: ffff000002467400 x24: 0000000000000027 [ 303.295572] x23: ffffd49ea9832000 x22: ffff0000020c9800 x21: ffff000002467000 [ 303.296566] x20: 0000000000000001 x19: 00000000000007fc x18: 0000000000000000 [ 303.297531] x17: 0000000000000000 x16: 0000000000000000 x15: 0000aaaac807ab54 [ 303.298495] x14: ed37489f673633c0 x13: 71c45c606de13f80 x12: 47464259e219acf4 [ 303.299453] x11: ffff000002af7b01 x10: 0000000000000003 x9 : 0000000000000002 [ 303.300431] x8 : 0000000000000010 x7 : ffffd49ea8973230 x6 : 0000000000a85201 [ 303.301412] x5 : 0000000000000000 x4 : ffff0000020c9800 x3 : 00000000000007fc [ 303.302370] x2 : 0000000000000027 x1 : ffff000002467400 x0 : ffff000002467000 [ 303.303341] Call trace:
[ 303.303679] 0x0 [ 303.303938] efivar_entry_set_get_size+0x98/0x16c [ 303.304585] efivarfs_file_write+0xd0/0x1a4 [ 303.305148] vfs_write+0xc4/0x2e4 [ 303.305601] ksys_write+0x70/0x104 [ 303.306073] __arm64_sys_write+0x1c/0x28 [ 303.306622] invoke_syscall+0x48/0x114 [ 303.307156] el0_svc_common.constprop.0+0x44/0xec [ 303.307803] do_el0_svc+0x38/0x98 [ 303.308268] el0_svc+0x2c/0x84 [ 303.308702] el0t_64_sync_handler+0xf4/0x120 [ 303.309293] el0t_64_sync+0x190/0x194 [ 303.309794] Code: ???????? ???????? ???????? ???????? (????????) [ 303.310612] ---[ end trace 0000000000000000 ]---

Fix this by adding a .reconfigure() function to the fs operations which we can use to check the requested flags and deny anything that's not RO if the firmware doesn't implement SetVariable at runtime.

CVE-2022-48865:
In the Linux kernel, the following vulnerability has been resolved:

tipc: fix kernel panic when enabling bearer

When enabling a bearer on a node, a kernel panic is observed:

[ 4.498085] RIP: 0010:tipc_mon_prep+0x4e/0x130 [tipc] ...
[ 4.520030] Call Trace:
[ 4.520689] <IRQ> [ 4.521236] tipc_link_build_proto_msg+0x375/0x750 [tipc] [ 4.522654] tipc_link_build_state_msg+0x48/0xc0 [tipc] [ 4.524034] __tipc_node_link_up+0xd7/0x290 [tipc] [ 4.525292] tipc_rcv+0x5da/0x730 [tipc] [ 4.526346] ? __netif_receive_skb_core+0xb7/0xfc0 [ 4.527601] tipc_l2_rcv_msg+0x5e/0x90 [tipc] [ 4.528737] __netif_receive_skb_list_core+0x20b/0x260 [ 4.530068] netif_receive_skb_list_internal+0x1bf/0x2e0 [ 4.531450] ? dev_gro_receive+0x4c2/0x680 [ 4.532512] napi_complete_done+0x6f/0x180 [ 4.533570] virtnet_poll+0x29c/0x42e [virtio_net] ...

The node in question is receiving activate messages in another thread after changing bearer status to allow message sending/ receiving in current thread:

thread 1 | thread 2
-------- | -------- | tipc_enable_bearer() | test_and_set_bit_lock() | tipc_bearer_xmit_skb() | | tipc_l2_rcv_msg() | tipc_rcv() | __tipc_node_link_up() | tipc_link_build_state_msg() | tipc_link_build_proto_msg() | tipc_mon_prep() | { | ...
| // null-pointer dereference | u16 gen = mon->dom_gen;
| ...
| } // Not being executed yet | tipc_mon_create() | { | ... | // allocate | mon = kzalloc(); | ... | } |

Monitoring pointer in thread 2 is dereferenced before monitoring data is allocated in thread 1. This causes kernel panic.

This commit fixes it by allocating the monitoring data before enabling the bearer to receive messages.

CVE-2022-48860:
In the Linux kernel, the following vulnerability has been resolved:

ethernet: Fix error handling in xemaclite_of_probe

This node pointer is returned by of_parse_phandle() with refcount incremented in this function. Calling of_node_put() to avoid the refcount leak. As the remove function do.

CVE-2022-48856:
In the Linux kernel, the following vulnerability has been resolved:

gianfar: ethtool: Fix refcount leak in gfar_get_ts_info

The of_find_compatible_node() function returns a node pointer with refcount incremented, We should use of_node_put() on it when done Add the missing of_node_put() to release the refcount.

CVE-2022-48855:
In the Linux kernel, the following vulnerability has been resolved:

sctp: fix kernel-infoleak for SCTP sockets

syzbot reported a kernel infoleak [1] of 4 bytes.

After analysis, it turned out r->idiag_expires is not initialized if inet_sctp_diag_fill() calls inet_diag_msg_common_fill()

Make sure to clear idiag_timer/idiag_retrans/idiag_expires and let inet_diag_msg_sctpasoc_fill() fill them again if needed.

[1]

BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline] BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:154 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668 instrument_copy_to_user include/linux/instrumented.h:121 [inline] copyout lib/iov_iter.c:154 [inline]
_copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668 copy_to_iter include/linux/uio.h:162 [inline] simple_copy_to_iter+0xf3/0x140 net/core/datagram.c:519
__skb_datagram_iter+0x2d5/0x11b0 net/core/datagram.c:425 skb_copy_datagram_iter+0xdc/0x270 net/core/datagram.c:533 skb_copy_datagram_msg include/linux/skbuff.h:3696 [inline] netlink_recvmsg+0x669/0x1c80 net/netlink/af_netlink.c:1977 sock_recvmsg_nosec net/socket.c:948 [inline] sock_recvmsg net/socket.c:966 [inline]
__sys_recvfrom+0x795/0xa10 net/socket.c:2097
__do_sys_recvfrom net/socket.c:2115 [inline]
__se_sys_recvfrom net/socket.c:2111 [inline]
__x64_sys_recvfrom+0x19d/0x210 net/socket.c:2111 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae

Uninit was created at:
slab_post_alloc_hook mm/slab.h:737 [inline] slab_alloc_node mm/slub.c:3247 [inline]
__kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4975 kmalloc_reserve net/core/skbuff.c:354 [inline]
__alloc_skb+0x545/0xf90 net/core/skbuff.c:426 alloc_skb include/linux/skbuff.h:1158 [inline] netlink_dump+0x3e5/0x16c0 net/netlink/af_netlink.c:2248
__netlink_dump_start+0xcf8/0xe90 net/netlink/af_netlink.c:2373 netlink_dump_start include/linux/netlink.h:254 [inline] inet_diag_handler_cmd+0x2e7/0x400 net/ipv4/inet_diag.c:1341 sock_diag_rcv_msg+0x24a/0x620 netlink_rcv_skb+0x40c/0x7e0 net/netlink/af_netlink.c:2494 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:277 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x1093/0x1360 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x14d9/0x1720 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] sock_write_iter+0x594/0x690 net/socket.c:1061 do_iter_readv_writev+0xa7f/0xc70 do_iter_write+0x52c/0x1500 fs/read_write.c:851 vfs_writev fs/read_write.c:924 [inline] do_writev+0x645/0xe00 fs/read_write.c:967
__do_sys_writev fs/read_write.c:1040 [inline]
__se_sys_writev fs/read_write.c:1037 [inline]
__x64_sys_writev+0xe5/0x120 fs/read_write.c:1037 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae

Bytes 68-71 of 2508 are uninitialized Memory access of size 2508 starts at ffff888114f9b000 Data copied to user address 00007f7fe09ff2e0

CPU: 1 PID: 3478 Comm: syz-executor306 Not tainted 5.17.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

CVE-2022-48765:
In the Linux kernel, the following vulnerability has been resolved:

KVM: LAPIC: Also cancel preemption timer during SET_LAPIC

The below warning is splatting during guest reboot.

------------[ cut here ]------------ WARNING: CPU: 0 PID: 1931 at arch/x86/kvm/x86.c:10322 kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm] CPU: 0 PID: 1931 Comm: qemu-system-x86 Tainted: G I 5.17.0-rc1+ #5 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm] Call Trace:
<TASK> kvm_vcpu_ioctl+0x279/0x710 [kvm]
__x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd39797350b

This can be triggered by not exposing tsc-deadline mode and doing a reboot in the guest. The lapic_shutdown() function which is called in sys_reboot path will not disarm the flying timer, it just masks LVTT. lapic_shutdown() clears APIC state w/ LVT_MASKED and timer-mode bit is 0, this can trigger timer-mode switch between tsc-deadline and oneshot/periodic, which can result in preemption timer be cancelled in apic_update_lvtt(). However, We can't depend on this when not exposing tsc-deadline mode and oneshot/periodic modes emulated by preemption timer. Qemu will synchronise states around reset, let's cancel preemption timer under KVM_SET_LAPIC.

CVE-2022-48763:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86: Forcibly leave nested virt when SMM state is toggled

Forcibly leave nested virtualization operation if userspace toggles SMM state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS. If userspace forces the vCPU out of SMM while it's post-VMXON and then injects an SMI, vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both vmxon=false and smm.vmxon=false, but all other nVMX state allocated.

Don't attempt to gracefully handle the transition as (a) most transitions are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't sufficient information to handle all transitions, e.g. SVM wants access to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede KVM_SET_NESTED_STATE during state restore as the latter disallows putting the vCPU into L2 if SMM is active, and disallows tagging the vCPU as being post-VMXON in SMM if SMM is not active.

Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU in an architecturally impossible state.

WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Modules linked in:
CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00 Call Trace:
<TASK> kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123 kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline] kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460 kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline] kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline] kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250 kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273
__fput+0x286/0x9f0 fs/file_table.c:311 task_work_run+0xdd/0x1a0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0xb29/0x2a30 kernel/exit.c:806 do_group_exit+0xd2/0x2f0 kernel/exit.c:935 get_signal+0x4b0/0x28c0 kernel/signal.c:2862 arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK>

CVE-2022-48757:
In the Linux kernel, the following vulnerability has been resolved:

net: fix information leakage in /proc/net/ptype

In one net namespace, after creating a packet socket without binding it to a device, users in other net namespaces can observe the new `packet_type` added by this packet socket by reading `/proc/net/ptype` file. This is minor information leakage as packet socket is namespace aware.

Add a net pointer in `packet_type` to keep the net namespace of of corresponding packet socket. In `ptype_seq_show`, this net pointer must be checked when it is not NULL.

CVE-2022-48754:
In the Linux kernel, the following vulnerability has been resolved:

phylib: fix potential use-after-free

Commit bafbdd527d56 (phylib: Add device reset GPIO support) added call to phy_device_reset(phydev) after the put_device() call in phy_detach().

The comment before the put_device() call says that the phydev might go away with put_device().

Fix potential use-after-free by calling phy_device_reset() before put_device().

CVE-2022-48751:
In the Linux kernel, the following vulnerability has been resolved:

net/smc: Transitional solution for clcsock race issue

We encountered a crash in smc_setsockopt() and it is caused by accessing smc->clcsock after clcsock was released.

BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 RIP: 0010:smc_setsockopt+0x59/0x280 [smc] Call Trace:
<TASK>
__sys_setsockopt+0xfc/0x190
__x64_sys_setsockopt+0x20/0x30 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f16ba83918e </TASK>

This patch tries to fix it by holding clcsock_release_lock and checking whether clcsock has already been released before access.

In case that a crash of the same reason happens in smc_getsockopt() or smc_switch_to_fallback(), this patch also checkes smc->clcsock in them too. And the caller of smc_switch_to_fallback() will identify whether fallback succeeds according to the return value.

CVE-2022-48731:
In the Linux kernel, the following vulnerability has been resolved:

mm/kmemleak: avoid scanning potential huge holes

When using devm_request_free_mem_region() and devm_memremap_pages() to add ZONE_DEVICE memory, if requested free mem region's end pfn were huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see move_pfn_range_to_zone()). Thus it creates a huge hole between node_start_pfn() and node_end_pfn().

We found on some AMD APUs, amdkfd requested such a free mem region and created a huge hole. In such a case, following code snippet was just doing busy test_bit() looping on the huge hole.

for (pfn = start_pfn; pfn < end_pfn; pfn++) { struct page *page = pfn_to_online_page(pfn);
if (!page) continue;
...
}

So we got a soft lockup:

watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221] CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1 RIP: 0010:pfn_to_online_page+0x5/0xd0 Call Trace:
? kmemleak_scan+0x16a/0x440 kmemleak_write+0x306/0x3a0 ? common_file_perm+0x72/0x170 full_proxy_write+0x5c/0x90 vfs_write+0xb9/0x260 ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae

I did some tests with the patch.

(1) amdgpu module unloaded

before the patch:

real 0m0.976s user 0m0.000s sys 0m0.968s

after the patch:

real 0m0.981s user 0m0.000s sys 0m0.973s

(2) amdgpu module loaded

before the patch:

real 0m35.365s user 0m0.000s sys 0m35.354s

after the patch:

real 0m1.049s user 0m0.000s sys 0m1.042s

CVE-2022-48712:
In the Linux kernel, the following vulnerability has been resolved:

ext4: fix error handling in ext4_fc_record_modified_inode()

Current code does not fully takes care of krealloc() error case, which could lead to silent memory corruption or a kernel bug. This patch fixes that.

Also it cleans up some duplicated error handling logic from various functions in fast_commit.c file.

CVE-2022-48702:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: emu10k1: Fix out of bounds access in snd_emu10k1_pcm_channel_alloc()

The voice allocator sometimes begins allocating from near the end of the array and then wraps around, however snd_emu10k1_pcm_channel_alloc() accesses the newly allocated voices as if it never wrapped around.

This results in out of bounds access if the first voice has a high enough index so that first_voice + requested_voice_count > NUM_G (64).
The more voices are requested, the more likely it is for this to occur.

This was initially discovered using PipeWire, however it can be reproduced by calling aplay multiple times with 16 channels:
aplay -r 48000 -D plughw:CARD=Live,DEV=3 -c 16 /dev/zero

UBSAN: array-index-out-of-bounds in sound/pci/emu10k1/emupcm.c:127:40 index 65 is out of range for type 'snd_emu10k1_voice [64]' CPU: 1 PID: 31977 Comm: aplay Tainted: G W IOE 6.0.0-rc2-emu10k1+ #7 Hardware name: ASUSTEK COMPUTER INC P5W DH Deluxe/P5W DH Deluxe, BIOS 3002 07/22/2010 Call Trace:
<TASK> dump_stack_lvl+0x49/0x63 dump_stack+0x10/0x16 ubsan_epilogue+0x9/0x3f
__ubsan_handle_out_of_bounds.cold+0x44/0x49 snd_emu10k1_playback_hw_params+0x3bc/0x420 [snd_emu10k1] snd_pcm_hw_params+0x29f/0x600 [snd_pcm] snd_pcm_common_ioctl+0x188/0x1410 [snd_pcm] ? exit_to_user_mode_prepare+0x35/0x170 ? do_syscall_64+0x69/0x90 ? syscall_exit_to_user_mode+0x26/0x50 ? do_syscall_64+0x69/0x90 ? exit_to_user_mode_prepare+0x35/0x170 snd_pcm_ioctl+0x27/0x40 [snd_pcm]
__x64_sys_ioctl+0x95/0xd0 do_syscall_64+0x5c/0x90 ? do_syscall_64+0x69/0x90 ? do_syscall_64+0x69/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd

CVE-2022-48687:
In the Linux kernel, the following vulnerability has been resolved:

ipv6: sr: fix out-of-bounds read when setting HMAC data.

The SRv6 layer allows defining HMAC data that can later be used to sign IPv6 Segment Routing Headers. This configuration is realised via netlink through four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual length of the SECRET attribute, it is possible to provide invalid combinations (e.g., secret = , secretlen = 64). This case is not checked in the code and with an appropriately crafted netlink message, an out-of-bounds read of up to 64 bytes (max secret length) can occur past the skb end pointer and into skb_shared_info:

Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 208 memcpy(hinfo->secret, secret, slen);
(gdb) bt #0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 #1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600, extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>, family=<optimized out>) at net/netlink/genetlink.c:731 #2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00, family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775 #3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792 #4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>) at net/netlink/af_netlink.c:2501 #5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803 #6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000) at net/netlink/af_netlink.c:1319 #7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>) at net/netlink/af_netlink.c:1345 #8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921 ...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end $1 = 0xffff88800b1b76c0 (gdb) p/x secret $2 = 0xffff88800b1b76c0 (gdb) p slen $3 = 64 '@'

The OOB data can then be read back from userspace by dumping HMAC state. This commit fixes this by ensuring SECRETLEN cannot exceed the actual length of SECRET.

CVE-2022-48673:
In the Linux kernel, the following vulnerability has been resolved:

net/smc: Fix possible access to freed memory in link clear

After modifying the QP to the Error state, all RX WR would be completed with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not wait for it is done, but destroy the QP and free the link group directly.
So there is a risk that accessing the freed memory in tasklet context.

Here is a crash example:

BUG: unable to handle page fault for address: ffffffff8f220860 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S OE 5.10.0-0607+ #23 Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018 RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0 Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32 RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086 RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000 RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00 RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010 R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
<IRQ>
_raw_spin_lock_irqsave+0x30/0x40 mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib] smc_wr_rx_tasklet_fn+0x56/0xa0 [smc] tasklet_action_common.isra.21+0x66/0x100
__do_softirq+0xd5/0x29c asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x37/0x40 irq_exit_rcu+0x9d/0xa0 sysvec_call_function_single+0x34/0x80 asm_sysvec_call_function_single+0x12/0x20

CVE-2022-48657:
In the Linux kernel, the following vulnerability has been resolved:

arm64: topology: fix possible overflow in amu_fie_setup()

cpufreq_get_hw_max_freq() returns max frequency in kHz as *unsigned int*, while freq_inv_set_max_ratio() gets passed this frequency in Hz as 'u64'.
Multiplying max frequency by 1000 can potentially result in overflow -- multiplying by 1000ULL instead should avoid that...

Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool.

CVE-2022-48654:
In the Linux kernel, the following vulnerability has been resolved:

netfilter: nfnetlink_osf: fix possible bogus match in nf_osf_find()

nf_osf_find() incorrectly returns true on mismatch, this leads to copying uninitialized memory area in nft_osf which can be used to leak stale kernel stack data to userspace.

CVE-2022-48653:
In the Linux kernel, the following vulnerability has been resolved:

ice: Don't double unplug aux on peer initiated reset

In the IDC callback that is accessed when the aux drivers request a reset, the function to unplug the aux devices is called. This function is also called in the ice_prepare_for_reset function. This double call is causing a scheduling while atomic BUG.

[ 662.676430] ice 0000:4c:00.0 rocep76s0: cqp opcode = 0x1 maj_err_code = 0xffff min_err_code = 0x8003

[ 662.676609] ice 0000:4c:00.0 rocep76s0: [Modify QP Cmd Error][op_code=8] status=-29 waiting=1 completion_err=1 maj=0xffff min=0x8003

[ 662.815006] ice 0000:4c:00.0 rocep76s0: ICE OICR event notification: oicr = 0x10000003

[ 662.815014] ice 0000:4c:00.0 rocep76s0: critical PE Error, GLPE_CRITERR=0x00011424

[ 662.815017] ice 0000:4c:00.0 rocep76s0: Requesting a reset

[ 662.815475] BUG: scheduling while atomic: swapper/37/0/0x00010002

[ 662.815475] BUG: scheduling while atomic: swapper/37/0/0x00010002 [ 662.815477] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill 8021q garp mrp stp llc vfat fat rpcrdma intel_rapl_msr intel_rapl_common sunrpc i10nm_edac rdma_ucm nfit ib_srpt libnvdimm ib_isert iscsi_target_mod x86_pkg_temp_thermal intel_powerclamp coretemp target_core_mod snd_hda_intel ib_iser snd_intel_dspcfg libiscsi snd_intel_sdw_acpi scsi_transport_iscsi kvm_intel iTCO_wdt rdma_cm snd_hda_codec kvm iw_cm ipmi_ssif iTCO_vendor_support snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hwdep snd_seq snd_seq_device rapl snd_pcm snd_timer isst_if_mbox_pci pcspkr isst_if_mmio irdma intel_uncore idxd acpi_ipmi joydev isst_if_common snd mei_me idxd_bus ipmi_si soundcore i2c_i801 mei ipmi_devintf i2c_smbus i2c_ismt ipmi_msghandler acpi_power_meter acpi_pad rv(OE) ib_uverbs ib_cm ib_core xfs libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helpe r ttm [ 662.815546] nvme nvme_core ice drm crc32c_intel i40e t10_pi wmi pinctrl_emmitsburg dm_mirror dm_region_hash dm_log dm_mod fuse [ 662.815557] Preemption disabled at:
[ 662.815558] [<0000000000000000>] 0x0 [ 662.815563] CPU: 37 PID: 0 Comm: swapper/37 Kdump: loaded Tainted: G S OE 5.17.1 #2 [ 662.815566] Hardware name: Intel Corporation D50DNP/D50DNP, BIOS SE5C6301.86B.6624.D18.2111021741 11/02/2021 [ 662.815568] Call Trace:
[ 662.815572] <IRQ> [ 662.815574] dump_stack_lvl+0x33/0x42 [ 662.815581] __schedule_bug.cold.147+0x7d/0x8a [ 662.815588] __schedule+0x798/0x990 [ 662.815595] schedule+0x44/0xc0 [ 662.815597] schedule_preempt_disabled+0x14/0x20 [ 662.815600] __mutex_lock.isra.11+0x46c/0x490 [ 662.815603] ? __ibdev_printk+0x76/0xc0 [ib_core] [ 662.815633] device_del+0x37/0x3d0 [ 662.815639] ice_unplug_aux_dev+0x1a/0x40 [ice] [ 662.815674] ice_schedule_reset+0x3c/0xd0 [ice] [ 662.815693] irdma_iidc_event_handler.cold.7+0xb6/0xd3 [irdma] [ 662.815712] ? bitmap_find_next_zero_area_off+0x45/0xa0 [ 662.815719] ice_send_event_to_aux+0x54/0x70 [ice] [ 662.815741] ice_misc_intr+0x21d/0x2d0 [ice] [ 662.815756] __handle_irq_event_percpu+0x4c/0x180 [ 662.815762] handle_irq_event_percpu+0xf/0x40 [ 662.815764] handle_irq_event+0x34/0x60 [ 662.815766] handle_edge_irq+0x9a/0x1c0 [ 662.815770] __common_interrupt+0x62/0x100 [ 662.815774] common_interrupt+0xb4/0xd0 [ 662.815779] </IRQ> [ 662.815780] <TASK> [ 662.815780] asm_common_interrupt+0x1e/0x40 [ 662.815785] RIP: 0010:cpuidle_enter_state+0xd6/0x380 [ 662.815789] Code: 49 89 c4 0f 1f 44 00 00 31 ff e8 65 d7 95 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 64 02 00 00 31 ff e8 ae c5 9c ff fb 45 85 f6 <0f> 88 12 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49 [ 662.815791] RSP: 0018:ff2c2c4f18edbe80 EFLAGS: 00000202 [ 662.815793] RAX: ff280805df140000 RBX: 0000000000000002 RCX: 000000000000001f [ 662.815795] RDX: 0000009a52da2d08 R
---truncated---

CVE-2022-48652:
In the Linux kernel, the following vulnerability has been resolved:

ice: Fix crash by keep old cfg when update TCs more than queues

There are problems if allocated queues less than Traffic Classes.

Commit a632b2a4c920 (ice: ethtool: Prohibit improper channel config for DCB) already disallow setting less queues than TCs.

Another case is if we first set less queues, and later update more TCs config due to LLDP, ice_vsi_cfg_tc() will failed but left dirty num_txq/rxq and tc_cfg in vsi, that will cause invalid pointer access.

[ 95.968089] ice 0000:3b:00.1: More TCs defined than queues/rings allocated.
[ 95.968092] ice 0000:3b:00.1: Trying to use more Rx queues (8), than were allocated (1)! [ 95.968093] ice 0000:3b:00.1: Failed to config TC for VSI index: 0 [ 95.969621] general protection fault: 0000 [#1] SMP NOPTI [ 95.969705] CPU: 1 PID: 58405 Comm: lldpad Kdump: loaded Tainted: G U W O --------- -t - 4.18.0 #1 [ 95.969867] Hardware name: O.E.M/BC11SPSCB10, BIOS 8.23 12/30/2021 [ 95.969992] RIP: 0010:devm_kmalloc+0xa/0x60 [ 95.970052] Code: 5c ff ff ff 31 c0 5b 5d 41 5c c3 b8 f4 ff ff ff eb f4 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 89 d1 <8b> 97 60 02 00 00 48 8d 7e 18 48 39 f7 72 3f 55 89 ce 53 48 8b 4c [ 95.970344] RSP: 0018:ffffc9003f553888 EFLAGS: 00010206 [ 95.970425] RAX: dead000000000200 RBX: ffffea003c425b00 RCX: 00000000006080c0 [ 95.970536] RDX: 00000000006080c0 RSI: 0000000000000200 RDI: dead000000000200 [ 95.970648] RBP: dead000000000200 R08: 00000000000463c0 R09: ffff888ffa900000 [ 95.970760] R10: 0000000000000000 R11: 0000000000000002 R12: ffff888ff6b40100 [ 95.970870] R13: ffff888ff6a55018 R14: 0000000000000000 R15: ffff888ff6a55460 [ 95.970981] FS: 00007f51b7d24700(0000) GS:ffff88903ee80000(0000) knlGS:0000000000000000 [ 95.971108] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.971197] CR2: 00007fac5410d710 CR3: 0000000f2c1de002 CR4: 00000000007606e0 [ 95.971309] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 95.971419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 95.971530] PKRU: 55555554 [ 95.971573] Call Trace:
[ 95.971622] ice_setup_rx_ring+0x39/0x110 [ice] [ 95.971695] ice_vsi_setup_rx_rings+0x54/0x90 [ice] [ 95.971774] ice_vsi_open+0x25/0x120 [ice] [ 95.971843] ice_open_internal+0xb8/0x1f0 [ice] [ 95.971919] ice_ena_vsi+0x4f/0xd0 [ice] [ 95.971987] ice_dcb_ena_dis_vsi.constprop.5+0x29/0x90 [ice] [ 95.972082] ice_pf_dcb_cfg+0x29a/0x380 [ice] [ 95.972154] ice_dcbnl_setets+0x174/0x1b0 [ice] [ 95.972220] dcbnl_ieee_set+0x89/0x230 [ 95.972279] ? dcbnl_ieee_del+0x150/0x150 [ 95.972341] dcb_doit+0x124/0x1b0 [ 95.972392] rtnetlink_rcv_msg+0x243/0x2f0 [ 95.972457] ? dcb_doit+0x14d/0x1b0 [ 95.972510] ? __kmalloc_node_track_caller+0x1d3/0x280 [ 95.972591] ? rtnl_calcit.isra.31+0x100/0x100 [ 95.972661] netlink_rcv_skb+0xcf/0xf0 [ 95.972720] netlink_unicast+0x16d/0x220 [ 95.972781] netlink_sendmsg+0x2ba/0x3a0 [ 95.975891] sock_sendmsg+0x4c/0x50 [ 95.979032] ___sys_sendmsg+0x2e4/0x300 [ 95.982147] ? kmem_cache_alloc+0x13e/0x190 [ 95.985242] ? __wake_up_common_lock+0x79/0x90 [ 95.988338] ? __check_object_size+0xac/0x1b0 [ 95.991440] ? _copy_to_user+0x22/0x30 [ 95.994539] ? move_addr_to_user+0xbb/0xd0 [ 95.997619] ? __sys_sendmsg+0x53/0x80 [ 96.000664] __sys_sendmsg+0x53/0x80 [ 96.003747] do_syscall_64+0x5b/0x1d0 [ 96.006862] entry_SYSCALL_64_after_hwframe+0x65/0xca

Only update num_txq/rxq when passed check, and restore tc_cfg if setup queue map failed.

CVE-2022-48651:
In the Linux kernel, the following vulnerability has been resolved:

ipvlan: Fix out-of-bound bugs caused by unset skb->mac_header

If an AF_PACKET socket is used to send packets through ipvlan and the default xmit function of the AF_PACKET socket is changed from dev_queue_xmit() to packet_direct_xmit() via setsockopt() with the option name of PACKET_QDISC_BYPASS, the skb->mac_header may not be reset and remains as the initial value of 65535, this may trigger slab-out-of-bounds bugs as following:

================================================================= UG: KASAN: slab-out-of-bounds in ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan] PU: 2 PID: 1768 Comm: raw_send Kdump: loaded Not tainted 6.0.0-rc4+ #6 ardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 all Trace:
print_address_description.constprop.0+0x1d/0x160 print_report.cold+0x4f/0x112 kasan_report+0xa3/0x130 ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan] ipvlan_start_xmit+0x29/0xa0 [ipvlan]
__dev_direct_xmit+0x2e2/0x380 packet_direct_xmit+0x22/0x60 packet_snd+0x7c9/0xc40 sock_sendmsg+0x9a/0xa0
__sys_sendto+0x18a/0x230
__x64_sys_sendto+0x74/0x90 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The root cause is:
1. packet_snd() only reset skb->mac_header when sock->type is SOCK_RAW and skb->protocol is not specified as in packet_parse_headers()

2. packet_direct_xmit() doesn't reset skb->mac_header as dev_queue_xmit()

In this case, skb->mac_header is 65535 when ipvlan_xmit_mode_l2() is called. So when ipvlan_xmit_mode_l2() gets mac header with eth_hdr() which use skb->head + skb->mac_header, out-of-bound access occurs.

This patch replaces eth_hdr() with skb_eth_hdr() in ipvlan_xmit_mode_l2() and reset mac header in multicast to solve this out-of-bound bug.

CVE-2022-48648:
In the Linux kernel, the following vulnerability has been resolved:

sfc: fix null pointer dereference in efx_hard_start_xmit

Trying to get the channel from the tx_queue variable here is wrong because we can only be here if tx_queue is NULL, so we shouldn't dereference it. As the above comment in the code says, this is very unlikely to happen, but it's wrong anyway so let's fix it.

I hit this issue because of a different bug that caused tx_queue to be NULL. If that happens, this is the error message that we get here:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [...] RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]

CVE-2022-3903:
An incorrect read request flaw was found in the Infrared Transceiver USB driver in the Linux kernel. This issue occurs when a user attaches a malicious USB device. A local user could use this flaw to starve the resources, causing denial of service or potentially crashing the system.

CVE-2022-3577:
An out-of-bounds memory write flaw was found in the Linux kernel's Kid-friendly Wired Controller driver.
This flaw allows a local user to crash or potentially escalate their privileges on the system. It is in bigben_probe of drivers/hid/hid-bigbenff.c. The reason is incorrect assumption - bigben devices all have inputs. However, malicious devices can break this assumption, leaking to out-of-bound write.

CVE-2021-47599:
In the Linux kernel, the following vulnerability has been resolved:

btrfs: use latest_dev in btrfs_show_devname

The test case btrfs/238 reports the warning below:

WARNING: CPU: 3 PID: 481 at fs/btrfs/super.c:2509 btrfs_show_devname+0x104/0x1e8 [btrfs] CPU: 2 PID: 1 Comm: systemd Tainted: G W O 5.14.0-rc1-custom #72 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace:
btrfs_show_devname+0x108/0x1b4 [btrfs] show_mountinfo+0x234/0x2c4 m_show+0x28/0x34 seq_read_iter+0x12c/0x3c4 vfs_read+0x29c/0x2c8 ksys_read+0x80/0xec
__arm64_sys_read+0x28/0x34 invoke_syscall+0x50/0xf8 do_el0_svc+0x88/0x138 el0_svc+0x2c/0x8c el0t_64_sync_handler+0x84/0xe4 el0t_64_sync+0x198/0x19c

Reason:
While btrfs_prepare_sprout() moves the fs_devices::devices into fs_devices::seed_list, the btrfs_show_devname() searches for the devices and found none, leading to the warning as in above.

Fix:
latest_dev is updated according to the changes to the device list.
That means we could use the latest_dev->name to show the device name in /proc/self/mounts, the pointer will be always valid as it's assigned before the device is deleted from the list in remove or replace.
The RCU protection is sufficient as the device structure is freed after synchronization.

CVE-2021-47585:
In the Linux kernel, the following vulnerability has been resolved:

btrfs: fix memory leak in __add_inode_ref()

Line 1169 (#3) allocates a memory chunk for victim_name by kmalloc(), but when the function returns in line 1184 (#4) victim_name allocated by line 1169 (#3) is not freed, which will lead to a memory leak.
There is a similar snippet of code in this function as allocating a memory chunk for victim_name in line 1104 (#1) as well as releasing the memory in line 1116 (#2).

We should kfree() victim_name when the return value of backref_in_log() is less than zero and before the function returns in line 1184 (#4).

1057 static inline int __add_inode_ref(struct btrfs_trans_handle *trans, 1058 struct btrfs_root *root, 1059 struct btrfs_path *path, 1060 struct btrfs_root *log_root, 1061 struct btrfs_inode *dir, 1062 struct btrfs_inode *inode, 1063 u64 inode_objectid, u64 parent_objectid, 1064 u64 ref_index, char *name, int namelen, 1065 int *search_done) 1066 {

1104 victim_name = kmalloc(victim_name_len, GFP_NOFS);
// #1: kmalloc (victim_name-1) 1105 if (!victim_name) 1106 return -ENOMEM;

1112 ret = backref_in_log(log_root, &search_key, 1113 parent_objectid, victim_name, 1114 victim_name_len);
1115 if (ret < 0) { 1116 kfree(victim_name); // #2: kfree (victim_name-1) 1117 return ret;
1118 } else if (!ret) {

1169 victim_name = kmalloc(victim_name_len, GFP_NOFS);
// #3: kmalloc (victim_name-2) 1170 if (!victim_name) 1171 return -ENOMEM;

1180 ret = backref_in_log(log_root, &search_key, 1181 parent_objectid, victim_name, 1182 victim_name_len);
1183 if (ret < 0) { 1184 return ret; // #4: missing kfree (victim_name-2) 1185 } else if (!ret) {

1241 return 0;
1242 }

CVE-2021-47579:
In the Linux kernel, the following vulnerability has been resolved:

ovl: fix warning in ovl_create_real()

Syzbot triggered the following warning in ovl_workdir_create() -> ovl_create_real():

if (!err && WARN_ON(!newdentry->d_inode)) {

The reason is that the cgroup2 filesystem returns from mkdir without instantiating the new dentry.

Weird filesystems such as this will be rejected by overlayfs at a later stage during setup, but to prevent such a warning, call ovl_mkdir_real() directly from ovl_workdir_create() and reject this case early.

CVE-2021-47577:
In the Linux kernel, the following vulnerability has been resolved:

io-wq: check for wq exit after adding new worker task_work

We check IO_WQ_BIT_EXIT before attempting to create a new worker, and wq exit cancels pending work if we have any. But it's possible to have a race between the two, where creation checks exit finding it not set, but we're in the process of exiting. The exit side will cancel pending creation task_work, but there's a gap where we add task_work after we've canceled existing creations at exit time.

Fix this by checking the EXIT bit post adding the creation task_work.
If it's set, run the same cancelation that exit does.

CVE-2021-47526:
In the Linux kernel, the following vulnerability has been resolved:

serial: liteuart: Fix NULL pointer dereference in ->remove()

drvdata has to be set in _probe() - otherwise platform_get_drvdata() causes null pointer dereference BUG in _remove().

CVE-2021-47522:
In the Linux kernel, the following vulnerability has been resolved:

HID: bigbenff: prevent null pointer dereference

When emulating the device through uhid, there is a chance we don't have output reports and so report_field is null.

CVE-2021-47502:
In the Linux kernel, the following vulnerability has been resolved:

ASoC: codecs: wcd934x: handle channel mappping list correctly

Currently each channel is added as list to dai channel list, however there is danger of adding same channel to multiple dai channel list which endups corrupting the other list where its already added.

This patch ensures that the channel is actually free before adding to the dai channel list and also ensures that the channel is on the list before deleting it.

This check was missing previously, and we did not hit this issue as we were testing very simple usecases with sequence of amixer commands.

CVE-2021-47492:
In the Linux kernel, the following vulnerability has been resolved:

mm, thp: bail out early in collapse_file for writeback page

Currently collapse_file does not explicitly check PG_writeback, instead, page_has_private and try_to_release_page are used to filter writeback pages. This does not work for xfs with blocksize equal to or larger than pagesize, because in such case xfs has no page->private.

This makes collapse_file bail out early for writeback page. Otherwise, xfs end_page_writeback will panic as follows.

page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32 aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:libtest.so flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback) raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8 raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000 page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u)) page->mem_cgroup:ffff0000c3e9a000
------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1212! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in:
BUG: Bad page state in process khugepaged pfn:84ef32 xfs(E) page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32 libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ...
CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ...
pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) Call trace:
end_page_writeback+0x1c0/0x214 iomap_finish_page_writeback+0x13c/0x204 iomap_finish_ioend+0xe8/0x19c iomap_writepage_end_bio+0x38/0x50 bio_endio+0x168/0x1ec blk_update_request+0x278/0x3f0 blk_mq_end_request+0x34/0x15c virtblk_request_done+0x38/0x74 [virtio_blk] blk_done_softirq+0xc4/0x110
__do_softirq+0x128/0x38c
__irq_exit_rcu+0x118/0x150 irq_exit+0x1c/0x30
__handle_domain_irq+0x8c/0xf0 gic_handle_irq+0x84/0x108 el1_irq+0xcc/0x180 arch_cpu_idle+0x18/0x40 default_idle_call+0x4c/0x1a0 cpuidle_idle_call+0x168/0x1e0 do_idle+0xb4/0x104 cpu_startup_entry+0x30/0x9c secondary_start_kernel+0x104/0x180 Code: d4210000 b0006161 910c8021 94013f4d (d4210000)
---[ end trace 4a88c6a074082f8c ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt

CVE-2021-47470:
In the Linux kernel, the following vulnerability has been resolved:

mm, slub: fix potential use-after-free in slab_debugfs_fops

When sysfs_slab_add failed, we shouldn't call debugfs_slab_add() for s because s will be freed soon. And slab_debugfs_fops will use s later leading to a use-after-free.

CVE-2021-47467:
In the Linux kernel, the following vulnerability has been resolved:

kunit: fix reference count leak in kfree_at_end

The reference counting issue happens in the normal path of kfree_at_end(). When kunit_alloc_and_get_resource() is invoked, the function forgets to handle the returned resource object, whose refcount increased inside, causing a refcount leak.

Fix this issue by calling kunit_alloc_resource() instead of kunit_alloc_and_get_resource().

Fixed the following when applying:
Shuah Khan <[email protected]>

CHECK: Alignment should match open parenthesis + kunit_alloc_resource(test, NULL, kfree_res_free, GFP_KERNEL, (void *)to_free);

CVE-2021-47448:
In the Linux kernel, the following vulnerability has been resolved:

mptcp: fix possible stall on recvmsg()

recvmsg() can enter an infinite loop if the caller provides the MSG_WAITALL, the data present in the receive queue is not sufficient to fulfill the request, and no more data is received by the peer.

When the above happens, mptcp_wait_data() will always return with no wait, as the MPTCP_DATA_READY flag checked by such function is set and never cleared in such code path.

Leveraging the above syzbot was able to trigger an RCU stall:

rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-...!: (10499 ticks this GP) idle=0af/1/0x4000000000000000 softirq=10678/10678 fqs=1 (t=10500 jiffies g=13089 q=109) rcu: rcu_preempt kthread starved for 10497 jiffies! g13089 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:R running task stack:28696 pid: 14 ppid: 2 flags:0x00004000 Call Trace:
context_switch kernel/sched/core.c:4955 [inline]
__schedule+0x940/0x26f0 kernel/sched/core.c:6236 schedule+0xd3/0x270 kernel/sched/core.c:6315 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1955 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2128 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 CPU: 1 PID: 8510 Comm: syz-executor827 Not tainted 5.15.0-rc2-next-20210920-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:84 [inline] RIP: 0010:memory_is_nonzero mm/kasan/generic.c:102 [inline] RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:128 [inline] RIP: 0010:memory_is_poisoned mm/kasan/generic.c:159 [inline] RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline] RIP: 0010:kasan_check_range+0xc8/0x180 mm/kasan/generic.c:189 Code: 38 00 74 ed 48 8d 50 08 eb 09 48 83 c0 01 48 39 d0 74 7a 80 38 00 74 f2 48 89 c2 b8 01 00 00 00 48 85 d2 75 56 5b 5d 41 5c c3 <48> 85 d2 74 5e 48 01 ea eb 09 48 83 c0 01 48 39 d0 74 50 80 38 00 RSP: 0018:ffffc9000cd676c8 EFLAGS: 00000283 RAX: ffffed100e9a110e RBX: ffffed100e9a110f RCX: ffffffff88ea062a RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888074d08870 RBP: ffffed100e9a110e R08: 0000000000000001 R09: ffff888074d08877 R10: ffffed100e9a110e R11: 0000000000000000 R12: ffff888074d08000 R13: ffff888074d08000 R14: ffff888074d08088 R15: ffff888074d08000 FS: 0000555556d8e300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 S: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000180 CR3: 0000000068909000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
instrument_atomic_read_write include/linux/instrumented.h:101 [inline] test_and_clear_bit include/asm-generic/bitops/instrumented-atomic.h:83 [inline] mptcp_release_cb+0x14a/0x210 net/mptcp/protocol.c:3016 release_sock+0xb4/0x1b0 net/core/sock.c:3204 mptcp_wait_data net/mptcp/protocol.c:1770 [inline] mptcp_recvmsg+0xfd1/0x27b0 net/mptcp/protocol.c:2080 inet6_recvmsg+0x11b/0x5e0 net/ipv6/af_inet6.c:659 sock_recvmsg_nosec net/socket.c:944 [inline]
____sys_recvmsg+0x527/0x600 net/socket.c:2626
___sys_recvmsg+0x127/0x200 net/socket.c:2670 do_recvmmsg+0x24d/0x6d0 net/socket.c:2764
__sys_recvmmsg net/socket.c:2843 [inline]
__do_sys_recvmmsg net/socket.c:2866 [inline]
__se_sys_recvmmsg net/socket.c:2859 [inline]
__x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2859 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc200d2
---truncated---

CVE-2021-47443:
In the Linux kernel, the following vulnerability has been resolved:

NFC: digital: fix possible memory leak in digital_tg_listen_mdaa()

'params' is allocated in digital_tg_listen_mdaa(), but not free when digital_send_cmd() failed, which will cause memory leak. Fix it by freeing 'params' if digital_send_cmd() return failed.

CVE-2021-47442:
In the Linux kernel, the following vulnerability has been resolved:

NFC: digital: fix possible memory leak in digital_in_send_sdd_req()

'skb' is allocated in digital_in_send_sdd_req(), but not free when digital_in_send_cmd() failed, which will cause memory leak. Fix it by freeing 'skb' if digital_in_send_cmd() return failed.

CVE-2021-47439:
In the Linux kernel, the following vulnerability has been resolved:

net: dsa: microchip: Added the condition for scheduling ksz_mib_read_work

When the ksz module is installed and removed using rmmod, kernel crashes with null pointer dereferrence error. During rmmod, ksz_switch_remove function tries to cancel the mib_read_workqueue using cancel_delayed_work_sync routine and unregister switch from dsa.

During dsa_unregister_switch it calls ksz_mac_link_down, which in turn reschedules the workqueue since mib_interval is non-zero.
Due to which queue executed after mib_interval and it tries to access dp->slave. But the slave is unregistered in the ksz_switch_remove function. Hence kernel crashes.

To avoid this crash, before canceling the workqueue, resetted the mib_interval to 0.

v1 -> v2:
-Removed the if condition in ksz_mib_read_work

CVE-2021-47432:
In the Linux kernel, the following vulnerability has been resolved:

lib/generic-radix-tree.c: Don't overflow in peek()

When we started spreading new inode numbers throughout most of the 64 bit inode space, that triggered some corner case bugs, in particular some integer overflows related to the radix tree code. Oops.

CVE-2021-47430:
In the Linux kernel, the following vulnerability has been resolved:

x86/entry: Clear X86_FEATURE_SMAP when CONFIG_X86_SMAP=n

Commit

3c73b81a9164 (x86/entry, selftests: Further improve user entry sanity checks)

added a warning if AC is set when in the kernel.

Commit

662a0221893a3d (x86/entry: Fix AC assertion)

changed the warning to only fire if the CPU supports SMAP.

However, the warning can still trigger on a machine that supports SMAP but where it's disabled in the kernel config and when running the syscall_nt selftest, for example:

------------[ cut here ]------------ WARNING: CPU: 0 PID: 49 at irqentry_enter_from_user_mode CPU: 0 PID: 49 Comm: init Tainted: G T 5.15.0-rc4+ #98 e6202628ee053b4f310759978284bd8bb0ce6905 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:irqentry_enter_from_user_mode ...
Call Trace:
? irqentry_enter ? exc_general_protection ? asm_exc_general_protection ? asm_exc_general_protectio

IS_ENABLED(CONFIG_X86_SMAP) could be added to the warning condition, but even this would not be enough in case SMAP is disabled at boot time with the nosmap parameter.

To be consistent with nosmap behaviour, clear X86_FEATURE_SMAP when !CONFIG_X86_SMAP.

Found using entry-fuzz + satrandconfig.

[ bp: Massage commit message. ]

CVE-2021-47407:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86: Handle SRCU initialization failure during page track init

Check the return of init_srcu_struct(), which can fail due to OOM, when initializing the page track mechanism. Lack of checking leads to a NULL pointer deref found by a modified syzkaller.

[Move the call towards the beginning of kvm_arch_init_vm. - Paolo]

CVE-2021-47406:
In the Linux kernel, the following vulnerability has been resolved:

ext4: add error checking to ext4_ext_replay_set_iblocks()

If the call to ext4_map_blocks() fails due to an corrupted file system, ext4_ext_replay_set_iblocks() can get stuck in an infinite loop. This could be reproduced by running generic/526 with a file system that has inline_data and fast_commit enabled. The system will repeatedly log to the console:

EXT4-fs warning (device dm-3): ext4_block_to_path:105: block 1074800922 > max in inode 131076

and the stack that it gets stuck in is:

ext4_block_to_path+0xe3/0x130 ext4_ind_map_blocks+0x93/0x690 ext4_map_blocks+0x100/0x660 skip_hole+0x47/0x70 ext4_ext_replay_set_iblocks+0x223/0x440 ext4_fc_replay_inode+0x29e/0x3b0 ext4_fc_replay+0x278/0x550 do_one_pass+0x646/0xc10 jbd2_journal_recover+0x14a/0x270 jbd2_journal_load+0xc4/0x150 ext4_load_journal+0x1f3/0x490 ext4_fill_super+0x22d4/0x2c00

With this patch, generic/526 still fails, but system is no longer locking up in a tight loop. It's likely the root casue is that fast_commit replay is corrupting file systems with inline_data, and we probably need to add better error handling in the fast commit replay code path beyond what is done here, which essentially just breaks the infinite loop without reporting the to the higher levels of the code.

CVE-2021-47390:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()

KASAN reports the following issue:

BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798

CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G X --------- --- Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020 Call Trace:
dump_stack+0xa5/0xe6 print_address_description.constprop.0+0x18/0x130 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
__kasan_report.cold+0x7f/0x114 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kasan_report+0x38/0x50 kasan_check_range+0xf5/0x1d0 kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm] ? kvm_arch_exit+0x110/0x110 [kvm] ? sched_clock+0x5/0x10 ioapic_write_indirect+0x59f/0x9e0 [kvm] ? static_obj+0xc0/0xc0 ? __lock_acquired+0x1d2/0x8c0 ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]

The problem appears to be that 'vcpu_bitmap' is allocated as a single long on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear the lower 16 bits of it with bitmap_zero() for no particular reason (my guess would be that 'bitmap' and 'vcpu_bitmap' variables in kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed 16-bit long, the later should accommodate all possible vCPUs).

CVE-2021-47389:
In the Linux kernel, the following vulnerability has been resolved:

KVM: SVM: fix missing sev_decommission in sev_receive_start

DECOMMISSION the current SEV context if binding an ASID fails after RECEIVE_START. Per AMD's SEV API, RECEIVE_START generates a new guest context and thus needs to be paired with DECOMMISSION:

The RECEIVE_START command is the only command other than the LAUNCH_START command that generates a new guest context and guest handle.

The missing DECOMMISSION can result in subsequent SEV launch failures, as the firmware leaks memory and might not able to allocate more SEV guest contexts in the future.

Note, LAUNCH_START suffered the same bug, but was previously fixed by commit 934002cd660b (KVM: SVM: Call SEV Guest Decommission if ASID binding fails).

CVE-2021-47366:
In the Linux kernel, the following vulnerability has been resolved:

afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server

AFS-3 has two data fetch RPC variants, FS.FetchData and FS.FetchData64, and Linux's afs client switches between them when talking to a non-YFS server if the read size, the file position or the sum of the two have the upper 32 bits set of the 64-bit value.

This is a problem, however, since the file position and length fields of FS.FetchData are *signed* 32-bit values.

Fix this by capturing the capability bits obtained from the fileserver when it's sent an FS.GetCapabilities RPC, rather than just discarding them, and then picking out the VICED_CAPABILITY_64BITFILES flag. This can then be used to decide whether to use FS.FetchData or FS.FetchData64 - and also FS.StoreData or FS.StoreData64 - rather than using upper_32_bits() to switch on the parameter values.

This capabilities flag could also be used to limit the maximum size of the file, but all servers must be checked for that.

Note that the issue does not exist with FS.StoreData - that uses *unsigned* 32-bit values. It's also not a problem with Auristor servers as its YFS.FetchData64 op uses unsigned 64-bit values.

This can be tested by cloning a git repo through an OpenAFS client to an OpenAFS server and then doing git status on it from a Linux afs client[1]. Provided the clone has a pack file that's in the 2G-4G range, the git status will show errors like:

error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index

This can be observed in the server's FileLog with something like the following appearing:

Sun Aug 29 19:31:39 2021 SRXAFS_FetchData, Fid = 2303380852.491776.3263114, Host 192.168.11.201:7001, Id 1001 Sun Aug 29 19:31:39 2021 CheckRights: len=0, for host=192.168.11.201:7001 Sun Aug 29 19:31:39 2021 FetchData_RXStyle: Pos 18446744071815340032, Len 3154 Sun Aug 29 19:31:39 2021 FetchData_RXStyle: file size 2400758866 ...
Sun Aug 29 19:31:40 2021 SRXAFS_FetchData returns 5

Note the file position of 18446744071815340032. This is the requested file position sign-extended.

CVE-2021-47365:
In the Linux kernel, the following vulnerability has been resolved:

afs: Fix page leak

There's a loop in afs_extend_writeback() that adds extra pages to a write we want to make to improve the efficiency of the writeback by making it larger. This loop stops, however, if we hit a page we can't write back from immediately, but it doesn't get rid of the page ref we speculatively acquired.

This was caused by the removal of the cleanup loop when the code switched from using find_get_pages_contig() to xarray scanning as the latter only gets a single page at a time, not a batch.

Fix this by putting the page on a ref on an early break from the loop.
Unfortunately, we can't just add that page to the pagevec we're employing as we'll go through that and add those pages to the RPC call.

This was found by the generic/074 test. It leaks ~4GiB of RAM each time it is run - which can be observed with top.

CVE-2021-47359:
In the Linux kernel, the following vulnerability has been resolved:

cifs: Fix soft lockup during fsstress

Below traces are observed during fsstress and system got hung.
[ 130.698396] watchdog: BUG: soft lockup - CPU#6 stuck for 26s!

CVE-2021-47353:
In the Linux kernel, the following vulnerability has been resolved:

udf: Fix NULL pointer dereference in udf_symlink function

In function udf_symlink, epos.bh is assigned with the value returned by udf_tgetblk. The function udf_tgetblk is defined in udf/misc.c and returns the value of sb_getblk function that could be NULL.
Then, epos.bh is used without any check, causing a possible NULL pointer dereference when sb_getblk fails.

This fix adds a check to validate the value of epos.bh.

CVE-2021-47351:
In the Linux kernel, the following vulnerability has been resolved:

ubifs: Fix races between xattr_{set|get} and listxattr operations

UBIFS may occur some problems with concurrent xattr_{set|get} and listxattr operations, such as assertion failure, memory corruption, stale xattr value[1].

Fix it by importing a new rw-lock in @ubifs_inode to serilize write operations on xattr, concurrent read operations are still effective, just like ext4.

[1] https://lore.kernel.org/linux-mtd/[email protected]

CVE-2021-47340:
In the Linux kernel, the following vulnerability has been resolved:

jfs: fix GPF in diFree

Avoid passing inode with JFS_SBI(inode->i_sb)->ipimap == NULL to diFree()[1]. GFP will appear:

struct inode *ipimap = JFS_SBI(ip->i_sb)->ipimap;
struct inomap *imap = JFS_IP(ipimap)->i_imap;

JFS_IP() will return invalid pointer when ipimap == NULL

Call Trace:
diFree+0x13d/0x2dc0 fs/jfs/jfs_imap.c:853 [1] jfs_evict_inode+0x2c9/0x370 fs/jfs/inode.c:154 evict+0x2ed/0x750 fs/inode.c:578 iput_final fs/inode.c:1654 [inline] iput.part.0+0x3fe/0x820 fs/inode.c:1680 iput+0x58/0x70 fs/inode.c:1670

CVE-2021-47339:
In the Linux kernel, the following vulnerability has been resolved:

media: v4l2-core: explicitly clear ioctl input data

As seen from a recent syzbot bug report, mistakes in the compat ioctl implementation can lead to uninitialized kernel stack data getting used as input for driver ioctl handlers.

The reported bug is now fixed, but it's possible that other related bugs are still present or get added in the future. As the drivers need to check user input already, the possible impact is fairly low, but it might still cause an information leak.

To be on the safe side, always clear the entire ioctl buffer before calling the conversion handler functions that are meant to initialize them.

CVE-2021-47335:
In the Linux kernel, the following vulnerability has been resolved:

f2fs: fix to avoid racing on fsync_entry_slab by multi filesystem instances

As syzbot reported, there is an use-after-free issue during f2fs recovery:

Use-after-free write at 0xffff88823bc16040 (in kfence-#10):
kmem_cache_destroy+0x1f/0x120 mm/slab_common.c:486 f2fs_recover_fsync_data+0x75b0/0x8380 fs/f2fs/recovery.c:869 f2fs_fill_super+0x9393/0xa420 fs/f2fs/super.c:3945 mount_bdev+0x26c/0x3a0 fs/super.c:1367 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x86/0x270 fs/super.c:1497 do_new_mount fs/namespace.c:2905 [inline] path_mount+0x196f/0x2be0 fs/namespace.c:3235 do_mount fs/namespace.c:3248 [inline]
__do_sys_mount fs/namespace.c:3456 [inline]
__se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae

The root cause is multi f2fs filesystem instances can race on accessing global fsync_entry_slab pointer, result in use-after-free issue of slab cache, fixes to init/destroy this slab cache only once during module init/destroy procedure to avoid this issue.

CVE-2021-47332:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: usx2y: Don't call free_pages_exact() with NULL address

Unlike some other functions, we can't pass NULL pointer to free_pages_exact(). Add a proper NULL check for avoiding possible Oops.

CVE-2021-47322:
In the Linux kernel, the following vulnerability has been resolved:

NFSv4: Fix an Oops in pnfs_mark_request_commit() when doing O_DIRECT

Fix an Oopsable condition in pnfs_mark_request_commit() when we're putting a set of writes on the commit list to reschedule them after a failed pNFS attempt.

CVE-2021-47320:
In the Linux kernel, the following vulnerability has been resolved:

nfs: fix acl memory leak of posix_acl_create()

When looking into another nfs xfstests report, I found acl and default_acl in nfs3_proc_create() and nfs3_proc_mknod() error paths are possibly leaked. Fix them in advance.

CVE-2021-47316:
In the Linux kernel, the following vulnerability has been resolved:

nfsd: fix NULL dereference in nfs3svc_encode_getaclres

In error cases the dentry may be NULL.

Before 20798dfe249a, the encoder also checked dentry and d_really_is_positive(dentry), but that looks like overkill to me--zero status should be enough to guarantee a positive dentry.

This isn't the first time we've seen an error-case NULL dereference hidden in the initialization of a local variable in an xdr encoder. But I went back through the other recent rewrites and didn't spot any similar bugs.

CVE-2021-47311:
In the Linux kernel, the following vulnerability has been resolved:

net: qcom/emac: fix UAF in emac_remove

adpt is netdev private data and it cannot be used after free_netdev() call. Using adpt after free_netdev() can cause UAF bug. Fix it by moving free_netdev() at the end of the function.

CVE-2021-47310:
In the Linux kernel, the following vulnerability has been resolved:

net: ti: fix UAF in tlan_remove_one

priv is netdev private data and it cannot be used after free_netdev() call. Using priv after free_netdev() can cause UAF bug. Fix it by moving free_netdev() at the end of the function.

CVE-2021-47306:
In the Linux kernel, the following vulnerability has been resolved:

net: fddi: fix UAF in fza_probe

fp is netdev private data and it cannot be used after free_netdev() call. Using fp after free_netdev() can cause UAF bug. Fix it by moving free_netdev() after error message.

TURBOchannel adapter)

CVE-2021-47301:
In the Linux kernel, the following vulnerability has been resolved:

igb: Fix use-after-free error during reset

Cleans the next descriptor to watch (next_to_watch) when cleaning the TX ring.

Failure to do so can cause invalid memory accesses. If igb_poll() runs while the controller is reset this can lead to the driver try to free a skb that was already freed.

(The crash is harder to reproduce with the igb driver, but the same potential problem exists as the code is identical to igc)

CVE-2021-47292:
In the Linux kernel, the following vulnerability has been resolved:

io_uring: fix memleak in io_init_wq_offload()

I got memory leak report when doing fuzz test:

BUG: memory leak unreferenced object 0xffff888107310a80 (size 96):
comm syz-executor.6, pid 4610, jiffies 4295140240 (age 20.135s) hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
backtrace:
[<000000001974933b>] kmalloc include/linux/slab.h:591 [inline] [<000000001974933b>] kzalloc include/linux/slab.h:721 [inline] [<000000001974933b>] io_init_wq_offload fs/io_uring.c:7920 [inline] [<000000001974933b>] io_uring_alloc_task_context+0x466/0x640 fs/io_uring.c:7955 [<0000000039d0800d>] __io_uring_add_tctx_node+0x256/0x360 fs/io_uring.c:9016 [<000000008482e78c>] io_uring_add_tctx_node fs/io_uring.c:9052 [inline] [<000000008482e78c>] __do_sys_io_uring_enter fs/io_uring.c:9354 [inline] [<000000008482e78c>] __se_sys_io_uring_enter fs/io_uring.c:9301 [inline] [<000000008482e78c>] __x64_sys_io_uring_enter+0xabc/0xc20 fs/io_uring.c:9301 [<00000000b875f18f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000b875f18f>] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 [<000000006b0a8484>] entry_SYSCALL_64_after_hwframe+0x44/0xae

CPU0 CPU1 io_uring_enter io_uring_enter io_uring_add_tctx_node io_uring_add_tctx_node
__io_uring_add_tctx_node __io_uring_add_tctx_node io_uring_alloc_task_context io_uring_alloc_task_context io_init_wq_offload io_init_wq_offload hash = kzalloc hash = kzalloc ctx->hash_map = hash ctx->hash_map = hash <- one of the hash is leaked

When calling io_uring_enter() in parallel, the 'hash_map' will be leaked, add uring_lock to protect 'hash_map'.

CVE-2021-47281:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: seq: Fix race of snd_seq_timer_open()

The timer instance per queue is exclusive, and snd_seq_timer_open() should have managed the concurrent accesses. It looks as if it's checking the already existing timer instance at the beginning, but it's not right, because there is no protection, hence any later concurrent call of snd_seq_timer_open() may override the timer instance easily. This may result in UAF, as the leftover timer instance can keep running while the queue itself gets closed, as spotted by syzkaller recently.

For avoiding the race, add a proper check at the assignment of tmr->timeri again, and return -EBUSY if it's been already registered.

CVE-2021-47264:
In the Linux kernel, the following vulnerability has been resolved:

ASoC: core: Fix Null-point-dereference in fmt_single_name()

Check the return value of devm_kstrdup() in case of Null-point-dereference.

CVE-2021-47262:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

Use the __string() machinery provided by the tracing subystem to make a copy of the string literals consumed by the nested VM-Enter failed tracepoint. A complete copy is necessary to ensure that the tracepoint can't outlive the data/memory it consumes and deference stale memory.

Because the tracepoint itself is defined by kvm, if kvm-intel and/or kvm-amd are built as modules, the memory holding the string literals defined by the vendor modules will be freed when the module is unloaded, whereas the tracepoint and its data in the ring buffer will live until kvm is unloaded (or indefinitely if kvm is built-in).

This bug has existed since the tracepoint was added, but was recently exposed by a new check in tracing to detect exactly this type of bug.

fmt: '%s%s ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: ' WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0 CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:trace_check_vprintf+0x3be/0x3e0 Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20 RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8 RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8 R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4 R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000 FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0 Call Trace:
trace_event_printf+0x5e/0x80 trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm] print_trace_line+0x1dd/0x4e0 s_show+0x45/0x150 seq_read_iter+0x2d5/0x4c0 seq_read+0x106/0x150 vfs_read+0x98/0x180 ksys_read+0x5f/0xe0 do_syscall_64+0x40/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae

CVE-2021-47260:
In the Linux kernel, the following vulnerability has been resolved:

NFS: Fix a potential NULL dereference in nfs_get_client()

None of the callers are expecting NULL returns from nfs_get_client() so this code will lead to an Oops. It's better to return an error pointer. I expect that this is dead code so hopefully no one is affected.

CVE-2021-47259:
In the Linux kernel, the following vulnerability has been resolved:

NFS: Fix use-after-free in nfs4_init_client()

KASAN reports a use-after-free when attempting to mount two different exports through two different NICs that belong to the same server.

Olga was able to hit this with kernels starting somewhere between 5.7 and 5.10, but I traced the patch that introduced the clear_bit() call to 4.13. So something must have changed in the refcounting of the clp pointer to make this call to nfs_put_client() the very last one.

CVE-2021-47255:
In the Linux kernel, the following vulnerability has been resolved:

kvm: LAPIC: Restore guard to prevent illegal APIC register access

Per the SDM, any access that touches bytes 4 through 15 of an APIC register may cause undefined behavior and must not be executed.
Worse, such an access in kvm_lapic_reg_read can result in a leak of kernel stack contents. Prior to commit 01402cf81051 (kvm: LAPIC:
write down valid APIC registers), such an access was explicitly disallowed. Restore the guard that was removed in that commit.

CVE-2021-47254:
In the Linux kernel, the following vulnerability has been resolved:

gfs2: Fix use-after-free in gfs2_glock_shrink_scan

The GLF_LRU flag is checked under lru_lock in gfs2_glock_remove_from_lru() to remove the glock from the lru list in __gfs2_glock_put().

On the shrink scan path, the same flag is cleared under lru_lock but because of cond_resched_lock(&lru_lock) in gfs2_dispose_glock_lru(), progress on the put side can be made without deleting the glock from the lru list.

Keep GLF_LRU across the race window opened by cond_resched_lock(&lru_lock) to ensure correct behavior on both sides - clear GLF_LRU after list_del under lru_lock.

CVE-2021-47230:
In the Linux kernel, the following vulnerability has been resolved:

KVM: x86: Immediately reset the MMU context when the SMM flag is cleared

Immediately reset the MMU context when the vCPU's SMM flag is cleared so that the SMM flag in the MMU role is always synchronized with the vCPU's flag. If RSM fails (which isn't correctly emulated), KVM will bail without calling post_leave_smm() and leave the MMU in a bad state.

The bad MMU role can lead to a NULL pointer dereference when grabbing a shadow page's rmap for a page fault as the initial lookups for the gfn will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will use the shadow page's SMM flag, which comes from the MMU (=1). SMM has an entirely different set of memslots, and so the initial lookup can find a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1).

general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline] RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947 Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44 RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002 R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000 FS: 000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline] mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604
__direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline] direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769 kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline] kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065 vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122 vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428 vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494 kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722 kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460 vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:1069 [inline]
__se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x440ce9

CVE-2021-47228:
In the Linux kernel, the following vulnerability has been resolved:

x86/ioremap: Map EFI-reserved memory as encrypted for SEV

Some drivers require memory that is marked as EFI boot services data. In order for this memory to not be re-used by the kernel after ExitBootServices(), efi_mem_reserve() is used to preserve it by inserting a new EFI memory descriptor and marking it with the EFI_MEMORY_RUNTIME attribute.

Under SEV, memory marked with the EFI_MEMORY_RUNTIME attribute needs to be mapped encrypted by Linux, otherwise the kernel might crash at boot like below:

EFI Variables Facility v0.08 2004-May-17 general protection fault, probably for non-canonical address 0x3597688770a868b2: 0000 [#1] SMP NOPTI CPU: 13 PID: 1 Comm: swapper/0 Not tainted 5.12.4-2-default #1 openSUSE Tumbleweed Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:efi_mokvar_entry_next [...] Call Trace:
efi_mokvar_sysfs_init ? efi_mokvar_table_init do_one_initcall ? __kmalloc kernel_init_freeable ? rest_init kernel_init ret_from_fork

Expand the __ioremap_check_other() function to additionally check for this other type of boot data reserved at runtime and indicate that it should be mapped encrypted for an SEV guest.

[ bp: Massage commit message. ]

CVE-2021-47227:
In the Linux kernel, the following vulnerability has been resolved:

x86/fpu: Prevent state corruption in __fpu__restore_sig()

The non-compacted slowpath uses __copy_from_user() and copies the entire user buffer into the kernel buffer, verbatim. This means that the kernel buffer may now contain entirely invalid state on which XRSTOR will #GP.
validate_user_xstate_header() can detect some of that corruption, but that leaves the onus on callers to clear the buffer.

Prior to XSAVES support, it was possible just to reinitialize the buffer, completely, but with supervisor states that is not longer possible as the buffer clearing code split got it backwards. Fixing that is possible but not corrupting the state in the first place is more robust.

Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate() which validates the XSAVE header contents before copying the actual states to the kernel. copy_user_to_xstate() was previously only called for compacted-format kernel buffers, but it works for both compacted and non-compacted forms.

Using it for the non-compacted form is slower because of multiple
__copy_from_user() operations, but that cost is less important than robust code in an already slow path.

[ Changelog polished by Dave Hansen ]

CVE-2021-47226:
In the Linux kernel, the following vulnerability has been resolved:

x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer

Both Intel and AMD consider it to be architecturally valid for XRSTOR to fail with #PF but nonetheless change the register state. The actual conditions under which this might occur are unclear [1], but it seems plausible that this might be triggered if one sibling thread unmaps a page and invalidates the shared TLB while another sibling thread is executing XRSTOR on the page in question.

__fpu__restore_sig() can execute XRSTOR while the hardware registers are preserved on behalf of a different victim task (using the fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but modify the registers.

If this happens, then there is a window in which __fpu__restore_sig() could schedule out and the victim task could schedule back in without reloading its own FPU registers. This would result in part of the FPU state that __fpu__restore_sig() was attempting to load leaking into the victim task's user-visible state.

Invalidate preserved FPU registers on XRSTOR failure to prevent this situation from corrupting any state.

[1] Frequent readers of the errata lists might imagine complex microarchitectural conditions.

CVE-2021-47217:
In the Linux kernel, the following vulnerability has been resolved:

x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails

Check for a valid hv_vp_index array prior to derefencing hv_vp_index when setting Hyper-V's TSC change callback. If Hyper-V setup failed in hyperv_init(), the kernel will still report that it's running under Hyper-V, but will have silently disabled nearly all functionality.

BUG: kernel NULL pointer dereference, address: 0000000000000010 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc2+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:set_hv_tscchange_cb+0x15/0xa0 Code: <8b> 04 82 8b 15 12 17 85 01 48 c1 e0 20 48 0d ee 00 01 00 f6 c6 08 ...
Call Trace:
kvm_arch_init+0x17c/0x280 kvm_init+0x31/0x330 vmx_init+0xba/0x13a do_one_initcall+0x41/0x1c0 kernel_init_freeable+0x1f2/0x23b kernel_init+0x16/0x120 ret_from_fork+0x22/0x30

CVE-2021-47212:
In the Linux kernel, the following vulnerability has been resolved:

net/mlx5: Update error handler for UCTX and UMEM

In the fast unload flow, the device state is set to internal error, which indicates that the driver started the destroy process.
In this case, when a destroy command is being executed, it should return MLX5_CMD_STAT_OK.
Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK instead of EIO.

This fixes a call trace in the umem release process - [ 2633.536695] Call Trace:
[ 2633.537518] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] [ 2633.538596] remove_client_context+0x8b/0xd0 [ib_core] [ 2633.539641] disable_device+0x8c/0x130 [ib_core] [ 2633.540615] __ib_unregister_device+0x35/0xa0 [ib_core] [ 2633.541640] ib_unregister_device+0x21/0x30 [ib_core] [ 2633.542663] __mlx5_ib_remove+0x38/0x90 [mlx5_ib] [ 2633.543640] auxiliary_bus_remove+0x1e/0x30 [auxiliary] [ 2633.544661] device_release_driver_internal+0x103/0x1f0 [ 2633.545679] bus_remove_device+0xf7/0x170 [ 2633.546640] device_del+0x181/0x410 [ 2633.547606] mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core] [ 2633.548777] mlx5_unregister_device+0x27/0x40 [mlx5_core] [ 2633.549841] mlx5_uninit_one+0x21/0xc0 [mlx5_core] [ 2633.550864] remove_one+0x69/0xe0 [mlx5_core] [ 2633.551819] pci_device_remove+0x3b/0xc0 [ 2633.552731] device_release_driver_internal+0x103/0x1f0 [ 2633.553746] unbind_store+0xf6/0x130 [ 2633.554657] kernfs_fop_write+0x116/0x190 [ 2633.555567] vfs_write+0xa5/0x1a0 [ 2633.556407] ksys_write+0x4f/0xb0 [ 2633.557233] do_syscall_64+0x5b/0x1a0 [ 2633.558071] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 2633.559018] RIP: 0033:0x7f9977132648 [ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 [ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648 [ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001 [ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740 [ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0 [ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c [ 2633.568725] ---[ end trace 10b4fe52945e544d ]---

CVE-2021-47211:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: usb-audio: fix null pointer dereference on pointer cs_desc

The pointer cs_desc return from snd_usb_find_clock_source could be null, so there is a potential null pointer dereference issue.
Fix this by adding a null check before dereference.

CVE-2021-47207:
In the Linux kernel, the following vulnerability has been resolved:

ALSA: gus: fix null pointer dereference on pointer block

The pointer block return from snd_gf1_dma_next_block could be null, so there is a potential null pointer dereference issue.
Fix this by adding a null check before dereference.

CVE-2021-47201:
In the Linux kernel, the following vulnerability has been resolved:

iavf: free q_vectors before queues in iavf_disable_vf

iavf_free_queues() clears adapter->num_active_queues, which iavf_free_q_vectors() relies on, so swap the order of these two function calls in iavf_disable_vf(). This resolves a panic encountered when the interface is disabled and then later brought up again after PF communication is restored.

CVE-2021-47194:
In the Linux kernel, the following vulnerability has been resolved:

cfg80211: call cfg80211_stop_ap when switch from P2P_GO type

If the userspace tools switch from NL80211_IFTYPE_P2P_GO to NL80211_IFTYPE_ADHOC via send_msg(NL80211_CMD_SET_INTERFACE), it does not call the cleanup cfg80211_stop_ap(), this leads to the initialization of in-use data. For example, this path re-init the sdata->assigned_chanctx_list while it is still an element of assigned_vifs list, and makes that linked list corrupt.

CVE-2021-47103:
In the Linux kernel, the following vulnerability has been resolved:

inet: fully convert sk->sk_rx_dst to RCU rules

syzbot reported various issues around early demux, one being included in this changelog [1]

sk->sk_rx_dst is using RCU protection without clearly documenting it.

And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv() are not following standard RCU rules.

[a] dst_release(dst);
[b] sk->sk_rx_dst = NULL;

They look wrong because a delete operation of RCU protected pointer is supposed to clear the pointer before the call_rcu()/synchronize_rcu() guarding actual memory freeing.

In some cases indeed, dst could be freed before [b] is done.

We could cheat by clearing sk_rx_dst before calling dst_release(), but this seems the right time to stick to standard RCU annotations and debugging facilities.

[1] BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline] BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792 Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204

CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247
__kasan_report mm/kasan/report.c:433 [inline] kasan_report.cold+0x83/0xdf mm/kasan/report.c:450 dst_check include/net/dst.h:470 [inline] tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792 ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline] ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
__netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
__netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
__netif_receive_skb_list net/core/dev.c:5608 [inline] netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699 gro_normal_list net/core/dev.c:5853 [inline] gro_normal_list net/core/dev.c:5849 [inline] napi_complete_done+0x1f1/0x880 net/core/dev.c:6590 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
__napi_poll+0xaf/0x440 net/core/dev.c:7023 napi_poll net/core/dev.c:7090 [inline] net_rx_action+0x801/0xb40 net/core/dev.c:7177
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629 RIP: 0033:0x7f5e972bfd57 Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73 RSP: 002b:00007fff8a413210 EFLAGS: 00000283 RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45 RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45 RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9 R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0 R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019 </TASK>

Allocated by task 13:
kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:434 [inline]
__kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467 kasan_slab_alloc include/linux/kasan.h:259 [inline] slab_post_alloc_hook mm/slab.h:519 [inline] slab_alloc_node mm/slub.c:3234 [inline] slab_alloc mm/slub.c:3242 [inline] kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247 dst_alloc+0x146/0x1f0 net/core/dst.c:92 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613 ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:234
---truncated---

CVE-2021-47091:
In the Linux kernel, the following vulnerability has been resolved:

mac80211: fix locking in ieee80211_start_ap error path

We need to hold the local->mtx to release the channel context, as even encoded by the lockdep_assert_held() there. Fix it.

CVE-2021-47076:
In the Linux kernel, the following vulnerability has been resolved:

RDMA/rxe: Return CQE error if invalid lkey was supplied

RXE is missing update of WQE status in LOCAL_WRITE failures. This caused the following kernel panic if someone sent an atomic operation with an explicitly wrong lkey.

[leonro@vm ~]$ mkt test test_atomic_invalid_lkey (tests.test_atomic.AtomicTest) ...
WARNING: CPU: 5 PID: 263 at drivers/infiniband/sw/rxe/rxe_comp.c:740 rxe_completer+0x1a6d/0x2e30 [rdma_rxe] Modules linked in: crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel rdma_ucm rdma_cm ib_umad ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core ptp pps_core CPU: 5 PID: 263 Comm: python3 Not tainted 5.13.0-rc1+ #2936 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:rxe_completer+0x1a6d/0x2e30 [rdma_rxe] Code: 03 0f 8e 65 0e 00 00 3b 93 10 06 00 00 0f 84 82 0a 00 00 4c 89 ff 4c 89 44 24 38 e8 2d 74 a9 e1 4c 8b 44 24 38 e9 1c f5 ff ff <0f> 0b e9 0c e8 ff ff b8 05 00 00 00 41 bf 05 00 00 00 e9 ab e7 ff RSP: 0018:ffff8880158af090 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888016a78000 RCX: ffffffffa0cf1652 RDX: 1ffff9200004b442 RSI: 0000000000000004 RDI: ffffc9000025a210 RBP: dffffc0000000000 R08: 00000000ffffffea R09: ffff88801617740b R10: ffffed1002c2ee81 R11: 0000000000000007 R12: ffff88800f3b63e8 R13: ffff888016a78008 R14: ffffc9000025a180 R15: 000000000000000c FS: 00007f88b622a740(0000) GS:ffff88806d540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f88b5a1fa10 CR3: 000000000d848004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
rxe_do_task+0x130/0x230 [rdma_rxe] rxe_rcv+0xb11/0x1df0 [rdma_rxe] rxe_loopback+0x157/0x1e0 [rdma_rxe] rxe_responder+0x5532/0x7620 [rdma_rxe] rxe_do_task+0x130/0x230 [rdma_rxe] rxe_rcv+0x9c8/0x1df0 [rdma_rxe] rxe_loopback+0x157/0x1e0 [rdma_rxe] rxe_requester+0x1efd/0x58c0 [rdma_rxe] rxe_do_task+0x130/0x230 [rdma_rxe] rxe_post_send+0x998/0x1860 [rdma_rxe] ib_uverbs_post_send+0xd5f/0x1220 [ib_uverbs] ib_uverbs_write+0x847/0xc80 [ib_uverbs] vfs_write+0x1c5/0x840 ksys_write+0x176/0x1d0 do_syscall_64+0x3f/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae

CVE-2021-47069:
In the Linux kernel, the following vulnerability has been resolved:

ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry

do_mq_timedreceive calls wq_sleep with a stack local address. The sender (do_mq_timedsend) uses this address to later call pipelined_send.

This leads to a very hard to trigger race where a do_mq_timedreceive call might return and leave do_mq_timedsend to rely on an invalid address, causing the following crash:

RIP: 0010:wake_q_add_safe+0x13/0x60 Call Trace:
__x64_sys_mq_timedsend+0x2a9/0x490 do_syscall_64+0x80/0x680 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f5928e40343

The race occurs as:

1. do_mq_timedreceive calls wq_sleep with the address of `struct ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it holds a valid `struct ext_wait_queue *` as long as the stack has not been overwritten.

2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
__pipelined_op.

3. Sender calls __pipelined_op::smp_store_release(&this->state, STATE_READY). Here is where the race window begins. (`this` is `ewq_addr`.)

4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it will see `state == STATE_READY` and break.

5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's stack. (Although the address may not get overwritten until another function happens to touch it, which means it can persist around for an indefinite time.)

6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a `struct ext_wait_queue *`, and uses it to find a task_struct to pass to the wake_q_add_safe call. In the lucky case where nothing has overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct.
In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a bogus address as the receiver's task_struct causing the crash.

do_mq_timedsend::__pipelined_op() should not dereference `this` after setting STATE_READY, as the receiver counterpart is now free to return.
Change __pipelined_op to call wake_q_add_safe on the receiver's task_struct returned by get_task_struct, instead of dereferencing `this` which sits on the receiver's stack.

As Manfred pointed out, the race potentially also exists in ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare. Fix those in the same way.

CVE-2021-3923:
A flaw was found in the Linux kernel's implementation of RDMA over infiniband. An attacker with a privileged local account can leak kernel stack information when issuing commands to the /dev/infiniband/rdma_cm device node. While this access is unlikely to leak sensitive user information, it can be further used to defeat existing kernel protection mechanisms.

CVE-2021-38208:
net/nfc/llcp_sock.c in the Linux kernel before 5.12.10 allows local unprivileged users to cause a denial of service (NULL pointer dereference and BUG) by making a getsockname call after a certain type of failure of a bind call.

CVE-2021-38207:
drivers/net/ethernet/xilinx/ll_temac_main.c in the Linux kernel before 5.12.13 allows remote attackers to cause a denial of service (buffer overflow and lockup) by sending heavy network traffic for about ten minutes.

CVE-2021-38205:
drivers/net/ethernet/xilinx/xilinx_emaclite.c in the Linux kernel before 5.13.3 makes it easier for attackers to defeat an ASLR protection mechanism because it prints a kernel pointer (i.e., the real IOMEM pointer).

CVE-2021-3635:
A flaw was found in the Linux kernel netfilter implementation in versions prior to 5.5-rc7. A user with root (CAP_SYS_ADMIN) access is able to panic the system when issuing netfilter netflow commands.

CVE-2021-3428:
A flaw was found in the Linux kernel. A denial of service problem is identified if an extent tree is corrupted in a crafted ext4 filesystem in fs/ext4/extents.c in ext4_es_cache_extent. Fabricating an integer overflow, A local attacker with a special user privilege may cause a system crash problem which can lead to an availability threat.

CVE-2020-36322:
An issue was discovered in the FUSE filesystem implementation in the Linux kernel before 5.10.6, aka CID-5d069dbe8aaf. fuse_do_getattr() calls make_bad_inode() in inappropriate situations, causing a system crash. NOTE: the original fix for this vulnerability was incomplete, and its incompleteness is tracked as CVE-2021-28950.

CVE-2020-35513:
A flaw incorrect umask during file or directory modification in the Linux kernel NFS (network file system) functionality was found in the way user create and delete object using NFSv4.2 or newer if both simultaneously accessing the NFS by the other process that is not using new NFSv4.2. A user with access to the NFS could use this flaw to starve the resources causing denial of service.

CVE-2020-14381:
A flaw was found in the Linux kernel's futex implementation. This flaw allows a local attacker to corrupt system memory or escalate their privileges when creating a futex on a filesystem that is about to be unmounted. The highest threat from this vulnerability is to confidentiality, integrity, as well as system availability.

Tenable has extracted the preceding description block directly from the Tencent Linux security advisory.

Note that Nessus has not tested for these issues but has instead relied only on the application's self-reported version number.

Solution

Update the affected packages.

Plugin Details

Severity: High

ID: 239342

File Name: tencentos_TSSA_2024_0981.nasl

Version: 1.1

Type: local

Family: Tencent Local Security Checks

Published: 6/16/2025

Updated: 6/16/2025

Supported Sensors: Nessus

Vulnerability Information

CPE: p-cpe:/a:tencent:tencentos_server:kernel, cpe:/o:tencent:tencentos_server:4

Required KB Items: Host/local_checks_enabled, Host/cpu, Host/etc/os-release, Host/TencentOS/rpm-list

Exploit Ease: No known exploits are available

Patch Publication Date: 1/17/2025

Vulnerability Publication Date: 1/17/2025

Detections

Analytics