Description of problem: During IDXD MDEV device being re-enabled/re-created, some warning logs show in host kernel dmesg as below. [ 1045.764961] vdcm_vidxd_create, idxd migration region init successfully!!! [ 1045.764964] ------------[ cut here ]------------ [ 1045.764965] list_add corruption. prev->next should be next (ffff994b9d7399e8), but was 0000000000000000. (prev=ffff995bc7f10020). [ 1045.764970] WARNING: CPU: 237 PID: 20847 at lib/list_debug.c:28 __list_add_valid+0x5b/0x90 [ 1045.764971] Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat x_tables nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink br_netfilter bridge stp llc overlay sunrpc nls_iso8859_1 iax_crypto idxd_mdev vfio_pci vfio_virqfd mdev intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio iTCO_wdt iTCO_vendor_support snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core joydev input_leds snd_hwdep snd_seq snd_seq_device snd_pcm intel_cstate efi_pstore snd_timer idxd pcspkr isst_if_mbox_pci mei_me snd isst_if_mmio isst_if_common i2c_i801 mei idxd_bus soundcore i2c_ismt i2c_smbus ipmi_ssif acpi_pad acpi_power_meter evbug mac_hid sch_fq_codel xfs [ 1045.764995] libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec hid_generic drm_ttm_helper usbmouse usbkbd ttm usbhid ahci drm igc hid libahci wmi ipmi_si ipmi_devintf ipmi_msghandler msr autofs4 [ 1045.765004] CPU: 237 PID: 20847 Comm: dsa_iax_enable. Tainted: G S W 5.10.134+ #415 [ 1045.765005] Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS EGSDREL1.SYS.0101.D66.2304161335 04/16/2023 [ 1045.765007] RIP: 0010:__list_add_valid+0x5b/0x90 [ 1045.765008] Code: c7 c7 d8 69 60 a5 48 89 c2 e8 6e cf 5d 00 0f 0b 31 c0 5d c3 cc cc cc cc 48 89 c1 4c 89 c6 48 c7 c7 28 6a 60 a5 e8 52 cf 5d 00 <0f> 0b 31 c0 5d c3 cc cc cc cc 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 [ 1045.765009] RSP: 0018:ffffbfbade927b98 EFLAGS: 00010282 [ 1045.765010] RAX: 0000000000000000 RBX: ffff994b9d7399c0 RCX: 0000000000000027 [ 1045.765010] RDX: 0000000000000027 RSI: ffff996aff4a05e0 RDI: ffff996aff4a05e8 [ 1045.765011] RBP: ffffbfbade927b98 R08: 0000000000000000 R09: c0000000fffeffff [ 1045.765012] R10: 0000000000000001 R11: ffffbfbade927950 R12: ffff995bc7f10000 [ 1045.765013] R13: ffff995bc7f10020 R14: ffff995bc7f10020 R15: ffff994b9d7399e8 [ 1045.765014] FS: 00007f2544cd2740(0000) GS:ffff996aff480000(0000) knlGS:0000000000000000 [ 1045.765014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1045.765015] CR2: 00007f2544687770 CR3: 00000030efe62001 CR4: 0000000000770ee0 [ 1045.765016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1045.765017] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1045.765017] PKRU: 55555554 [ 1045.765018] Call Trace: [ 1045.765020] vfio_assign_device_set+0x8d/0x1b0 [ 1045.765022] ? vprintk_func+0x62/0x110 [ 1045.765023] vfio_register_group_dev+0x261/0x410 [ 1045.765026] idxd_vdcm_probe+0x408/0x500 [idxd_mdev] [ 1045.765028] mdev_probe+0x3e/0xb0 [mdev] [ 1045.765029] really_probe+0x1d7/0x3f0 [ 1045.765030] __driver_probe_device+0x112/0x190 [ 1045.765031] device_driver_attach+0x29/0x60 [ 1045.765032] mdev_device_create+0x1eb/0x290 [mdev] [ 1045.765033] create_store+0x98/0xc0 [mdev] [ 1045.765035] ? kernfs_fop_write_iter+0xea/0x1d0 [ 1045.765036] mdev_type_attr_store+0x14/0x30 [mdev] [ 1045.765037] sysfs_kf_write+0x38/0x50 [ 1045.765038] kernfs_fop_write_iter+0x146/0x1d0 [ 1045.765040] new_sync_write+0x114/0x1b0 [ 1045.765041] vfs_write+0x18d/0x260 [ 1045.765042] ksys_write+0x61/0xe0 [ 1045.765043] __x64_sys_write+0x1a/0x20 [ 1045.765044] do_syscall_64+0x34/0x90 [ 1045.765046] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 1045.765047] RIP: 0033:0x7f25443e75c8 [ 1045.765048] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 d5 3f 2a 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 [ 1045.765048] RSP: 002b:00007ffea8a362c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 1045.765049] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 00007f25443e75c8 [ 1045.765050] RDX: 0000000000000025 RSI: 0000556fab240380 RDI: 0000000000000001 [ 1045.765050] RBP: 0000556fab240380 R08: 000000000000000a R09: 00007f2544447740 [ 1045.765051] R10: 000000000000000a R11: 0000000000000246 R12: 00007f25446876e0 [ 1045.765051] R13: 0000000000000025 R14: 00007f2544682880 R15: 0000000000000025 [ 1045.765052] ---[ end trace 684b846c78e8efe3 ]--- How reproducible: Steps to Reproduce: 1. Create IDXD MDEV device 2. Assign one MDEV device to VM with vIOMMU 3. Start VM and run some DSA test 4. Shutdown VM and delete created MDEV device 5. re-create IDXD MDEV device.
(In reply to Fengqian from comment #0) > Description of problem: > During IDXD MDEV device being re-enabled/re-created, some warning logs show > in host kernel dmesg as below. > > [ 1045.764961] vdcm_vidxd_create, idxd migration region init successfully!!! > [ 1045.764964] ------------[ cut here ]------------ > [ 1045.764965] list_add corruption. prev->next should be next > (ffff994b9d7399e8), but was 0000000000000000. (prev=ffff995bc7f10020). > [ 1045.764970] WARNING: CPU: 237 PID: 20847 at lib/list_debug.c:28 > __list_add_valid+0x5b/0x90 > [ 1045.764971] Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 > xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user > xfrm_algo nft_counter xt_addrtype nft_compat x_tables nft_chain_nat nf_nat > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink br_netfilter > bridge stp llc overlay sunrpc nls_iso8859_1 iax_crypto idxd_mdev vfio_pci > vfio_virqfd mdev intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul > crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper > rapl snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio iTCO_wdt > iTCO_vendor_support snd_hda_intel snd_intel_dspcfg snd_hda_codec > snd_hda_core joydev input_leds snd_hwdep snd_seq snd_seq_device snd_pcm > intel_cstate efi_pstore snd_timer idxd pcspkr isst_if_mbox_pci mei_me snd > isst_if_mmio isst_if_common i2c_i801 mei idxd_bus soundcore i2c_ismt > i2c_smbus ipmi_ssif acpi_pad acpi_power_meter evbug mac_hid sch_fq_codel xfs > [ 1045.764995] libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper > syscopyarea sysfillrect sysimgblt fb_sys_fops cec hid_generic drm_ttm_helper > usbmouse usbkbd ttm usbhid ahci drm igc hid libahci wmi ipmi_si ipmi_devintf > ipmi_msghandler msr autofs4 > [ 1045.765004] CPU: 237 PID: 20847 Comm: dsa_iax_enable. Tainted: G S W > 5.10.134+ #415 > [ 1045.765005] Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS > EGSDREL1.SYS.0101.D66.2304161335 04/16/2023 > [ 1045.765007] RIP: 0010:__list_add_valid+0x5b/0x90 > [ 1045.765008] Code: c7 c7 d8 69 60 a5 48 89 c2 e8 6e cf 5d 00 0f 0b 31 c0 > 5d c3 cc cc cc cc 48 89 c1 4c 89 c6 48 c7 c7 28 6a 60 a5 e8 52 cf 5d 00 <0f> > 0b 31 c0 5d c3 cc cc cc cc 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 > [ 1045.765009] RSP: 0018:ffffbfbade927b98 EFLAGS: 00010282 > [ 1045.765010] RAX: 0000000000000000 RBX: ffff994b9d7399c0 RCX: > 0000000000000027 > [ 1045.765010] RDX: 0000000000000027 RSI: ffff996aff4a05e0 RDI: > ffff996aff4a05e8 > [ 1045.765011] RBP: ffffbfbade927b98 R08: 0000000000000000 R09: > c0000000fffeffff > [ 1045.765012] R10: 0000000000000001 R11: ffffbfbade927950 R12: > ffff995bc7f10000 > [ 1045.765013] R13: ffff995bc7f10020 R14: ffff995bc7f10020 R15: > ffff994b9d7399e8 > [ 1045.765014] FS: 00007f2544cd2740(0000) GS:ffff996aff480000(0000) > knlGS:0000000000000000 > [ 1045.765014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1045.765015] CR2: 00007f2544687770 CR3: 00000030efe62001 CR4: > 0000000000770ee0 > [ 1045.765016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 1045.765017] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: > 0000000000000400 > [ 1045.765017] PKRU: 55555554 > [ 1045.765018] Call Trace: > [ 1045.765020] vfio_assign_device_set+0x8d/0x1b0 > [ 1045.765022] ? vprintk_func+0x62/0x110 > [ 1045.765023] vfio_register_group_dev+0x261/0x410 > [ 1045.765026] idxd_vdcm_probe+0x408/0x500 [idxd_mdev] > [ 1045.765028] mdev_probe+0x3e/0xb0 [mdev] > [ 1045.765029] really_probe+0x1d7/0x3f0 > [ 1045.765030] __driver_probe_device+0x112/0x190 > [ 1045.765031] device_driver_attach+0x29/0x60 > [ 1045.765032] mdev_device_create+0x1eb/0x290 [mdev] > [ 1045.765033] create_store+0x98/0xc0 [mdev] > [ 1045.765035] ? kernfs_fop_write_iter+0xea/0x1d0 > [ 1045.765036] mdev_type_attr_store+0x14/0x30 [mdev] > [ 1045.765037] sysfs_kf_write+0x38/0x50 > [ 1045.765038] kernfs_fop_write_iter+0x146/0x1d0 > [ 1045.765040] new_sync_write+0x114/0x1b0 > [ 1045.765041] vfs_write+0x18d/0x260 > [ 1045.765042] ksys_write+0x61/0xe0 > [ 1045.765043] __x64_sys_write+0x1a/0x20 > [ 1045.765044] do_syscall_64+0x34/0x90 > [ 1045.765046] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > [ 1045.765047] RIP: 0033:0x7f25443e75c8 > [ 1045.765048] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 > f3 0f 1e fa 48 8d 05 d5 3f 2a 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> > 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 > [ 1045.765048] RSP: 002b:00007ffea8a362c8 EFLAGS: 00000246 ORIG_RAX: > 0000000000000001 > [ 1045.765049] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: > 00007f25443e75c8 > [ 1045.765050] RDX: 0000000000000025 RSI: 0000556fab240380 RDI: > 0000000000000001 > [ 1045.765050] RBP: 0000556fab240380 R08: 000000000000000a R09: > 00007f2544447740 > [ 1045.765051] R10: 000000000000000a R11: 0000000000000246 R12: > 00007f25446876e0 > [ 1045.765051] R13: 0000000000000025 R14: 00007f2544682880 R15: > 0000000000000025 > [ 1045.765052] ---[ end trace 684b846c78e8efe3 ]--- > > > How reproducible: > > > Steps to Reproduce: > 1. Create IDXD MDEV device > 2. Assign one MDEV device to VM with vIOMMU > 3. Start VM and run some DSA test > 4. Shutdown VM and delete created MDEV device > 5. re-create IDXD MDEV device. It needs to turn on kernel config CONFIG_DEBUG_LIST.
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/1978