Bug 28352 - Hygon: Support fine-grained shared memory management for CSV3 VM
Summary: Hygon: Support fine-grained shared memory management for CSV3 VM
Status: NEW
Alias: None
Product: ANCK 5.10 Dev
Classification: ANCK
Component: X86 (show other bugs) X86
Version: unspecified
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: Guanjun
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-12-22 13:36 UTC by wojiaohanliyang
Modified: 2025-12-22 13:47 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description wojiaohanliyang hygon_group 2025-12-22 13:36:43 UTC
Description of problem:

The NUMA nodes from num_online_nodes() are not necessarily contiguous or zero-based. So the csv3_shared_memory array tracked via sysfs path /sys/kernel/mm/csv3_cma/mem_info should be of size MAX_NUMNODES.

The shared memory regions of a CSV3 virtual machine may be backed by either standard 4KB small pages or 2MB hugetlb/THP (Transparent Huge Pages) large pages. KVM tracks all physical pages corresponding to the shared memory regions of CSV3 virtual machines, but it does not record whether each physical page is a small page or a large page.
When a memory region of the virtual machine transitions from shared back to private, KVM or QEMU should release the corresponding physical pages. However, because KVM currently lacks information about whether those pages are small or large, blindly releasing physical pages poses a potential risk: if a single large page contains multiple small shared memory regions, releasing it prematurely could corrupt other still-shared regions.
This patch introduces fine-grained tracking of how physical pages are used by shared memory regions in CSV3 virtual machines. It ensures that a physical page is released only when it no longer contains any shared memory region.
Additionally, this patch adds a new interface for interaction with QEMU. When a memory region transitions from private to shared, QEMU can now directly pin (lock) the physical pages backing that shared region. Consequently, when the virtual machine accesses the shared memory and triggers a #NPF (Nested Page Fault), the #NPF handler can immediately retrieve the already-pinned physical page, significantly accelerating #NPF resolution. This optimization alleviates contention during concurrent #NPF handling across many CSV3 virtual machines trying to pin pages simultaneously, thereby improving perceived responsiveness and user experience.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 小龙 admin 2025-12-22 13:47:57 UTC
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/6244