Description of problem: In our case, there are two cpu/cpuacct/cpuset cgroups A and A/B, cfs_quota of A&B both are 1600000(cfs_period = 100000). All tasks are run in cgroup B. After running a period of time. We found that idle metric(in cpuacct.proc_stat) of A becomes a really large number, then overflow, finally becomes a large number again. Version-Release number of selected component (if applicable): How reproducible: yield.c #include <sched.h> #include <pthread.h> static void *yield_worker(void *args) { for ( ; ; ) sched_yield(); } int main(int argc, const char *argv[]) { pthread_t tid[256]; int i; for (i = 0; i < 256; i++) pthread_create(&tid[i], NULL, yield_worker, NULL); pthread_join(tid[0], NULL); return 0; } Steps to Reproduce: 1. prepare cpu cgroups mkdir /sys/fs/cgroup/cpu/A echo 0-31 > /sys/fs/cgroup/cpu/A/cpuset.cpus echo 0 > /sys/fs/cgroup/cpu/A/cpuset.mems echo 50000 > /sys/fs/cgroup/cpu/A/cpu.cfs_quota_us mkdir /sys/fs/cgroup/cpu/A/B echo 0-31 > /sys/fs/cgroup/cpu/A/B/cpuset.cpus echo 0 > /sys/fs/cgroup/cpu/A/B/cpuset.mems echo 50000 > /sys/fs/cgroup/cpu/A/B/cpu.cfs_quota_us 2. wait a long time(maybe few of days), we found that more oldder cgroups were easy to reproduce problem. 3. run test case cgexec -g cpu:/A/B ./yield 4. show results cat /sys/fs/cgroup/cpu/A/cpuacct.proc_stat ; sleep 5; cat /sys/fs/cgroup/cpu/A/cpuacct.proc_stat cat /sys/fs/cgroup/cpu/A/B/cpuacct.proc_stat ; sleep 5; cat /sys/fs/cgroup/cpu/A/B/cpuacct.proc_stat Actual results: A: idle 2625715746 steal 49438 idle 3406279083 steal 50520 B: idle 137536530 steal 847084 idle 137536530 steal 862591 Expected results: idle in A should not inc two much in this case. Additional info:
The PR Link: https://gitee.com/anolis/cloud-kernel/pulls/2491
https://gitee.com/anolis/cloud-kernel/pulls/2491