394 – [ANCK 4.19-devel] Host panic when running offline virtualized machine for a long time

Bug 394 - [ANCK 4.19-devel] Host panic when running offline virtualized machine for a long time

Summary: [ANCK 4.19-devel] Host panic when running offline virtualized machine for a l...

Status:	CONFIRMED

Alias:	None

Product:	ANCK 4.19 Dev
Classification:	ANCK
Component:	sched (show other bugs)	sched
Sub Component:
Version:	4.19-026.x
Hardware:	All Linux

Importance:	P3-Medium S3-normal
Target Milestone:	---
Assignee:	CruzZhao
QA Contact:	shuming

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2022-01-25 17:33 UTC by baka233
Modified:	2022-01-29 15:30 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description baka233 2022-01-25 17:33:32 UTC

Description of problem:

When running offline virtualized container in a long time, the host kernel may cause divide zero crash occasionally when running __cpuacct_get_usage_result.


Version-Release number of selected component (if applicable):


How reproducible:

Running offline virtualized workload, and run `cat cpuacct.proc_stat_show` for a longtime, it may cause the panic or not.

Steps to Reproduce:


Actual results:
Host panic if the offline virtualized container run a long time.

Expected results:
Work normally

Additional info:
This bug is caused by race condition when read per_cpu `kcpustats` variable, and the non-consistent tick_user and tick_guest, make the `tick_user - tick_guest` be negative.

Comment 1 CruzZhao alibaba_cloud_group

2022-01-29 15:30:01 UTC

It's a problem of rich container, Xunlei Pang may help.