Bug 3521 - [Anolis8.8][ANCK-4.19.91][aarch64][dvd.iso][FT2500]anolis-8.8-aarch64-dvd.iso->ANCK(4.19.91)软件选择服务器安装后,checklist检查,通过echo c >/proc/sysrq-trigger触发crash,未正常生成vmcore
Summary: [Anolis8.8][ANCK-4.19.91][aarch64][dvd.iso][FT2500]anolis-8.8-aarch64-dvd.iso...
Status: VERIFIED WONTFIX
Alias: None
Product: Anolis OS 8
Classification: Anolis OS
Component: kernel - anck-4.19 (show other bugs) kernel - anck-4.19
Version: 8.8
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: xiangzao
QA Contact: shuming
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-12-20 10:41 UTC by yunhe123
Modified: 2023-01-16 10:34 UTC (History)
8 users (show)

See Also:


Attachments
kdump服务正常 (193.77 KB, image/jpeg)
2022-12-20 10:41 UTC, yunhe123
Details

Note You need to log in before you can comment on or make changes to this bug.
Description yunhe123 alibaba_cloud_group 2022-12-20 10:41:47 UTC
Created attachment 557 [details]
kdump服务正常

Description of problem:
anolis-8.8-aarch64-dvd.iso->ANCK(4.19.91)软件选择服务器安装后,checklist检查,通过echo c >/proc/sysrq-trigger触发crash,未正常生成vmcore,具体如下:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.19.91-26.6.an8.aarch64 root=/dev/mapper/ao00-root ro crashkernel=512M rd.lvm.lv=ao00/root rd.lvm.lv=ao00/swap cgroup.memory=nokmem

# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.19.91-26.6.an8.aarch64 root=/dev/mapper/ao00-root ro crashkernel=1024M,high rd.lvm.lv=ao00/root rd.lvm.lv=ao00/swap cgroup.memory=nokmem


kdump服务正常的截图见附件
#echo c >/proc/sysrq-trigger 
#cd /var/crash/
[root@localhost crash]# ll
total 0


版本信息:
#cat /etc/os-release
NAME="Anolis OS"
VERSION="8.8"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.8"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.8"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

内核信息:
# uname -r
4.19.91-26.6.an8.aarch64

内存信息:
# free -h
              total        used        free      shared  buff/cache   available
Mem:          503Gi       1.6Gi       501Gi       9.0Mi       264Mi       500Gi
Swap:         4.0Gi          0B       4.0Gi



Version-Release number of selected component (if applicable):


How reproducible:
必现

Steps to Reproduce:
1.飞腾物理机安装anolis-8.8-aarch64-dvd.iso镜像,选择4.19.91内核,软件选择服务器,安装启动
2.尝试修改crashkernel的值为512M和1024M之后,kdump服务正常
3.做checklist测试,通过echo c >/proc/sysrq-trigger触发crash,未能正常生成vmcore

Actual results:
通过echo c >/proc/sysrq-trigger触发crash,未能正常生成vmcore

Expected results:
通过echo c >/proc/sysrq-trigger触发crash,能正常生成vmcore

Additional info:
Comment 1 yunhe123 alibaba_cloud_group 2023-01-03 16:32:01 UTC
按照bug单:https://bugzilla.openanolis.cn/show_bug.cgi?id=3554的方法修改后,验证可正常产生vmcore。


……
以下是3554单的修正方法:
在这个问题上,vmcore没有正常生成有两个原因

一个原因是没找到LVM逻辑卷,具体报错为:
[   10.438424] dracut-initqueue[421]: Scanning devices nvme0n1p3 nvme1n1p1  for LVM logical volumes ao00/root
[   10.500609] dracut-initqueue[446]: inactive '/dev/ao/swap' [4.00 GiB] inherit
[   10.556198] dracut-initqueue[446]: inactive '/dev/ao/home' [3.56 TiB] inherit
[   10.612197] dracut-initqueue[446]: inactive '/dev/ao/root' [70.00 GiB] inherit
[   10.668256] dracut-initqueue[448]: Volume group "ao00" not found
[   10.716078] dracut-initqueue[448]: Cannot process volume group ao00
这个问题可以通过将系统安装的nvme盘上而不是sda盘上解决

一个原因是/etc/sysconfig/kdump 文件内crashkernel参数nr_cpus设置为1,正确情况应该设置为2,否则会有如下报错:
[  744.428762] nvme nvme2: I/O 57 QID 1 timeout, completion polled
[  744.429611] nvme nvme0: I/O 9 QID 1 timeout, completion polled
[  744.430372] nvme nvme1: I/O 25 QID 1 timeout, completion polled

解决完以上问题后,rc2能正常生成vmcore,不是rc2内核问题
Comment 2 xiangzao alibaba_cloud_group 2023-01-03 16:36:33 UTC
已验证不属于内核问题,置为wontfix