Created attachment 557 [details] kdump服务正常 Description of problem: anolis-8.8-aarch64-dvd.iso->ANCK(4.19.91)软件选择服务器安装后,checklist检查,通过echo c >/proc/sysrq-trigger触发crash,未正常生成vmcore,具体如下: # cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.19.91-26.6.an8.aarch64 root=/dev/mapper/ao00-root ro crashkernel=512M rd.lvm.lv=ao00/root rd.lvm.lv=ao00/swap cgroup.memory=nokmem # cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.19.91-26.6.an8.aarch64 root=/dev/mapper/ao00-root ro crashkernel=1024M,high rd.lvm.lv=ao00/root rd.lvm.lv=ao00/swap cgroup.memory=nokmem kdump服务正常的截图见附件 #echo c >/proc/sysrq-trigger #cd /var/crash/ [root@localhost crash]# ll total 0 版本信息: #cat /etc/os-release NAME="Anolis OS" VERSION="8.8" ID="anolis" ID_LIKE="rhel fedora centos" VERSION_ID="8.8" PLATFORM_ID="platform:an8" PRETTY_NAME="Anolis OS 8.8" ANSI_COLOR="0;31" HOME_URL="https://openanolis.cn/" 内核信息: # uname -r 4.19.91-26.6.an8.aarch64 内存信息: # free -h total used free shared buff/cache available Mem: 503Gi 1.6Gi 501Gi 9.0Mi 264Mi 500Gi Swap: 4.0Gi 0B 4.0Gi Version-Release number of selected component (if applicable): How reproducible: 必现 Steps to Reproduce: 1.飞腾物理机安装anolis-8.8-aarch64-dvd.iso镜像,选择4.19.91内核,软件选择服务器,安装启动 2.尝试修改crashkernel的值为512M和1024M之后,kdump服务正常 3.做checklist测试,通过echo c >/proc/sysrq-trigger触发crash,未能正常生成vmcore Actual results: 通过echo c >/proc/sysrq-trigger触发crash,未能正常生成vmcore Expected results: 通过echo c >/proc/sysrq-trigger触发crash,能正常生成vmcore Additional info:
按照bug单:https://bugzilla.openanolis.cn/show_bug.cgi?id=3554的方法修改后,验证可正常产生vmcore。 …… 以下是3554单的修正方法: 在这个问题上,vmcore没有正常生成有两个原因 一个原因是没找到LVM逻辑卷,具体报错为: [ 10.438424] dracut-initqueue[421]: Scanning devices nvme0n1p3 nvme1n1p1 for LVM logical volumes ao00/root [ 10.500609] dracut-initqueue[446]: inactive '/dev/ao/swap' [4.00 GiB] inherit [ 10.556198] dracut-initqueue[446]: inactive '/dev/ao/home' [3.56 TiB] inherit [ 10.612197] dracut-initqueue[446]: inactive '/dev/ao/root' [70.00 GiB] inherit [ 10.668256] dracut-initqueue[448]: Volume group "ao00" not found [ 10.716078] dracut-initqueue[448]: Cannot process volume group ao00 这个问题可以通过将系统安装的nvme盘上而不是sda盘上解决 一个原因是/etc/sysconfig/kdump 文件内crashkernel参数nr_cpus设置为1,正确情况应该设置为2,否则会有如下报错: [ 744.428762] nvme nvme2: I/O 57 QID 1 timeout, completion polled [ 744.429611] nvme nvme0: I/O 9 QID 1 timeout, completion polled [ 744.430372] nvme nvme1: I/O 25 QID 1 timeout, completion polled 解决完以上问题后,rc2能正常生成vmcore,不是rc2内核问题
已验证不属于内核问题,置为wontfix