[问题简述] Anolis8 kernel-debug ck-4.19 aarch64环境,内部版nightly测试,blktests测试套block/024测试fail [复现步骤] 提前磁盘分好区 git clone https://github.com/osandov/blktests.git cd blktests/ make make install prefix=./run_blktests cd ./run_blktests export TEST_DEVS=/dev/nvme0n1p1 cd blktests/ ./check block/024 [期望结果] case pass [实际结果] [[root@l57f12084 blktests]# ./check block/024 block/024 (do I/O faster than a jiffy and check iostats times) [failed] runtime ... 2.839s --- tests/block/024.out 2022-11-17 22:19:44.767827410 +0800 +++ /root/blktests/run_blktests/blktests/results/nodev/block/024.out.bad 2022-11-17 22:36:02.571827410 +0800 @@ -6,5 +6,5 @@ read 1 s write 1 s read 2 s -write 3 s +write 2 s Test complete [测试环境] [root@l57f12084 blktests]# uname -r 4.19.91-540.git.31394c95a.an8.aarch64+debug [root@l57f12084 blktests]# cat /etc/redhat-release Anolis OS release 8.6 [root@l57f12084 blktests]# free -g total used free shared buff/cache available Mem: 657 20 635 0 1 633 Swap: 1 0 1 [root@l57f12084 blktests]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 329G 0 329G 0% /dev tmpfs 329G 0 329G 0% /dev/shm tmpfs 329G 11M 329G 1% /run tmpfs 329G 0 329G 0% /sys/fs/cgroup /dev/sda2 49G 5.3G 42G 12% / /dev/sda1 1022M 6.7M 1016M 1% /boot/efi tmpfs 66G 0 66G 0% /run/user/0 [root@l57f12084 blktests]# lscpu Architecture: aarch64 Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-95 Thread(s) per core: 1 Core(s) per socket: 48 Socket(s): 2 NUMA node(s): 1 Vendor ID: HiSilicon BIOS Vendor ID: HiSilicon Model: 0 Model name: Kunpeng-920 BIOS Model name: HUAWEI Kunpeng 920 5250 Stepping: 0x1 CPU max MHz: 2600.0000 CPU min MHz: 200.0000 BogoMIPS: 200.00 L1d cache: 64K L1i cache: 64K L2 cache: 512K L3 cache: 24576K NUMA node0 CPU(s): 0-95 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
以前的版本也出同样的问题,非引入的regression。
在最新的4.19上测试是通过的,还未复现出来。
问题已定位,待分析解决方案。 根因: block024用例里用后台启动的方式并行地启动了两个dd进程往null_blk里写数据,但这会导致 IO的合并,在概率情况下合并的IO过多导致统计到的写IO个数偏少,从而总的写时间偏少,最终导致 测试用例输出的写IO总时间少于预期结果,测试失败。
该问题非内核问题,解决方案为在blktest的block/024测试用例的两次后台dd写命令之间加一个wait。
already fixed