Bug 4593 - [Anolis 23][ANCK-5.10-14-rc2]挂载GlusterFS分布式文件系统报错:Mounting glusterfs on /data/ failed.
Summary: [Anolis 23][ANCK-5.10-14-rc2]挂载GlusterFS分布式文件系统报错:Mounting glusterfs on /data...
Status: IN_PROGRESS
Alias: None
Product: Anolis OS 23
Classification: Anolis OS
Component: BaseOS Packages (show other bugs) BaseOS Packages
Version: 23.0
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: happy_orange
QA Contact: bolong_tbl
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-23 11:35 UTC by Banana
Modified: 2023-06-15 12:18 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Banana alibaba_cloud_group 2023-03-23 11:35:25 UTC
[问题描述]:挂载GlusterFS分布式文件系统报错:
[root@qibo-anck014-an23-g6r-1 ~]# mount.glusterfs node1:dis-volume /data/
grep: warning: stray \ before -
Fatal glibc error: malloc assertion failure in sysmalloc: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)
/usr/sbin/mount.glusterfs: line 125: 795870 Aborted                 (core dumped) $cmd_line
Mounting glusterfs on /data/ failed.

[环境信息]:
机器类型:ECS

[内核信息]:
客户端(挂载端)
[root@qibo-anck014-an23-g6r-1 ~]# uname -r
5.10.134-14_rc2.1.an23.aarch64

服务端1(部署lusterFS):
[root@iZbp14fzsphchtmh4wcau9Z ~]# uname -r
5.10.134-13.2_alpha1.an23.x86_64

服务端2(部署lusterFS):
[root@iZbp14fzsphchtmh4wcauaZ ~]# uname -r
5.10.134-13.2_alpha1.an23.x86_64

[操作系统信息]:
客户端(挂载端)
[root@qibo-anck014-an23-g6r-1 ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

服务端1(部署lusterFS):
[root@iZbp14fzsphchtmh4wcau9Z ~]# uname -r
5.10.134-13.2_alpha1.an23.x86_64
[root@iZbp14fzsphchtmh4wcau9Z ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

服务端2(部署lusterFS):
[root@iZbp14fzsphchtmh4wcauaZ ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

[问题发生概率]:必现

[复现步骤]:

前置条件(安装部署glusterd):
1、客户端执行:yum install glusterfs glusterfs-fuse安装软件包;
2、服务端执行:yum -y install glusterfs glusterfs-server glusterfs-fuse glusterfs-rdma安装软件包
3、所有机器配置/etc/hosts文件类型如下:
Server1_ip node1 如:192.168.1.1 node1
Server2_ip node2
4、服务端磁盘挂载,如将/dev/vdb1挂载到/data目录下:
mount /dev/vdb1 /data/
5、服务端客户端执行systemctl restart glusterd重启服务。

测试步骤(创建与挂载):
1、服务端执行下列命令添加节点创建集群(在一台机器执行即可,无需执行本机节点):
gluster peer probe node1
2、服务端创建分布式卷(一台机器执行即可):
gluster volume create dis-volume node1:/data node2:/data force 
3、服务端执行gluster volume list命令,查看卷列表;
4、服务端执行gluster volume start dis-volume启动新建卷;
5、服务端执行gluster volume info dis-volume命令,查看分布式卷信息;
6、客户端创建挂载目录如:mkdir /data;
7、客户端执行mount.glusterfs node1:dis-volume /data命令,挂载分布式文件系统。
Comment 1 Banana alibaba_cloud_group 2023-03-23 11:40:41 UTC
Anolis8会挂载失败,查询了一下原因可能是gluster版本过低:

执行命令:
[2023-03-22 16:37:34]  [root@iZbp11lagjmwtlkrtvp6hgZ ~]# mount.fluuuse glusterfs node1:dis0-volmenume
[2023-03-22 16:38:04]  ERROR: Server name/volume name unspecified cannot proceed further..
[2023-03-22 16:38:04]  Please specify correct format
[2023-03-22 16:38:04]  Usage:
[2023-03-22 16:38:04]  man 8 /usr/sbin/mount.glusterfs
[2023-03-22 16:38:04]  [root@iZbp11lagjmwtlkrtvp6hgZ ~]# mkdir /opt/test
[2023-03-22 16:38:15]  [root@iZbp11lagjmwtlkrtvp6hgZ ~]# mkdir /opt/testount.glusterfs node1:dis-volume /opt/test/
[2023-03-22 16:38:22]  Mounting glusterfs on /opt/test/ failed.
[2023-03-22 16:39:04]  [root@iZbp11lagjmwtlkrtvp6hgZ ~]# catac /etc/varvar/log/messages
[2023-03-22 16:39:26]  Mar 22 16:39:04 iZbp11lagjmwtlkrtvp6hgZ systemd[1]: opt-test.mount: Succeeded.
[2023-03-22 16:39:26]  Mar 22 16:38:22 iZbp11lagjmwtlkrtvp6hgZ systemd-udevd[531]: Network interface NamePolicy= disabled on kernel command line, ignoring.
[2023-03-22 16:39:26]  Mar 22 16:37:34 iZbp11lagjmwtlkrtvp6hgZ systemd[1]: Started Dynamically Generate Message Of The Day.
[2023-03-22 16:39:26]  Mar 22 16:37:33 iZbp11lagjmwtlkrtvp6hgZ systemd[1]: Starting Dynamically Generate Message Of The Day...


日志查看:
[root@iZbp11lagjmwtlkrtvp6hgZ ~]# cat /var/log/glusterfs/opt-test.log
[2023-03-22 08:40:02.827114] I [MSGID: 100030] [glusterfsd.c:2868:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 6.0 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=node1 --volfile-id=dis-volume /opt/test)
[2023-03-22 08:40:02.827646] I [glusterfsd.c:2577:daemonize] 0-glusterfs: Pid of current running process is 32735
[2023-03-22 08:40:02.832939] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2023-03-22 08:40:02.833171] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2023-03-22 08:40:45.085347] I [glusterfsd-mgmt.c:2451:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: node1
[2023-03-22 08:40:45.085378] I [glusterfsd-mgmt.c:2471:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2023-03-22 08:40:45.085531] W [glusterfsd.c:1583:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(+0xf953) [0x7f6e66c0f953] -->/usr/sbin/glusterfs(+0x137c7) [0x5556742917c7] -->/usr/sbin/glusterfs(cleanup_and_exit+0x58) [0x5556742865c8] ) 0-: received signum (1), shutting down
[2023-03-22 08:40:45.085553] I [fuse-bridge.c:6966:fini] 0-fuse: Unmounting '/opt/test'.
[2023-03-22 08:40:45.085753] I [fuse-bridge.c:6971:fini] 0-fuse: Closing fuse connection to '/opt/test'.
Comment 2 扣肉 2023-06-15 12:12:31 UTC
11.0 是 glusterfs 目前的最新版本。

看起来报错和这个很像:
https://bugzilla.suse.com/show_bug.cgi?id=1209702
Comment 3 扣肉 2023-06-15 12:18:41 UTC
从编译结果上来说,x86-64的版本,有到tcmalloc_minimal的引用,应该不会报错,aarch64 没有链接到tcmalloc_minimal,可能会报错