Description of problem: sysom网页中,迁移实施页面,在对某台待迁移机器进行迁移操作,或者批量迁移多台机器时,偶尔出现某项操作开始后,“迁移状态”卡在“运行中”很久不更新,而到待升级机器查看对应操作的命令已经结束运行,此时无法对该机器进行其他操作,只能从数据库中修改“迁移状态”字段。 Version-Release number of selected component (if applicable): 使用基于下面这个commit点的定制代码,未对任务下发和状态更新代码逻辑进行过修改:https://gitee.com/anolis/sysom/commit/0e2c489be1fdec87e8067c68fcff30e8c7c0cc62 后面关于这个问题,合入了两个更新: https://gitee.com/anolis/sysom/commit/7fd2c54e451e0f878694af76d0dd00f391cd6dd2, https://gitee.com/anolis/sysom/commit/546fb1753cb70fc041320820fc49dbf6831f5765 How reproducible: 特定环境下,概率性出现。 Steps to Reproduce: 1.向sysom添加待升级机器,进入“迁移实施”页面。 2.从“操作”下拉菜单执行迁移任务,或者批量迁移添加的机器。 3.观察“迁移状态”列是否正确更新。 Actual results: 某项迁移任务(如系统备份,风险评估任务)对应的命令在待升级机器已经执行结束很久,而sysom页面该机器的“迁移状态”还停留在“运行中”。 Expected results: 当某项迁移任务对应的命令在待升级机器已经执行结束,sysom页面该机器的“迁移状态”从“运行中”更新为“就绪中”或者“成功”的正常状态。 Additional info: 卡在 “运行中”的机器,对应任务在sysom.mig_job 表里的job_result 字段是空的。
经与社区专家沟通定位,在日志中发现报错,任务结果写数据库时候,连接被重置了,导致没写进数据库,进一步导致 迁移状态没更新。临时方案把数据库连接数调大,具体调整方案:编辑/etc/my.cnf文件,输入: [mysqld] max_connections=2000 保存退出,然后重启mysql systemctl restart mysqld
(In reply to camel from comment #1) > 经与社区专家沟通定位,在日志中发现报错,任务结果写数据库时候,连接被重置了,导致没写进数据库,进一步导致 > 迁移状态没更新。临时方案把数据库连接数调大,具体调整方案:编辑/etc/my.cnf文件,输入: > [mysqld] > max_connections=2000 > 保存退出,然后重启mysql > systemctl restart mysqld 如上优化mysql最大连接数后,在当时环境中,初步使用未再次出现迁移状态未更新的问题。后续在其他sysom 部署环境下,仍会出现迁移状态未更新的问题。排查migration-error 日志文件,发现还是出现连接数据库失败的问题,如下: 1、下发环境准备yum install任务,在待迁移主机上yum命令执行成功结束后,报错如下: Exception in thread Thread-1: Traceback (most recent call last): File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 756, in _write_bytes self._sock.sendall(data) ConnectionResetError: [Errno 104] Connection reset by peer During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/django/db/backends/mysql/base.py", line 73, in execute return self.cursor.execute(query, args) File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/cursors.py", line 148, in execute result = self._query(query) File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/cursors.py", line 310, in _query conn.query(q) File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 547, in query self._execute_command(COMMAND.COM_QUERY, sql) File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 814, in _execute_command self._write_bytes(packet) File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 760, in _write_bytes CR.CR_SERVER_GONE_ERROR, "MySQL server has gone away (%r)" % (e,) pymysql.err.OperationalError: (2006, "MySQL server has gone away (ConnectionResetError(104, 'Connection reset by peer'))") 2、下发迁移评估leapp命令,leapp程序执行结束,可以获取到评估报告leapp-report.txt了,但连接数据库报错如下: 177464 [INFO] -- 2023-07-31 19:14:02 -- P_ 6773_T_140342677989120 - <channel:67>: host 10.170.113.101 get file /var/log/bclinux-sysmt/leapp-report.txt to /tmp/migration/imp/10.170.113.101/mig_ass_report.log 177465 [INFO] -- 2023-07-31 19:14:02 -- P_ 6773_T_140342677989120 - <channel:68>: {'code': 0, 'err_msg': '', 'result': '', 'echo': {}, 'job_id': '', 'is_f inished': True} 177466 [INFO] -- 2023-07-31 19:14:02 -- P_ 6773_T_140343228966656 - <channel:67>: host 10.170.113.100 get file /var/tmp/state.json to /tmp/migration/imp/1 0.170.113.100/mig_imp_rate.log 177467 [INFO] -- 2023-07-31 19:14:02 -- P_ 6773_T_140343228966656 - <channel:68>: {'code': 0, 'err_msg': '', 'result': '', 'echo': {}, 'job_id': '', 'is_f inished': True} 177468 /usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/_auth.py:8: CryptographyDeprecationWarning: Python 3.6 is no longer suppor ted by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release. 177469 from cryptography.hazmat.backends import default_backend 177470 [INFO] -- 2023-07-31 19:14:06 -- P_ 11450_T_140343521056576 - <apps:24>: >>> Migration module loading success 177471 Exception in thread Thread-1: 177472 Traceback (most recent call last): 177473 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 756, in _write_bytes 177474 self._sock.sendall(data) 177475 BrokenPipeError: [Errno 32] Broken pipe 177476 177477 During handling of the above exception, another exception occurred: 177478 177479 Traceback (most recent call last): 177480 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute 177481 return self.cursor.execute(sql, params) 177482 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/django/db/backends/mysql/base.py", line 73, in execute 177483 return self.cursor.execute(query, args) 177484 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/cursors.py", line 148, in execute 177485 result = self._query(query) 177486 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/cursors.py", line 310, in _query 177487 conn.query(q) 177488 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 547, in query 177489 self._execute_command(COMMAND.COM_QUERY, sql) 177490 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 814, in _execute_command 177491 self._write_bytes(packet) 177492 File "/usr/local/sysom/server/virtualenv/lib64/python3.6/site-packages/pymysql/connections.py", line 760, in _write_bytes 177493 CR.CR_SERVER_GONE_ERROR, "MySQL server has gone away (%r)" % (e,) ……
MySQL server has gone away..这个问题应该是数据库链接超时导致的 可以按以下方案进行修复: 1.编辑/etc/my.cnf文件,输入: [mysqld] max_connections=2000 保存退出,然后重启mysql: systemctl restart mysqld 2.打开sysom安装目录/sysom_server/sysom_migration/conf/commom.py文件 找到DATABASES,然后在default里新增一个CONN_MAX_AGE:3600,例如: DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', 'NAME': 'xx', 'USER': 'xx', 'PASSWORD': 'xx', 'HOST': 'xx', 'PORT': 'xx', 'CONN_MAX_AGE': 3600, } } 然后重启sysom_migration服务: supervisorctl restart sysom-migration