5438 – [Anolis23 GA][软件兼容性] 1.27.2版本的kubernetes在执行 kubeadm init 时下载镜像失败，[ERROR ImagePull]: failed to pull image registry.k8s.io/kube-apiserver:v1.27.0

Bug 5438 - [Anolis23 GA][软件兼容性] 1.27.2版本的kubernetes在执行 kubeadm init 时下载镜像失败，[ERROR ImagePull]: failed to pull image registry.k8s.io/kube-apiserver:v1.27.0

Summary: [Anolis23 GA][软件兼容性] 1.27.2版本的kubernetes在执行 kubeadm init 时下载镜像失败，[ERROR Image...

Status:	CLOSED FIXED

Alias:	None

Product:	Anolis OS 23
Classification:	Anolis OS
Component:	BaseOS Packages (show other bugs)	BaseOS Packages
Sub Component:
Version:	unspecified
Hardware:	All Linux

Importance:	P3-Medium S3-normal
Target Milestone:	---
Assignee:	happy_orange
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2023-06-07 14:24 UTC by Janos
Modified:	2023-07-03 16:09 UTC (History)
CC List:	12 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Janos alibaba_cloud_group

2023-06-07 14:24:04 UTC

[缺陷描述]：
  云上ecs，1.27.2版本的kubernetes在执行 kubeadm init 时下载镜像失败，[ERROR ImagePull]: failed to pull image registry.k8s.io/kube-apiserver:v1.27.0


[重现环境]：
环境信息：云上ecs
OS：Anolis 23 x86_64/aarch64

# cat /etc/os-release
NAME="Anolis OS"
VERSION="23"
ID="anolis"
VERSION_ID="23"
PLATFORM_ID="platform:an23"
PRETTY_NAME="Anolis OS 23"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
BUG_REPORT_URL="https://bugzilla.openanolis.cn/"

KERNEL：
# uname -r
5.10.134-14.an23.x86_64

repo信息：
Repo-id            : AppStream-Nightly
Repo-name          : AnolisOS-23 - AppStream
Repo-baseurl       : http://mirrors.openanolis.cn/anolis/23/Nightly/AppStream/x86_64/os
Repo-filename      : /etc/yum.repos.d/AnolisOS-Nightly.repo

Repo-id            : BaseOS-Nightly
Repo-name          : AnolisOS-23 - BaseOS
Repo-baseurl       : http://mirrors.openanolis.cn/anolis/23/Nightly/BaseOS/x86_64/os
Repo-filename      : /etc/yum.repos.d/AnolisOS-Nightly.repo


[重现步骤]：
参考SIG：https://openanolis.cn/sig/third_software_compatibility/doc/426352745466167442

# 安装containerd
yum install -y containerd
containerd config default > /etc/containerd/config.toml

# 修改容器配置，使用cgroup模式，和 aliyun 的镜像源
sed -i 's|SystemdCgroup = .*|SystemdCgroup = true|g' /etc/containerd/config.toml
sed -i 's|sandbox_image = .*|sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"|g' /etc/containerd/config.toml

# 启动containerd
systemctl daemon-reload
systemctl start containerd

# 安装k8s
yum install -y kubernetes kubernetes-kubeadm cri-tools

# 生成初始化配置文件，修改使用containerd作为容器，修改使用aliyun镜像源
kubeadm config print init-defaults > /root/init.yaml
sed -i 's|/var/run/dockershim.sock|/run/containerd/containerd.sock|g' /root/init.yaml
sed -i "s|k8s.gcr.io|registry.aliyuncs.com/google_containers|g" /root/init.yaml
sed -i "s|advertiseAddress: .*|advertiseAddress: ${ip}|g" /root/init.yaml
sed -i '/serviceSubnet: 10.96.0.0\/12/a\  podSubnet: 10.244.0.0\/16' /root/init.yaml

# 根据配置文件进行初始化
kubeadm init --config=/root/init.yaml


[期望结果]：
k8s初始化成功

[实际结果]：
k8s初始化失败，提示如下：

I0607 13:53:15.002140   15249 checks.go:828] using image pull policy: IfNotPresent
I0607 13:53:15.013689   15249 checks.go:854] pulling: registry.k8s.io/kube-apiserver:v1.27.0
I0607 13:55:45.855123   15249 checks.go:854] pulling: registry.k8s.io/kube-controller-manager:v1.27.0
I0607 13:58:16.642077   15249 checks.go:854] pulling: registry.k8s.io/kube-scheduler:v1.27.0
I0607 14:00:47.387566   15249 checks.go:854] pulling: registry.k8s.io/kube-proxy:v1.27.0
W0607 14:03:18.133558   15249 checks.go:835] detected that the sandbox image "registry.aliyuncs.com/google_containers/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
I0607 14:03:18.143876   15249 checks.go:854] pulling: registry.k8s.io/pause:3.9
I0607 14:05:52.367815   15249 checks.go:854] pulling: registry.k8s.io/etcd:3.5.7-0
I0607 14:08:23.182436   15249 checks.go:854] pulling: registry.k8s.io/coredns/coredns:v1.10.1
[preflight] Some fatal errors occurred:
        [ERROR ImagePull]: failed to pull image registry.k8s.io/kube-apiserver:v1.27.0: output: time="2023-06-07T13:55:45+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/kube-apiserver:v1.27.0\": failed to resolve reference \"registry.k8s.io/kube-apiserver:v1.27.0\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-apiserver/manifests/v1.27.0\": dial tcp 74.125.23.82:443: i/o timeout"
, error: exit status 1
        [ERROR ImagePull]: failed to pull image registry.k8s.io/kube-controller-manager:v1.27.0: output: time="2023-06-07T13:58:16+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/kube-controller-manager:v1.27.0\": failed to resolve reference \"registry.k8s.io/kube-controller-manager:v1.27.0\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-controller-manager/manifests/v1.27.0\": dial tcp 74.125.23.82:443: i/o timeout"
, error: exit status 1
        [ERROR ImagePull]: failed to pull image registry.k8s.io/kube-scheduler:v1.27.0: output: time="2023-06-07T14:00:47+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/kube-scheduler:v1.27.0\": failed to resolve reference \"registry.k8s.io/kube-scheduler:v1.27.0\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-scheduler/manifests/v1.27.0\": dial tcp 142.250.157.82:443: i/o timeout"
, error: exit status 1
        [ERROR ImagePull]: failed to pull image registry.k8s.io/kube-proxy:v1.27.0: output: time="2023-06-07T14:03:18+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/kube-proxy:v1.27.0\": failed to resolve reference \"registry.k8s.io/kube-proxy:v1.27.0\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-proxy/manifests/v1.27.0\": dial tcp 142.250.157.82:443: i/o timeout"
, error: exit status 1
        [ERROR ImagePull]: failed to pull image registry.k8s.io/pause:3.9: output: time="2023-06-07T14:05:52+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/pause:3.9\": failed to resolve reference \"registry.k8s.io/pause:3.9\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.9\": dial tcp 142.250.157.82:443: i/o timeout"
, error: exit status 1
        [ERROR ImagePull]: failed to pull image registry.k8s.io/etcd:3.5.7-0: output: time="2023-06-07T14:08:23+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/etcd:3.5.7-0\": failed to resolve reference \"registry.k8s.io/etcd:3.5.7-0\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/etcd/manifests/3.5.7-0\": dial tcp 142.250.157.82:443: i/o timeout"
, error: exit status 1
        [ERROR ImagePull]: failed to pull image registry.k8s.io/coredns/coredns:v1.10.1: output: time="2023-06-07T14:10:54+08:00" level=fatal msg="pulling image: rpc error: code = DeadlineExceeded desc = failed to pull and unpack image \"registry.k8s.io/coredns/coredns:v1.10.1\": failed to resolve reference \"registry.k8s.io/coredns/coredns:v1.10.1\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/coredns/coredns/manifests/v1.10.1\": dial tcp 142.251.170.82:443: i/o timeout"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
error execution phase preflight
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
        cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
        vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
        vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
        vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
        cmd/kubeadm/app/kubeadm.go:50
main.main
        cmd/kubeadm/kubeadm.go:25
runtime.main
        /usr/lib/golang/src/runtime/proc.go:250
runtime.goexit
        /usr/lib/golang/src/runtime/asm_amd64.s:1598


[原因定位]：
根据提示，1.27.2版本的k8s在init时需要下载 1.27.0以上版本的相关组件，但对应版本的组件在aliyun的镜像源中未提供，所以k8s又去默认的官网下载，但是ecs连接官网速度过慢，导致相关组件下载失败

Comment 1 yunmeng365524 2023-06-07 17:27:24 UTC

请帮忙确认一下23上提供的k8s版本有没有对应的源可用

Comment 2 happy_orange alibaba_cloud_group

2023-06-25 19:05:48 UTC

/root/init.yaml 里面配置了 2.27.0 版本，建议更换成 2.27.2 ，保持和 kubernetes 相同版本，这样一些基础命令就不会再去下载。另外，在运行的过程中还是会去下载其他软件，比如：etcd，这个软件预计在 23.1 版本中增加，现在仍采用开源方式进行使用。

Comment 3 Janos alibaba_cloud_group

2023-06-30 10:13:11 UTC

默认init配置中的版本号更换成1.27.2后，验证可以init成功

Comment 4 yunmeng365524 2023-07-03 16:08:53 UTC

已经确认，适配case