使用的主机信息如下:
角色 | HOSTNAME | IP | CPU | 内存 | 系统盘 | CPU 架构 | 操作系统 |
---|---|---|---|---|---|---|---|
控制平面 | k8s-master | 192.168.0.101 | 2 | 4G | 64G | x86_64 | openSUSE Leap 15.6 |
工作平面 | k8s-worker-1 | 192.168.0.102 | 2 | 4G | 64G | x86_64 | openSUSE Leap 15.6 |
工作平面 | k8s-worker-2 | 192.168.0.103 | 4 | 8G | 64G | x86_64 | openSUSE Leap 15.6 |
1) 更新主机操作系统
zypper ref
zypper up -y
2) 重启应用更新
reboot
hostnamectl set-hostname k8s-master
hostnamectl set-hostname k8s-worker-1
hostnamectl set-hostname k8s-worker-2
vim /etc/hosts
192.168.0.101 k8s-master
192.168.0.102 k8s-worker-1
192.168.0.103 k8s-worker-2
1) 临时禁用
swapoff -a
2) 永久禁用
sed -i '/swap/d' /etc/fstab
3) 查看 SWAP 分区
free -h
Swap 部分显示为 0 表示禁用
swapon --show
返回为空时表示已禁用
1) 临时加载
modprobe overlay
modprobe br_netfilter
2) 设置开机自动加载
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
1) 编辑配置文件
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
2) 更新系统设置
sysctl --system
1) 开放端口
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --permanent --add-port=2379-2380/tcp
firewall-cmd --permanent --add-port=10250/tcp
firewall-cmd --permanent --add-port=10259/tcp
firewall-cmd --permanent --add-port=10257/tcp
firewall-cmd --permanent --add-port=10250/tcp
firewall-cmd --permanent --add-port=30000-32767/tcp
Calico CNI
时)firewall-cmd --permanent --add-port=179/tcp
firewall-cmd --permanent --add-port=4789/udp
Flannel CNI
时)firewall-cmd --permanent --add-port=8472/udp
2) 刷新防火墙规则
firewall-cmd --reload
1) 关闭防火墙
systemctl stop firewalld.service
2) 禁用开机自启动
systemctl disable firewalld.service
3) 检查防火墙状态
systemctl status firewalld.service
1) 安装
zypper install -y containerd
2) 锁定(防止误升级)
zypper addlock containerd
1) 生成配置文件
mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
2) 设置 SystemdCgroup 为 true
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
编辑
/etc/containerd/config.toml
文件,找到[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
下的SystemdCgroup
将其设置为true
3) 设置镜像(部分地区需要)
sed -i 's#registry.k8s.io/pause:3.8#registry.aliyuncs.com/google_containers/pause:3.10#' /etc/containerd/config.toml
编辑
/etc/containerd/config.toml
文件,找到[plugins."io.containerd.grpc.v1.cri"]
下的registry.k8s.io/pause
将其设置为可用的镜像(比如registry.aliyuncs.com/google_containers/pause
)
systemctl enable --now containerd
1) 检查启动状态
systemctl status containerd --no-pager -l
注意返回的 Active
是否为 running
,同时查看日志中是否存在报错(error
)信息,启动正常时的示例如下:
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; preset: disabled)
Active: active (running) since Thu 2025-04-17 16:39:32 CST; 3s ago
Docs: https://containerd.io
Process: 25216 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 25217 (containerd)
Tasks: 8
CPU: 48ms
CGroup: /system.slice/containerd.service
└─25217 /usr/sbin/containerd
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031029228+08:00" level=info msg="Start subscribing containerd event"
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031107162+08:00" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031215624+08:00" level=info msg=serving... address=/run/containerd/containerd.sock
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031159009+08:00" level=info msg="Start recovering state"
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031298388+08:00" level=info msg="Start event monitor"
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031309649+08:00" level=info msg="Start snapshots syncer"
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031315009+08:00" level=info msg="Start cni network conf syncer for default"
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031320830+08:00" level=info msg="Start streaming server"
Apr 17 16:39:32 k8s-master systemd[1]: Started containerd container runtime.
Apr 17 16:39:32 k8s-master containerd[25217]: time="2025-04-17T16:39:32.031548975+08:00" level=info msg="containerd successfully booted in 0.024108s"
2) 查看版本
containerd
命令containerd --version
返回值根据安装的版本有所不同,示例如下:
containerd github.com/containerd/containerd v1.7.21 472731909fa34bd7bc9c087e4c27943f9835f111
1) 安装
zypper install -y flannel
2) 锁定(防止误升级)
zypper addlock flannel
1) 配置软件源
cat <<EOF | sudo tee /etc/zypp/repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.32/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.32/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
2) 导入密钥
rpm --import https://pkgs.k8s.io/core:/stable:/v1.32/rpm/repodata/repomd.xml.key
3) 刷新仓库元数据
zypper ref
1) 安装
zypper install -y --allow-downgrade kubelet-1.32.3 kubeadm-1.32.3 kubectl-1.32.3
2) 锁定(防止误升级)
zypper addlock kubelet kubeadm kubectl
systemctl enable --now kubelet
kubeadm init --kubernetes-version=v1.32.3 --apiserver-advertise-address=192.168.0.101 --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///run/containerd/containerd.sock --image-repository=registry.aliyuncs.com/google_containers
其中:
--kubernetes-version=v1.32.3
指定部署的 Kubernetes 版本--apiserver-advertise-address=192.168.0.101
指定 API 服务器 IP--pod-network-cidr=10.244.0.0/16
指定 Pod 使用的网段--cri-socket=unix:///run/containerd/containerd.sock
指定容器运行时 sock 文件--image-repository=registry.aliyuncs.com/google_containers
指定容器镜像仓库,默认使用 registry.k8s.io
(部分地区需要)其他可选参数:
--apiserver-bind-port=[端口号]
指定 API 服务绑定端口,默认为 6443
--control-plane-endpoint=[IP:端口号]
指定控制平面共享 IP:端口,用于高可用集群--service-dns-domain=[域名]
指定服务 DNS 域,默认为 cluster.local
--cert-dir=[目录]
指定证书存储目录,默认为 /etc/kubernetes/pki
--certificate-key=[密钥]
指定证书加密密钥,用于高可用集群控制平面之间传输证书--upload-certs
指定将证书至集群,用于高可用集群--dry-run
指定不实际执行,只打印日志初始化成功时的日志大致如下:
[init] Using Kubernetes version: v1.32.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0417 17:00:34.298232 24312 checks.go:844] detected that the sandbox image "registry.aliyuncs.com/google_containers/pause:3.10" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "registry.aliyuncs.com/google_containers/pause:3.10" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.101]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.0.101 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.0.101 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.269277ms
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 6.501735367s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 12vgkt.tessqxdvblzfxxg1
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.0.101:6443 --token 123456 \
--discovery-token-ca-cert-hash sha256:123456
注意其中的
kubeadm join
命令,在工作平面中使用此命令即可将其加入当前集群。
kubectl 默认使用 $HOME/.kube/config
配置文件,需要手动配置该文件后才能使用 kubectl 管理集群。
可直接复制 admin.conf
文件(默认路径为 /etc/kubernetes
)后配置权限:
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf
在部署 Pod 网络插件(CNI)之前,Pod 、 Service 无法通信,CoreDNS Pod 会处于 Pending 状态,集群功能不完整。
使用的 CNI 必须和
kubeadm init
时--pod-network-cidr
参数指定的网段一致,通常Calico
使用192.168.0.0/16
、Flannel
使用10.244.0.0/16
,因此本集群使用Flannel
1) 创建相关目录
mkdir -pv /opt/kubernetes/flannel
2) 下载 Flannel 的 yaml 文件
wget https://github.com/flannel-io/flannel/releases/download/v0.26.7/kube-flannel.yml
3) 部署 Flannel
kubectl apply -f kube-flannel.yml
4) 检查 Flannel 部署状态
kubectl get nodes
检查节点状态是否正常,正常示例如下:
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 21m v1.32.3
kubectl get pods -n kube-system
检查 coredns
相关 Pod 状态是否正常,正常示例如下:
NAME READY STATUS RESTARTS AGE
coredns-6766b7b6bb-4mlcz 1/1 Running 0 21m
coredns-6766b7b6bb-kr2fz 1/1 Running 0 21m
etcd-k8s-master 1/1 Running 0 21m
kube-apiserver-k8s-master 1/1 Running 0 21m
kube-controller-manager-k8s-master 1/1 Running 0 21m
kube-proxy-khfz9 1/1 Running 0 21m
kube-scheduler-k8s-master 1/1 Running 0 21m
systemctl status containerd --no-pager
systemctl status kubelet --no-pager
kubectl get nodes
kubectl cluster-info
kubectl get pods -n kube-system
kubectl get componentstatuses
kubectl run busybox --image=busybox:1.28 -- sleep 3600
kubectl exec -it busybox -- nslookup kubernetes.default
kubectl get --raw='/healthz'
kubectl -n kube-system exec -it etcd-k8s-master -- etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key endpoint health
控制平面初始化成功后,会生成 join 命令,建议直接使用此命令将工作平面加入集群
kubeadm join 192.168.0.101:6443 --cri-socket /opt/containerd/run/containerd.sock --token=123456 --discovery-token-ca-cert-hash=sha256:123456
其中:
192.168.0.101:6443
为控制平面地址--token=123456
为 token,由控制平面自动生成--discovery-token-ca-cert-hash=sha256:123456
为证书 hash 值,由控制平面自动生成其他可选参数:
--node-name=[名称]
指定节点名称,默认为主机名--control-plane
指定此节点以控制平面身份加入--certificate-key=[密钥]
指定证书解密密钥,和 kubeadm init
的 --certificate-key
相同--apiserver-advertise-address=[IP:端口号]
指定控制平面共享 IP:端口,用于高可用集群--dry-run
指定不实际执行,只打印日志containerd
作为容器运行时,如使用其他容器运行时,请参考相关文档 kubeadm reset
和 kubeadm reset -f
命令重置,并在删除 $HOME/.kube/config
后重新初始化