很久没有折腾Kubernetes了,最近刚好手上有资源,可以够部署一个简单的集群,于是就利用起来了。这里采用的安装方式是通过Kubeadm来安装高可用集群。
1 版本信息
首先,物理服务器条件:
1
2
3
4
5
6
7
8
9
10
|
Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz x 2 Sockets
共80 vCPU:
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
内存共196G,其中128G分配为1G大小的HugePage内存.
磁盘共786G.
主机操作系统版本:Ubuntu 18.04 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
|
基于上述资源条件,创建虚机时做如下分配:
虚机名称 |
Controller-0 |
Controller-1 |
Controller-2 |
Node-0 |
Node-1 |
Node-2 |
CPU |
4 |
4 |
4 |
16 |
16 |
16 |
内存 |
8G |
8G |
8G |
16G |
16G |
16G |
磁盘 |
64G |
64G |
64G |
64G |
64G |
64G |
角色 |
Controller |
Controller |
Controller |
Node |
Node |
Node |
虚机部署就是通过virt-install和virsh来安装的KVM QEMU虚机,所有虚机都用HugePage内存,并做了CPU静态绑定,这里不对细节做过多陈述。
虚机上安装的操作系统为:
Ubuntu 20.04 5.8.0-55-generic #62~20.04.1-Ubuntu SMP Wed Jun 2 08:55:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
安装的Kubernetes版本为v1.21.1.
使用的Docker版本为: Docker version 20.10.7, build f0df350.
2 安装过程
2.1 Kubernetes集群网络规划

- 高可用集群的Service地址使用10.164.128.x/24网段的地址,在3个Controller上部署haproxy和keepalived来实现Service的高可用。
- Kubernetes的默认k8s网络是192.168.100.x/24网段,基于Calico部署实施。跨worker node度Pod间通信是通过Calico网络进行的。
- Data Network是用于高性能数据网络的。基于Multus CNI配置Pod支持多网络平面,Data Network可以是Flannel,MACVLAN等普通网络,也可以是SRIOV的高性能网络。
2.2 VM系统环境准备
2.2.1 配置系统环境
首先,确认Ubuntu 20.04已经升级到最新版本。通过“sudo apt upgrade -y”之后,确认内核版本统一为一个版本,如:
1
2
|
cifangzi@controller-0:~$ uname -a
Linux controller-0 5.8.0-55-generic #62~20.04.1-Ubuntu SMP Wed Jun 2 08:55:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
|
然后,在所有VM系统里修改/etc/hostname和/etc/hosts文件添加主机名,如:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
cifangzi@controller-0:~$ cat /etc/hostname
controller-0
cifangzi@controller-0:~$ cat /etc/hosts
...
10.164.128.188 vip.cluster.local
10.164.128.180 controller-0
10.164.128.181 controller-1
10.164.128.182 controller-2
10.164.128.183 node-0
10.164.128.184 node-1
10.164.128.185 node-2
10.164.128.187 node-3
...
|
这里10.164.128.188 vip.cluster.local作为API Server的高可用虚地址,后面安装时需要用到。修改后重启确认生效。
最后,设置环境依赖,kubeadm依赖br-netfilter以及对应设置,需要如下配置:
1
2
3
4
5
6
7
8
9
10
11
|
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#然后重启服务
sudo sysctl --system
|
2.2.2 安装Docker
Docker的安装流程参考官网链接:
1
|
https://docs.docker.com/engine/install/ubuntu/
|
具体步骤简化如下:
首先通过apt安装插件程序如下:
1
2
3
4
5
6
|
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release
|
然后,下载安装docker archive的证书:
1
|
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
|
再向apt添加Docker的安装源:
1
2
3
|
echo \
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
最后,安装Docker程序:
1
2
|
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
|
Docker安装完成之后,还需要进行配置,配置Docker使用systemd作为cgroup driver,具体如下:
1
2
3
4
5
6
7
8
9
10
11
|
sudo mkdir /etc/docker
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
|
因为国内无法访问Docker Hub,所以这里可以选择性的给Docker设置一个代理(此步骤为可选,如果Docker可以正常访问Docker Hub等国外网络,可以不做设置):
1
2
3
4
5
6
7
8
|
sudo mkdir -p /etc/systemd/system/docker.service.d
vim /etc/systemd/system/docker.service.d/http-proxy.conf
#添加如下内容
[Service]
Environment="HTTP_PROXY=http://proxy.example.com:80"
Environment="HTTPS_PROXY=https://proxy.example.com:443"
Environment="NO_PROXY=localhost,127.0.0.1,docker-registry.example.com,.corp"
|
最后,重启Docker服务:
1
2
|
sudo systemctl daemon-reload
sudo systemctl restart docker
|
2.3 安装Kubernetes
2.3.1 安装kubeadm、kubelet和kubectl
这个安装流程参考官网链接:
1
|
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
|
基本步骤可以简化如下:
先添加Kubernetes下载安装需要的证书:
1
|
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
|
添加Kubernetes安装源到apt:
1
|
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
|
然后安装kubeadm、kubelet和kubectl:
1
2
|
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
|
防止kubelet、kubeadm和kubectl自动更新版本,这里将软件版本做mark hold:
1
|
sudo apt-mark hold kubelet kubeadm kubectl
|
2.3.2 安装第一个控制节点
因为这里用haproxy和keepalived做控制节点高可用的方案,所以在安装控制节点的同时,也需要将haproxy和keepalived Pod通过静态方式安装上,并使用haproxy的VIP作为API Server的服务地址。
先创建一个kubeadm安装Kubernetes时用的配置文件kubeadm_config.yaml,内容如下:
1
2
3
4
5
6
7
8
9
10
11
|
# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
kubernetesVersion: v1.21.1
controlPlaneEndpoint: vip.cluster.local:8443
networking:
podSubnet: 10.244.0.0/16
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
|
这里主要提供了三个参数,controlPlaneEndpoint是API Server的地址,这里使用了vip.cluster.local这个域名以及8443端口;因为这里用Calico作为Pod网络CNI,所以需要通过podSubnet设置Pod IP CIDR,这里为10.244.0.0/16;cgroupDriver用来指定cgroup的driver类型为systemd。
因为需要静态创建haproxy和keepalived Pod,所以需要在/etc/kubernetes/manifests/路径下创建两个yaml文件,keepalived.yaml和haproxy.yaml,如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
|
# keepalived.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
name: keepalived
namespace: kube-system
spec:
containers:
- image: osixia/keepalived:2.0.17
name: keepalived
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_BROADCAST
- NET_RAW
volumeMounts:
- mountPath: /usr/local/etc/keepalived/keepalived.conf
name: config
- mountPath: /etc/keepalived/check_apiserver.sh
name: check
hostNetwork: true
volumes:
- hostPath:
path: /etc/keepalived/keepalived.conf
name: config
- hostPath:
path: /etc/keepalived/check_apiserver.sh
name: check
status: {}
# haproxy.yaml
apiVersion: v1
kind: Pod
metadata:
name: haproxy
namespace: kube-system
spec:
containers:
- image: haproxy:2.1.4
name: haproxy
livenessProbe:
failureThreshold: 8
httpGet:
host: localhost
path: /healthz
port: 8443
scheme: HTTPS
volumeMounts:
- mountPath: /usr/local/etc/haproxy/haproxy.cfg
name: haproxyconf
readOnly: true
hostNetwork: true
volumes:
- hostPath:
path: /etc/haproxy/haproxy.cfg
type: FileOrCreate
name: haproxyconf
status: {}
|
划重点:
在keepalived.yaml里,keepalived Pod直接mount了两个主机上的文件,分别是配置文件/etc/keepalived/keepalived.conf和探测脚本/etc/keepalived/check_apiserver.sh。
在haproxy.yaml里,haproxy Pod直接mount了主机上的配置文件/etc/haproxy/haproxy.cfg。
创建keepalived配置文件/etc/keepalived/keepalived.conf:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens7
virtual_router_id 51
priority 101
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.164.128.188
}
track_script {
check_apiserver
}
}
|
这里设置的VIP地址为10.164.128.188,以及vrrp相关参数,并且指定了API Server的track脚本为/etc/keepalived/check_apiserver.sh。
创建keepalived使用的track脚本/etc/keepalived/check_apiserver.sh:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
#!/bin/sh
APISERVER_VIP="10.164.128.188"
APISERVER_DEST_PORT=8443
errorExit() {
echo "*** $*" 1>&2
exit 1
}
curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi
|
再创建haproxy的配置文件/etc/haproxy/haproxy.cfg:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0
log /dev/log local1 notice
daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 20s
timeout server 20s
timeout http-keep-alive 10s
timeout check 10s
#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
bind *:8443
mode tcp
option tcplog
default_backend apiserver
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
option httpchk GET /healthz
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
server srv1 10.164.128.180:6443 check
server srv2 10.164.128.181:6443 check
server srv3 10.164.128.182:6443 check
# [...]
|
这里配置了backend apiserver地址为准备安装的3个Controller的地址。
然后通过如下命令安装Kubernetes:
1
|
sudo kubeadm init --upload-certs --config ./kubeadm_config.yml
|
P.S.因为Kubnertes不支持Swap,所以在安装Kubernetes之前关闭系统的Swap。
https://github.com/kubernetes/kubeadm/issues/610
可以通过修改/etc/fstab注释掉Swap相关配置后重启动方式来关闭Swap。
如果顺利的话,至此,第一个Controller节点基本安装完成。
1
2
3
4
5
6
7
8
9
10
11
|
cifangzi@controller-0:~$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-6zvpw 0/1 Pending 0 11m
kube-system coredns-558bd4d5db-kb2bh 0/1 Pending 0 11m
kube-system etcd-controller-0 1/1 Running 0 11m
kube-system haproxy-controller-0 1/1 Running 0 11m
kube-system keepalived-controller-0 1/1 Running 0 11m
kube-system kube-apiserver-controller-0 1/1 Running 0 11m
kube-system kube-controller-manager-controller-0 1/1 Running 0 11m
kube-system kube-proxy-2tg9m 1/1 Running 0 11m
kube-system kube-scheduler-controller-0 1/1 Running 0 11m
|
在安装其它Controller节点和worker节点之前,需要先安装Calico才能打通跨节点Pod间的通信。这里选择使用Helm来安装Calico。所以,先安装Helm:
1
2
3
4
|
curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
|
在如下链接下载Calico Helm安装包,然后安装Calico:
1
2
3
|
https://docs.projectcalico.org/getting-started/kubernetes/helm
helm install calico tigera-operator-v3.19.1-2.tgz
|
安装完Calico之后,会多出如下Pod:
1
2
3
4
|
calico-system calico-kube-controllers-7f58dbcbbd-blg4t 1/1 Running 0 20m
calico-system calico-node-tfqdf 1/1 Running 0 20m
calico-system calico-typha-8d6bdd5d5-fzjpn 1/1 Running 0 20m
tigera-operator tigera-operator-86c4fc874f-g5t9d 1/1 Running 1 20m
|
2.3.3 安装另外两个控制节点
第一个控制节点安装完成时,会打印一些信息,提示安装其它控制节点和worker节点的命令,如其它控制节点安装的命令:
1
2
3
|
kubeadm join vip.cluster.local:8443 --token llcqaz.lnhg7qwpyieb85y1 \
--discovery-token-ca-cert-hash sha256:3d057e4477f58e7f99cefcf3eeae39e19f9c7058ff7431742fda6c1910e7b3d8 \
--control-plane --certificate-key d2be6f1c641cd6244d3a62a57d4c6545dd89b9631a1ace54b57916fc5ebd4478
|
因为这里配置了haproxy和keepalived作为高可用的方式,所以,还需要添加haproxy和keepalived的Pod。
按照按照第一个控制节点的方式,先创建haproxy和keepalived的配置文件,修改keepalived的配置文件中VRRP的角色为SLAVE,创建keepalived用到的track脚本。然后再创建haproxy和keepalived的yaml文件,并拷贝到/etc/kubernetes/manifests/。通过如下命令重启kubelet后即可:
1
|
sudo systemctl restart kubelet
|
查看节点状态,如下:
1
2
3
4
|
cifangzi@controller-0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-0 Ready control-plane,master 22h v1.21.1
controller-1 Ready control-plane,master 21h v1.21.1
|
此时,coredns的状态应该都是running,如下:
1
2
|
kube-system coredns-558bd4d5db-k8h55 1/1 Running 0 20h
kube-system coredns-558bd4d5db-svp75 1/1 Running 0 20h
|
可以用相同方法,创建第三个控制节点。
2.3.4 安装worker节点
worker节点比较容易安装,按照之前第一个节点按照时的提示命令,可以直接安装。如下:
1
2
|
kubeadm join vip.cluster.local:8443 --token llcqaz.lnhg7qwpyieb85y1 \
--discovery-token-ca-cert-hash sha256:3d057e4477f58e7f99cefcf3eeae39e19f9c7058ff7431742fda6c1910e7b3d8
|
按同样方法安装其它worker节点。安装后查看Pod和节点状态:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
cifangzi@controller-0:~$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-7f58dbcbbd-blg4t 1/1 Running 0 20h
calico-system calico-node-d94vp 1/1 Running 2 20h
calico-system calico-node-jt269 1/1 Running 2 20h
calico-system calico-node-n2rjj 1/1 Running 2 20h
calico-system calico-node-p49lm 1/1 Running 0 20h
calico-system calico-node-pnq4z 1/1 Running 0 20h
calico-system calico-node-tfqdf 1/1 Running 0 20h
calico-system calico-typha-8d6bdd5d5-fzjpn 1/1 Running 0 20h
calico-system calico-typha-8d6bdd5d5-gtbsw 1/1 Running 0 20h
calico-system calico-typha-8d6bdd5d5-tnjcz 1/1 Running 0 20h
kube-system coredns-558bd4d5db-k8h55 1/1 Running 0 20h
kube-system coredns-558bd4d5db-svp75 1/1 Running 0 20h
kube-system etcd-controller-0 1/1 Running 0 20h
kube-system etcd-controller-1 1/1 Running 0 20h
kube-system etcd-controller-2 1/1 Running 0 20h
kube-system haproxy-controller-0 1/1 Running 0 20h
kube-system haproxy-controller-1 1/1 Running 0 20h
kube-system haproxy-controller-2 1/1 Running 0 20h
kube-system keepalived-controller-0 1/1 Running 0 20h
kube-system keepalived-controller-1 1/1 Running 0 20h
kube-system keepalived-controller-2 1/1 Running 0 20h
kube-system kube-apiserver-controller-0 1/1 Running 0 20h
kube-system kube-apiserver-controller-1 1/1 Running 0 20h
kube-system kube-apiserver-controller-2 1/1 Running 0 20h
kube-system kube-controller-manager-controller-0 1/1 Running 1 20h
kube-system kube-controller-manager-controller-1 1/1 Running 0 20h
kube-system kube-controller-manager-controller-2 1/1 Running 0 20h
kube-system kube-proxy-5pwws 1/1 Running 0 20h
kube-system kube-proxy-b2qrg 1/1 Running 0 20h
kube-system kube-proxy-c22tz 1/1 Running 0 20h
kube-system kube-proxy-hnd7s 1/1 Running 0 20h
kube-system kube-proxy-pkdkp 1/1 Running 0 20h
kube-system kube-proxy-tnbmj 1/1 Running 0 20h
kube-system kube-scheduler-controller-0 1/1 Running 1 20h
kube-system kube-scheduler-controller-1 1/1 Running 0 20h
kube-system kube-scheduler-controller-2 1/1 Running 0 20h
tigera-operator tigera-operator-86c4fc874f-g5t9d 1/1 Running 1 20h
cifangzi@controller-0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-0 Ready control-plane,master 22h v1.21.1
controller-1 Ready control-plane,master 21h v1.21.1
controller-2 Ready control-plane,master 21h v1.21.1
node0 Ready <none> 21h v1.21.1
node1 Ready <none> 21h v1.21.1
node2 Ready <none> 21h v1.21.1
|
2.4 安装Multus CNI
安装Multus CNI也比较简单,通过git clone Multus对应的源码,然后安装即可。
1
2
3
|
git clone https://github.com/k8snetworkplumbingwg/multus-cni.git
cd multus-cni
cat ./images/multus-daemonset.yml | kubectl apply -f -
|
至此,高可用的Kubernetes集群安装完成。