说明
服务器规划
这里采用一个三节点的etcd集群,两个个Master节点和两个Node节点,如下:
内网IP | 公网IP | 主机名 | 组件 | 角色 | 配置 |
---|---|---|---|---|---|
172.16.0.5 | 106.55.0.70 | tx | etcd,kube-apiserver,kube-controller-manager,kube-scheduler,cfssl | Master | 1C2G |
192.168.0.210 | 139.9.181.246 | hw | nginx,kube-apiserver,kube-controller-manager,kube-scheduler | Master | 1C2G |
192.168.0.3 | 114.67.166.108 | jd | etcd,kubelet,kube-proxy,flannel,docker | Node | 2C4G |
10.0.0.149 | 110.43.49.198 | js | etcd,kubelet,kube-proxy,flannel,docker | Node | 2C4G |
软件版本
Kubernetes 1.18.3
Docker 19.03.11
Etcd 3.4.9
Flanneld 0.12.0
Nginx 1.18
证书
这里etcd和kubernetes集群各自使用不同的CA证书。
准备工作
hostname修改
hostnamectl set-hostname tx
hostnamectl set-hostname hw
hostnamectl set-hostname jd
hostnamectl set-hostname js
hosts文件修改
cat >> /etc/hosts <<EOF
114.67.166.108 jd
139.9.181.246 hw
106.55.0.70 tx
10.0.0.149 js
EOF
防火墙配置(关闭)
防火墙这里为了简便使用关闭的方式,也可以自己配置开放端口,如果使用云服务器,记得关闭安全组
systemctl stop firewalld && systemctl disable firewalld
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT
swapoff -a
sed -i '/swap/s/^\\(.*\\)$/#\1/g' /etc/fstab
setenforce 0
vim /etc/selinux/config
SELINUX=disabled
service dnsmasq stop && systemctl disable dnsmasq
cfssl工具安装
cfssl用于生成kubernetes所用到的证书,为了方便,这里将cfssl安装到tx节点上
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
创建安装目录及日志目录
我这里将安装目录放到/opt/kubernetes
下:
mkdir /opt/kubernetes/{bin,cfg,ssl,logs}
mkdir /opt/etcd/{bin,cfg,ssl,data}
bin:用于存放二进制文件
cfg:用于存放配置文件
ssl:用于存放证书
data:用于存放数据,比如etcd的数据
logs:所有日志均放到该目录下
下载软件包
将下载的软件包解压后放到安装目录,我这里为/opt/kubernetes/bin
,如下:
Master节点
[root@tx bin]# pwd
/opt/kubernetes/bin
[root@tx bin]# ll
total 331444
-rwxr-xr-x 1 root root 120668160 Jun 15 15:01 kube-apiserver
-rwxr-xr-x 1 root root 110059520 Jun 15 15:01 kube-controller-manager
-rwxr-xr-x 1 root root 42950656 Jun 15 15:01 kube-scheduler
[root@tx bin]# pwd
/opt/etcd/bin
[root@tx bin]# ll
total 23272
-rwxr-xr-x 1 root root 23827424 Jun 15 15:01 etcd
Node节点
[root@jd bin]# pwd
/opt/kubernetes/bin
[root@jd bin]# ll
total 211776
-rwxr-xr-x 1 root root 35253112 Jun 15 14:58 flanneld
-rwxr-xr-x 1 root root 113283800 Jun 15 14:59 kubelet
-rwxr-xr-x 1 root root 38379520 Jun 15 14:59 kube-proxy
-rwxr-xr-x 1 root root 2139 Jun 15 14:58 mk-docker-opts.sh
[root@jd bin]# pwd
/opt/etcd/bin
[root@jd bin]# ll
total 23272
-rwxr-xr-x 1 root root 23827424 Jun 15 15:01 etcd
Client
将etcdctl
和kubectl
分别放到/usr/bin
目录下,以便在任何位置都可以访问,可以将这两个客户端放到任何节点,也可以只放到Master节点
[root@tx bin]# pwd
/usr/bin
[root@tx bin]# ll
total 132856
# 省略其他文件
-rwxr-xr-x 1 root root 13498880 May 25 20:17 etcdctl
-rwxr-xr-x 1 root root 43115328 May 26 19:38 kubectl
etcd部署
-
创建ETCD CA配置文件
ca-config.json
cat > /opt/etcd/ssl/ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "www": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF
-
创建CA证书请求文件
ca-csr.json
cat > /opt/etcd/ssl/ca-csr.json <<EOF { "CN": "etcd CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "GuangDong" } ] } EOF
-
生成CA证书
cfssl gencert --initca ca-csr.json | cfssljson --bare ca
-
创建etcd证书请求文件
cat > /opt/etcd/ssl/server-csr.json <<EOF { "CN": "etcd", "hosts": [ "106.55.0.70", "114.67.166.108", "110.43.49.198", "172.16.0.5", "192.168.0.3", "10.0.0.149" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "GuangDong" } ] } EOF
值得注意的是这里的
hosts
需要指定三台服务器的ip -
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
注意参数
-profile
必须和ca-config.json
文件中的profiles
属性中对应 -
查看,这时会生成如下文件:
[root@tx ssl]# ll total 36 -rw-r--r-- 1 root root 287 Jun 16 14:23 ca-config.json -rw-r--r-- 1 root root 960 Jun 16 14:25 ca.csr -rw-r--r-- 1 root root 212 Jun 16 14:25 ca-csr.json -rw------- 1 root root 1675 Jun 16 14:25 ca-key.pem -rw-r--r-- 1 root root 1273 Jun 16 14:25 ca.pem -rw-r--r-- 1 root root 1041 Jun 16 14:28 server.csr -rw-r--r-- 1 root root 347 Jun 16 14:27 server-csr.json -rw------- 1 root root 1679 Jun 16 14:28 server-key.pem -rw-r--r-- 1 root root 1371 Jun 16 14:28 server.pem
-
将证书分发到各个节点
scp ./* root@jd:/opt/etcd/ssl scp ./* root@js:/opt/etcd/ssl
确保三个节点中的证书一致(都有如上第6步中的文件)
-
创建配置文件
cat > /opt/etcd/cfg/etcd.conf <<EOF #[Member] ETCD_NAME="etcd-1" ETCD_DATA_DIR="/opt/etcd/data" ETCD_LISTEN_PEER_URLS="https://172.16.0.5:2380" ETCD_LISTEN_CLIENT_URLS="https://172.16.0.5:2379" #[Clustering] ETCD_INITIAL_ADVERTISE_PEER_URLS="https://106.55.0.70:2380" ETCD_ADVERTISE_CLIENT_URLS="https://106.55.0.70:2379" ETCD_INITIAL_CLUSTER="etcd-1=https://106.55.0.70:2380,etcd-2=https://114.67.166.108:2380,etcd-3=https://110.43.49.198:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_ENABLE_V2="true" EOF
这里有几个参数需要注意一下:
ETCD_NAME
:该参数不能重复,各个节点是不一样的,且和ETCD_INITIAL_CLUSTER
中的名词和地址对应
ETCD_LISTEN_PEER_URLS
:这里填写的是当前节点内网IP
ETCD_LISTEN_CLIENT_URLS
:同样的,这里填写的是当前节点内网IP
ETCD_INITIAL_ADVERTISE_PEER_URLS
:当前节点公网IP
ETCD_ADVERTISE_CLIENT_URLS
:当前节点公网IP
ETCD_INITIAL_CLUSTER
:集群地址列表
这里需要特别注意的是ETCD_ENABLE_V2
,开启etcd v2API,由于etcd 3.4.9默认使用v3的API,这里我们使用的flannel 0.12.0不支持etcd v3,所以,需要手动开启etcd v2 API的支持。否则,flannel将无法工作。并且我们在部署flannel是需要想etcd集群中添加配置,这里添加配置使用的api必须切换至v2,设置方式export ETCDCTL_API=3
-
创建系统服务
cat > /usr/lib/systemd/system/etcd.service <<EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=/opt/etcd/cfg/etcd.conf ExecStart=/opt/etcd/bin/etcd \\ --cert-file=/opt/etcd/ssl/server.pem \\ --key-file=/opt/etcd/ssl/server-key.pem \\ --peer-cert-file=/opt/etcd/ssl/server.pem \\ --peer-key-file=/opt/etcd/ssl/server-key.pem \\ --trusted-ca-file=/opt/etcd/ssl/ca.pem \\ --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \\ --logger=zap Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
EnvironmentFile
:需要应用的配置文件路径,etcd会去该文件读取环境变量 -
启动(分别启动三个节点)
systemctl daemon-reload systemctl start etcd.service systemctl enable etcd.service
-
配置etcdctl环境变量,以便在使用时不用指定证书及endpoints参数等
cat >> ~/.bashrc <<EOF export ETCDCTL_ENDPOINTS=https://106.55.0.70:2379,https://114.67.166.108:2379,https://110.43.49.198:2379 export ETCDCTL_CACERT=/opt/etcd/ssl/ca.pem export ETCDCTL_CERT=/opt/etcd/ssl/server.pem export ETCDCTL_KEY=/opt/etcd/ssl/server-key.pem EOF . ~/.bashrc
-
查看集群健康状态
[root@tx cfg]# etcdctl endpoint health https://106.55.0.70:2379 is healthy: successfully committed proposal: took = 18.282218ms https://110.43.49.198:2379 is healthy: successfully committed proposal: took = 26.816091ms https://114.67.166.108:2379 is healthy: successfully committed proposal: took = 35.442918ms
Master节点部署
kube-apiserver部署
-
创建kubernetes CA配置文件
ca-config.json
cat > /opt/kubernetes/ssl/ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF
-
创建CA证书请求文件
cat > /opt/kubernetes/ssl/ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "GuangDong", "O": "k8s", "OU": "System" } ] } EOF
-
生成CA证书
cfssl gencert --initca ca-csr.json | cfssljson --bare ca
-
创建证书请求文件
cat > /opt/kubernetes/ssl/server-csr.json <<EOF { "CN": "kubernetes", "hosts": [ "10.254.0.1", "127.0.0.1", "139.9.181.246", "106.55.0.70", "192.168.0.210", "172.16.0.5", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "GuangDong", "O": "k8s", "OU": "System" } ] } EOF
hosts
属性中需要包含所有Master节点的公网IP、内网IP和ClusterIP -
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
-
生成Token配置文件,用于Node节点中kubelet加入集群验证
-
生成Token
head -c 16 /dev/urandom | od -An -t x | tr -d ' ' f33e04663a4d7dd00be3df626e7251c8
-
创建token.csv
cat > /opt/kubernetes/cfg/token.csv <<EOF f33e04663a4d7dd00be3df626e7251c8,kubelet-bootstrap,10001,"system:node-bootstrapper" EOF
token.csv文件第一个参数标识一个tokenId,第二个参数标识一个用户,第三个标识用户组,第四个标识当前加入到k8s的角色
-
-
创建配置文件
cat > /opt/kubernetes/cfg/kube-apiserver.conf <<EOF KUBE_APISERVER_OPTS="--logtostderr=false \ --v=2 \ --log-dir=/opt/kubernetes/logs \ --etcd-servers=https://106.55.0.70:2379,https://114.67.166.108:2379,https://110.43.49.198:2379 \ --bind-address=172.16.0.5 \ --secure-port=6443 \ --advertise-address=106.55.0.70 \ --allow-privileged=true \ --service-cluster-ip-range=10.254.0.0/24 \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \ --authorization-mode=RBAC,Node \ --enable-bootstrap-token-auth=true \ --token-auth-file=/opt/kubernetes/cfg/token.csv \ --service-node-port-range=30000-32767 \ --kubelet-client-certificate=/opt/kubernetes/ssl/server.pem \ --kubelet-client-key=/opt/kubernetes/ssl/server-key.pem \ --tls-cert-file=/opt/kubernetes/ssl/server.pem \ --tls-private-key-file=/opt/kubernetes/ssl/server-key.pem \ --client-ca-file=/opt/kubernetes/ssl/ca.pem \ --service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \ --etcd-cafile=/opt/etcd/ssl/ca.pem \ --etcd-certfile=/opt/etcd/ssl/server.pem \ --etcd-keyfile=/opt/etcd/ssl/server-key.pem \ --audit-log-maxage=30 \ --audit-log-maxbackup=3 \ --audit-log-maxsize=100 \ --audit-log-path=/opt/kubernetes/logs/k8s-audit.log" EOF
--bind-address
:当前节点内网IP--token-auth-file
:第4步生成的token.csv文件路径 -
创建系统服务
cat > /usr/lib/systemd/system/kube-apiserver.service <<EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=-/opt/kubernetes/cfg/kube-apiserver.conf ExecStart=/opt/kubernetes/bin/kube-apiserver \$KUBE_APISERVER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF
-
将tx节点上生成的证书及
token.csv
,kube-apiserver.conf
,kube-apiserver.service
copy到hw节点相同目录下 -
启动
systemctl daemon-reload
systemctl start kube-apiserver.service
systemctl enable kube-apiserver.service
部署Nginx,实现kube-apiserver的负载均衡
-
安装Nginx
-
安装Nginx所以来的包
yum -y install gcc pcre-devel zlib-devel
-
安装Nginx,不指定
--prefix
默认会将Nginx安装到/usr/local/nginx
目录tar -zxvf nginx-1.18.0.tar.gz cd nginx-1.18.0 ./configure --with-stream make && make install
-
-
创建系统服务
cat > /usr/lib/systemd/system/nginx.service <<EOF [Unit] Description=nginx - high performance web server After=network.target remote-fs.target nss-lookup.target [Service] Type=forking ExecStart=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf ExecReload=/usr/local/nginx/sbin/nginx -s reload ExecStop=/usr/local/nginx/sbin/nginx -s stop [Install] WantedBy=multi-user.target EOF
-
配置四层负载
cat >> /usr/local/nginx/conf/nginx.conf <<EOF stream { upstream apiserver { server 139.9.181.246:6443; server 106.55.0.70:6443; } server { listen 8443; proxy_pass apiserver; } } EOF
这里监听8443端口,反向代理到
139.9.181.246
和106.55.0.70
的6443端口 -
启动验证
systemctl daemon-reload systemctl start nginx.service systemctl enable nginx.service
这里将Nginx部署在hw节点上,这里为了简便,Nginx是单节点的,生产环境是不可靠的,需要搭配keepalived做主备
部署kubectl
-
创建证书请求文件
cat > /opt/kubernetes/ssl/admin-csr.json <<EOF { "CN": "admin", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "system:masters", "OU": "System" } ] } EOF
-
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
-
生成
admin.conf
配置文件并将其copy到用户目录下的.kube
目录下cd /opt/kubernetes/cfg kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=https://139.9.181.246:8443 \ --kubeconfig=admin.conf kubectl config set-credentials admin \ --client-certificate=/opt/kubernetes/ssl/admin.pem \ --client-key=/opt/kubernetes/ssl/admin-key.pem \ --embed-certs=true \ --kubeconfig=admin.conf kubectl config set-context kubernetes \ --cluster=kubernetes \ --user=admin \ --kubeconfig=admin.conf kubectl config use-context kubernetes --kubeconfig=admin.conf cp admin.conf /root/.kube/config
kube-controller-manager部署
-
创建证书请求文件
cat > /opt/kubernetes/ssl/kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "hosts": [ "127.0.0.1", "139.9.181.246", "106.55.0.70", "192.168.0.210", "172.16.0.5", "node01.k8s.com", "node02.k8s.com", "node03.k8s.com" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "system:kube-controller-manager", "OU": "System" } ] } EOF
hosts
属性中需要包含当前节点的内网IP和公网IPCN和O均为system:kube-controller-manager,kubernetes内置的ClusterRoleBindings system:kube-controller-manager赋予kube-controller-manager工作所需权限
-
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
-
创建
kubeconfig
配置文件cd /opt/kubernetes/cfg kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=https://139.9.181.246:8443 \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager \ --client-certificate=/opt/kubernetes/ssl/kube-controller-manager.pem \ --client-key=/opt/kubernetes/ssl/kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager \ --cluster=kubernetes \ --user=system:kube-controller-manager \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
-
创建系统服务(这里将配置直接写入到kube-controller-manager.server文件中,原因是将配置单独写入到一个配置文件中然后通过
EnvironmentFile
读取会出现没有指定--master
的异常,具体原因不详)cat > /usr/lib/systemd/system/kube-controller-manager.service <<EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/kubernetes/bin/kube-controller-manager \\ --profiling \\ --cluster-name=kubernetes \\ --controllers=*,bootstrapsigner,tokencleaner \\ --kube-api-qps=1000 \\ --kube-api-burst=2000 \\ --leader-elect=true \\ --use-service-account-credentials \\ --concurrent-service-syncs=2 \\ --tls-cert-file=/opt/kubernetes/ssl/kube-controller-manager.pem \\ --tls-private-key-file=/opt/kubernetes/ssl/kube-controller-manager-key.pem \\ --authentication-kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig \\ --authorization-kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig \\ --client-ca-file=/opt/kubernetes/ssl/ca.pem \\ --requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \\ --requestheader-extra-headers-prefix="X-Remote-Extra-" \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem \\ --cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem \\ --experimental-cluster-signing-duration=876000h \\ --horizontal-pod-autoscaler-sync-period=10s \\ --concurrent-deployment-syncs=10 \\ --concurrent-gc-syncs=30 \\ --node-cidr-mask-size=24 \\ --service-cluster-ip-range=10.254.0.0/24 \\ --pod-eviction-timeout=6m \\ --terminated-pod-gc-threshold=10000 \\ --root-ca-file=/opt/kubernetes/ssl/ca.pem \\ --service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem \\ --kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig \\ --logtostderr=false \\ --log-dir=/opt/kubernetes/logs \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF
service-cluster-ip-range
:指定 Service Cluster IP 网段,必须和 kube-apiserver 中的同名参数一致 -
同样的,需要将
kube-controller-manager
生成的证书,kube-controller-manager.kubeconfig
,kube-controller-manager.service
文件copy到另一个master节点。 -
启动
systemctl daemon-reload systemctl start kube-controller-manager.service systemctl enable kube-controller-manager.service
kube-scheduler部署
-
创建证书请求文件
cat > /opt/kubernetes/ssl/kube-scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "hosts": [ "127.0.0.1", "139.9.181.246", "106.55.0.70", "192.168.0.210", "172.16.0.5" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "system:kube-scheduler", "OU": "System" } ] } EOF
hosts
:需要指定所有master节点内网IP和公网IP -
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
-
创建
kubeconfig
配置文件cd /opt/kubernetes/cfg kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=https://139.9.181.246:8443 \ --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler \ --client-certificate=/opt/kubernetes/ssl/kube-scheduler.pem \ --client-key=/opt/kubernetes/ssl/kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler \ --cluster=kubernetes \ --user=system:kube-scheduler \ --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
-
创建系统服务配置文件
cat > /opt/kubernetes/cfg/kube-scheduler.conf <<EOF KUBE_SCHEDULER_ARGS="--bind-address=127.0.0.1 \ --kubeconfig=/opt/kubernetes/cfg/kube-scheduler.kubeconfig \ --leader-elect=true \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/opt/kubernetes/logs \ --v=2" EOF
-
创建系统服务
cat > /usr/lib/systemd/system/kube-scheduler.service <<EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] EnvironmentFile=-/opt/kubernetes/cfg/kube-scheduler.conf ExecStart=/opt/kubernetes/bin/kube-scheduler \$KUBE_SCHEDULER_ARGS Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF
-
同样的,需要将
kube-scheduler
生成的证书,kube-scheduler.kubeconfig
,kube-scheduler.conf
,kube-scheduler.service
文件copy到另一个master节点。 -
启动
systemctl daemon-reload systemctl start kube-scheduler systemctl enable kube-scheduler
Node节点部署
docker安装
这里使用yum安装Docker,默认yum远程仓库中没有docker,所以需要添加repository,添加repository需要用到yum-config-manager,但是在使用yum-config-manager时报找不到该命令,所以需要安装yum-utils
-
安装
yum-utils
yum install -y yum-utils device-mapper-persistent-data lvm2
-
添加docker yum源
yum-config-manager \ --add-repo \ https://download.docker.com/linux/centos/docker-ce.repo
-
安装docker-ce及客户端
yum install -y docker-ce-19.03.11-3.el7 docker-ce-cli-19.03.11-3.el7 containerd.io
-
配置阿里云镜像加速(因为国内访问DockerHub比较慢)
sudo mkdir -p /etc/docker sudo tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors": ["https://gqjyyepn.mirror.aliyuncs.com"] } EOF sudo systemctl daemon-reload sudo systemctl restart docker sudo systemctl enable docker
flannel部署
Flannel的用于跨主机节点间docker容器的通信
-
将分配给flannel的子网段写入到etcd中,避免多个flannel节点间ip冲突
etcdctl set /coreos.com/network/config '{ "Network": "172.17.0.0/16", "Backend": {"Type": "vxlan"}}'
这里一定要用etcdctl v2 API进行set,不能使用etcdctl v3 API通过put命令添加
-
创建配置文件
cat > /opt/kubernetes/cfg/flannel.conf <<EOF FLANNEL_ARGS="--public-ip=114.67.166.108 \ --iface=eth0 \ --etcd-endpoints=https://106.55.0.70:2379,https://110.43.49.198:2379,https://114.67.166.108:2379 \ --etcd-cafile=/opt/etcd/ssl/ca.pem \ --etcd-certfile=/opt/etcd/ssl/server.pem \ --etcd-keyfile=/opt/etcd/ssl/server-key.pem \ --ip-masq=true" EOF
注意
--public-ip
:这里填写当前节点公网IP
--iface
:填写内网IP或者内网IP对应的网卡名称
-
创建系统服务
cat > /usr/lib/systemd/system/flanneld.service <<EOF [Unit] Description=Flanneld overlay address etc agent After=network-online.target network.target Before=docker.service [Service] Type=notify EnvironmentFile=-/opt/kubernetes/cfg/flannel.conf ExecStart=/opt/kubernetes/bin/flanneld \$FLANNEL_ARGS ExecStartPost=/opt/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker Restart=on-failure [Install] WantedBy=multi-user.target EOF
mk-docker-opts.sh
脚本将分配给flannel的Pod子网网段信息写入到/run/flannel/docker
文件中,后续docker启动时使用这个文件中参数值设置docker0网桥 -
配置docker使用flannel分配的子网
-
修改docker.service文件
vim /usr/lib/systemd/system/docker.service
这里有三个地方需要修改
[root@jd bin]# cat /usr/lib/systemd/system/docker.service [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service flanneld.service Wants=network-online.target Requires=docker.socket flanneld.service [Service] Type=notify EnvironmentFile=-/run/flannel/docker # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS -H fd:// --containerd=/run/containerd/containerd.sock ExecReload=/bin/kill -s HUP $MAINPID TimeoutSec=0 RestartSec=2 Restart=always # 省略...
在Unit段中的After后面添加flanneld.service参数
在Wants下面添加
Requires=flanneld.service
在Service段中Type后面添加
EnvironmentFile=-/run/flannel/docker
在ExecStart后面添加
$DOCKER_NETWORK_OPTIONS
参数
-
-
启动服务
systemctl daemon-reload systemctl start flanneld.service systemctl enable flanneld.service systemctl restart docker.service
查看
ip a # 省略其他网卡信息 3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default link/ether 72:c6:c5:14:c4:07 brd ff:ff:ff:ff:ff:ff inet 10.254.12.0/32 scope global flannel.1 valid_lft forever preferred_lft forever inet6 fe80::70c6:c5ff:fe14:c407/64 scope link valid_lft forever preferred_lft forever 4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default link/ether 02:42:34:3d:98:65 brd ff:ff:ff:ff:ff:ff inet 10.254.12.1/24 brd 10.254.12.255 scope global docker0 valid_lft forever preferred_lft forever inet6 fe80::42:34ff:fe3d:9865/64 scope link valid_lft forever preferred_lft forever
如果发现flannel和docker0出于同一个网段则证明部署成功
kubelet部署
-
为kubelet创建一个bootstrap.kubeconfig文件,这里先在master节点上执行,然后拷贝到worker节点,因为只有master节点上安装了kubectl
cd /opt/kubernetes/cfg kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=https://139.9.181.246:8443 \ --kubeconfig=bootstrap.kubeconfig # 设置客户端认证参数 kubectl config set-credentials kubelet-bootstrap \ --token=f33e04663a4d7dd00be3df626e7251c8 \ --kubeconfig=bootstrap.kubeconfig # 设置上下文参数 kubectl config set-context default \ --cluster=kubernetes \ --user=kubelet-bootstrap \ --kubeconfig=bootstrap.kubeconfig # 设置默认上下文 kubectl config use-context default --kubeconfig=bootstrap.kubeconfig scp bootstrap.kubeconfig root@jd:/opt/kubernetes/cfg scp bootstrap.kubeconfig root@js:/opt/kubernetes/cfg
-
创建kubelet.config.json文件
cat > /opt/kubernetes/cfg/kubelet-config.yml <<EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: 0.0.0.0 port: 10250 readOnlyPort: 10255 cgroupDriver: cgroupfs clusterDNS: - 10.254.0.2 clusterDomain: cluster.local failSwapOn: false authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /opt/kubernetes/ssl/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% maxOpenFiles: 1000000 maxPods: 110 EOF
-
创建系统服务配置文件
cat > /opt/kubernetes/cfg/kubelet.conf <<EOF KUBELET_OPTS="--logtostderr=false \ --v=2 \ --log-dir=/opt/kubernetes/logs \ --alsologtostderr=true \ --kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \ --bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \ --config=/opt/kubernetes/cfg/kubelet-config.yml \ --cert-dir=/opt/kubernetes/ssl \ --pod-infra-container-image=k8s.gcr.io/pause-amd64:3.1" EOF
kubelet.config
会自动生成
-
创建系统服务
cat > /usr/lib/systemd/system/kubelet.service <<EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] EnvironmentFile=-/opt/kubernetes/cfg/kubelet.conf ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTS Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF
-
由于pause镜像默认拉取需要科学上网,所有这里通过国内下载打标签的方式预先下载好镜像
cat > download-images.sh <<EOF #!/bin/bash docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-node:v3.1.3 docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-node:v3.1.3 quay.io/calico/node:v3.1.3 docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-node:v3.1.3 docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-cni:v3.1.3 docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-cni:v3.1.3 quay.io/calico/cni:v3.1.3 docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-cni:v3.1.3 docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/pause-amd64:3.1 docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/pause-amd64:3.1 k8s.gcr.io/pause-amd64:3.1 docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/pause-amd64:3.1 docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-typha:v0.7.4 docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-typha:v0.7.4 quay.io/calico/typha:v0.7.4 docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-typha:v0.7.4 docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/coredns:1.1.3 docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/coredns:1.1.3 k8s.gcr.io/coredns:1.1.3 docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/coredns:1.1.3 docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/kubernetes-dashboard-amd64:v1.8.3 docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/kubernetes-dashboard-amd64:v1.8.3 k8s.gcr.io/kubernetes-dashboard-amd64:v1.8.3 docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/kubernetes-dashboard-amd64:v1.8.3 EOF
可以按需下载,也可以直接执行以上脚本拉取所有k8s所需要的镜像
-
将kubelet-bootstrap用户绑定到系统集群角色(当前操作在master上执行)
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
-
启动
mkdir -p /var/lib/kubelet systemctl daemon-reload systemctl start kubelet.service systemctl enable kubelet.service
-
在Master节点上审批csr请求
kubectl get csr NAME AGE REQUESTOR CONDITION node-csr-C4O9_KIek83fXKlhPjsW37KxpzBGl6CSspvsDEiBsPc 18s kubelet-bootstrap Pending node-csr-Ow3aKEezOFC3bGIerrIu_olmsKEb02GNECffcfOYYZY 18s kubelet-bootstrap Pending kubectl certificate approve node-csr-C4O9_KIek83fXKlhPjsW37KxpzBGl6CSspvsDEiBsPc kubectl certificate approve node-csr-Ow3aKEezOFC3bGIerrIu_olmsKEb02GNECffcfOYYZY
kube-proxy部署
-
创建证书请求文件(当前操作在Master上执行,完成后分发到Node节点)
cat > /opt/kubernetes/ssl/kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "k8s", "OU": "System" } ] } EOF
-
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy scp kube-proxy*.* root@jd:/opt/kubernetes/ssl scp kube-proxy*.* root@js:/opt/kubernetes/ssl
-
创建
kubeconfig
配置文件cd /ope/kubernetes/cfg kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=https://139.9.181.246:8443 \ --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy \ --client-certificate=/opt/kubernetes/ssl/kube-proxy.pem \ --client-key=/opt/kubernetes/ssl/kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default \ --cluster=kubernetes \ --user=kube-proxy \ --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig scp kube-proxy.kubeconfig root@jd:/opt/kubernetes/cfg scp kube-proxy.kubeconfig root@js:/opt/kubernetes/cfg
-
创建
kube-proxy-config.yml
配置文件cat > /opt/kubernetes/cfg/kube-proxy-config.yml <<EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 192.168.0.3 metricsBindAddress: 192.168.0.3:10249 clientConnection: kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig clusterCIDR: 10.254.0.0/24 EOF
-
创建系统服务配置文件
cat > /opt/kubernetes/cfg/kube-proxy.conf <<EOF KUBE_PROXY_OPTS="--logtostderr=false \ --v=2 \ --log-dir=/opt/kubernetes/logs \ --config=/opt/kubernetes/cfg/kube-proxy-config.yml" EOF
-
创建系统服务
cat > /usr/lib/systemd/system/kube-proxy.service <<EOF [Unit] Description=Kubernetes Proxy After=network.target [Service] EnvironmentFile=-/opt/kubernetes/cfg/kube-proxy.conf ExecStart=/opt/kubernetes/bin/kube-proxy \$KUBE_PROXY_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
-
启动
systemctl daemon-reload systemctl enable kube-proxy systemctl start kube-proxy
Dashboard
-
下载yaml文件
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml
-
修改Service的 Type为NodePort,供集群外部访问,默认为ClusterIP,只能集群内部访问
vim recommended.yaml ---- kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: ports: - port: 443 targetPort: 8443 nodePort: 30001 type: NodePort selector: k8s-app: kubernetes-dashboard ----
-
创建Kubernetes资源
kubectl apply -f recommended.yaml
-
查看资源
[root@tx ssl]# kubectl get pod,svc -n kubernetes-dashboard NAME READY STATUS RESTARTS AGE pod/dashboard-metrics-scraper-694557449d-pcgxw 1/1 Running 0 46h pod/kubernetes-dashboard-9774cc786-rpznb 1/1 Running 0 46h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dashboard-metrics-scraper ClusterIP 10.254.0.31 <none> 8000/TCP 46h service/kubernetes-dashboard NodePort 10.254.0.126 <none> 443:30601/TCP 46h
-
访问
通过任何Node节点的IP加上
kubernetes-dashboard
Service暴露的端口进行访问,这里的端口为30601,所有,我们可以通过114.67.166.108:30601
进行访问。如下图,Chrome提示该链接是非安全的,且在高级里面没有继续访问的选项,因为我们使用的是自签证书,Chrome无法识别这里我们可以在当前页面输入
thisisunsafe
即可继续访问。 -
创建Service Account并绑定集群角色
kubectl create serviceaccount dashboard-admin -n kube-system kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
-
查看token
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
拿到Token过后就可以输入token登录Dashboard了。
CoreDns
-
下载yaml文件,点击下载,我在下载官方的yaml安装出了点问题,我使用的yaml如下:
apiVersion: v1 kind: ServiceAccount metadata: name: coredns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: kubernetes.io/bootstrapping: rbac-defaults addonmanager.kubernetes.io/mode: Reconcile name: system:coredns rules: - apiGroups: - "" resources: - endpoints - services - pods - namespaces verbs: - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults addonmanager.kubernetes.io/mode: EnsureExists name: system:coredns roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:coredns subjects: - kind: ServiceAccount name: coredns namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists data: Corefile: | .:53 { errors health kubernetes cluster.local. in-addr.arpa ip6.arpa { pods insecure upstream fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf cache 30 reload } --- apiVersion: apps/v1 kind: Deployment metadata: name: coredns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "CoreDNS" spec: # replicas: not specified here: # 1. In order to make Addon Manager do not reconcile this replicas parameter. # 2. Default is 1. # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on. strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 selector: matchLabels: k8s-app: kube-dns template: metadata: labels: k8s-app: kube-dns annotations: seccomp.security.alpha.kubernetes.io/pod: 'docker/default' spec: serviceAccountName: coredns tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule - key: "CriticalAddonsOnly" operator: "Exists" containers: - name: coredns image: coredns/coredns:1.6.7 imagePullPolicy: IfNotPresent resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi args: [ "-conf", "/etc/coredns/Corefile" ] volumeMounts: - name: config-volume mountPath: /etc/coredns readOnly: true ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true dnsPolicy: Default volumes: - name: config-volume configMap: name: coredns items: - key: Corefile path: Corefile --- apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "CoreDNS" spec: selector: k8s-app: kube-dns clusterIP: 10.254.0.2 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP
注意
将Service资源的
clusterIP
改成集群IP的第二个地址,比如我的cluster-ip-rang是10.254.0.0/24,这里填写10.254.0.2
-
创建Kubernetes资源
kubectl apply -f coredns.yaml
-
测试
kubectl run -it --rm dns-test --image=busybox:1.28.4 sh nslookup kubernetes
自动补全
-
安装
bash-completion
yum install bash-completion
-
配置
source /usr/share/bash-completion/bash_completion source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc
NFS
-
安装NFS Server和rpcbind
yum -y install nfs-utils rpcbind
-
开放目录
echo "/opt/nfs 0.0.0.0/0(rw,sync,all_squash,anonuid=65534,anongid=65534)" >> /etc/exports
-
启动服务
systemctl start rpcbind systemctl start nfs-server systemctl enable rpcbind systemctl enable nfs-server
-
客户端验证
yum install -y showmount showmount -e ip
-
创建PV/PVC
apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv spec: capacity: storage: 4Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle nfs: path: /opt/nfs server: 110.43.49.198 --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: gogs-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
创建
nfs-pv-pvc.yaml
,执行kubectl apply -f nfs-pv-pvc.yaml
,可以看到PV/PVC已经创建完成[root@tx workspace]# kubectl get pv --all-namespaces NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs-pv 4Gi RWO Recycle Bound default/gogs-pvc 4s [root@tx workspace]# kubectl get pvc --all-namespaces NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE default gogs-pvc Bound nfs-pv 4Gi RWO 10s
部署gogs
-
编写yaml
apiVersion: apps/v1 kind: Deployment metadata: name: gogs labels: app: gogs spec: replicas: 1 template: metadata: name: gogs labels: app: gogs spec: containers: - name: gogs image: gogs/gogs:0.11.91 imagePullPolicy: IfNotPresent volumeMounts: - name: gogs-data mountPath: /data ports: - containerPort: 3000 name: web protocol: TCP - containerPort: 22 name: ssh protocol: TCP resources: limits: cpu: "500m" memory: "500Mi" requests: cpu: "300m" memory: "256Mi" livenessProbe: httpGet: port: 3000 readinessProbe: httpGet: port: 3000 volumes: - name: gogs-data persistentVolumeClaim: claimName: gogs-pvc restartPolicy: Always selector: matchLabels: app: gogs --- apiVersion: v1 kind: Service metadata: name: gogs spec: selector: app: gogs ports: - port: 3000 type: NodePort
[root@tx workspace]# kubectl get pod NAME READY STATUS RESTARTS AGE gogs-fb66dbb8f-r4bgp 0/1 Running 2 79s
这里可以看到pod的STATUS虽然是Running状态,但是它并不可用,READY是0,原因是就绪探针和存活探针探测到该pod中的容器端口3000并没有被监听。
查看该pod日志,我们可以发现,
mkdir /data/git: permission denied
,这里很关键,创建/data/git目录是没有权限,这里我们将/data目录挂载到了gogs-pvc上,然而该pvc自动分配到了nfs-pv上,查看该pv对用的nfs-server开放的目录,发现只有root用户有写的权限。[root@tx workspace]# kubectl logs gogs-779f87bd89-ld9v5 //省略... ./run: ./setup: line 9: can't create /data/git/.ssh/environment: nonexistent directory chmod: /data/git/.ssh/environment: No such file or directory chown: /data: Operation not permitted chown: /data: Operation not permitted chown: /data/git/: No such file or directory chmod: /data: Operation not permitted chmod: /data/gogs: No such file or directory chmod: /data/git/: No such file or directory 2020/06/22 13:46:33 [ WARN] Custom config '/data/gogs/conf/app.ini' not found, ignore this if you're running first time 2020/06/22 13:46:33 [FATAL] [...g/setting/setting.go:517 NewContext()] Fail to create '/data/git/.ssh': mkdir /data/git: permission denied
-
授权nfs-server目录权限
[root@js opt]# chmod 777 nfs
-
重启pod
[root@tx ~]# kubectl delete pod gogs-779f87bd89-ld9v5
此时,gogs正常
nginx-ingress-controller部署
-
将下载好的deploy.yaml中的image替换成能下载的镜像源,这里将改成如下:
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.32.0
-
执行deploy.yaml
kubectl apply -f deploy.yaml
Reids-Cluster集群部署
-
由于Redis-Cluster模式官方推荐使用六个redis节点从而搭建一个三主三从的集群,所以,这里需要创建六个NFS存储目录
mkdir -p /opt/nfs/redis/pv{1,2,3,4,5,6} cd /opt/nfs/redis chmod 777 ./*
-
将这六个pv目录通过nfs服务暴露出去
vim /etc/exports /opt/nfs/redis/pv1 *(rw,all_squash) /opt/nfs/redis/pv2 *(rw,all_squash) /opt/nfs/redis/pv3 *(rw,all_squash) /opt/nfs/redis/pv4 *(rw,all_squash) /opt/nfs/redis/pv5 *(rw,all_squash) /opt/nfs/redis/pv6 *(rw,all_squash)
-
创建yaml,里面包含了
namespace
/PersistentVolume
/Service
/StatefulSet
,如下:#namespaces apiVersion: v1 kind: Namespace metadata: name: redis-cluster-ns --- #PersistentVolume apiVersion: v1 kind: PersistentVolume metadata: name: redis-pv1 namespace: redis-cluster-ns spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: server: 110.43.49.198 path: /opt/nfs/redis/pv1 --- apiVersion: v1 kind: PersistentVolume metadata: name: redis-pv2 namespace: redis-cluster-ns spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: path: /opt/nfs/redis/pv2 server: 110.43.49.198 --- apiVersion: v1 kind: PersistentVolume metadata: name: redis-pv3 namespace: redis-cluster-ns spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: path: /opt/nfs/redis/pv3 server: 110.43.49.198 --- apiVersion: v1 kind: PersistentVolume metadata: name: redis-pv4 namespace: redis-cluster-ns spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: path: /opt/nfs/redis/pv4 server: 110.43.49.198 --- apiVersion: v1 kind: PersistentVolume metadata: name: redis-pv5 namespace: redis-cluster-ns spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: path: /opt/nfs/redis/pv5 server: 110.43.49.198 --- apiVersion: v1 kind: PersistentVolume metadata: name: redis-pv6 namespace: redis-cluster-ns spec: capacity: storage: 1Gi accessModes: - ReadWriteMany nfs: path: /opt/nfs/redis/pv6 server: 110.43.49.198 #ConfigMap --- apiVersion: v1 kind: ConfigMap metadata: name: redis-conf namespace: redis-cluster-ns data: redis.conf: | appendonly yes cluster-enabled yes cluster-config-file /var/lib/redis/nodes.conf cluster-node-timeout 5000 dir /var/lib/redis port 6379 #Service --- apiVersion: v1 kind: Service metadata: name: redis-service namespace: redis-cluster-ns labels: app: redis spec: selector: app: redis appCluster: redis-cluster ports: - name: redis-port port: 6379 clusterIP: None #StatefulSet --- apiVersion: apps/v1 kind: StatefulSet metadata: name: redis-app namespace: redis-cluster-ns spec: selector: matchLabels: app: redis serviceName: "redis-service" replicas: 6 template: metadata: namespace: redis-cluster-ns labels: app: redis appCluster: redis-cluster spec: terminationGracePeriodSeconds: 20 affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: topologyKey: kubernetes.io/hostname labelSelector: matchExpressions: - key: app operator: In values: - redis weight: 100 containers: - name: redis image: redis # command: redis-server /etc/redis/redis.conf --protected-mode no command: - "redis-server" args: - "/etc/redis/redis.conf" #换行表示空格 - "--protected-mode" - "no" #需要请求的资源限制,100m相当于0.1核cpu resources: requests: cpu: 100m memory: 100Mi #容器暴露的端口 ports: - name: redis containerPort: 6379 protocol: TCP - name: cluster containerPort: 16379 protocol: TCP volumeMounts: - name: redis-conf mountPath: /etc/redis - name: redis-data mountPath: /var/lib/redis volumes: - name: redis-conf configMap: name: redis-conf items: - key: redis.conf path: redis.conf volumeClaimTemplates: - metadata: name: redis-data namespace: redis-cluster-ns spec: accessModes: - ReadWriteMany resources: requests: storage: 1Gi #Service --- apiVersion: v1 kind: Service metadata: name: redis-access-service namespace: redis-cluster-ns labels: app: redis spec: ports: - name: redis-port port: 6379 protocol: TCP targetPort: 6379 selector: app: redis appCluster: redis-cluster type: NodePort
-
创建资源
kubectl apply -f redis-cluster.yaml
-
创建
redis-trib
容器进行集群的创建(以上只是运行了六个节点的redis,节点之间并没有通信)kubectl run redis-trib --image=registry.cn-hangzhou.aliyuncs.com/james-public/redis-trib:latest -it -n redis-cluster-ns --restart=Never /bin/sh
-
在
redis-trib
容器中执行redis-trib create --replicas 1 \ `dig +short redis-app-0.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \ `dig +short redis-app-1.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \ `dig +short redis-app-2.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \ `dig +short redis-app-3.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \ `dig +short redis-app-4.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \ `dig +short redis-app-5.redis-service.redis-cluster-ns.svc.cluster.local`:6379
-
最终在任意redis节点中查看redis-cluster状态查看是否创建成功
[root@tx workspace]# kubectl exec -it redis-app-2 /bin/bash -n redis-cluster-ns kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead. root@redis-app-2:/data# /usr/local/bin/redis-cli -c 127.0.0.1:6379> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:6 cluster_my_epoch:3 cluster_stats_messages_ping_sent:32092 cluster_stats_messages_pong_sent:31509 cluster_stats_messages_meet_sent:1 cluster_stats_messages_sent:63602 cluster_stats_messages_ping_received:31509 cluster_stats_messages_pong_received:32093 cluster_stats_messages_received:63602 127.0.0.1:6379> cluster nodes 4cf5e475c88ed71959e4957bcd1913230c3d923a 172.17.91.4:6379@16379 myself,master - 0 1593522304000 3 connected 10923-16383 fcab7988b0546b759f6d8e07232843acd4df0017 172.17.39.5:6379@16379 slave fdf280d767db08f4453c425763705df4653dce76 0 1593522306494 4 connected 2b5b5e29bf15608e0e60f1f085bbe96ac070935a 172.17.39.6:6379@16379 slave 4cf5e475c88ed71959e4957bcd1913230c3d923a 0 1593522306000 6 connected 070378f05d3ef0c28e036d8f7b7d61a175d5c2ce 172.17.91.5:6379@16379 slave 64d07d4b62fd249a387bbbdc9bc51033e875fb54 0 1593522305000 5 connected fdf280d767db08f4453c425763705df4653dce76 172.17.91.3:6379@16379 master - 0 1593522306089 1 connected 0-5460 64d07d4b62fd249a387bbbdc9bc51033e875fb54 172.17.39.4:6379@16379 master - 0 1593522305494 2 connected 5461-10922 127.0.0.1:6379>
打印如上信息表示集群创建成功
参考文献
https://blog.csdn.net/chen645800876/article/details/105279648/