跨VPC搭建Kubernetes集群(二进制)

James 2020年10月30日 381次浏览

说明

服务器规划

这里采用一个三节点的etcd集群,两个个Master节点和两个Node节点,如下:

内网IP公网IP主机名组件角色配置
172.16.0.5106.55.0.70txetcd,kube-apiserver,kube-controller-manager,kube-scheduler,cfsslMaster1C2G
192.168.0.210139.9.181.246hwnginx,kube-apiserver,kube-controller-manager,kube-schedulerMaster1C2G
192.168.0.3114.67.166.108jdetcd,kubelet,kube-proxy,flannel,dockerNode2C4G
10.0.0.149110.43.49.198jsetcd,kubelet,kube-proxy,flannel,dockerNode2C4G

软件版本

Kubernetes 1.18.3
Docker 19.03.11
Etcd 3.4.9
Flanneld 0.12.0
Nginx 1.18

证书

这里etcd和kubernetes集群各自使用不同的CA证书。

准备工作

hostname修改

hostnamectl set-hostname tx
hostnamectl set-hostname hw
hostnamectl set-hostname jd
hostnamectl set-hostname js

hosts文件修改

cat >> /etc/hosts <<EOF
114.67.166.108	jd
139.9.181.246	hw
106.55.0.70		tx
10.0.0.149		js
EOF

防火墙配置(关闭)

防火墙这里为了简便使用关闭的方式,也可以自己配置开放端口,如果使用云服务器,记得关闭安全组

systemctl stop firewalld && systemctl disable firewalld

iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT

swapoff -a

sed -i '/swap/s/^\\(.*\\)$/#\1/g' /etc/fstab

setenforce 0

vim /etc/selinux/config
SELINUX=disabled

service dnsmasq stop && systemctl disable dnsmasq

cfssl工具安装

cfssl用于生成kubernetes所用到的证书,为了方便,这里将cfssl安装到tx节点上

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64

wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64

wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64

chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64

mv cfssl_linux-amd64 /usr/local/bin/cfssl

mv cfssljson_linux-amd64 /usr/local/bin/cfssljson

mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo

创建安装目录及日志目录

我这里将安装目录放到/opt/kubernetes下:

mkdir /opt/kubernetes/{bin,cfg,ssl,logs}

mkdir /opt/etcd/{bin,cfg,ssl,data}

bin:用于存放二进制文件

cfg:用于存放配置文件

ssl:用于存放证书

data:用于存放数据,比如etcd的数据

logs:所有日志均放到该目录下

下载软件包

etcd

kubernetes

flannel

Nginx

将下载的软件包解压后放到安装目录,我这里为/opt/kubernetes/bin,如下:

Master节点

[root@tx bin]# pwd
/opt/kubernetes/bin
[root@tx bin]# ll
total 331444
-rwxr-xr-x 1 root root 120668160 Jun 15 15:01 kube-apiserver
-rwxr-xr-x 1 root root 110059520 Jun 15 15:01 kube-controller-manager
-rwxr-xr-x 1 root root  42950656 Jun 15 15:01 kube-scheduler
[root@tx bin]# pwd
/opt/etcd/bin
[root@tx bin]# ll
total 23272
-rwxr-xr-x 1 root root 23827424 Jun 15 15:01 etcd

Node节点

[root@jd bin]# pwd
/opt/kubernetes/bin
[root@jd bin]# ll
total 211776
-rwxr-xr-x 1 root root  35253112 Jun 15 14:58 flanneld
-rwxr-xr-x 1 root root 113283800 Jun 15 14:59 kubelet
-rwxr-xr-x 1 root root  38379520 Jun 15 14:59 kube-proxy
-rwxr-xr-x 1 root root      2139 Jun 15 14:58 mk-docker-opts.sh
[root@jd bin]# pwd
/opt/etcd/bin
[root@jd bin]# ll
total 23272
-rwxr-xr-x 1 root root 23827424 Jun 15 15:01 etcd

Client

etcdctlkubectl分别放到/usr/bin目录下,以便在任何位置都可以访问,可以将这两个客户端放到任何节点,也可以只放到Master节点

[root@tx bin]# pwd
/usr/bin
[root@tx bin]# ll
total 132856
# 省略其他文件
-rwxr-xr-x    1 root root   13498880 May 25 20:17 etcdctl
-rwxr-xr-x    1 root root   43115328 May 26 19:38 kubectl

etcd部署

  1. 创建ETCD CA配置文件ca-config.json

    cat > /opt/etcd/ssl/ca-config.json <<EOF
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "www": {
             "expiry": "87600h",
             "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ]
          }
        }
      }
    }
    EOF
    
  2. 创建CA证书请求文件ca-csr.json

    cat > /opt/etcd/ssl/ca-csr.json <<EOF
    {
        "CN": "etcd CA",
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "ShenZhen",
                "ST": "GuangDong"
            }
        ]
    }
    EOF
    
  3. 生成CA证书

    cfssl gencert --initca ca-csr.json | cfssljson --bare ca
    
  4. 创建etcd证书请求文件

    cat > /opt/etcd/ssl/server-csr.json <<EOF
    {
        "CN": "etcd",
        "hosts": [
        "106.55.0.70",
        "114.67.166.108",
        "110.43.49.198",
        "172.16.0.5",
        "192.168.0.3",
        "10.0.0.149"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "ShenZhen",
                "ST": "GuangDong"
            }
        ]
    }
    EOF
    

    值得注意的是这里的hosts需要指定三台服务器的ip

  5. 生成证书

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
    

    注意参数-profile必须和ca-config.json文件中的profiles属性中对应

  6. 查看,这时会生成如下文件:

    [root@tx ssl]# ll
    total 36
    -rw-r--r-- 1 root root  287 Jun 16 14:23 ca-config.json
    -rw-r--r-- 1 root root  960 Jun 16 14:25 ca.csr
    -rw-r--r-- 1 root root  212 Jun 16 14:25 ca-csr.json
    -rw------- 1 root root 1675 Jun 16 14:25 ca-key.pem
    -rw-r--r-- 1 root root 1273 Jun 16 14:25 ca.pem
    -rw-r--r-- 1 root root 1041 Jun 16 14:28 server.csr
    -rw-r--r-- 1 root root  347 Jun 16 14:27 server-csr.json
    -rw------- 1 root root 1679 Jun 16 14:28 server-key.pem
    -rw-r--r-- 1 root root 1371 Jun 16 14:28 server.pem
    
    
  7. 将证书分发到各个节点

    scp ./* root@jd:/opt/etcd/ssl
    
    scp ./* root@js:/opt/etcd/ssl
    

    确保三个节点中的证书一致(都有如上第6步中的文件)

  8. 创建配置文件

    cat > /opt/etcd/cfg/etcd.conf <<EOF
    #[Member]
    ETCD_NAME="etcd-1"
    ETCD_DATA_DIR="/opt/etcd/data"
    ETCD_LISTEN_PEER_URLS="https://172.16.0.5:2380"
    ETCD_LISTEN_CLIENT_URLS="https://172.16.0.5:2379"
    #[Clustering]
    ETCD_INITIAL_ADVERTISE_PEER_URLS="https://106.55.0.70:2380"
    ETCD_ADVERTISE_CLIENT_URLS="https://106.55.0.70:2379"
    ETCD_INITIAL_CLUSTER="etcd-1=https://106.55.0.70:2380,etcd-2=https://114.67.166.108:2380,etcd-3=https://110.43.49.198:2380"
    ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
    ETCD_INITIAL_CLUSTER_STATE="new"
    ETCD_ENABLE_V2="true"
    EOF
    

    这里有几个参数需要注意一下:

    ETCD_NAME:该参数不能重复,各个节点是不一样的,且和ETCD_INITIAL_CLUSTER中的名词和地址对应

ETCD_LISTEN_PEER_URLS:这里填写的是当前节点内网IP

ETCD_LISTEN_CLIENT_URLS:同样的,这里填写的是当前节点内网IP

ETCD_INITIAL_ADVERTISE_PEER_URLS:当前节点公网IP

ETCD_ADVERTISE_CLIENT_URLS:当前节点公网IP

ETCD_INITIAL_CLUSTER:集群地址列表

这里需要特别注意的是ETCD_ENABLE_V2,开启etcd v2API,由于etcd 3.4.9默认使用v3的API,这里我们使用的flannel 0.12.0不支持etcd v3,所以,需要手动开启etcd v2 API的支持。否则,flannel将无法工作。并且我们在部署flannel是需要想etcd集群中添加配置,这里添加配置使用的api必须切换至v2,设置方式export ETCDCTL_API=3

  1. 创建系统服务

    cat > /usr/lib/systemd/system/etcd.service <<EOF
    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    [Service]
    Type=notify
    EnvironmentFile=/opt/etcd/cfg/etcd.conf
    ExecStart=/opt/etcd/bin/etcd \\
    --cert-file=/opt/etcd/ssl/server.pem \\
    --key-file=/opt/etcd/ssl/server-key.pem \\
    --peer-cert-file=/opt/etcd/ssl/server.pem \\
    --peer-key-file=/opt/etcd/ssl/server-key.pem \\
    --trusted-ca-file=/opt/etcd/ssl/ca.pem \\
    --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \\
    --logger=zap
    
    Restart=on-failure
    LimitNOFILE=65536
    [Install]
    WantedBy=multi-user.target
    EOF
    

    EnvironmentFile:需要应用的配置文件路径,etcd会去该文件读取环境变量

  2. 启动(分别启动三个节点)

    systemctl daemon-reload
    
    systemctl start etcd.service
    
    systemctl enable etcd.service
    
  3. 配置etcdctl环境变量,以便在使用时不用指定证书及endpoints参数等

    cat >> ~/.bashrc <<EOF
    export ETCDCTL_ENDPOINTS=https://106.55.0.70:2379,https://114.67.166.108:2379,https://110.43.49.198:2379
    export ETCDCTL_CACERT=/opt/etcd/ssl/ca.pem
    export ETCDCTL_CERT=/opt/etcd/ssl/server.pem
    export ETCDCTL_KEY=/opt/etcd/ssl/server-key.pem
    EOF
    
    . ~/.bashrc
    
  4. 查看集群健康状态

    [root@tx cfg]# etcdctl endpoint health
    https://106.55.0.70:2379 is healthy: successfully committed proposal: took = 18.282218ms
    https://110.43.49.198:2379 is healthy: successfully committed proposal: took = 26.816091ms
    https://114.67.166.108:2379 is healthy: successfully committed proposal: took = 35.442918ms
    

Master节点部署

kube-apiserver部署

  1. 创建kubernetes CA配置文件ca-config.json

    cat > /opt/kubernetes/ssl/ca-config.json <<EOF
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "kubernetes": {
             "expiry": "87600h",
             "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ]
          }
        }
      }
    }
    EOF
    
  2. 创建CA证书请求文件

    cat > /opt/kubernetes/ssl/ca-csr.json <<EOF
    {
        "CN": "kubernetes",
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "ShenZhen",
                "ST": "GuangDong",
                "O": "k8s",
                "OU": "System"
            }
        ]
    }
    EOF
    
  3. 生成CA证书

    cfssl gencert --initca ca-csr.json | cfssljson --bare ca
    
  4. 创建证书请求文件

    cat > /opt/kubernetes/ssl/server-csr.json <<EOF
    {
        "CN": "kubernetes",
        "hosts": [
          "10.254.0.1",
          "127.0.0.1",
          "139.9.181.246",
          "106.55.0.70",
          "192.168.0.210",
          "172.16.0.5",
          "kubernetes",
          "kubernetes.default",
          "kubernetes.default.svc",
          "kubernetes.default.svc.cluster",
          "kubernetes.default.svc.cluster.local"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "L": "ShenZhen",
                "ST": "GuangDong",
                "O": "k8s",
                "OU": "System"
            }
        ]
    }
    EOF
    

    hosts属性中需要包含所有Master节点的公网IP、内网IP和ClusterIP

  5. 生成证书

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
    
  6. 生成Token配置文件,用于Node节点中kubelet加入集群验证

    • 生成Token

      head -c 16 /dev/urandom | od -An -t x | tr -d ' '
      f33e04663a4d7dd00be3df626e7251c8
      
    • 创建token.csv

      cat > /opt/kubernetes/cfg/token.csv <<EOF
      f33e04663a4d7dd00be3df626e7251c8,kubelet-bootstrap,10001,"system:node-bootstrapper"
      EOF
      

      token.csv文件第一个参数标识一个tokenId,第二个参数标识一个用户,第三个标识用户组,第四个标识当前加入到k8s的角色

  7. 创建配置文件

    cat > /opt/kubernetes/cfg/kube-apiserver.conf <<EOF
    KUBE_APISERVER_OPTS="--logtostderr=false \
    --v=2 \
    --log-dir=/opt/kubernetes/logs \
    --etcd-servers=https://106.55.0.70:2379,https://114.67.166.108:2379,https://110.43.49.198:2379 \
    --bind-address=172.16.0.5 \
    --secure-port=6443 \
    --advertise-address=106.55.0.70 \
    --allow-privileged=true \
    --service-cluster-ip-range=10.254.0.0/24 \
    --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
    --authorization-mode=RBAC,Node \
    --enable-bootstrap-token-auth=true \
    --token-auth-file=/opt/kubernetes/cfg/token.csv \
    --service-node-port-range=30000-32767 \
    --kubelet-client-certificate=/opt/kubernetes/ssl/server.pem \
    --kubelet-client-key=/opt/kubernetes/ssl/server-key.pem \
    --tls-cert-file=/opt/kubernetes/ssl/server.pem  \
    --tls-private-key-file=/opt/kubernetes/ssl/server-key.pem \
    --client-ca-file=/opt/kubernetes/ssl/ca.pem \
    --service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \
    --etcd-cafile=/opt/etcd/ssl/ca.pem \
    --etcd-certfile=/opt/etcd/ssl/server.pem \
    --etcd-keyfile=/opt/etcd/ssl/server-key.pem \
    --audit-log-maxage=30 \
    --audit-log-maxbackup=3 \
    --audit-log-maxsize=100 \
    --audit-log-path=/opt/kubernetes/logs/k8s-audit.log"
    EOF
    

    --bind-address:当前节点内网IP

    --token-auth-file:第4步生成的token.csv文件路径

  8. 创建系统服务

    cat > /usr/lib/systemd/system/kube-apiserver.service <<EOF
    [Unit]
    Description=Kubernetes API Server
    Documentation=https://github.com/kubernetes/kubernetes
    [Service]
    EnvironmentFile=-/opt/kubernetes/cfg/kube-apiserver.conf
    ExecStart=/opt/kubernetes/bin/kube-apiserver \$KUBE_APISERVER_OPTS
    Restart=on-failure
    [Install]
    WantedBy=multi-user.target
    EOF
    
  9. 将tx节点上生成的证书及token.csvkube-apiserver.confkube-apiserver.servicecopy到hw节点相同目录下

  10. 启动

systemctl daemon-reload

systemctl start kube-apiserver.service

systemctl enable kube-apiserver.service

部署Nginx,实现kube-apiserver的负载均衡

  1. 安装Nginx

    • 安装Nginx所以来的包

      yum -y install gcc pcre-devel zlib-devel
      
    • 安装Nginx,不指定--prefix默认会将Nginx安装到/usr/local/nginx目录

      tar -zxvf nginx-1.18.0.tar.gz
      
      cd nginx-1.18.0
      
      ./configure --with-stream
      
      make && make install
      
  2. 创建系统服务

    cat > /usr/lib/systemd/system/nginx.service <<EOF
    [Unit]
    Description=nginx - high performance web server
    After=network.target remote-fs.target nss-lookup.target
    
    [Service]
    Type=forking
    ExecStart=/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
    ExecReload=/usr/local/nginx/sbin/nginx -s reload
    ExecStop=/usr/local/nginx/sbin/nginx -s stop
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  3. 配置四层负载

    cat >> /usr/local/nginx/conf/nginx.conf <<EOF
    stream {
        upstream apiserver {
            server 139.9.181.246:6443;
    		server 106.55.0.70:6443;
        }
        server {
            listen 8443;
            proxy_pass apiserver;
        }
    }
    EOF
    

    这里监听8443端口,反向代理到139.9.181.246106.55.0.70的6443端口

  4. 启动验证

    systemctl daemon-reload
    
    systemctl start nginx.service
    
    systemctl enable nginx.service
    

    这里将Nginx部署在hw节点上,这里为了简便,Nginx是单节点的,生产环境是不可靠的,需要搭配keepalived做主备

部署kubectl

  1. 创建证书请求文件

    cat > /opt/kubernetes/ssl/admin-csr.json <<EOF
    {
      "CN": "admin",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "GuangDong",
          "L": "ShenZhen",
          "O": "system:masters",
          "OU": "System"
        }
      ]
    }
    EOF
    
  2. 生成证书

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
    
  3. 生成admin.conf配置文件并将其copy到用户目录下的.kube目录下

    cd /opt/kubernetes/cfg
    
    kubectl config set-cluster kubernetes \
        --certificate-authority=/opt/kubernetes/ssl/ca.pem \
        --embed-certs=true \
        --server=https://139.9.181.246:8443 \
        --kubeconfig=admin.conf
    
    kubectl config set-credentials admin \
        --client-certificate=/opt/kubernetes/ssl/admin.pem \
        --client-key=/opt/kubernetes/ssl/admin-key.pem \
        --embed-certs=true \
        --kubeconfig=admin.conf
    
    kubectl config set-context kubernetes \
        --cluster=kubernetes \
        --user=admin \
        --kubeconfig=admin.conf
    
    kubectl config use-context kubernetes --kubeconfig=admin.conf
    
    cp admin.conf /root/.kube/config
    

kube-controller-manager部署

  1. 创建证书请求文件

    cat > /opt/kubernetes/ssl/kube-controller-manager-csr.json <<EOF
    {
        "CN": "system:kube-controller-manager",   
        "hosts": [
    	"127.0.0.1",
    	"139.9.181.246",
    	"106.55.0.70",
    	"192.168.0.210",
    	"172.16.0.5",
    	"node01.k8s.com",
    	"node02.k8s.com",
    	"node03.k8s.com"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "GuangDong",
            "L": "ShenZhen",
            "O": "system:kube-controller-manager",
            "OU": "System"
          }
        ]
    }
    EOF
    

    hosts属性中需要包含当前节点的内网IP和公网IP

    CN和O均为system:kube-controller-manager,kubernetes内置的ClusterRoleBindings system:kube-controller-manager赋予kube-controller-manager工作所需权限

  2. 生成证书

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
    
  3. 创建kubeconfig配置文件

    cd /opt/kubernetes/cfg
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/opt/kubernetes/ssl/ca.pem \
      --embed-certs=true \
      --server=https://139.9.181.246:8443 \
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config set-credentials system:kube-controller-manager \
      --client-certificate=/opt/kubernetes/ssl/kube-controller-manager.pem \
      --client-key=/opt/kubernetes/ssl/kube-controller-manager-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config set-context system:kube-controller-manager \
      --cluster=kubernetes \
      --user=system:kube-controller-manager \
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
    
  4. 创建系统服务(这里将配置直接写入到kube-controller-manager.server文件中,原因是将配置单独写入到一个配置文件中然后通过EnvironmentFile读取会出现没有指定--master的异常,具体原因不详)

    cat > /usr/lib/systemd/system/kube-controller-manager.service <<EOF
    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    ExecStart=/opt/kubernetes/bin/kube-controller-manager   \\
    --profiling   \\
    --cluster-name=kubernetes   \\
    --controllers=*,bootstrapsigner,tokencleaner   \\
    --kube-api-qps=1000   \\
    --kube-api-burst=2000   \\
    --leader-elect=true   \\
    --use-service-account-credentials   \\
    --concurrent-service-syncs=2   \\
    --tls-cert-file=/opt/kubernetes/ssl/kube-controller-manager.pem   \\
    --tls-private-key-file=/opt/kubernetes/ssl/kube-controller-manager-key.pem   \\
    --authentication-kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig   \\
    --authorization-kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig   \\
    --client-ca-file=/opt/kubernetes/ssl/ca.pem   \\
    --requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem   \\
    --requestheader-extra-headers-prefix="X-Remote-Extra-"   \\
    --requestheader-group-headers=X-Remote-Group   \\
    --requestheader-username-headers=X-Remote-User   \\
    --cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem   \\
    --cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem   \\
    --experimental-cluster-signing-duration=876000h   \\
    --horizontal-pod-autoscaler-sync-period=10s   \\
    --concurrent-deployment-syncs=10   \\
    --concurrent-gc-syncs=30   \\
    --node-cidr-mask-size=24   \\
    --service-cluster-ip-range=10.254.0.0/24   \\
    --pod-eviction-timeout=6m   \\
    --terminated-pod-gc-threshold=10000   \\
    --root-ca-file=/opt/kubernetes/ssl/ca.pem   \\
    --service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem   \\
    --kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig   \\
    --logtostderr=false   \\
    --log-dir=/opt/kubernetes/logs   \\
    --v=2
    
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    EOF
    

    service-cluster-ip-range:指定 Service Cluster IP 网段,必须和 kube-apiserver 中的同名参数一致

  5. 同样的,需要将kube-controller-manager生成的证书,kube-controller-manager.kubeconfigkube-controller-manager.service文件copy到另一个master节点。

  6. 启动

    systemctl daemon-reload
    
    systemctl start kube-controller-manager.service
    
    systemctl enable kube-controller-manager.service
    

kube-scheduler部署

  1. 创建证书请求文件

    cat > /opt/kubernetes/ssl/kube-scheduler-csr.json <<EOF
    {
        "CN": "system:kube-scheduler",
        "hosts": [
    	"127.0.0.1",
    	"139.9.181.246",
    	"106.55.0.70",
    	"192.168.0.210",
    	"172.16.0.5"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "GuangDong",
            "L": "ShenZhen",
            "O": "system:kube-scheduler",
            "OU": "System"
          }
        ]
    }
    EOF
    

    hosts:需要指定所有master节点内网IP和公网IP

  2. 生成证书

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
    
  3. 创建kubeconfig配置文件

    cd /opt/kubernetes/cfg
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/opt/kubernetes/ssl/ca.pem \
      --embed-certs=true \
      --server=https://139.9.181.246:8443 \
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config set-credentials system:kube-scheduler \
      --client-certificate=/opt/kubernetes/ssl/kube-scheduler.pem \
      --client-key=/opt/kubernetes/ssl/kube-scheduler-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config set-context system:kube-scheduler \
      --cluster=kubernetes \
      --user=system:kube-scheduler \
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig 
    
  4. 创建系统服务配置文件

    cat > /opt/kubernetes/cfg/kube-scheduler.conf <<EOF
    KUBE_SCHEDULER_ARGS="--bind-address=127.0.0.1 \
      --kubeconfig=/opt/kubernetes/cfg/kube-scheduler.kubeconfig \
      --leader-elect=true \
      --alsologtostderr=true \
      --logtostderr=false \
      --log-dir=/opt/kubernetes/logs \
      --v=2"
    EOF
    
  5. 创建系统服务

    cat > /usr/lib/systemd/system/kube-scheduler.service <<EOF
    [Unit]
    Description=Kubernetes Scheduler
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    EnvironmentFile=-/opt/kubernetes/cfg/kube-scheduler.conf
    ExecStart=/opt/kubernetes/bin/kube-scheduler \$KUBE_SCHEDULER_ARGS
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  6. 同样的,需要将kube-scheduler生成的证书,kube-scheduler.kubeconfigkube-scheduler.confkube-scheduler.service文件copy到另一个master节点。

  7. 启动

    systemctl daemon-reload
    
    systemctl start kube-scheduler
    
    systemctl enable kube-scheduler
    

Node节点部署

docker安装

这里使用yum安装Docker,默认yum远程仓库中没有docker,所以需要添加repository,添加repository需要用到yum-config-manager,但是在使用yum-config-manager时报找不到该命令,所以需要安装yum-utils

  1. 安装yum-utils

    yum install -y yum-utils device-mapper-persistent-data lvm2
    
  2. 添加docker yum源

    yum-config-manager \
        --add-repo \
        https://download.docker.com/linux/centos/docker-ce.repo
    
  3. 安装docker-ce及客户端

    yum install -y docker-ce-19.03.11-3.el7 docker-ce-cli-19.03.11-3.el7 containerd.io
    
  4. 配置阿里云镜像加速(因为国内访问DockerHub比较慢)

    sudo mkdir -p /etc/docker
    
    sudo tee /etc/docker/daemon.json <<-'EOF'
    {
      "registry-mirrors": ["https://gqjyyepn.mirror.aliyuncs.com"]
    }
    EOF
    
    sudo systemctl daemon-reload
    
    sudo systemctl restart docker
    
    sudo systemctl enable docker
    

flannel部署

Flannel的用于跨主机节点间docker容器的通信

  1. 将分配给flannel的子网段写入到etcd中,避免多个flannel节点间ip冲突

    etcdctl set /coreos.com/network/config  '{ "Network": "172.17.0.0/16", "Backend": {"Type": "vxlan"}}'
    

    这里一定要用etcdctl v2 API进行set,不能使用etcdctl v3 API通过put命令添加

  2. 创建配置文件

    cat > /opt/kubernetes/cfg/flannel.conf <<EOF
    FLANNEL_ARGS="--public-ip=114.67.166.108 \
    --iface=eth0 \
    --etcd-endpoints=https://106.55.0.70:2379,https://110.43.49.198:2379,https://114.67.166.108:2379 \
    --etcd-cafile=/opt/etcd/ssl/ca.pem \
    --etcd-certfile=/opt/etcd/ssl/server.pem \
    --etcd-keyfile=/opt/etcd/ssl/server-key.pem \
    --ip-masq=true"
    EOF
    

注意

--public-ip:这里填写当前节点公网IP

--iface:填写内网IP或者内网IP对应的网卡名称

  1. 创建系统服务

    cat > /usr/lib/systemd/system/flanneld.service <<EOF
    [Unit]
    Description=Flanneld overlay address etc agent
    After=network-online.target network.target
    Before=docker.service
    
    [Service]
    Type=notify
    EnvironmentFile=-/opt/kubernetes/cfg/flannel.conf
    ExecStart=/opt/kubernetes/bin/flanneld \$FLANNEL_ARGS
    ExecStartPost=/opt/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
    
    Restart=on-failure
    
    [Install]
    WantedBy=multi-user.target
    EOF
    

    mk-docker-opts.sh脚本将分配给flannel的Pod子网网段信息写入到/run/flannel/docker文件中,后续docker启动时使用这个文件中参数值设置docker0网桥

  2. 配置docker使用flannel分配的子网

    • 修改docker.service文件

      vim /usr/lib/systemd/system/docker.service
      

      这里有三个地方需要修改

      [root@jd bin]# cat /usr/lib/systemd/system/docker.service
      [Unit]
      Description=Docker Application Container Engine
      Documentation=https://docs.docker.com
      BindsTo=containerd.service
      After=network-online.target firewalld.service containerd.service flanneld.service
      Wants=network-online.target
      Requires=docker.socket flanneld.service
      
      [Service]
      Type=notify
      EnvironmentFile=-/run/flannel/docker
      # the default is not to use systemd for cgroups because the delegate issues still
      # exists and systemd currently does not support the cgroup feature set required
      # for containers run by docker
      ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS -H fd:// --containerd=/run/containerd/containerd.sock
      ExecReload=/bin/kill -s HUP $MAINPID
      TimeoutSec=0
      RestartSec=2
      Restart=always
      # 省略...
      
      

      在Unit段中的After后面添加flanneld.service参数

      在Wants下面添加Requires=flanneld.service

      在Service段中Type后面添加EnvironmentFile=-/run/flannel/docker

      在ExecStart后面添加$DOCKER_NETWORK_OPTIONS参数

  3. 启动服务

    systemctl daemon-reload
    
    systemctl start flanneld.service
    
    systemctl enable flanneld.service
    
    systemctl restart docker.service
    

    查看

    ip a
    # 省略其他网卡信息
    3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
        link/ether 72:c6:c5:14:c4:07 brd ff:ff:ff:ff:ff:ff
        inet 10.254.12.0/32 scope global flannel.1
           valid_lft forever preferred_lft forever
        inet6 fe80::70c6:c5ff:fe14:c407/64 scope link 
           valid_lft forever preferred_lft forever
    4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
        link/ether 02:42:34:3d:98:65 brd ff:ff:ff:ff:ff:ff
        inet 10.254.12.1/24 brd 10.254.12.255 scope global docker0
           valid_lft forever preferred_lft forever
        inet6 fe80::42:34ff:fe3d:9865/64 scope link 
           valid_lft forever preferred_lft forever
    
    

    如果发现flannel和docker0出于同一个网段则证明部署成功

kubelet部署

  1. 为kubelet创建一个bootstrap.kubeconfig文件,这里先在master节点上执行,然后拷贝到worker节点,因为只有master节点上安装了kubectl

    cd /opt/kubernetes/cfg
    
    kubectl config set-cluster kubernetes \
          --certificate-authority=/opt/kubernetes/ssl/ca.pem \
          --embed-certs=true \
          --server=https://139.9.181.246:8443 \
          --kubeconfig=bootstrap.kubeconfig
    
    # 设置客户端认证参数
    kubectl config set-credentials kubelet-bootstrap \
          --token=f33e04663a4d7dd00be3df626e7251c8 \
          --kubeconfig=bootstrap.kubeconfig
    
    # 设置上下文参数
    kubectl config set-context default \
          --cluster=kubernetes \
          --user=kubelet-bootstrap \
          --kubeconfig=bootstrap.kubeconfig
    
    # 设置默认上下文
    kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
    
    scp bootstrap.kubeconfig root@jd:/opt/kubernetes/cfg
    
    scp bootstrap.kubeconfig root@js:/opt/kubernetes/cfg
    
  2. 创建kubelet.config.json文件

    cat > /opt/kubernetes/cfg/kubelet-config.yml <<EOF
    kind: KubeletConfiguration
    apiVersion: kubelet.config.k8s.io/v1beta1
    address: 0.0.0.0
    port: 10250
    readOnlyPort: 10255
    cgroupDriver: cgroupfs
    clusterDNS:
    - 10.254.0.2
    clusterDomain: cluster.local 
    failSwapOn: false
    authentication:
      anonymous:
        enabled: false
      webhook:
        cacheTTL: 2m0s
        enabled: true
      x509:
        clientCAFile: /opt/kubernetes/ssl/ca.pem 
    authorization:
      mode: Webhook
      webhook:
        cacheAuthorizedTTL: 5m0s
        cacheUnauthorizedTTL: 30s
    evictionHard:
      imagefs.available: 15%
      memory.available: 100Mi
      nodefs.available: 10%
      nodefs.inodesFree: 5%
    maxOpenFiles: 1000000
    maxPods: 110
    EOF
    
  3. 创建系统服务配置文件

    cat > /opt/kubernetes/cfg/kubelet.conf <<EOF
    KUBELET_OPTS="--logtostderr=false \
    --v=2 \
    --log-dir=/opt/kubernetes/logs \
    --alsologtostderr=true \
    --kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \
    --bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \
    --config=/opt/kubernetes/cfg/kubelet-config.yml \
    --cert-dir=/opt/kubernetes/ssl \
    --pod-infra-container-image=k8s.gcr.io/pause-amd64:3.1"
    EOF
    

kubelet.config会自动生成

  1. 创建系统服务

    cat > /usr/lib/systemd/system/kubelet.service <<EOF
    [Unit]
    Description=Kubernetes Kubelet
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=docker.service
    Requires=docker.service
    
    [Service]
    EnvironmentFile=-/opt/kubernetes/cfg/kubelet.conf
    ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTS
    
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  2. 由于pause镜像默认拉取需要科学上网,所有这里通过国内下载打标签的方式预先下载好镜像

    cat > download-images.sh <<EOF
    #!/bin/bash
    
    docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-node:v3.1.3
    docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-node:v3.1.3 quay.io/calico/node:v3.1.3
    docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-node:v3.1.3
    
    docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-cni:v3.1.3
    docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-cni:v3.1.3 quay.io/calico/cni:v3.1.3
    docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-cni:v3.1.3
    
    docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/pause-amd64:3.1
    docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/pause-amd64:3.1 k8s.gcr.io/pause-amd64:3.1
    docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/pause-amd64:3.1
    
    docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-typha:v0.7.4
    docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-typha:v0.7.4 quay.io/calico/typha:v0.7.4
    docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/calico-typha:v0.7.4
    
    docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/coredns:1.1.3
    docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/coredns:1.1.3 k8s.gcr.io/coredns:1.1.3
    docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/coredns:1.1.3
    
    docker pull registry.cn-hangzhou.aliyuncs.com/liuyi01/kubernetes-dashboard-amd64:v1.8.3
    docker tag registry.cn-hangzhou.aliyuncs.com/liuyi01/kubernetes-dashboard-amd64:v1.8.3 k8s.gcr.io/kubernetes-dashboard-amd64:v1.8.3
    docker rmi registry.cn-hangzhou.aliyuncs.com/liuyi01/kubernetes-dashboard-amd64:v1.8.3
    EOF
    

    可以按需下载,也可以直接执行以上脚本拉取所有k8s所需要的镜像

  3. 将kubelet-bootstrap用户绑定到系统集群角色(当前操作在master上执行)

    kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
    
  4. 启动

    mkdir -p /var/lib/kubelet
    
    systemctl daemon-reload
    
    systemctl start kubelet.service
    
    systemctl enable kubelet.service
    
  5. 在Master节点上审批csr请求

    kubectl get csr
    NAME                                                   AGE   REQUESTOR           CONDITION
    node-csr-C4O9_KIek83fXKlhPjsW37KxpzBGl6CSspvsDEiBsPc   18s   kubelet-bootstrap   Pending
    node-csr-Ow3aKEezOFC3bGIerrIu_olmsKEb02GNECffcfOYYZY   18s   kubelet-bootstrap   Pending
    
    kubectl certificate approve node-csr-C4O9_KIek83fXKlhPjsW37KxpzBGl6CSspvsDEiBsPc
    
    kubectl certificate approve node-csr-Ow3aKEezOFC3bGIerrIu_olmsKEb02GNECffcfOYYZY
    

kube-proxy部署

  1. 创建证书请求文件(当前操作在Master上执行,完成后分发到Node节点)

    cat > /opt/kubernetes/ssl/kube-proxy-csr.json <<EOF
    {
      "CN": "system:kube-proxy",
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "GuangDong",
          "L": "ShenZhen",
          "O": "k8s",
          "OU": "System"
        }
      ]
    }
    EOF
    
  2. 生成证书

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes  kube-proxy-csr.json | cfssljson -bare kube-proxy
    
    scp kube-proxy*.* root@jd:/opt/kubernetes/ssl
    
    scp kube-proxy*.* root@js:/opt/kubernetes/ssl
    
  3. 创建kubeconfig配置文件

    cd /ope/kubernetes/cfg
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/opt/kubernetes/ssl/ca.pem \
      --embed-certs=true \
      --server=https://139.9.181.246:8443 \
      --kubeconfig=kube-proxy.kubeconfig
    
    kubectl config set-credentials kube-proxy \
      --client-certificate=/opt/kubernetes/ssl/kube-proxy.pem \
      --client-key=/opt/kubernetes/ssl/kube-proxy-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-proxy.kubeconfig
    
    kubectl config set-context default \
      --cluster=kubernetes \
      --user=kube-proxy \
      --kubeconfig=kube-proxy.kubeconfig
    
    kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
    
    scp kube-proxy.kubeconfig root@jd:/opt/kubernetes/cfg
    
    scp kube-proxy.kubeconfig root@js:/opt/kubernetes/cfg
    
  4. 创建kube-proxy-config.yml配置文件

    cat > /opt/kubernetes/cfg/kube-proxy-config.yml <<EOF
    kind: KubeProxyConfiguration
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 192.168.0.3
    metricsBindAddress: 192.168.0.3:10249
    clientConnection:
      kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig
    clusterCIDR: 10.254.0.0/24
    EOF
    
  5. 创建系统服务配置文件

    cat > /opt/kubernetes/cfg/kube-proxy.conf <<EOF
    KUBE_PROXY_OPTS="--logtostderr=false \
    --v=2 \
    --log-dir=/opt/kubernetes/logs \
    --config=/opt/kubernetes/cfg/kube-proxy-config.yml"
    EOF
    
  6. 创建系统服务

    cat > /usr/lib/systemd/system/kube-proxy.service <<EOF
    [Unit]
    Description=Kubernetes Proxy
    After=network.target
    [Service]
    EnvironmentFile=-/opt/kubernetes/cfg/kube-proxy.conf
    ExecStart=/opt/kubernetes/bin/kube-proxy \$KUBE_PROXY_OPTS
    Restart=on-failure
    LimitNOFILE=65536
    [Install]
    WantedBy=multi-user.target
    EOF
    
  7. 启动

    systemctl daemon-reload
    
    systemctl enable kube-proxy 
    
    systemctl start kube-proxy
    

Dashboard

  1. 下载yaml文件

    wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml
    
  2. 修改Service的 Type为NodePort,供集群外部访问,默认为ClusterIP,只能集群内部访问

    vim recommended.yaml
    ----
    kind: Service
    apiVersion: v1
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
      name: kubernetes-dashboard
      namespace: kubernetes-dashboard
    spec:
      ports:
        - port: 443
          targetPort: 8443
          nodePort: 30001
      type: NodePort
      selector:
        k8s-app: kubernetes-dashboard
    ----
    
  3. 创建Kubernetes资源

    kubectl apply -f recommended.yaml
    
  4. 查看资源

    [root@tx ssl]# kubectl get pod,svc -n kubernetes-dashboard
    NAME                                             READY   STATUS    RESTARTS   AGE
    pod/dashboard-metrics-scraper-694557449d-pcgxw   1/1     Running   0          46h
    pod/kubernetes-dashboard-9774cc786-rpznb         1/1     Running   0          46h
    
    NAME                                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
    service/dashboard-metrics-scraper   ClusterIP   10.254.0.31    <none>        8000/TCP        46h
    service/kubernetes-dashboard        NodePort    10.254.0.126   <none>        443:30601/TCP   46h
    
    
  5. 访问

    通过任何Node节点的IP加上kubernetes-dashboardService暴露的端口进行访问,这里的端口为30601,所有,我们可以通过114.67.166.108:30601进行访问。如下图,Chrome提示该链接是非安全的,且在高级里面没有继续访问的选项,因为我们使用的是自签证书,Chrome无法识别

    20200620130835143

    这里我们可以在当前页面输入thisisunsafe即可继续访问。

  6. 创建Service Account并绑定集群角色

    kubectl create serviceaccount dashboard-admin -n kube-system
    
    kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
    
  7. 查看token

    kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
    

    拿到Token过后就可以输入token登录Dashboard了。

CoreDns

  1. 下载yaml文件,点击下载,我在下载官方的yaml安装出了点问题,我使用的yaml如下:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: coredns
      namespace: kube-system
      labels:
          kubernetes.io/cluster-service: "true"
          addonmanager.kubernetes.io/mode: Reconcile
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      labels:
        kubernetes.io/bootstrapping: rbac-defaults
        addonmanager.kubernetes.io/mode: Reconcile
      name: system:coredns
    rules:
    - apiGroups:
      - ""
      resources:
      - endpoints
      - services
      - pods
      - namespaces
      verbs:
      - list
      - watch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      annotations:
        rbac.authorization.kubernetes.io/autoupdate: "true"
      labels:
        kubernetes.io/bootstrapping: rbac-defaults
        addonmanager.kubernetes.io/mode: EnsureExists
      name: system:coredns
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:coredns
    subjects:
    - kind: ServiceAccount
      name: coredns
      namespace: kube-system
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: coredns
      namespace: kube-system
      labels:
          addonmanager.kubernetes.io/mode: EnsureExists
    data:
      Corefile: |
        .:53 {
            errors
            health
            kubernetes cluster.local. in-addr.arpa ip6.arpa {
                pods insecure
                upstream
                fallthrough in-addr.arpa ip6.arpa
            }
            prometheus :9153
            forward . /etc/resolv.conf
            cache 30
            reload
        }
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: coredns
      namespace: kube-system
      labels:
        k8s-app: kube-dns
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
        kubernetes.io/name: "CoreDNS"
    spec:
      # replicas: not specified here:
      # 1. In order to make Addon Manager do not reconcile this replicas parameter.
      # 2. Default is 1.
      # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxUnavailable: 1
      selector:
        matchLabels:
          k8s-app: kube-dns
      template:
        metadata:
          labels:
            k8s-app: kube-dns
          annotations:
            seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
        spec:
          serviceAccountName: coredns
          tolerations:
            - key: node-role.kubernetes.io/master
              effect: NoSchedule
            - key: "CriticalAddonsOnly"
              operator: "Exists"
          containers:
          - name: coredns
            image: coredns/coredns:1.6.7 
            imagePullPolicy: IfNotPresent
            resources:
              limits:
                memory: 170Mi
              requests:
                cpu: 100m
                memory: 70Mi
            args: [ "-conf", "/etc/coredns/Corefile" ]
            volumeMounts:
            - name: config-volume
              mountPath: /etc/coredns
              readOnly: true
            ports:
            - containerPort: 53
              name: dns
              protocol: UDP
            - containerPort: 53
              name: dns-tcp
              protocol: TCP
            - containerPort: 9153
              name: metrics
              protocol: TCP
            livenessProbe:
              httpGet:
                path: /health
                port: 8080
                scheme: HTTP
              initialDelaySeconds: 60
              timeoutSeconds: 5
              successThreshold: 1
              failureThreshold: 5
            securityContext:
              allowPrivilegeEscalation: false
              capabilities:
                add:
                - NET_BIND_SERVICE
                drop:
                - all
              readOnlyRootFilesystem: true
          dnsPolicy: Default
          volumes:
            - name: config-volume
              configMap:
                name: coredns
                items:
                - key: Corefile
                  path: Corefile
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: kube-dns
      namespace: kube-system
      annotations:
        prometheus.io/port: "9153"
        prometheus.io/scrape: "true"
      labels:
        k8s-app: kube-dns
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
        kubernetes.io/name: "CoreDNS"
    spec:
      selector:
        k8s-app: kube-dns
      clusterIP: 10.254.0.2
      ports:
      - name: dns
        port: 53
        protocol: UDP
      - name: dns-tcp
        port: 53
        protocol: TCP
    
    

    注意

    将Service资源的clusterIP改成集群IP的第二个地址,比如我的cluster-ip-rang是10.254.0.0/24,这里填写10.254.0.2

  2. 创建Kubernetes资源

    kubectl apply -f coredns.yaml
    
  3. 测试

    kubectl run -it --rm dns-test --image=busybox:1.28.4 sh
    
    nslookup kubernetes
    

自动补全

  1. 安装bash-completion

    yum install bash-completion
    
  2. 配置

    source /usr/share/bash-completion/bash_completion
    
    source <(kubectl completion bash)
    
    echo "source <(kubectl completion bash)" >> ~/.bashrc
    

NFS

  1. 安装NFS Server和rpcbind

    yum -y install nfs-utils rpcbind
    
  2. 开放目录

    echo "/opt/nfs 0.0.0.0/0(rw,sync,all_squash,anonuid=65534,anongid=65534)" >> /etc/exports
    
  3. 启动服务

    systemctl start rpcbind
    
    systemctl start nfs-server
    
    systemctl enable rpcbind
    
    systemctl enable nfs-server
    
  4. 客户端验证

    yum install -y showmount
    
    showmount -e ip
    
  5. 创建PV/PVC

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: nfs-pv
    spec:
      capacity:
        storage: 4Gi
      accessModes:
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Recycle
      nfs:
        path: /opt/nfs
        server: 110.43.49.198
    
    ---
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: gogs-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
    

    创建nfs-pv-pvc.yaml,执行kubectl apply -f nfs-pv-pvc.yaml,可以看到PV/PVC已经创建完成

    [root@tx workspace]# kubectl get pv --all-namespaces
    NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM              STORAGECLASS   REASON   AGE
    nfs-pv   4Gi        RWO            Recycle          Bound    default/gogs-pvc                           4s
    [root@tx workspace]# kubectl get pvc --all-namespaces
    NAMESPACE   NAME       STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    default     gogs-pvc   Bound    nfs-pv   4Gi        RWO                           10s
    

部署gogs

  1. 编写yaml

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: gogs
      labels:
        app: gogs
    spec:
      replicas: 1
      template:
        metadata:
          name: gogs
          labels:
            app: gogs
        spec:
          containers:
            - name: gogs
              image: gogs/gogs:0.11.91
              imagePullPolicy: IfNotPresent
              volumeMounts:
                - name: gogs-data
                  mountPath: /data
              ports:
                - containerPort: 3000
                  name: web
                  protocol: TCP
                - containerPort: 22
                  name: ssh
                  protocol: TCP
              resources:
                limits:
                  cpu: "500m"
                  memory: "500Mi"
                requests:
                  cpu: "300m"
                  memory: "256Mi"
              livenessProbe:
                httpGet:
                  port: 3000
              readinessProbe:
                httpGet:
                  port: 3000
          volumes:
            - name: gogs-data
              persistentVolumeClaim:
                claimName: gogs-pvc
          restartPolicy: Always
      selector:
        matchLabels:
          app: gogs
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: gogs
    spec:
      selector:
        app: gogs
      ports:
        - port: 3000
      type: NodePort
    
    [root@tx workspace]# kubectl get pod
    NAME                   READY   STATUS    RESTARTS   AGE
    gogs-fb66dbb8f-r4bgp   0/1     Running   2          79s
    
    

    这里可以看到pod的STATUS虽然是Running状态,但是它并不可用,READY是0,原因是就绪探针和存活探针探测到该pod中的容器端口3000并没有被监听。

    查看该pod日志,我们可以发现,mkdir /data/git: permission denied,这里很关键,创建/data/git目录是没有权限,这里我们将/data目录挂载到了gogs-pvc上,然而该pvc自动分配到了nfs-pv上,查看该pv对用的nfs-server开放的目录,发现只有root用户有写的权限。

    [root@tx workspace]# kubectl logs gogs-779f87bd89-ld9v5
    //省略...
    ./run: ./setup: line 9: can't create /data/git/.ssh/environment: nonexistent directory
    chmod: /data/git/.ssh/environment: No such file or directory
    chown: /data: Operation not permitted
    chown: /data: Operation not permitted
    chown: /data/git/: No such file or directory
    chmod: /data: Operation not permitted
    chmod: /data/gogs: No such file or directory
    chmod: /data/git/: No such file or directory
    2020/06/22 13:46:33 [ WARN] Custom config '/data/gogs/conf/app.ini' not found, ignore this if you're running first time
    2020/06/22 13:46:33 [FATAL] [...g/setting/setting.go:517 NewContext()] Fail to create '/data/git/.ssh': mkdir /data/git: permission denied
    
    
  2. 授权nfs-server目录权限

    [root@js opt]# chmod 777 nfs
    
  3. 重启pod

    [root@tx ~]# kubectl delete pod gogs-779f87bd89-ld9v5
    

    此时,gogs正常

nginx-ingress-controller部署

  1. 官网下载yaml文件,注意是下载Bare-metal的文件

  2. 将下载好的deploy.yaml中的image替换成能下载的镜像源,这里将改成如下:

    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.32.0
    
  3. 执行deploy.yaml

    kubectl apply -f deploy.yaml
    

Reids-Cluster集群部署

  1. 由于Redis-Cluster模式官方推荐使用六个redis节点从而搭建一个三主三从的集群,所以,这里需要创建六个NFS存储目录

    mkdir -p /opt/nfs/redis/pv{1,2,3,4,5,6}
    
    cd /opt/nfs/redis
    
    chmod 777 ./*
    
  2. 将这六个pv目录通过nfs服务暴露出去

    vim /etc/exports
    
    
    /opt/nfs/redis/pv1      *(rw,all_squash)
    /opt/nfs/redis/pv2      *(rw,all_squash)
    /opt/nfs/redis/pv3      *(rw,all_squash)
    /opt/nfs/redis/pv4      *(rw,all_squash)
    /opt/nfs/redis/pv5      *(rw,all_squash)
    /opt/nfs/redis/pv6      *(rw,all_squash)
    
    
  3. 创建yaml,里面包含了namespace/PersistentVolume/Service/StatefulSet,如下:

    #namespaces
    apiVersion: v1
    kind: Namespace
    metadata:
      name: redis-cluster-ns
    ---
    #PersistentVolume
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: redis-pv1
      namespace: redis-cluster-ns
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      nfs:
        server: 110.43.49.198
        path: /opt/nfs/redis/pv1
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: redis-pv2
      namespace: redis-cluster-ns
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      nfs:
        path: /opt/nfs/redis/pv2
        server: 110.43.49.198
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: redis-pv3
      namespace: redis-cluster-ns
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      nfs:
        path: /opt/nfs/redis/pv3
        server: 110.43.49.198
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: redis-pv4
      namespace: redis-cluster-ns
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      nfs:
        path: /opt/nfs/redis/pv4
        server: 110.43.49.198
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: redis-pv5
      namespace: redis-cluster-ns
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      nfs:
        path: /opt/nfs/redis/pv5
        server: 110.43.49.198
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: redis-pv6
      namespace: redis-cluster-ns
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      nfs:
        path: /opt/nfs/redis/pv6
        server: 110.43.49.198
    #ConfigMap
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: redis-conf
      namespace: redis-cluster-ns
    data:
      redis.conf: |
        appendonly yes
        cluster-enabled yes
        cluster-config-file /var/lib/redis/nodes.conf
        cluster-node-timeout 5000
        dir /var/lib/redis
        port 6379
    #Service
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: redis-service
      namespace: redis-cluster-ns
      labels:
        app: redis
    spec:
      selector:
        app: redis
        appCluster: redis-cluster
      ports:
        - name: redis-port
          port: 6379
      clusterIP: None
    #StatefulSet
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: redis-app
      namespace: redis-cluster-ns
    spec:
      selector:
        matchLabels:
          app: redis
      serviceName: "redis-service"
      replicas: 6
      template:
        metadata:
          namespace: redis-cluster-ns
          labels:
            app: redis
            appCluster: redis-cluster
        spec:
          terminationGracePeriodSeconds: 20
          affinity:
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
                - podAffinityTerm:
                    topologyKey: kubernetes.io/hostname
                    labelSelector:
                      matchExpressions:
                        - key: app
                          operator: In
                          values:
                            - redis
                  weight: 100
          containers:
            - name: redis
              image: redis
              # command: redis-server /etc/redis/redis.conf --protected-mode no
              command:
                - "redis-server"
              args:
                - "/etc/redis/redis.conf" #换行表示空格
                - "--protected-mode"
                - "no"
              #需要请求的资源限制,100m相当于0.1核cpu
              resources:
                requests:
                  cpu: 100m
                  memory: 100Mi
              #容器暴露的端口
              ports:
                - name: redis
                  containerPort: 6379
                  protocol: TCP
                - name: cluster
                  containerPort: 16379
                  protocol: TCP
              volumeMounts:
                - name: redis-conf
                  mountPath: /etc/redis
                - name: redis-data
                  mountPath: /var/lib/redis
          volumes:
            - name: redis-conf
              configMap:
                name: redis-conf
                items:
                  - key: redis.conf
                    path: redis.conf
      volumeClaimTemplates:
        - metadata:
            name: redis-data
            namespace: redis-cluster-ns
          spec:
            accessModes:
              - ReadWriteMany
            resources:
              requests:
                storage: 1Gi
    #Service
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: redis-access-service
      namespace: redis-cluster-ns
      labels:
        app: redis
    spec:
      ports:
        - name: redis-port
          port: 6379
          protocol: TCP
          targetPort: 6379
      selector:
        app: redis
        appCluster: redis-cluster
      type: NodePort
    
    
  4. 创建资源

    kubectl apply -f redis-cluster.yaml
    
  5. 创建redis-trib容器进行集群的创建(以上只是运行了六个节点的redis,节点之间并没有通信)

    kubectl run redis-trib --image=registry.cn-hangzhou.aliyuncs.com/james-public/redis-trib:latest -it -n redis-cluster-ns --restart=Never /bin/sh
    
  6. redis-trib容器中执行

    redis-trib create --replicas 1 \
    `dig +short redis-app-0.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \
    `dig +short redis-app-1.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \
    `dig +short redis-app-2.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \
    `dig +short redis-app-3.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \
    `dig +short redis-app-4.redis-service.redis-cluster-ns.svc.cluster.local`:6379 \
    `dig +short redis-app-5.redis-service.redis-cluster-ns.svc.cluster.local`:6379
    
  7. 最终在任意redis节点中查看redis-cluster状态查看是否创建成功

    [root@tx workspace]# kubectl exec -it redis-app-2 /bin/bash -n redis-cluster-ns
    kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
    root@redis-app-2:/data# /usr/local/bin/redis-cli -c
    127.0.0.1:6379> cluster info
    cluster_state:ok
    cluster_slots_assigned:16384
    cluster_slots_ok:16384
    cluster_slots_pfail:0
    cluster_slots_fail:0
    cluster_known_nodes:6
    cluster_size:3
    cluster_current_epoch:6
    cluster_my_epoch:3
    cluster_stats_messages_ping_sent:32092
    cluster_stats_messages_pong_sent:31509
    cluster_stats_messages_meet_sent:1
    cluster_stats_messages_sent:63602
    cluster_stats_messages_ping_received:31509
    cluster_stats_messages_pong_received:32093
    cluster_stats_messages_received:63602
    127.0.0.1:6379> cluster nodes
    4cf5e475c88ed71959e4957bcd1913230c3d923a 172.17.91.4:6379@16379 myself,master - 0 1593522304000 3 connected 10923-16383
    fcab7988b0546b759f6d8e07232843acd4df0017 172.17.39.5:6379@16379 slave fdf280d767db08f4453c425763705df4653dce76 0 1593522306494 4 connected
    2b5b5e29bf15608e0e60f1f085bbe96ac070935a 172.17.39.6:6379@16379 slave 4cf5e475c88ed71959e4957bcd1913230c3d923a 0 1593522306000 6 connected
    070378f05d3ef0c28e036d8f7b7d61a175d5c2ce 172.17.91.5:6379@16379 slave 64d07d4b62fd249a387bbbdc9bc51033e875fb54 0 1593522305000 5 connected
    fdf280d767db08f4453c425763705df4653dce76 172.17.91.3:6379@16379 master - 0 1593522306089 1 connected 0-5460
    64d07d4b62fd249a387bbbdc9bc51033e875fb54 172.17.39.4:6379@16379 master - 0 1593522305494 2 connected 5461-10922
    127.0.0.1:6379> 
    
    

    打印如上信息表示集群创建成功

参考文献

https://blog.csdn.net/chen645800876/article/details/105279648/

https://gitee.com/admxj/kubernetes-ha-binary

https://blog.51cto.com/flyfish225/2504511

https://github.com/istio/istio.io/issues/4715

https://cloud.tencent.com/developer/article/1392872