the site subtitle

Keepalived+Nginx搭建Kubernetes高可用集群

2023.07.03

Hello,大家好,很久没有更新过关于Kubernetes相关的文章了,之前我有写过一篇跨VPC搭建Kubernetes集群,现在看来,还是太肤浅了,打算重新整理一下,作为一个Java开发,觉得还是很有必要。

这里需要说明一下,为什么不使用LVS,原因是因为我是用Tencent Cloud竞价服务器来搭建,云主机的HAVIP不支持LVS的DR模式,因为DR模式是通过修改目标请求MAC地址实现的,而在目标机器收到请求后需要ARP欺骗,也就是需要将VIP绑定到lo回环网卡上,并且需要抑制ARP解析,这打破了云厂商私有网络的规则,这是第一;第二,NAT模式在内网环境中也有很多的限制和坑,我这里根据自己的理解分析一下,首先说一下网络NAT原理,然后说一下LVS中NAT模式。

  • NAT的实现原理:
    当一个请求需要访问其他网段的IP时,请求首先会在局域网中查找当前请求目标IP的MAC地址,因为是其他网段,所以肯定找不到,这个时候请求会发送给网关
    表示由网关将请求代为转发出去,网关会将该请求的源IP改为网关IP,该操作称为SNAT,并把请求转发给其他网关,直到到达目标服务器,当请求响应时,目标服务器会将目标IP及源IP互换,当响应到达网关,网关会查找这个请求是否是自己转换出去的
    如果是,则将目标地址改为当前局域网中真实服务器地址,该操作称为DNAT。

  • LVS中NAT模式的实现原理
    前提:NAT模式需要两块网卡、Real Server的网关需要指定Director Server
    假如目前的架构是一台Director Server和后端三台Real Server,Director机器需要有两个IP,一个对外提供服务的VIP,一个对内互通的局域网DIP,当客户端发送一个请求到VIP,首先Director会通过负载均衡算法从RS列表中选出一个Real Server,然后将请求的目标IP改为RS的真实RIP,该操作称为DNAT,RS正常处理请求,响应则将请求的目标IP源IP互换,然而请求的源IP(也就是响应的目标IP)和RS所处不在一个网段,所以这个时候需要指定RS服务器的网关为Director,响应请求经过Director,将响应的源IP改为VIP,这个操作称为SNAT,最后将请求响应回去。

Keepalived安装及配置

关于Keepalived我就不做过多解释,以免我说的不够全面,简单的说就是为负载均衡器做主备切换的,其核心实现是通过VRRP协议实现VIP的自动漂移。

安装Keepalived

这里使用最简单便捷的安装方式yum

yum install -y keepalived

这样就安装好了Keepalived,我们可以通过rpm -ql keepalived查看安装了哪些文件

配置

keepalived配置文件

! Configuration File for keepalived

global_defs { # 全局定义
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id lb01    # keepalived的路由ID,在局域网中唯一
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

# vrrp实例名称,在同一个配置文件中唯一,同一个集群(主备)之间保持一致
vrrp_instance VI_1 {
    state MASTER    # 状态,大写的,主、被 MASTER BACKUP
    interface eth0    # 绑定的网卡
    virtual_router_id 51    # 在同一组主备集群中保持一致,在同一个配置文件中唯一
    priority 100    # 优先级 主备之前相差50即可
    advert_int 1    # 心跳间隔 单位秒
    authentication {    # 认证
        auth_type PASS    # 认证类型,简单密码认证
        auth_pass 1111    # 密码
    }
    virtual_ipaddress { # VIP 虚拟IP
        192.168.123.242/28 dev eth0 label eth0:0
        # 192.168.200.16
        # 192.168.200.17
        # 192.168.200.18
    }
}

# 定义一个VIP服务,指定虚拟IP监听的端口
virtual_server 192.168.123.242 8443 {
    delay_loop 6    # 一分钟内检查几次后端服务
    lb_algo rr       # 负载均衡算法,rr表示轮询 或者lvs_sched
    lb_kind NAT        # 模式,DR NAT TUN
    nat_mask 255.255.255.0 # VIP的子网掩码
    persistence_timeout 50    # 会话保持事件
    protocol TCP    # 协议

    # 后端真实服务器
    real_server 192.168.201.100 443 {
        weight 1    # 权重
        SSL_GET {    # 健康检查的方式 SSL_GET HTTP_GET TCP_CHECK
            url {
              path /
              digest ff20ad2481f97b1754ef3e12ecd3a9cc
            }
            url {
              path /mrtg/
              digest 9b3a0c85a887a256d6939da88aabd8cd
            }
            connect_timeout 3
            retry 3
            delay_before_retry 3
        }
    }
}

virtual_server 10.10.10.2 1358 {
    delay_loop 6
    lb_algo rr
    lb_kind NAT
    persistence_timeout 50
    protocol TCP

    sorry_server 192.168.200.200 1358

    real_server 192.168.200.2 1358 {
        weight 1
        HTTP_GET {
            url {
              path /testurl/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            url {
              path /testurl2/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            url {
              path /testurl3/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            connect_timeout 3
            retry 3
            delay_before_retry 3
        }
    }

    real_server 192.168.200.3 1358 {
        weight 1
        HTTP_GET {
            url {
              path /testurl/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334c
            }
            url {
              path /testurl2/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334c
            }
            connect_timeout 3
            retry 3
            delay_before_retry 3
        }
    }
}

virtual_server 10.10.10.3 1358 {
    delay_loop 3
    lb_algo rr
    lb_kind NAT
    persistence_timeout 50
    protocol TCP

    real_server 192.168.200.4 1358 {
        weight 1
        HTTP_GET {
            url {
              path /testurl/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            url {
              path /testurl2/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            url {
              path /testurl3/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            connect_timeout 3
            retry 3
            delay_before_retry 3
        }
    }

    real_server 192.168.200.5 1358 {
        weight 1
        HTTP_GET {
            url {
              path /testurl/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            url {
              path /testurl2/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            url {
              path /testurl3/test.jsp
              digest 640205b7b0fc66c1ea91c463fac6334d
            }
            connect_timeout 3
            retry 3
            delay_before_retry 3
        }
    }
}

以上配置文件看上去很多,其实主要包含三个大块:

  • global_defs:全局配置,主要配置路由ID及邮件通知等信息

  • vrrp_instance:vrrp实例配置,也就是当前Keepalived的节点信息已经管理的VIP,主备及优先级等

  • virtual_server:主要管理LVS,在本教程中使用不到

通过上面对配置文件大致的说明,下面来正式配置Keepalived

主节点配置:

! Configuration File for keepalived

global_defs {
   router_id lb01
   vrrp_skip_check_adv_addr
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}
vrrp_script check_nginx {
    script "/usr/local/nginx/script/health_check.sh"
    interval 2
    weight -20
    fall 3
    rise 2
}
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 3
    nopreempt   # 非抢占式,也就是当主宕机重启后不会抢占为主,必须将state状态改为BACKUP,所有节点的state状态都必须是BACKUP
    garp_master_delay 1
    garp_master_refresh 5
    authentication {
        auth_type PASS
        auth_pass 123456
    }
    unicast_src_ip 192.168.123.247  # 设置本机内网IP地址
    unicast_peer {
        192.168.123.249             # 对端设备的 IP 地址
    }
    virtual_ipaddress {
        192.168.123.242/28 dev eth0 label eth0:0
    }
    track_script {
        check_nginx  # 监控脚本的名称
    }
}

备节点配置:

! Configuration File for keepalived

global_defs {
   router_id lb02
   vrrp_skip_check_adv_addr
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}
vrrp_script check_nginx {
    script "/usr/local/nginx/script/health_check.sh"
    interval 2
    weight -20
    fall 3
    rise 2
}
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 50
    advert_int 3
    nopreempt   # 非抢占式,也就是当主宕机重启后不会抢占为主,必须将state状态改为BACKUP,所有节点的state状态都必须是BACKUP
    garp_master_delay 1
    garp_master_refresh 5
    authentication {
        auth_type PASS
        auth_pass 123456
    }
    unicast_src_ip 192.168.123.249  # 设置本机内网IP地址
    unicast_peer {
        192.168.123.247             # 对端设备的 IP 地址
    }
    virtual_ipaddress {
        192.168.123.242/28 dev eth0 label eth0:0
    }
    track_script {
        check_nginx  # 监控脚本的名称
    }
}

配置参数解释:

  • router_id:路由ID,在整个局域网中必须唯一

  • 定义一个检查nginx健康状态的脚本,并且需要为该脚本赋予可执行权限,该脚本主要的作用是,检查nginx进程是否还在运行,如果非运行状态,则将Keepalived进程关闭,出发备节点vrrp进行IP漂移

    脚本内容如下:

    #!/bin/bash
    
    count=$(ps -ef |grep nginx |egrep -cv "grep|$$")
    if [ "$count" -eq 0 ];then
        systemctl stop keepalived
    fi
    
  • 主备节点配置为BACKUP,这里主要是实现主备节点为非抢占模式,什么是非抢占式,也就是说,当主节点宕机后备节点升级为主过后,当原来的主节点恢复,当下的主节点不会被剥夺MASTER角色,也就是VIP不会再次漂移,其主要目的是为了,尽量的避免非节点宕机导致的间歇性不可用,当然如果主备节点在预先设定的主节点性能等各方面更优,这里可以考虑设置为抢占式的,当然,非抢占式还需要搭配nopreempt一起使用

  • virtual_router_id:虚拟路由ID,需要主备一致

  • priority:优先级,当同时启动主备节点时,优先级更高的节点会被选主

  • advert_int:主备节点间的心跳间隔

  • authentication:主备节点的身份认证

  • unicast_src_ip:由于云服务的一些限制,默认不允许虚拟网络内的广播,所有这里采用单播的方式进行vrrp通信,这里配置的是当前节点的IP地址

  • unicaset_peer:对端IP

  • virtual_address:需要Keepalived管理的虚拟IP,可以填写多个,其实这里使用的就是ip addr add命令

  • track_script:健康检查运行的脚本

Nginx安装及配置

Nginx所需要的库预先安装

yum install -y gcc pcre-devel zlib-devel openssl openssl-devel

安装Nginx

这里使用源码编译安装,安装分为以下几步

  1. 下载

  2. 解压:tar -zxvf nginx-xxx.tar.gz

  3. 编译:./configure --prefix=/usr/local/nginx --with-http_stub_status_module --with-http_ssl_module --with-file-aio --with-http_realip_module --with-stream,这里编译选中所需要的库即可,因为这里后续还需要代理Kubernetes集群中Ingress Controller,需要使用到四层代理,所以这里--with-stream参数不能少

  4. 安装:make && make install,如果没有出现异常就可以进行安装了

配置

配置主要有两块,首先是反向代理、负载均衡的配置,其次是Nginx服务的配置文件,开机自启以及Keepalived的自启

配置Kubernetes Api Server反向代理、负载均衡

首先在nginx.conf中引入外部配置文件

http {
    # 这里需要避免后续四层代理的端口80冲突
    listen       8888;
    # ... 此处省略其他配置
    include nginx.conf.d/http/*.conf; 
}
stream {
    log_format  main  '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
    access_log  /var/log/nginx/stream-access.log  main;
    include nginx.conf.d/stream/*.conf;
}

Api Server的配置文件

upstream k8s-apiserver {
    server 192.168.123.244:6443;
    server 192.168.123.245:6443;
    server 192.168.123.246:6443;
}
server {
    listen 6443;
    proxy_pass k8s-apiserver;
}

Ingress Controller的配置文件

upstream k8s-ingress-controller-http {
    server 192.168.123.244:30744;
    server 192.168.123.245:30744;
    server 192.168.123.246:30744;
}
server {
    listen 80;
    proxy_pass k8s-ingress-controller-http;
}

upstream k8s-ingress-controller-https {
    server 192.168.123.244:31574;
    server 192.168.123.245:31574;
    server 192.168.123.246:31574;
}
server {
    listen 443;
    proxy_pass k8s-ingress-controller-https;
} 

因为Ingress Controller代理使用四层代理,并且端口为80/443,所以,在http段中就不能监听80/443端口,避免冲突导致服务启动失败。

配置nginx.service

要实现开机服务自启,需要将nginx配置为Unit服务

[Unit]
Description=The nginx HTTP and reverse proxy server
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
PIDFile=/usr/local/nginx/logs/nginx.pid
ExecStartPre=/usr/bin/rm -f /usr/local/nginx/logs/nginx.pid
ExecStartPre=/usr/local/nginx/sbin/nginx -t
ExecStart=/usr/local/nginx/sbin/nginx
ExecReload=/usr/local/nginx/sbin/nginx -s reload
KillSignal=SIGQUIT
TimeoutStopSec=5
KillMode=process
PrivateTmp=true

[Install]
WantedBy=multi-user.target

将nginx.service存放于/usr/lib/systemd/system/nginx.service

创建文件夹

mkdir -p /var/log/nginx

通过命令启动nginx即可:

systemctl daemon-reload
systemctl enable --now nginx.service
systemctl status nginx.service

配置Keepalived在Nginx后启动

修改Keepalived的Unit文件,在/usr/lib/systemd/system/keepalived.service文件夹下,然后在After后面添加nginx.service即可,如下:

[Unit]
Description=LVS and VRRP High Availability Monitor
After=network-online.target syslog.target nginx.service
Wants=network-online.target

[Service]
Type=forking
PIDFile=/var/run/keepalived.pid
KillMode=process
EnvironmentFile=-/etc/sysconfig/keepalived
ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Kubernetes集群部署

集群版本及主机规划

这里采用三台服务器搭建Kubernetes集群,三个master三个node,每台服务器既担任master,也担任node,这里采用的是腾讯云的竞价服务器,不得不说,这个是真好啊,一个小时几毛钱,比起之前薅羊毛方便多了。

主机名角色IP软件
i1Master/Node192.168.123.244etcd/kubectl/kube-api-server/kube-controller-manager/kube-scheduler/kube-proxy/kubelet/containerd/runc
i2Master/Node192.168.123.245etcd/kubectl/kube-api-server/kube-controller-manager/kube-scheduler/kube-proxy/kubelet/containerd/runc
i3Master/Node192.168.123.246etcd/kubectl/kube-api-server/kube-controller-manager/kube-scheduler/kube-proxy/kubelet/containerd/runc

这里需要说的是:在kubernetes中,真正实现了高可用集群的是ETCD,因为所有数据都保存到ETCD中,ETCD需要保证数据的一致性,所以会有选主的过程,而对于Api Server来说,它只是一个访问ETCD集群的客户端,严格来说,它是无状态的,没有数据也就不需要考虑一致性,所以,Api Server无论多少个都是可以同时存在的,而Master节点中其他两个组件Controller Manager和Scheduler,如果一个kubernetes集群中有多个Master节点(也就是说存在多个controller-manager和Scheduler),也只会有一个controller-manager和Scheduler工作,其他节点处于休眠状态,当工作节点的controller-manager和Scheduler出现宕机,则集群中的某一个节点的controller-manager和Scheduler被唤醒,也就相当于是主备切换从而来达到服务的高可用。

总结:ETCD严格意义上来说,至少需要3个节点来构建最小集群,api server、controller manager、Scheduler至少需要2个节点即可构建一个高可用的kubernetes,当然了,apiserver作为kubernetes的主入口,需要对多个apiserver进行负载均衡,然而负载均衡又会出现单点问题,所以,在这之前我们使用keepalived对负载均衡器做主备高可用。

软件版本

作为一个有经验的开发人员,版本不能盲目追新,但作为一个有强迫症的码农,又比较喜欢追新,所以,保守点吧,选个1.24吧。

软件版本下载地址
CentOS8.4
Kubernetesv1.24.0v1.24.0
Etcdv3.5.4v3.5.4
Containerdv1.6.1v1.6.1
Runcv1.1.0v1.1.0

网络分配

项目网段
节点网络192.168.123.0/29
集群网络10.96.0.0/16
Pod网络172.18.0.0/16

安装前准备

关闭防火墙、必要软件安装、关闭Swap空间

这里我直接整理为一个脚本,后续只需要执行就可以了,内容如下

#! /bin/bash
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld

# 关闭SELinux
setenforce 0
sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

# 交换分区设置
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab
echo "vm.swappiness=0" >> /etc/sysctl.conf
sysctl -p

# limit优化
ulimit -SHn 65535

cat <<EOF >> /etc/security/limits.conf
* soft nofile 655360
* hard nofile 131072
* soft nproc 655350
* hard nproc 655350
* soft memlock unlimited
* hard memlock unlimited
EOF

#ipvs管理工具安装及模块加载
yum -y install ipvsadm ipset sysstat conntrack libseccomp

modprobe -- ip_vs 
modprobe -- ip_vs_rr 
modprobe -- ip_vs_wrr 
modprobe -- ip_vs_sh 
modprobe -- nf_conntrack

cat >/etc/modules-load.d/ipvs.conf <<EOF 
ip_vs 
ip_vs_lc 
ip_vs_wlc 
ip_vs_rr 
ip_vs_wrr 
ip_vs_lblc 
ip_vs_lblcr 
ip_vs_dh 
ip_vs_sh 
ip_vs_fo 
ip_vs_nq 
ip_vs_sed 
ip_vs_ftp 
ip_vs_sh 
nf_conntrack 
ip_tables 
ip_set 
xt_set 
ipt_set 
ipt_rpfilter 
ipt_REJECT 
ipip 
EOF

# 加载containerd相关内核模块
modprobe overlay
modprobe br_netfilter

# 永久性加载模块
cat > /etc/modules-load.d/containerd.conf << EOF
overlay
br_netfilter
EOF

# 设置为开机启动
systemctl enable --now systemd-modules-load.service

# Linux内核优化
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 131072
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF

sysctl --system

# 安装一些其他工具
yum install wget jq psmisc vim net-tools telnet yum-utils device-mapper-persistent-data lvm2 git lrzsz -y

# 重启服务器,保证重启后内核依旧加载
reboot -h now

主机名设置

根据上面的说明配置各个服务器的主机名

hostnamectl set-hostname i1

主机名与IP地址映射

这里之所以有其他i4 i5是为了后续添加节点方便,本例中无实际作用

cat >> /etc/hosts << EOF
192.168.123.244 i1
192.168.123.245 i2
192.168.123.246 i3
EOF

配置免密登录

由于后续所有操作大部分是在i1上操作,其他节点的文件只需要从i1上copy过去就可以了,所以这里只需要在i1上能免密登录到其他节点即可

在i1上执行如下命令:

ssh-keygen

设置完密码(本例中不设置,直接一路回车即可)后会在当前用户家目录下的.ssh文件夹下生成两个文件id_rsa/id_rsa.pub,将id_rsa.pub的内容复制到其他节点的~/.ssh/authorized_keys文件中

注意:这里如果有多个客户端免密登录,则需要一行一串秘钥即可

下面测试一下是否配置成功

ssh root@i2

创建工作目录

在用户家目录创建kubernetes目录,后续所有操作均在此目录下进行

mkdir ~/kubernetes

证书工具下载

wget https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssl_1.6.4_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssljson_1.6.4_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssl-certinfo_1.6.4_linux_amd64

赋予执行权限

chmod +x cfssl*

将工具移动到/usr/local/bin目录下

mv cfssl_1.6.4_linux_amd64 /usr/local/bin/cfssl
mv cfssljson_1.6.4_linux_amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_1.6.4_linux_amd64 /usr/local/bin/cfssl-certinfo

检查是否安装成功:cfssl version

证书前置知识

创建CA配置文件

该配置文件是通用的,用于指定某个证书有效期及验证方式

cat > ca-config.json <<"EOF"
{
  "signing": {
      "default": {
          "expiry": "87600h"
        },
      "profiles": {
          "peer": {
              "usages": [
                  "signing",
                  "key encipherment",
                  "server auth",
                  "client auth"
              ],
              "expiry": "87600h"
          },
          "server": {
              "usages": [
                  "signing",
                  "key encipherment",
                  "server auth"
              ],
              "expiry": "87600h"
          },
          "client": {
              "usages": [
                  "signing",
                  "key encipherment",
                  "client auth"
              ],
              "expiry": "87600h"
          }
      }
  }
}
EOF

接下来我们就开始搭建Kubernetes集群,在搭建集群之前,这里我要说点和别的教程不一样的,一直以来我们都习惯傻瓜式的被动的教程,别人说怎么样就怎么样,但自己并不知道为什么这样,我们知道搭建Kubernetes最重要的就是证书,也不能说最重要的,起码的,你搞懂Kubernetes的证书,至少搭建就成功了百分之六十,为什么这么难懂,其实只是因为在Kubernetes中证书太多了,各种各样的证书,搞得眼花缭乱,很多作者把ETCD Kubernetes的证书全部由一个CA颁发,我并不是说这么做不可以,这么做没有问题,并且生产环境中大多都这么使用的,但是,为了让我们更加清楚证书在Kubernetes中的作用,某个请求需要哪些证书,这里我尽量的为每一个服务器客户端颁发不同的证书,并且使用不同的CA。

Kubernetes证书梳理

本教程旨在疑问搞明白Kubernetes安装流程及思路,所有里面牵扯到其他知识可能会比较啰嗦,如果对某些地方熟悉的可跳过。

密码学PKI(可跳过)

先来了解一下概念

  • 散列函数(hash函数):MD5/SHA-1/SHA-256等,散列函数所计算的值是单向定长且不可逆的。

  • 对称加密:根据相同的密钥相同的算法对数据进行加密,也可以使用相同的密钥及相同的算法解密数据,所以,密钥就显得特别重要。

  • 非对称加密:公钥加密,私钥解密;私钥签名,公钥验签;加密和签名其本质上原理是一样的,只是签名相对于加密多了一个步骤,就是将数据先进性散列得到一个hash值,我们称之为摘要/指纹,然后对摘要进行加密,而加密是直接操作数据。反之解密和验签是一样的解析流程。

    我们可能会好奇为什么签名需要有摘要,原因是,签名具有数据完整性/不可否认性,摘要保证了数据不被篡改,而私钥是严格保密的,只有所有者有私钥,所以私钥签名的数据所有者是不可否认的。

  • 证书:证书的作用就一个,就是安全的公钥交换,这也是数据安全传输的前提,如果没有了这个前提,就没有安全一说,举个例子:

    A需要给B发送了一个秘密文件,首先A/B各自有自己的一对公司钥,然后双方需要公钥交换,这个时候C从中截取了A给B的公钥,然后替换成了C自己的公钥,这个时候B得到的就是C的公钥,反之A得到的也是C的公钥,而C有A和B的公钥,在A发送秘密文件时(先用B的公钥对数据进行加密,然后用A自己的私钥签名,而这里所持有的B的公钥其实是C的公钥),数据同样的被C窃取,C首先用A的公钥验证签名,然后用自己的私钥对数据进行解密,这样就得到了A发送给B的秘密文件,然后做一些不可描述的事情,然后C用B的公钥对篡改过的数据进行加密,然后用自己的私钥进行签名,发送给B,因为B持有C的公钥,所以B可以对C传过来的数据进行验证签名及解密,这就出现了不安全的数据传输问题。

    以上案例有两个问题,1.公钥交换不安全,2.A/B双方虽然有公钥,但并不知道公钥属于谁,所以,就出现了证书。
    证书最重要的信息有四个部分,所有者信息(名称/域名/组织等),所有者公钥,过期时间,CA机构用私钥对以上数据的签名。在我们的操作系统中都会存在CA权威机构的根证书,也就是操作系统中内置了CA的公钥,这里拿https举个例子:

    客户端在访问一个https网站时,服务器会向客户端发送证书,浏览器收到服务器的证书后,对证书进行验证,使用CA对根证书(CA公钥)对服务器端的证书数据验签,确认是CA颁发的证书后验证通过,说明该证书确实是当前网站的公钥(这里还有域名校验,暂不展开),然后客户端会生成一个对称密钥,使用服务器证书中的公钥加密发送给服务器端,服务器收到对称密钥后,后面的数据使用对称加密进行保证数据安全的传输。

  • 单向认证:通常情况下https是单向验证的,也就是说,只需要客户端验证服务器的身份,服务器信任所有客户端。

  • 双向认证:双向认证标识客户端需要验证服务器身份,服务器也需要验证客户端身份,这时,客户端也需要有自己的证书。

通过对上面概念的了解,我们还需要了解一下证书格式:

  1. Pfx:该证书包含公钥及私钥,一般情况下,在对接第三方API时,比如银联支付接口会使用该中证书作为商户的证书,因为包含了私钥,所以在提取私钥时需要提供密码。

  2. Cer:该正式只包含公钥,一般作为API提供方证书,因为只有公钥,可以进行验证签名和数据加密的操作。

kubernetes证书梳理

证书只用于身份认证(使用证书对应的私钥对证书进行签名,发送给客户端或者服务端,对方收到后,首先验证证书是否可信,这里需要使用CA进行验证,证书受信任后,再将证书中的公钥拿出来对整个数据进行验签,确保数据从服务器或者客户端发送过来没有被篡改,然后读取证书中的身份信息)

kubernetes所需要的证书及CA

这里采用一个CA只签发一套证书的方式来搭建k8s集群,便于更好的理解双向认证以及哪些证书私钥几CA根证书是成套的。首先拟清楚k8s中哪些请求需要证书

  1. ETCD集群各个节点之间通信所需要的peer证书

    需要的证书及CA

    • 用于签发ETCD peer证书的CA及私钥(etcd-peer-ca.pem)

    • 使用etcd-peer-ca.pem CA签发的peer证书(etcd-peer.pem)

    • etcd-peer.pem对应的私钥文件(etcd-peer-key.pem)

  2. ETCD对外提供服务所使用的服务端证书

    需要的证书及CA

    • 用于签发ETCD server证书的CA及私钥(etcd-server-ca.pem)

    • 使用etcd-server-ca.pem CA签发的server证书(etcd-server.pem)

    • etcd-server.pem对应的私钥文件(etcd-server-key.pem)

  3. etcdctl访问ETCD集群的客户端证书(如果需要)

    需要的证书及CA

    • 用于签发ETCD client证书的CA及私钥(etcd-client-ca.pem)

    • 使用etcd-client-ca.pem CA签发的server证书(etcd-client-user.pem)

    • etcd-client-user.pem对应的私钥文件(etcd-client-user-key.pem)

  4. kube-api-server访问ETCD所需要的客户端证书

    需要的证书及CA

    • 用于签发ETCD client证书的CA及私钥(etcd-client-ca.pem)

    • 使用etcd-client-ca.pem CA签发的client证书(etcd-client-apiserver.pem)

    • etcd-client-apiserver.pem对应的私钥文件(etcd-client-apiserver-key.pem)

  5. kube-api-server对外提供服务的服务端证书

    需要的证书及CA

    • 用于签发apiserver server证书的CA及私钥(apiserver-server-ca.pem)

    • 使用apiserver-server-ca.pem CA签发的server证书(apiserver-server.pem)

    • apiserver-server.pem对应的私钥文件(apiserver-server-key.pem)

  6. kube-controller-manager对外提供服务的服务端证书

    需要的证书及CA

    • 用于签发controller-manager server证书的CA及私钥(controller-manager-server-ca.pem)

    • 使用controller-manager-server-ca.pem CA签发的server证书(controller-manager-server.pem)

    • controller-manager-server.pem对应的私钥文件(controller-manager-server-key.pem)

  7. kube-scheduler访问apiserver所需要的客户端证书

    需要的证书及CA

    • 用于签发apiserver client证书的CA及私钥(apiserver-client-ca.pem)

    • 使用apiserver-client-ca.pem CA签发的client证书(apiserver-client-scheduler.pem)

    • apiserver-client-scheduler.pem对应的私钥文件(apiserver-client-scheduler-key.pem)

  8. kube-controller-manager访问apiserver所需要的客户端证书

    需要的证书及CA

    • 使用apiserver-client-ca.pem CA签发的client证书(apiserver-client-controller-manager.pem)

    • apiserver-client-controller-manager.pem对应的私钥文件(apiserver-client-controller-manager-key.pem)

  9. kube-proxy访问apiserver所需要的客户端证书

    需要的证书及CA

    • 使用apiserver-client-ca.pem CA签发的client证书(apiserver-client-proxy.pem)

    • apiserver-client-proxy.pem对应的私钥文件(apiserver-client-proxy-key.pem)

  10. kubelet访问apiserver所需要的客户端证书?

    这里比较特殊,Kubernetes团队考虑到Node节点的不可观测性,为kubelet指定了TLS Bootstrapping机制来为kubelet自动颁发服务端及客户端相关的证书,自动签发的CA均来自kube-controller-manager配置文件中指定的--cluster-signing-cert-file参数

    需要的证书及CA

    • 创建一个用于集群中证书签署的公用CA(cluster-ca.pem)

    • 使用cluster-ca.pem CA签发的client证书(kubelet-client.pem),该证书由controller-manager自动签发

  11. kubectl管理员访问apiserver所需要的客户端证书

    需要的证书及CA

    • 使用apiserver-client-ca.pem CA签发的client证书(apiserver-client-admin.pem)

    • apiserver-client-admin.pem对应的私钥文件(apiserver-client-admin-key.pem)

  12. kubelet对外提供服务的服务端证书?

    需要的证书及CA

    • 使用cluster-ca.pem CA签发的server证书(kubelet-server.pem),该证书由controller-manager自动签发
  13. kube-api-server访问kubelet所需要的客户端证书

    需要的证书及CA

    • 用于签发kubelet client证书的CA(kubelet-client-ca.pem)

    • 使用kubelet-client-ca.pem CA签发的client证书(kubelet-client-apiserver.pem)

    • kubelet-client-apiserver.pem对应的私钥文件(kubelet-client-apiserver-key.pem)

  14. ApiServer访问API聚合所需要的客户端证书(如果需要)

    需要的证书及CA

    • 用于签发apiserver client证书的CA(extra-apiserver-client-ca.pem)
    • 使用extra-apiserver-client-ca.pem签发的client证书(extra-proxy-client.pem)
    • extra-proxy-client.pem对应的私钥(extra-proxy-client-key.pem)

通过以上分析大概需要14套证书以及好几套CA,虽然这么多,但是每一套证书都是有迹可循的,把他们分开来看,就会搞清楚Kubernetes的认证流程已经哪些组件需要进行安全的通信

注意:上面的证书配置及生成中可能会遇到一下几个问题,并且这里非常关键,我逐一说明:

  1. 通过上面的步骤生成的证书可以跑起来一个Kubernetes集群,完全没有问题,但是当你打开ETCD集群日志时,你会发现如下WARNING

    2021-06-09 11:10:13.022735 I | embed: rejected connection from "127.0.0.1:50898" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
    WARNING: 2021/06/09 11:10:13 grpc: addrConn.createTransport failed to connect to {127.0.0.1:12379 0 }. Err :connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate". Reconnecting...
    

    虽然是警告,但是作为强迫症程序员,这不能忍,我通过翻找ETCD的ISSUES,ETCDv3使用gRPC来提供服务,但为了能够支持REST API,ETCD使用gRPC Gateway来转发REST请求到gRPC Server,但是,当我们启动客户端身份认证时,gRPC Server也会对gRPC Gateway进行身份认证(这里直接使用了etcd的服务端证书进行发送),并且验证使用的CA就是我们配置的--trusted-ca-file参数,所以,这里会产生两个问题,第一、etcd的服务端证书需要同时支持Server以及Client身份认证;第二、由于ETCD验证客户端的CA证书参数和gRPC Server验证gRPC Gateway证书的参数是同一个参数指定的,所以,这里必须使用同一个CA。

    也就是说,ETCD集群最多只能有两个CA(ETCD集群节点间通信双向认证的CA以及服务端客户端公用一个CA),虽然参数上面没有明确的限制,但是从逻辑上来讲,最多只能使用两个CA,当然了,一个CA当然是可以的。

  2. 我们说kubelet的证书是由controller-manager颁发的,默认情况下,controller-manager默认只为kubelet颁发用于访问apiserver的客户端证书,但是我们可以通过配置让controller-manager为kubelet颁发对外提供服务的服务端证书(为什么需要?在后面Metrics Server那一节说),好,那么现在controller-manager为kubelet颁发服务端证书以及客户端证书,但是现在的问题是,controller-manager为kubelet颁发证书的CA使用的是同一个,也就是参数--cluster-signing-cert-file指定。

    同一个CA有什么问题?

    加入一个Kubernetes集群中所有操作使用同一个CA,一点毛病没有,但是这里为了搞清楚Kubernetes证书的双向认证流程,我们尽量的使用不同的CA以及不同的证书来演示,所以,这里说的目的是,因为controller-manger为kubelet颁发了客户端证书,该证书需要用于访问apiserver,所以kube-apiserver的客户端CA(通过--client-ca-file参数制定)就必须和controller-manger的集群CA(通过--cluster-signing-cert-file参数制定)相同。然后由于controller-manager也为kubelet颁发了服务端证书,当apiserver需要访问kubelet时,apiserver需要验证kubelet的身份(当然了,Kubernetes中apiserver默认不对kubelet进行身份认证,只有在设置了--kubelet-certificate-authority参数时才会验证),所以kubelet服务器证书的CA需要指定为controller-manager中--cluster-signing-cert-file参数一致。当这三个CA合并为一个,其他之前分析的由apiserver-client-ca颁发的客户端证书需要由集群CA颁发。

    也就是说,整个Kubernetes所需要的CA有用于签发Apiserver服务端证书的CA,用于签发controller-manager服务端证书的CA以及集群CA(颁发kubelet服务端/客户端证书),以及API聚合CA,还有一个Kubelet用于验证其客户端的CA

ETCD集群搭建

创建ETCD PEER CA

创建ETCD PEER CA证书请求文件

cat > etcd-peer-ca-csr.json <<"EOF"
{
  "CN": "etcd-peer-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "ETCD",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成ETCD PEER CA的证书及私钥

cfssl gencert -initca etcd-peer-ca-csr.json | cfssljson -bare etcd-peer-ca

生成ETCD PEER证书

创建ETCD PEER证书请求文件

cat > etcd-peer-csr.json <<"EOF"
{
  "CN": "etcd-peer",
  "hosts": [
    "127.0.0.1",
    "192.168.123.244",
    "192.168.123.245",
    "192.168.123.246"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "ETCD",
    "OU": "System"
  }]
}
EOF

这里需要注意的是,hosts 表示要签名的域名或IP,必须包含运行节点的IP,当hosts域不为空时,表示后续使用该证书只能在指定的IP及域名上使用

生成ETCD PEER证书

cfssl gencert -ca=etcd-peer-ca.pem -ca-key=etcd-peer-ca-key.pem -config=ca-config.json -profile=peer etcd-peer-csr.json | cfssljson  -bare etcd-peer

生成ETCD CA

创建ETCD CA证书请求文件

cat > etcd-ca-csr.json <<"EOF"
{
  "CN": "etcd-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "ETCD",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成ETCD CA证书及私钥

cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare etcd-ca

生成ETCD Server证书

创建ETCD Server证书请求文件

cat > etcd-server-csr.json <<"EOF"
{
  "CN": "etcd-server",
  "hosts": [
    "127.0.0.1",
    "192.168.123.244",
    "192.168.123.245",
    "192.168.123.246"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "ETCD",
    "OU": "System"
  }]
}
EOF

生成证书及私钥(这里需要特别注意的是,profile需要指定为peer,因为它需要作为服务器证书也需要作为客户端证书)

cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=ca-config.json -profile=peer etcd-server-csr.json | cfssljson  -bare etcd-server

生成ETCD Client User证书

这里说明一下,该证书用于命令行工具etcdctl对ETCD集群的访问,如果没有该证书,etcdctl只能通过非安全端口对当前节点进行访问

创建ETCD Client User证书请求文件

cat > etcd-client-user-csr.json <<"EOF"
{
  "CN": "etcd-client-user",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "ETCD",
    "OU": "System"
  }]
}
EOF

这里的hosts未指定任何IP地址及域名,表明任何节点都能访问etcd

生成客户端(终端)证书

cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=ca-config.json -profile=client etcd-client-user-csr.json | cfssljson  -bare etcd-client-user

创建目录

mkdir -p /etc/etcd/ssl
mkdir -p /var/lib/etcd/default.etcd

创建ETCD配置文件

cat >  /etc/etcd/etcd.conf <<"EOF"
#[Member]
ETCD_NAME="etcd1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.123.244:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.123.244:2379,http://127.0.0.1:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.123.244:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.123.244:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.123.244:2380,etcd2=https://192.168.123.245:2380,etcd3=https://192.168.123.246:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF

参数解释:

  • ETCD_DATA_DIR:ETCD数据持久化目录

  • ETCD_LISTEN_PEER_URLS:ETCD集群节点之间通信的IP及端口,IP用于绑定到固定网卡

  • ETCD_LISTEN_CLIENT_URLS:ETCD集群当前节点对外提供服务绑定的网卡及端口

创建ETCD服务配置文件

cat > /etc/systemd/system/etcd.service <<"EOF"
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/etcd/etcd.conf
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \
  --cert-file=/etc/etcd/ssl/etcd-server.pem \
  --key-file=/etc/etcd/ssl/etcd-server-key.pem \
  --trusted-ca-file=/etc/etcd/ssl/etcd-ca.pem \
  --peer-cert-file=/etc/etcd/ssl/etcd-peer.pem \
  --peer-key-file=/etc/etcd/ssl/etcd-peer-key.pem \
  --peer-trusted-ca-file=/etc/etcd/ssl/etcd-peer-ca.pem \
  --peer-client-cert-auth \
  --client-cert-auth
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

参数解释:

  • --cert-file:ETCD对外提供服务的服务器证书文件

  • --key-file:--cert-file参数对应的私钥

  • --trusted-ca-file:受信任的客户端证书CA根证书,用于验证客户端证书有效性

  • --peer-cert-file:集群节点间证书

  • --peer-key-file:--peer-cert-file参数对应的私钥

  • --peer-trusted-ca-file:集群节点间证书的CA

分发文件到其他节点

将集群节点中需要用到的软件二进制文件、配置文件、证书等copy到其他节点

  1. 将etcd/etcdctl/etcdutl复制到三个节点中的/usr/local/bin目录下

    cp etcd* /usr/local/bin/
    scp etcd* i2:/usr/local/bin/
    scp etcd* i3:/usr/local/bin/
    
  2. 将etcd相关的证书复制到三个节点中的/etc/etcd/ssl目录下

    cp etcd-ca-key.pem  etcd-ca.pem  etcd-client-user-key.pem  etcd-client-user.pem  etcd-peer-ca-key.pem  etcd-peer-ca.pem  etcd-peer-key.pem  etcd-peer.pem  etcd-server-key.pem  etcd-server.pem /etc/etcd/ssl/
    scp etcd-ca-key.pem  etcd-ca.pem  etcd-client-user-key.pem  etcd-client-user.pem  etcd-peer-ca-key.pem  etcd-peer-ca.pem  etcd-peer-key.pem  etcd-peer.pem  etcd-server-key.pem  etcd-server.pem i2:/etc/etcd/ssl/
    scp etcd-ca-key.pem  etcd-ca.pem  etcd-client-user-key.pem  etcd-client-user.pem  etcd-peer-ca-key.pem  etcd-peer-ca.pem  etcd-peer-key.pem  etcd-peer.pem  etcd-server-key.pem  etcd-server.pem i3:/etc/etcd/ssl/
    
  3. 将配置文件复制到三个节点中的/etc/etcd目录下,并分别更改配置,这里需要改的参数有ETCD_NAME、ETCD_LISTEN_PEER_URLS、ETCD_INITIAL_ADVERTISE_PEER_URLS、ETCD_ADVERTISE_CLIENT_URLS,需要更改以上四个参数中的节点名称以及IP地址为当前节点的IP

    scp /etc/etcd/etcd.conf i2:/etc/etcd/
    scp /etc/etcd/etcd.conf i3:/etc/etcd/
    scp /etc/systemd/system/etcd.service i2:/etc/systemd/system/
    scp /etc/systemd/system/etcd.service i3:/etc/systemd/system/
    

启动ETCD

systemctl daemon-reload
systemctl enable --now etcd.service
systemctl status etcd

测试集群(健康检查)

使用ETCD客户端CA颁发的用户证书进行访问

ETCDCTL_API=3 etcdctl --write-out=table --cacert=/etc/etcd/ssl/etcd-ca.pem --cert=/etc/etcd/ssl/etcd-client-user.pem --key=/etc/etcd/ssl/etcd-client-user-key.pem --endpoints=https://192.168.123.244:2379,https://192.168.123.245:2379,https://192.168.123.246:2379 endpoint health

kube-apiserver安装

生成ETCD Client Api Server证书

生成用于访问ETCD集群的apiserver客户端证书

创建etcd client apiserver证书请求文件

cat > etcd-client-apiserver-csr.json <<"EOF"
{
  "CN": "etcd-client-apiserver",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "Kubernetes",
    "OU": "System"
  }]
}
EOF

使用etcd client ca颁发证书

cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=ca-config.json -profile=client etcd-client-apiserver-csr.json | cfssljson  -bare etcd-client-apiserver

生成Api Server CA

创建Api Server Server CA证书请求文件

cat > apiserver-server-ca-csr.json <<"EOF"
{
  "CN": "apiserver-server-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成CA证书及秘钥

cfssl gencert -initca apiserver-server-ca-csr.json | cfssljson -bare apiserver-server-ca

生成Api Server Server证书

创建Api Server Server证书请求文件

cat > apiserver-server-csr.json << "EOF"
{
  "CN": "kubernetes",
  "hosts": [
    "127.0.0.1",
    "192.168.123.244",
    "192.168.123.245",
    "192.168.123.246",
    "192.168.123.242",
    "10.96.0.1",
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local",
    "*.k8s.dev-james.xyz"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ]
}
EOF

注意:如果apiserver做了高可用,VIP也需要填写到hosts域中,否则会出现如下错误:

Unable to connect to the server: x509: certificate is valid for 127.0.0.1, 192.168.123.244, 192.168.123.245, 192.168.123.246, 10.96.0.1, not 192.168.123.242

使用Api Server CA颁发证书

cfssl gencert -ca=apiserver-server-ca.pem -ca-key=apiserver-server-ca-key.pem -config=ca-config.json -profile=server apiserver-server-csr.json | cfssljson  -bare apiserver-server

生成集群CA(Api Client CA)

创建Cluster CA证书请求文件

cat > cluster-ca-csr.json <<"EOF"
{
  "CN": "cluster-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成CA证书及秘钥

cfssl gencert -initca cluster-ca-csr.json | cfssljson -bare cluster-ca

创建Service Account需要使用的密钥对

openssl genrsa -out sa.key 2048
openssl rsa -in sa.key -pubout -out sa.pub

生成Kubelet Client CA

由于apiserver会作为客户端访问kubelet,所以需要创建Kubelet Client Api Server证书,而生成该证书又需要一个Kubelet Client CA,所以

创建kubelet客户端CA证书请求文件

cat > kubelet-client-ca-csr.json <<"EOF"
{
  "CN": "kubelet-client-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成CA证书及私钥

cfssl gencert -initca kubelet-client-ca-csr.json | cfssljson -bare kubelet-client-ca

生成Kubelet Client Api Server证书

创建证书请求文件

cat > kubelet-client-apiserver-csr.json <<"EOF"
{
  "CN": "kubelet-client-apiserver",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "ETCD",
    "OU": "System"
  }]
}
EOF

生成证书及私钥

cfssl gencert -ca=kubelet-client-ca.pem -ca-key=kubelet-client-ca-key.pem -config=ca-config.json -profile=client kubelet-client-apiserver-csr.json | cfssljson  -bare kubelet-client-apiserver

生成ExtraApiserver Client CA证书

创建证书请求文件

cat > extra-apiserver-client-ca-csr.json <<"EOF"
{
  "CN": "extra-apiserver-client-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成证书及秘钥

cfssl gencert -initca extra-apiserver-client-ca-csr.json | cfssljson -bare extra-apiserver-client-ca

生成ExtraApiserver Client ApiServer证书

该证书用于ApiServer访问ExtraApiServer

创建证书请求文件

cat > extra-proxy-client-csr.json <<"EOF"
{
  "CN": "aggregator",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "system:masters",
    "OU": "System"
  }]
}
EOF

生成证书及私钥

cfssl gencert -ca=extra-apiserver-client-ca.pem -ca-key=extra-apiserver-client-ca-key.pem -config=ca-config.json -profile=client extra-proxy-client-csr.json | cfssljson  -bare extra-proxy-client

创建目录

mkdir -p /etc/kubernetes/ssl
mkdir -p /var/log/kubernetes

创建Api Server配置文件

cat > /etc/kubernetes/kube-apiserver.conf <<"EOF"
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
  --anonymous-auth=false \
  --bind-address=192.168.123.244 \
  --secure-port=6443 \
  --advertise-address=192.168.123.244 \
  --authorization-mode=Node,RBAC \
  --runtime-config=api/all=true \
  --enable-bootstrap-token-auth=true \
  --service-cluster-ip-range=10.96.0.0/16 \
  --service-node-port-range=30000-32767 \
  --tls-cert-file=/etc/kubernetes/ssl/apiserver-server.pem  \
  --tls-private-key-file=/etc/kubernetes/ssl/apiserver-server-key.pem \
  --client-ca-file=/etc/kubernetes/ssl/cluster-ca.pem \
  --kubelet-certificate-authority=/etc/kubernetes/ssl/cluster-ca.pem \
  --kubelet-client-certificate=/etc/kubernetes/ssl/kubelet-client-apiserver.pem \
  --kubelet-client-key=/etc/kubernetes/ssl/kubelet-client-apiserver-key.pem \
  --service-account-key-file=/etc/kubernetes/ssl/sa.pub \
  --service-account-signing-key-file=/etc/kubernetes/ssl/sa.key  \
  --service-account-issuer=api \
  --etcd-cafile=/etc/etcd/ssl/etcd-ca.pem \
  --etcd-certfile=/etc/kubernetes/ssl/etcd-client-apiserver.pem \
  --etcd-keyfile=/etc/kubernetes/ssl/etcd-client-apiserver-key.pem \
  --etcd-servers=https://192.168.123.244:2379,https://192.168.123.245:2379,https://192.168.123.246:2379 \
  --requestheader-client-ca-file=/etc/kubernetes/ssl/extra-apiserver-client-ca.pem \
  --requestheader-allowed-names=aggregator \
  --requestheader-extra-headers-prefix=X-Remote-Extra- \
  --requestheader-group-headers=X-Remote-Group \
  --requestheader-username-headers=X-Remote-User \
  --proxy-client-cert-file=/etc/kubernetes/ssl/extra-proxy-client.pem \
  --proxy-client-key-file=/etc/kubernetes/ssl/extra-proxy-client-key.pem \
  --allow-privileged=true \
  --apiserver-count=3 \
  --audit-log-maxage=30 \
  --audit-log-maxbackup=3 \
  --audit-log-maxsize=100 \
  --audit-log-path=/var/log/kubernetes/kube-apiserver-audit.log \
  --event-ttl=1h \
  --alsologtostderr=true \
  --logtostderr=false \
  --log-dir=/var/log/kubernetes \
  --v=4"
EOF

参数解释

  • --authorization-mode:支持的鉴权方式

  • --enable-bootstrap-token-auth:开启TLS Bootstrapping认证

  • --service-cluster-ip-range:ServiceIP网段

  • --service-node-port-range:Service端口范围

  • --tls-cert-file:Api Server服务端证书

  • --tls-private-key-file:服务端证书秘钥

  • --client-ca-file:客户端CA根证书,用于验证客户端身份

  • --kubelet-certificate-authority:这里实际上是指定的签发kubelet-server证书的CA根证书文件路径
    注意:默认情况下,API Server不检查 kubelet 的服务证书,为了对这个连接进行认证,使用 --kubelet-certificate-authority 标志给 API 服务器提供一个根证书,用于验证 kubelet 的服务证书。

  • --kubelet-client-certificate:kubelet客户端证书,用于API Server作为客户端访问kubelet时表明身份的证书,通常见于执行kubectl exec pod或者kubectl logs等命令

  • --kubelet-client-key:kubelet客户端证书秘钥

  • --service-account-key-file:Service Account验签公钥,对Service Account的Token进行验签

  • --service-account-signing-key-file:Service Account私钥,对Service Account的Token进行签名

  • --etcd-cafile:ETCD服务端CA根证书,用于验证ETCD服务端身份

  • --etcd-certfile:API Server访问ETCD集群所需要的客户端证书,该证书必须由etcd-client-ca颁发

  • --etcd-keyfile:ETCD客户端证书对应的私钥

创建Api Server服务配置文件

cat > /etc/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

分发文件到其他节点

将启动API Server所需要的二进制文件,配置文件,证书文件分发到集群Master节点

  1. 将kube-apiserver复制到各个节点的/usr/local/bin/目录下

    cp kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy kubectl /usr/local/bin/
    scp kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy kubectl i2:/usr/local/bin/
    scp kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy kubectl i3:/usr/local/bin/
    
  2. 将kube-apiserver.conf复制到各个节点的/etc/kubernetes/目录下,并相应的修改参数--bind-address--advertise-address为节点IP地址

    scp /etc/kubernetes/kube-apiserver.conf i2:/etc/kubernetes/
    scp /etc/kubernetes/kube-apiserver.conf i3:/etc/kubernetes/
    
  3. 将kube-apiserver.service复制到各个节点的/etc/systemd/system/目录下

    scp /etc/systemd/system/kube-apiserver.service i2:/etc/systemd/system/
    scp /etc/systemd/system/kube-apiserver.service i3:/etc/systemd/system/
    
  4. 将证书文件复制到各个节点的·/etc/kubernetes/ssl/目录下

    cp apiserver-server-ca-key.pem  apiserver-server-ca.pem  apiserver-server-key.pem  apiserver-server.pem cluster-ca-key.pem  cluster-ca.pem etcd-client-apiserver-key.pem  etcd-client-apiserver.pem extra-apiserver-client-ca-key.pem  extra-apiserver-client-ca.pem  extra-proxy-client-key.pem  extra-proxy-client.pem kubelet-client-apiserver-key.pem  kubelet-client-apiserver.pem  kubelet-client-ca-key.pem  kubelet-client-ca.pem /etc/kubernetes/ssl/
    scp apiserver-server-ca-key.pem  apiserver-server-ca.pem  apiserver-server-key.pem  apiserver-server.pem cluster-ca-key.pem  cluster-ca.pem etcd-client-apiserver-key.pem  etcd-client-apiserver.pem extra-apiserver-client-ca-key.pem  extra-apiserver-client-ca.pem  extra-proxy-client-key.pem  extra-proxy-client.pem kubelet-client-apiserver-key.pem  kubelet-client-apiserver.pem  kubelet-client-ca-key.pem  kubelet-client-ca.pem i2:/etc/kubernetes/ssl/
    scp apiserver-server-ca-key.pem  apiserver-server-ca.pem  apiserver-server-key.pem  apiserver-server.pem cluster-ca-key.pem  cluster-ca.pem etcd-client-apiserver-key.pem  etcd-client-apiserver.pem extra-apiserver-client-ca-key.pem  extra-apiserver-client-ca.pem  extra-proxy-client-key.pem  extra-proxy-client.pem kubelet-client-apiserver-key.pem  kubelet-client-apiserver.pem  kubelet-client-ca-key.pem  kubelet-client-ca.pem i3:/etc/kubernetes/ssl/
    
    # 将操作Service Account的公钥及私钥复制到ssl文件夹下
    cp sa.key sa.pub /etc/kubernetes/ssl/
    scp sa.key sa.pub i2:/etc/kubernetes/ssl/
    scp sa.key sa.pub i3:/etc/kubernetes/ssl/
    

启动Api Server

systemctl daemon-reload
systemctl enable --now kube-apiserver.service
systemctl status kube-apiserver

配置kubectl

生成ApiServer Client证书admin

创建admin用户证书请求文件

cat > apiserver-client-admin-csr.json << "EOF"
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "system:masters",             
      "OU": "System"
    }
  ]
}
EOF

这里需要注意的是,CN一定要是admin,O一定要是system:masters,在Kubernetes集群中,预定义了一些用户组,其中system:masters组具有集群管理员的最高权限。

生成证书及私钥,该证书需要使用cluster-ca.pem进行颁发

cfssl gencert -ca=cluster-ca.pem -ca-key=cluster-ca-key.pem -config=ca-config.json -profile=client apiserver-client-admin-csr.json | cfssljson  -bare apiserver-client-admin

配置kube.config

使用kubectl工具创建kube.config文件

kubectl config set-cluster kubernetes --certificate-authority=apiserver-server-ca.pem --embed-certs=true --server=https://192.168.123.242:6443 --kubeconfig=kube.config
kubectl config set-credentials admin --client-certificate=apiserver-client-admin.pem --client-key=apiserver-client-admin-key.pem --embed-certs=true --kubeconfig=kube.config
kubectl config set-context admin@kubernetes --cluster=kubernetes --user=admin --kubeconfig=kube.config
kubectl config use-context admin@kubernetes --kubeconfig=kube.config

简单说一下以上命令的含义

  • 第一行指定了集群的URL地址以及服务器证书的CA,相当于创建了一个集群访问入口

  • 第二行设置了一个用户,名称为admin,并指定了客户端证书

  • 第三行将用户和集群关联,表示为一个上下文,这里其实还可以指定namespace,表示默认使用的是哪个命名空间

  • 第四行设置默认使用哪个上下文

来看一下kube.config文件的内容

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQ1RENDQXN5Z0F3SUJBZ0lVSEVkLzM1QS9qMWx2MlNCZDZ6VlJvNkFIMWZrd0RRWUpLb1pJaHZjTkFRRUwKQlFBd2VERUxNQWtHQTFVRUJoTUNRMDR4RWpBUUJnTlZCQWdUQ1VkMVlXNW5SRzl1WnpFUk1BOEdBMVVFQnhNSQpVMmhsYmxwb1pXNHhFekFSQmdOVkJBb1RDa3QxWW1WeWJtVjBaWE14RHpBTkJnTlZCQXNUQmxONWMzUmxiVEVjCk1Cb0dBMVVFQXhNVFlYQnBjMlZ5ZG1WeUxYTmxjblpsY2kxallUQWVGdzB5TXpBMk1Ea3dPRFF3TURCYUZ3MHoKTXpBMk1EWXdPRFF3TURCYU1IZ3hDekFKQmdOVkJBWVRBa05PTVJJd0VBWURWUVFJRXdsSGRXRnVaMFJ2Ym1jeApFVEFQQmdOVkJBY1RDRk5vWlc1YWFHVnVNUk13RVFZRFZRUUtFd3BMZFdKbGNtNWxkR1Z6TVE4d0RRWURWUVFMCkV3WlRlWE4wWlcweEhEQWFCZ05WQkFNVEUyRndhWE5sY25abGNpMXpaWEoyWlhJdFkyRXdnZ0VpTUEwR0NTcUcKU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRQ3g5bGNkOGIxcHFsZTJqcERNS1VTL044WUY1QnJVU0wyNgpndUh6eWdVRlV1ZW9PbkwyTHljWmZ4bG1uUmxrMVAydnQxRm9oOWgrK1ZpVzNoSEZYNEFhdFdGS1ZZTFJNMHBBCmlwZEZlMlR3ZDJDTnJSSVROME1HVjI3d2JHYnVBNjFxYXV5OWtIUnBWZW1LcXBuMS84U2U2ZTk5cFkvd3NtZHgKaGVlcGpjOGF2YkR0clZBQzc0SmMwZ3hoeTEzT0lZSk9GZWNobmkxcWphM1YvQkMrbGEwVHh3NnNSVGpDMHA5OAp0YURNSEFLQ3hGUFh2aUhWa1VrQVF5YW1RSXRDSkdyeFhlamN1dlpUS3R4NlZ5cWRVUlljRyszYWtlN2tRdVNYCnpsSkdwNGNVNC9MMThmckw1VmJVVVBpWFNDZEdLUStrbDdmTWFSZ1ZOb2UyUDdLSjNKZjNBZ01CQUFHalpqQmsKTUE0R0ExVWREd0VCL3dRRUF3SUJCakFTQmdOVkhSTUJBZjhFQ0RBR0FRSC9BZ0VDTUIwR0ExVWREZ1FXQkJRRwpLeHJ3bDdVK20xMHkxNXpQUDc0bTBDOTcvREFmQmdOVkhTTUVHREFXZ0JRR0t4cndsN1UrbTEweTE1elBQNzRtCjBDOTcvREFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBcGpjdWFSUmt6UVVkNEZtOG9XM292dm9qWTRQdVN3a1kKNGdxY1ZKaXZIQzA0eGVkWktxUjB4VU5URFpKU3J4UVJFS1cyQlZ6aFlsUlRyZWRuSXpGMzZvSk9vZXJqWDAvQQowL1FkR2NpWWJUVnlYNy9yTWJQS25nRlRIMkJFZm9COXQ1eUxvWlFLYjAxTk9OTVhraE9oRjF0UzlweFBCRERuCjhtMnRBNVZ5UGViemQ3T1Y0aTl6M24wS0ZYTTJCSFVDc3g4c1kvMG5ESldzQ3hWTWJ3UGJCY21SZ0tRWU11MkMKZHhPVVZVSmwxR3B6dmpaV29mY1dNbDJqWm1KclFEbS9WclUzczdveTREREY5UWVJNHhOUEdaYkhZSjJNa0F2UwpTNlBjVFB1YmxjYTlFdnFrc1J1cmhSVERCWWVoNXRHVUFFZ3BlbTBoTkh5OGs0YUFrVFU4bkE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    server: https://192.168.123.242:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: admin
  name: admin@kubernetes
current-context: admin@kubernetes
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQ2VENDQXRHZ0F3SUJBZ0lVRVEvUCtjNkRRdXMxQ3FZNTVKT0dFd1YxRWNjd0RRWUpLb1pJaHZjTkFRRUwKQlFBd2VERUxNQWtHQTFVRUJoTUNRMDR4RWpBUUJnTlZCQWdUQ1VkMVlXNW5SRzl1WnpFUk1BOEdBMVVFQnhNSQpVMmhsYmxwb1pXNHhFekFSQmdOVkJBb1RDa3QxWW1WeWJtVjBaWE14RHpBTkJnTlZCQXNUQmxONWMzUmxiVEVjCk1Cb0dBMVVFQXhNVFlYQnBjMlZ5ZG1WeUxXTnNhV1Z1ZEMxallUQWVGdzB5TXpBMk1Ea3hNREV5TURCYUZ3MHoKTXpBMk1EWXhNREV5TURCYU1HNHhDekFKQmdOVkJBWVRBa05PTVJJd0VBWURWUVFJRXdsSGRXRnVaMFJ2Ym1jeApFVEFQQmdOVkJBY1RDRk5vWlc1YWFHVnVNUmN3RlFZRFZRUUtFdzV6ZVhOMFpXMDZiV0Z6ZEdWeWN6RVBNQTBHCkExVUVDeE1HVTNsemRHVnRNUTR3REFZRFZRUURFd1ZoWkcxcGJqQ0NBU0l3RFFZSktvWklodmNOQVFFQkJRQUQKZ2dFUEFEQ0NBUW9DZ2dFQkFMYkk0MksvTkJEakVHR0lYek82aW1QUk1wZlpyTHJnTko4V0FiVE1SbkFJZVBZRgpVZWVoeDltNjJtM2RlUHcyeTVHV2FDcjY4Wm8yMC9VdWJJb21ZYXdjOElTQlZiMStEQW1CZjVzTjI4ZnZyMW9DCis2ZmhmMzIyQTJpcGFJZ0pQcXAwbGRCUlI0cHZsTjJwWEF4Umo0Y1djTWp0N0tlREhRKzBweFJRb2U4alJQa0IKL1J1b1BMaW9Vc1dycXN2N3kzUzdtK0xwVnFaT3hlVGNTZHNYRVZqRDNmNHVQdXN4YzN1MUVNS2huVFFBWjh6WApoNFZydG8wV2IzUHk2WWhHN2lrcWhaMWM3TW9QOEZjNFNYUlM5WTJCTzhiYWVIelhPWjcvTXBpU0Y3MDdDalBrCmNMVSszYTlIRkppTDZGV2s2di9IRGpERmdRbVNCQUZ4OHFuVkhhTUNBd0VBQWFOMU1ITXdEZ1lEVlIwUEFRSC8KQkFRREFnV2dNQk1HQTFVZEpRUU1NQW9HQ0NzR0FRVUZCd01DTUF3R0ExVWRFd0VCL3dRQ01BQXdIUVlEVlIwTwpCQllFRkJhVEVVTmRId1p5dGd3MzEyWElwK09xVGJVa01COEdBMVVkSXdRWU1CYUFGRjI2ZVFQbTZWcU10V1pBClFtR3ptSzE2bEd5WE1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQThFVnh2cXRYMitwZ3RheFhRdXUzd1MvRHIKY1pJaWh3TVJJTEVoWktlZS94WU8wd3hPY2dhb0xaWHlLUmRJZzJvS0RVVXpHRUhmUTZ6K1k1alV2U1ZZeUxFZQpsN0NCQnIvbjBCVlB2ajdPL1lVMUVEZzlrOUZzYUZLVE5iQmN3U3h4ZnppbUcrS3VQL0JrNS9GN3liOEJReExxCkdHejBjQ3JicmFwb29RYmZSYm5RTmZkYVJyWWpibVNRVkJ0STNSQzl2RXBmcUZCS1JhRlFnclpaNm9VcG8yVCsKTHlJVG9PQmJOeXpXSjRiUk9uamZ3QVI4RG1PbUc5STNDWENXU3ZpdzNjWXVMOG4vUDZQbkJJbjFFK2pQcHBOVwp6OTc1TENOUzVEUk1FeXpKTDFaRVRENi9zZ2kya3E4MEp0SUFwRXluWFZYVDhiVHRvVEtEN2xmYkMvL3gKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBdHNqallyODBFT01RWVloZk03cUtZOUV5bDltc3V1QTBueFlCdE14R2NBaDQ5Z1ZSCjU2SEgyYnJhYmQxNC9EYkxrWlpvS3ZyeG1qYlQ5UzVzaWlaaHJCendoSUZWdlg0TUNZRi9tdzNieCsrdldnTDcKcCtGL2ZiWURhS2xvaUFrK3FuU1YwRkZIaW0rVTNhbGNERkdQaHhad3lPM3NwNE1kRDdTbkZGQ2g3eU5FK1FIOQpHNmc4dUtoU3hhdXF5L3ZMZEx1YjR1bFdwazdGNU54SjJ4Y1JXTVBkL2k0KzZ6RnplN1VRd3FHZE5BQm56TmVICmhXdTJqUlp2Yy9McGlFYnVLU3FGblZ6c3lnL3dWemhKZEZMMWpZRTd4dHA0Zk5jNW52OHltSklYdlRzS00rUncKdFQ3ZHIwY1VtSXZvVmFUcS84Y09NTVdCQ1pJRUFYSHlxZFVkb3dJREFRQUJBb0lCQUI5UFJmUHlRSjdyNWpCdQp4YS84c2h2ckI3bVBKZEZVK202TnZIa1Z6TE1BSUlnejNSWEtWb3RyUUdNMVhyWUZSTldKYUFxRXRjSHV4bHZuCk9keG9PcTdhdmpCVVh6VjRVK09FOVREQUxQZVFqUDdrSit0WDZ4akRoczMweHQwV2lFOTJiUHNrRVJjYmllcDIKU2pncCtHWHhhQnhpOVBpMHN0T3Y1RGJNb1JCdk5kK3JRaU5CaU5YOUZuY0dvN0RWS1FrelVjcy9uT0FvbG0vUgo5MXVNY3BCeGNmVzRjUGhLS3FSQ1JvN2NOM294ZUNJc3g2bE5xT0JJOHdaU3hFL2diWEtGMlBhN3h1K21VbGdnCldYV3JkRUo3Y2g5Si83MHI4UmQ5Tzg0TWxheW5HOEdrVFNJbUppYm5kSlJMemlnRzNLNEVhanZHNkp4QUZTQTQKVjJOSlpRRUNnWUVBeU9Da21MYWVjR1UxZjF5UC92WCtMRGJKblZseWd0cE9aRjh6c1FRVkNnT3MrencrVlREbgo2NHBNNlo3d2phWFR6YXJDVVhtZ0FHWThtSUVtV1d5dkd2V1JyQ2JzL0hkQWVOcTFNbS8wUFJ4K2V6eHBWR01NCklqalJDb0xoQ0VlanZrTkxLemZpWWZ0QXBZUmM3S0FLdUx4SXVoNW05QW9pSFJkWFU2Snhta0VDZ1lFQTZQRkEKcWN5SC9wanBjV1ZBZGZ3Q08wS1cySlBzQ2NIQWVGRW5jK2g2K3NVT3pjdjJQS0NKeGFRM2JLcVZhaFFtYi8vRAp0bTZOcmhTUTRjWXk1TW0xaUJBVmpmZXNzbHJQRG5DaldERG9tS2RoOS92ZUI4ditId2RhWGdLZVVsYXlQZXZEClRnSVpITnY4cjN2TURGdzdjNW0yeVRDVTZEdmFPcnhlODVobjF1TUNnWUJwZTMwekxBSTY1d3FXbkphSXZjZ0EKazZ4L1VlOE53M0VTeCtNdSt3UEpSSERiWktFZXZ4V3AyKy9UWmNEUHdOcGR6Mk5Hd1dWQmtHNFZid3dpUFM0ZQpMQUdZc3NBVE90UENJcWF2bTVaWFdOVWFCWGtSOVFqMEYzMjkxVWd4dnR5L0Zqc3NzS1hSNmN2aW5vVGxSSTBjCndOSTMyYXNhVHcvbTB0RHFmQXpIZ1FLQmdDWDhTVjRuRXpvcU4wOGRnc0I2b3VhRStsSkE5T20yWmF0NUdHVG4KVVQ2WmFjdVhhZ1VDN05TRTdlRFRoRi95L3oyZVNJejBSRGhSOURwTTlybW1SdXIwTEgrbEZzMVN6NWI4T1RiRgphdmlSdXdFVVdtV05GMWg1KzN0L0U5QTdnUDlsOWNnL3dWYWFiUDgwd2RaMko5KzIvajZhcEgybVhQVGRDT0xTClJJU2ZBb0dBVFdzTHB4Y2Iyd3hIdlI5aDNHWDg0V205SmFKV1RFQmZVejZUSk9FSUdMU21uSnBZWVdFaTVJV3kKMnZDN2RQdVF3MUlxM3ZGTVU1V0RFclF5QmFPMm44QjRWYjVVMjJMY3c5OWJTRHNUSFFyc0pJbGJ5MmlTMzBlaApnL3VDaUNQRlcyUVIvMmRxbFp0RHpQVko3dUZtY3hHTDFtZzZFMjgvK3pkbjNSK29MdFU9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==

大致内容就分为三块,集群、用户身份以及用户身份所对应的集群(上下文)

将kube.config复制到相应位置

mkdir ~/.kube
cp kube.config ~/.kube/config

测试集群连通性及集群状态

kubectl get cs

部署kube-controller-manager

生成kuber-controller-manager Server CA

创建controller manger Server CA证书请求文件

cat > controller-manager-server-ca-csr.json <<"EOF"
{
  "CN": "controller-manager-server-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成根证书及私钥

cfssl gencert -initca controller-manager-server-ca-csr.json | cfssljson -bare controller-manager-server-ca

生成Controller Manger服务器证书

创建证书请求文件

cat > controller-manager-server-csr.json << "EOF"
{
  "CN": "kubernetes",
  "hosts": [
    "127.0.0.1",
    "192.168.123.244",
    "192.168.123.245",
    "192.168.123.246"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ]
}
EOF

生成证书及私钥

cfssl gencert -ca=controller-manager-server-ca.pem -ca-key=controller-manager-server-ca-key.pem -config=ca-config.json -profile=server controller-manager-server-csr.json | cfssljson  -bare controller-manager-server

生成用于访问apiserver的客户端证书

创建证书请求文件

cat > apiserver-client-controller-manager-csr.json << "EOF"
{
    "CN": "system:kube-controller-manager",
    "hosts": [],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
      {
        "C": "CN",
        "ST": "GuangDong",
        "L": "ShenZhen",
        "O": "system:kube-controller-manager",
        "OU": "System"
      }
    ]
}
EOF

生成证书及私钥

cfssl gencert -ca=cluster-ca.pem -ca-key=cluster-ca-key.pem -config=ca-config.json -profile=client apiserver-client-controller-manager-csr.json | cfssljson  -bare apiserver-client-controller-manager

创建kubeconfig文件

kubectl config set-cluster kubernetes --certificate-authority=apiserver-server-ca.pem --embed-certs=true --server=https://192.168.123.242:6443 --kubeconfig=kube-controller-manager.kubeconfig
kubectl config set-credentials system:kube-controller-manager --client-certificate=apiserver-client-controller-manager.pem --client-key=apiserver-client-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig
kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig

指定集群APIServer地址及服务端CA证书、客户端证书及上下文

创建配置文件

cat > /etc/kubernetes/kube-controller-manager.conf << "EOF"
KUBE_CONTROLLER_MANAGER_OPTS="--secure-port=10257 \
  --bind-address=127.0.0.1 \
  --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
  --service-cluster-ip-range=10.96.0.0/16 \
  --cluster-name=kubernetes \
  --cluster-signing-cert-file=/etc/kubernetes/ssl/cluster-ca.pem \
  --cluster-signing-key-file=/etc/kubernetes/ssl/cluster-ca-key.pem \
  --allocate-node-cidrs=true \
  --cluster-cidr=172.18.0.0/16 \
  --cluster-signing-duration=87600h \
  --root-ca-file=/etc/kubernetes/ssl/apiserver-server-ca.pem \
  --service-account-private-key-file=/etc/kubernetes/ssl/sa.key \
  --leader-elect=true \
  --feature-gates=RotateKubeletServerCertificate=true \
  --controllers=*,bootstrapsigner,tokencleaner \
  --horizontal-pod-autoscaler-sync-period=10s \
  --tls-cert-file=/etc/kubernetes/ssl/controller-manager-server.pem \
  --tls-private-key-file=/etc/kubernetes/ssl/controller-manager-server-key.pem \
  --use-service-account-credentials=true \
  --alsologtostderr=true \
  --logtostderr=false \
  --log-dir=/var/log/kubernetes \
  --v=2"
EOF

参数解释:

  • --kubeconfig:访问apiserver的配置文件,上面讲了,里面包含三个重要的信息,分别是集群、用户、上下文

  • --service-cluster-ip-range:集群中service的网段

  • --cluster-signing-cert-file:用于给集群中其他组件颁发用于访问apiserver证书的CA根证书,因为是给其他组件颁发证书,所以需要使用apiserver的客户端CA证书进行颁发,这里必须要和apiserver中--client-ca-file参数一致

  • --cluster-cidr:集群中POD的网段

  • --controllers:要启用的控制器列表

  • --cluster-signing-duration:所签发的证书有效期

  • --service-account-private-key-file:对Service Account Token进行签名,后续Service Account访问Api Server是,Api Server会对Service Account Token签名进行验证

  • --root-ca-file:如果此标志非空,则在服务账号的令牌 Secret 中会包含此根证书机构。该CA将包含在Service Account的Secret中,用于验证Api Server的服务端证书,所以这里一定是签发Api Server证书的CA根证书

  • --horizontal-pod-autoscaler-sync-period:HPA同步周期,默认15s

  • --tls-cert-file:对外提供服务的证书

创建controller manager服务配置文件

cat > /etc/systemd/system/kube-controller-manager.service << "EOF"
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/kube-controller-manager.conf
ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

同步安装其他Master节点

所有Master节点均需要安装kube-controller-manager,配置雷同

  1. 复制证书等相关文件

    cp apiserver-client-controller-manager-key.pem controller-manager-server-ca-key.pem controller-manager-server-key.pem apiserver-client-controller-manager.pem controller-manager-server-ca.pem controller-manager-server.pem /etc/kubernetes/ssl/
    scp apiserver-client-controller-manager-key.pem controller-manager-server-ca-key.pem controller-manager-server-key.pem apiserver-client-controller-manager.pem controller-manager-server-ca.pem controller-manager-server.pem i2:/etc/kubernetes/ssl/
    scp apiserver-client-controller-manager-key.pem controller-manager-server-ca-key.pem controller-manager-server-key.pem apiserver-client-controller-manager.pem controller-manager-server-ca.pem controller-manager-server.pem i3:/etc/kubernetes/ssl/
    
  2. 复制配置文件

    scp /etc/kubernetes/kube-controller-manager.conf i2:/etc/kubernetes/
    scp /etc/kubernetes/kube-controller-manager.conf i3:/etc/kubernetes/
    
    cp kube-controller-manager.kubeconfig /etc/kubernetes/
    scp kube-controller-manager.kubeconfig i2:/etc/kubernetes/
    scp kube-controller-manager.kubeconfig i3:/etc/kubernetes/
    
  3. 复制Unit服务配置文件

    scp /etc/systemd/system/kube-controller-manager.service i2:/etc/systemd/system/
    scp /etc/systemd/system/kube-controller-manager.service i3:/etc/systemd/system/
    

启动kube-controller-manager

systemctl daemon-reload
systemctl enable --now kube-controller-manager.service
systemctl status kube-controller-manager

部署kube-scheduler

生成scheduler客户端证书

创建用于访问apiserver的scheduler客户端证书请求文件

cat > apiserver-client-scheduler-csr.json << "EOF"
{
    "CN": "system:kube-scheduler",
    "hosts": [],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
      {
        "C": "CN",
        "ST": "GuangDong",
        "L": "ShenZhen",
        "O": "system:kube-scheduler",
        "OU": "System"
      }
    ]
}
EOF

这里尤其需要注意CN及O参数的值,一定要是system:kube-scheduler

生成证书及私钥

cfssl gencert -ca=cluster-ca.pem -ca-key=cluster-ca-key.pem -config=ca-config.json -profile=client apiserver-client-scheduler-csr.json | cfssljson  -bare apiserver-client-scheduler

创建kubeconfig文件

kubectl config set-cluster kubernetes --certificate-authority=apiserver-server-ca.pem --embed-certs=true --server=https://192.168.123.242:6443 --kubeconfig=kube-scheduler.kubeconfig
kubectl config set-credentials system:kube-scheduler --client-certificate=apiserver-client-scheduler.pem --client-key=apiserver-client-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig
kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig

创建配置文件

cat > /etc/kubernetes/kube-scheduler.conf << "EOF"
KUBE_SCHEDULER_OPTS="--bind-address=127.0.0.1 \
--kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \
--leader-elect=true \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2"
EOF

以上参数比较简单没什么好解释的

创建服务配置文件

cat > /etc/systemd/system/kube-scheduler.service << "EOF"
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/kube-scheduler.conf
ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

同步安装其他Master节点

所有的Master节点都需要安装kube-scheduler,相关配置和上面一样,无需任何更改

  1. 复制证书

    cp apiserver-client-scheduler.pem apiserver-client-scheduler-key.pem /etc/kubernetes/ssl/
    scp apiserver-client-scheduler.pem apiserver-client-scheduler-key.pem i2:/etc/kubernetes/ssl/
    scp apiserver-client-scheduler.pem apiserver-client-scheduler-key.pem i3:/etc/kubernetes/ssl/
    
  2. 复制配置文件

    scp /etc/kubernetes/kube-scheduler.conf i2:/etc/kubernetes/
    scp /etc/kubernetes/kube-scheduler.conf i3:/etc/kubernetes/
    
    cp kube-scheduler.kubeconfig /etc/kubernetes/
    scp kube-scheduler.kubeconfig i2:/etc/kubernetes/
    scp kube-scheduler.kubeconfig i3:/etc/kubernetes/
    
  3. 复制Unit服务文件

    scp /etc/systemd/system/kube-scheduler.service i2:/etc/systemd/system/
    scp /etc/systemd/system/kube-scheduler.service i3:/etc/systemd/system/
    

启动kube-scheduler

systemctl daemon-reload
systemctl enable --now kube-scheduler.service
systemctl status kube-scheduler 

部署Containerd

将安装包分发到所有节点

scp cri-containerd-cni-1.6.1-linux-amd64.tar.gz i2:~/
scp cri-containerd-cni-1.6.1-linux-amd64.tar.gz i3:~/

解压

将下载好的cri-containerd-cni-xxx.tar.gz直接解压到根目录,因为containerd的目录都是预定好的,直接解压到根目录二进制文件会自动保存到相应的目录

tar -zxvf cri-containerd-cni-1.6.1-linux-amd64.tar.gz -C /

创建containerd的配置文件

这里可以创建默认的配置文件,简易实用下面的配置文件,因为默认的config.toml没有开启cri插件

# 手动创建默认的配置文件目录
mkdir /etc/containerd/

# 将containerd的一些默认配置输出到config.toml文件
containerd config default > /etc/containerd/config.toml

# 配置cri配置文件,参考https://kubernetes.io/zh-cn/docs/setup/production-environment/container-runtimes/
cat >/etc/containerd/config.toml<<EOF
root = "/var/lib/containerd"
state = "/run/containerd"
oom_score = -999

[grpc]
  address = "/run/containerd/containerd.sock"
  uid = 0
  gid = 0
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

[debug]
  address = ""
  uid = 0
  gid = 0
  level = ""

[metrics]
  address = ""
  grpc_histogram = false

[cgroup]
  path = ""

[plugins]
  [plugins.cgroups]
    no_prometheus = false
  [plugins.cri]
    stream_server_address = "127.0.0.1"
    stream_server_port = "0"
    enable_selinux = false
    sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
    stats_collect_period = 10
    systemd_cgroup = true
    enable_tls_streaming = false
    max_container_log_line_size = 16384
    [plugins.cri.containerd]
      snapshotter = "overlayfs"
      no_pivot = false
      [plugins.cri.containerd.default_runtime]
        runtime_type = "io.containerd.runtime.v1.linux"
        runtime_engine = ""
        runtime_root = ""
      [plugins.cri.containerd.untrusted_workload_runtime]
        runtime_type = ""
        runtime_engine = ""
        runtime_root = ""
    [plugins.cri.cni]
      bin_dir = "/opt/cni/bin"
      conf_dir = "/etc/cni/net.d"
      conf_template = "/etc/cni/net.d/10-default.conf"
    [plugins.cri.registry]
      [plugins.cri.registry.mirrors]
        # 这里由于我使用的海外服务器,所以暂不配置
    [plugins.cri.x509_key_pair_streaming]
      tls_cert_file = ""
      tls_key_file = ""
  [plugins.diff-service]
    default = ["walking"]
  [plugins.linux]
    shim = "containerd-shim"
    runtime = "runc"
    runtime_root = ""
    no_shim = false
    shim_debug = false
  [plugins.opt]
    path = "/opt/containerd"
  [plugins.restart]
    interval = "10s"
  [plugins.scheduler]
    pause_threshold = 0.02
    deletion_threshold = 0
    mutation_threshold = 100
    schedule_delay = "0s"
    startup_delay = "100ms"
EOF

这里根据需求配置registry

[plugins.cri.registry.mirrors."docker.io"]
  endpoint = [
    "https://docker.mirrors.ustc.edu.cn",
    "http://hub-mirror.c.163.com"
  ]
[plugins.cri.registry.mirrors."gcr.io"]
  endpoint = [
    "https://gcr.mirrors.ustc.edu.cn"
  ]
[plugins.cri.registry.mirrors."k8s.gcr.io"]
  endpoint = [
    "https://gcr.mirrors.ustc.edu.cn/google-containers/"
  ]
[plugins.cri.registry.mirrors."quay.io"]
  endpoint = [
    "https://quay.mirrors.ustc.edu.cn"
  ]
[plugins.cri.registry.mirrors."harbor.dev-james.xyz"]
  endpoint = [
    "http://harbor.dev-james.xyz"
  ]

安装runc

将下载的runc-amd64赋予执行的权限,然后将其copy到/usr/local/sbin/runc

chmod +x runc.amd64
mv runc.amd64 runc
cp runc /usr/local/sbin/
scp runc i2:/usr/local/sbin/
scp runc i3:/usr/local/sbin/

启动Containerd

systemctl daemon-reload
systemctl enable --now containerd.service

同步安装其他Node节点

所有的Node节点都需要安装Containerd,配置参考以上即可

创建TLS Bootstrapping需要的资源

在使用动态Token进行认证时需要创建Secret和ClusterRoleBinding

创建Secret

cat > bootstrap-token-secret.yaml << EOF
apiVersion: v1
kind: Secret
metadata:
  # name 必须是 "bootstrap-token-<token id>" 格式的
  name: bootstrap-token-d34ace
  namespace: kube-system

# type 必须是 'bootstrap.kubernetes.io/token'
type: bootstrap.kubernetes.io/token
stringData:
  # 供人阅读的描述,可选。
  description: "The default bootstrap token generated by 'kubeadm init'."

  # 令牌 ID 和秘密信息,必需。
  token-id: d34ace
  token-secret: d1ba5634578fb1cc

  # 允许的用法
  usage-bootstrap-authentication: "true"
  usage-bootstrap-signing: "true"

  # 令牌要认证为的额外组,必须以 "system:bootstrappers:" 开头
  auth-extra-groups: system:bootstrappers:default-node-token,system:bootstrappers:worker,system:bootstrappers:ingress
EOF

使用kubectl进行创建Secret

kubectl apply -f bootstrap-token-secret.yaml

创建ClusterRoleBinding

cat > bootstrap-token-rbac.yaml << EOF
# enable bootstrapping nodes to create CSR
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: create-csrs-for-bootstrapping
subjects:
- kind: Group
  name: system:bootstrappers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:node-bootstrapper
  apiGroup: rbac.authorization.k8s.io
EOF

创建资源

kubectl apply -f bootstrap-token-rbac.yaml

自动批复

允许 kubelet 请求并接收新的证书

cat > approve-new.yaml << EOF
# 批复 "system:bootstrappers" 组的所有 CSR
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: auto-approve-csrs-for-group
subjects:
- kind: Group
  name: system:bootstrappers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
  apiGroup: rbac.authorization.k8s.io
EOF

创建资源

kubectl apply -f approve-new.yaml

允许 kubelet 对其客户端证书执行续期操作

cat > approve-rotate.yaml << EOF
# 批复 "system:nodes" 组的 CSR 续约请求
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: auto-approve-renewals-for-nodes
subjects:
- kind: Group
  name: system:nodes
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
  apiGroup: rbac.authorization.k8s.io
EOF

创建资源

kubectl apply -f approve-rotate.yaml

部署kubelet

创建kubeletConfiguration

该文件可以是yaml格式,也可以是json格式,这里采用json格式

cat >/etc/kubernetes/kubelet.json<<EOF
{
  "kind": "KubeletConfiguration",
  "apiVersion": "kubelet.config.k8s.io/v1beta1",
  "authentication": {
    "x509": {
      "clientCAFile": "/etc/kubernetes/ssl/kubelet-client-ca.pem"
    },
    "webhook": {
      "enabled": true,
      "cacheTTL": "2m0s"
    },
    "anonymous": {
      "enabled": false
    }
  },
  "authorization": {
    "mode": "Webhook",
    "webhook": {
      "cacheAuthorizedTTL": "5m0s",
      "cacheUnauthorizedTTL": "30s"
    }
  },
  "serverTLSBootstrap": true,
  "address": "192.168.123.246",
  "port": 10250,
  "readOnlyPort": 10255,
  "cgroupDriver": "systemd",                    
  "hairpinMode": "promiscuous-bridge",
  "serializeImagePulls": false,
  "clusterDomain": "cluster.local.",
  "clusterDNS": ["10.96.0.2"]
}
EOF

参数解释

  • clientCAFile:当kubelet作为服务器时,apiserver作为客户端访问携带证书,该参数用于指定验证客户端证书的CA根证书。

  • clusterDNS:集群DNS服务器IP地址,通常情况下是Service网段的第二个IP地址,第一个IP地址被分配给apiserver

  • address:当前节点IP地址

生成bootstrap kubeconfig文件

生成kubelet-bootstrap.kubeconfig文件

kubectl config set-cluster kubernetes --certificate-authority=apiserver-server-ca.pem --embed-certs=true --server=https://192.168.123.242:6443 --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl config set-credentials kubelet-bootstrap --token=d34ace.d1ba5634578fb1cc --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig
kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig

这里和之前最大的不同在于第二行,设置客户端认证的时候,之前设置的是客户端证书,而这里设置的是token,由于kubelet不需要手动颁发证书,所以该token用于访问apiserver进行证书颁发,等颁发证书后才进行客户端证书进行认证。

创建配置文件

cat > /etc/kubernetes/kubelet.conf << "EOF"
KUBELET_OPTS="--bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \
  --cert-dir=/etc/kubernetes/ssl \
  --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
  --config=/etc/kubernetes/kubelet.json \
  --container-runtime=remote \
  --container-runtime-endpoint=unix:///run/containerd/containerd.sock \
  --rotate-certificates \
  --rotate-server-certificates \
  --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.6 \
  --root-dir=/etc/cni/net.d \
  --alsologtostderr=true \
  --logtostderr=false \
  --log-dir=/var/log/kubernetes \
  --v=2"
EOF

这里需要注意的是,和其他组件不同,由于kubelet属于node节点上的组件,而Node节点具有可伸缩以及不确定的特性,如果使用传统的方式进行颁发证书,当有新的节点加入并且很多节点的时候,这个时候颁发证书将是一件很繁琐的事情,鉴于此,kubernetes对此做了优化,也就是在kubelet启动时,先使用token访问apiserver,这个token会关联到一个特定的用户和组,该组有且只有颁发证书的权限,为每一个新加入的kubelet颁发证书,

  1. kubelet 为自己创建一个 CSR,并将其 signerName 设置为 kubernetes.io/kube-apiserver-client-kubelet
  2. API 服务器从 kubelet 收到证书请求并对这些请求执行身份认证, 但真正负责发放签名证书的是控制器管理器(controller-manager)。

kubelet取回证书后保存到kubelet所在节点的指定目录,并生成一个kubelet.kubeconfig文件保存到--kubeconfig指定的目录,而当下一次kubelet宕机等重启的情况,kubelet首先会找kubeconfig参数所指定的kubeconfig文件进行认证,如果存在,则认证,不存在,则走上面的流程。

参考:https://v1-24.docs.kubernetes.io/zh-cn/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/

注意:TLS 启动引导所提供的客户端证书默认被签名为仅用于 client auth(客户端认证), 因此不能作为提供服务的证书,或者 server auth。但是我们可以启用服务端证书,通过kubelet.json文件中指定配置serverTLSBootstrap: true即可启用,出于安全原因,Kubernetes 核心中所实现的 CSR 批复控制器并不会自动批复节点的服务证书。*

参数解释:

  • --bootstrap-kubeconfig:用于首次启动kubelet时--kubeconfig参数所指定的kubeconfig文件不存在时用于申请证书访问apiserver所用到的token认证配置文件。

  • --container-runtime-endpoint:指定运行的容器地址,本地直接使用sock文件进行连接

  • --pod-infra-container-image:所指定的镜像不会被镜像垃圾收集器删除

创建服务配置文件

cat > /etc/systemd/system/kubelet.service << "EOF"
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=containerd.service
Requires=containerd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet.conf
WorkingDirectory=/var/lib/kubelet
ExecStart=/usr/local/bin/kubelet $KUBELET_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

创建目录

mkdir -p /var/lib/kubelet

同步安装其他Node节点

将所需要的配置文件及证书等复制到相应的node节点,需要修改kubelet.json文件中的address为当前节点IP

  1. 复制配置文件

    scp /etc/kubernetes/kubelet.json i2:/etc/kubernetes/
    scp /etc/kubernetes/kubelet.json i3:/etc/kubernetes/
    
    scp /etc/kubernetes/kubelet.conf i2:/etc/kubernetes/
    scp /etc/kubernetes/kubelet.conf i3:/etc/kubernetes/
    
    cp kubelet-bootstrap.kubeconfig /etc/kubernetes/
    scp kubelet-bootstrap.kubeconfig i2:/etc/kubernetes/
    scp kubelet-bootstrap.kubeconfig i3:/etc/kubernetes/
    
  2. 复制Unit服务文件

    scp /etc/systemd/system/kubelet.service i2:/etc/systemd/system/
    scp /etc/systemd/system/kubelet.service i3:/etc/systemd/system/
    

启动kubelet

systemctl daemon-reload
systemctl enable --now kubelet
systemctl status kubelet

通过查看日志journalctl -xeu kubelet可能会出现一下问题:

  • Failed to contact API server when waiting for CSINode publishing: Unauthorized
    该问题是kubelet所需要的证书未生成成功,可能的原因是token认证失败
    如需重新生成新的证书,则需要删除以下文件

    1. rm -rf /etc/kubernetes/kubelet.kubeconfig

    2. rm -rf /etc/kubernetes/ssl/kubelet.*

    3. rm -rf /etc/kubernetes/ssl/kubelet-client-current.*

  • kubectl get nodes查看node状态发现状态为NotReady
    出现该问题的主要元素是kubelet的CSR请求没有自动审批,所以这里可以手动进行审批
    查看所有的CSR请求 kubectl get csr 删除所有的CSR请求 kubectl get csr --no-headers=true | awk '{print $1}' | xargs kubectl delete csr 审批所有处于Pending状态的CSR请求 kubectl get csr | awk '/Pending/ {print $1}' | xargs kubectl certificate approve

  • 这里可能会出现很多奇奇怪怪的问题,比如少cp了配置文件等,一定要细心排查

部署kube-proxy

生成proxy客户端证书

创建用于访问apiserver的客户端证书请求文件

cat > apiserver-client-proxy-csr.json << "EOF"
{
  "CN": "system:kube-proxy",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ]
}
EOF

生成证书及私钥

cfssl gencert -ca=cluster-ca.pem -ca-key=cluster-ca-key.pem -config=ca-config.json -profile=client apiserver-client-proxy-csr.json | cfssljson  -bare apiserver-client-proxy

创建kubeconfig文件

kubectl config set-cluster kubernetes --certificate-authority=apiserver-server-ca.pem --embed-certs=true --server=https://192.168.123.242:6443 --kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy --client-certificate=apiserver-client-proxy.pem --client-key=apiserver-client-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

创建配置文件

cat > /etc/kubernetes/kube-proxy.yaml << "EOF"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 192.168.123.244
clientConnection:
  kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
clusterCIDR: 172.18.0.0/16
healthzBindAddress: 192.168.123.244:10256
kind: KubeProxyConfiguration
metricsBindAddress: 192.168.123.244:10249
mode: "ipvs"
EOF

参数解释:

  • bindAddress:绑定到本机IP地址对应的网卡

  • clusterCIDR:Pod网段

  • mode:这里使用ipvs,原因在于它比iptables性能更高

创建服务配置文件

cat >  /etc/systemd/system/kube-proxy.service << "EOF"
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target

[Service]
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \
  --config=/etc/kubernetes/kube-proxy.yaml \
  --alsologtostderr=true \
  --logtostderr=false \
  --log-dir=/var/log/kubernetes \
  --v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

创建目录

mkdir -p /var/lib/kube-proxy

同步安装其他Master节点

所有的Master节点都需要安装kube-proxy,需要修改/etc/kubernetes/kube-proxy.yaml文件

  1. 复制证书

    cp apiserver-client-proxy.pem apiserver-client-proxy-key.pem /etc/kubernetes/ssl/
    scp apiserver-client-proxy.pem apiserver-client-proxy-key.pem i2:/etc/kubernetes/ssl/
    scp apiserver-client-proxy.pem apiserver-client-proxy-key.pem i3:/etc/kubernetes/ssl/
    
  2. 复制配置文件

    scp /etc/kubernetes/kube-proxy.yaml i2:/etc/kubernetes/
    scp /etc/kubernetes/kube-proxy.yaml i3:/etc/kubernetes/
    
    cp kube-proxy.kubeconfig /etc/kubernetes/
    scp kube-proxy.kubeconfig i2:/etc/kubernetes/
    scp kube-proxy.kubeconfig i3:/etc/kubernetes/
    
  3. 复制Unit服务文件

    scp /etc/systemd/system/kube-proxy.service i2:/etc/systemd/system/
    scp /etc/systemd/system/kube-proxy.service i3:/etc/systemd/system/
    

启动kube-proxy

systemctl daemon-reload
systemctl enable --now kube-proxy
systemctl status kube-proxy

安装Calico

安装Calico比较简单,只需要下载一个yaml配置文件,然后在kubernetes集群中创建相应的资源就可以了,如下:

curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.0/manifests/calico.yaml -O

当然了,这里不能直接执行以上命令,需要将文件下载下来,改一下Pod网段

- name: CALICO_IPV4POOL_CIDR
  value: "172.18.0.0/16"

执行kubectl apply -f calico.yaml即可

安装CoreDNS

创建yaml资源清单文件

cat >  coredns.yaml << "EOF"
apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
  - apiGroups:
    - ""
    resources:
    - endpoints
    - services
    - pods
    - namespaces
    verbs:
    - list
    - watch
  - apiGroups:
    - discovery.k8s.io
    resources:
    - endpointslices
    verbs:
    - list
    - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local  in-addr.arpa ip6.arpa {
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/name: "CoreDNS"
spec:
  # replicas: not specified here:
  # 1. Default is 1.
  # 2. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: coredns
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      nodeSelector:
        kubernetes.io/os: linux
      affinity:
         podAntiAffinity:
           preferredDuringSchedulingIgnoredDuringExecution:
           - weight: 100
             podAffinityTerm:
               labelSelector:
                 matchExpressions:
                   - key: k8s-app
                     operator: In
                     values: ["kube-dns"]
               topologyKey: kubernetes.io/hostname
      containers:
      - name: coredns
        image: docker.io/coredns/coredns:1.8.4
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          readOnlyRootFilesystem: true
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.96.0.2
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
  - name: metrics
    port: 9153
    protocol: TCP

EOF

创建资源

kubectl apply -f coredns.yaml

使用NFS作为kubernetes持久化存储

在存储节点上安装nfs-utils

yum install -y nfs-utils

以上命令就安装好了nfs-server以及rpcbind

创建目录并授权

mkdir -p /opt/data/kubernetes
chmod -R 777 /opt/data/kubernetes

导出服务

/opt/data/kubernetes *(rw,no_root_squash,sync)

在/etc/exports文件中添加如上代码

exportfs -r
exportfs

启动服务

systemctl deamon-reload
systemctl enable --now nfs-server
systemctl enable --now rpcbind

kubernetes所有node节点上安装nfs-utils

在所有node节点上安装nfs-utils

创建RBAC资源文件

为使用NFS所需要用到的账户授权

cat >  sc-rbac.yaml << "EOF"
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io
EOF

执行kubectl apply -f sc-rbac.yaml创建资源

nfs-client

cat >  sc-deployment.yaml << "EOF"
kind: Deployment
apiVersion: apps/v1
metadata:
  name: nfs-client-provisioner
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
          # image: k8s.dockerproxy.com/sig-storage/nfs-subdir-external-provisioner:v4.0.2
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: k8s-sigs.io/nfs-subdir-external-provisioner
            - name: NFS_SERVER
              # value: <YOUR NFS SERVER HOSTNAME>
              value: 192.168.123.247
            - name: NFS_PATH
              # value: /var/nfs
              value: /opt/data/kubernetes
      volumes:
        - name: nfs-client-root
          nfs:
            # server: <YOUR NFS SERVER HOSTNAME>
            server: 192.168.123.247
            path: /opt/data/kubernetes
EOF

执行kubectl apply -f sc-deployment.yaml创建资源

StorageClass

cat >  sc.yaml << "EOF"
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-client
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
  pathPattern: "${.PVC.namespace}/${.PVC.name}" # 此处使用 "${.PVC.namespace}/${.PVC.name}" 来使用pvc的名称作为nfs中真实目录名称
  onDelete: delete
EOF

这里使用{.PVC.namespace}/{.PVC.name}来自动创建命名,并且目录名称为命名空间下PVC的名称

执行kubectl apply -f sc.yaml创建资源

安装Dashobard

wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

将资源清单文件下载下来,然后将Service的暴露方式改为NodePort,通常情况下会使用ingress作为统一入口,参考下一节,ingress代理dashboard

创建Token及服务账户

cat >  dashboard-sa.yaml << "EOF"
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: admin-user
    namespace: kubernetes-dashboard
EOF

创建一个具有更高权限的SA,然后再为该SA创建Token,在1.22版本之前,创建SA会自动的为其创建Token Secret,但在1.24版本后,所有请求Token的方式改为TokenRequest Api来完成,所有,需要通过如下命令来为SA创建Token,该Token是包含有效期的

kubectl create token -n kubernetes-dashboard admin-user

当然,我们也可以使用之前的方式为SA创建永久的Token

apiVersion: v1
kind: Secret
metadata:
  namespace: kubernetes-dashboard
  name: admin-user-secret
  annotations:
    kubernetes.io/service-account.name: admin-user
type: kubernetes.io/service-account-token

查看Token

kubectl describe secret admin-user-secret -n kubernetes-dashboard

安装Ingress Controller

ingress controller作为外部访问的入口,ingress作为配置controller的一种资源定义

wget https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.0/deploy/static/provider/cloud/deploy.yaml

将配置文件下载下来,然后将Service类型改为NodePort,默认为LoadBalancer,另外还有个参数比较坑externalTrafficPolicy: Local

当将 externalTrafficPolicy 设置为 Local 时,Ingress Controller 会尝试将入站请求直接转发到与目标 Service 在同一节点上运行的 Pod。换句话说,它会将请求限制在本地节点,不会将请求转发到集群中其他节点上的 Pod。

如下:

# 省略
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.8.0
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - appProtocol: http
    name: http
    port: 80
    protocol: TCP
    targetPort: http
    nodePort: 30080
  - appProtocol: https
    name: https
    port: 443
    protocol: TCP
    targetPort: https
    nodePort: 30443
  selector:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
  type: NodePort
---
# 省略

IngressController负载均衡(NodePort端口映射)

IngressController已经安装好了,但是总感觉差点意思:

  1. 由于使用的是NodePort类型的Service,所有端口受限制,通常情况下,NodePort端口范围为30000-32767,然而IngressController作为应用入口,应该使用80/443端口

  2. NodePort旨在每个Node节点上开发端口,这里需要做负载均衡才能保证服务的高可用

解决思路:

在集群外在加一层四层及以下代理,为什么这个代理要四层以下,我们可以假设一下,IngressController的作用是什么,就是反向代理,例如Nginx,主要通过主机名及path进行分发请求到对应的后端服务器,如果再添加一层7层代理,第一个是冗余,第二个是可能会应为重复的配置导致无法路由到理想的后端服务器,第三个,四层代理性能上比七层代理性能更改,因为它不需要通过应用成的解封包。

这里使用Nginx的四层代理对IngressController进行负载均衡,并且这里的Nginx是有主备节点的,并且通过Keepalived维护主备节点的切换,当然也可以使用HAPorxy以及LVS(LVS这里是用不了,前面已经说了),这里需要注意的是,要使用Nginx的四层代理,由于它的七层代理默认监听了80端口,所以,在四层代理IngressController需要监听80/443端口的时候会出现Nginx启动不了,但是通过nginx -t命令并没有发现任何异常,其实就是七层代理和四层代理的端口出现冲突,所以,这里就是用之前代理kube-apiserver的Nginx服务器做反向代理,这样的话,就实现了80/443端口转发到svc的3xxxx端口,并且实现了所有Node节点的负载均衡。

15.2 使用Ingress代理Dashboard

在使用Ingress代理Dashboard时需要注意以下几点:

  1. 默认的Dashboard已开启了安全认证,也就是会自动生成服务器证书,但是当我们需要在前端搭建反向代理服务的时候,并不需要dashboard为我们提供SSL证书,所以,这里开启不安全的端口用于内网代理服务器访问,在资源清单文件中修改Deployment管理的资源容器kubernetes-dashboard,添加一下两行命令行参数:

    - --enable-insecure-login
    - --insecure-port=8080
    
  2. --auto-generate-certificates注释

  3. 暴露端口,将端口8443改为非安全监听端口8080

  4. 存活探针,将存活探针检查端口改为8080,并将https协议改为http

  5. Service端口映射改为80映射到8080

  6. 生成TLS证书,这里需要使用可信的CA,可以使用云厂商的免费证书;如果没有随便使用哪个CA签名都会被认为是不安全的,所以,随便找个CA就可以了

    cat > dashboard-server-csr.json <<"EOF"
    {
    "CN": "dashboard-server",
    "hosts": [
      "127.0.0.1",
      "192.168.123.242",
      "43.153.194.72",
      "dashboard.k8s.dev-james.xyz"
    ],
    "key": {
      "algo": "rsa",
      "size": 2048
    },
    "names": [{
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }]
    }
    EOF
    
  7. 生成证书及私钥

    cfssl gencert -ca=cluster-ca.pem -ca-key=cluster-ca-key.pem -config=ca-config.json -profile=server dashboard-server-csr.json | cfssljson  -bare dashboard-server
    
  8. 创建证书secret

    kubectl create secret tls dashboard-cert-secret --cert=dashboard-server.pem --key=dashboard-server-key.pem -n kubernetes-dashboard
    
  9. 在ingress中配置证书

    这里所有的rule默认都是http协议,当配置了tls,指定hosts为某个域名则表示支持https访问

    cat > dashboard-ingress.yaml <<"EOF"
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      namespace: kubernetes-dashboard
      name: dashboard-ingress
    spec:
      ingressClassName: nginx
      rules:
        - host: dashboard.k8s.dev-james.xyz
          http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: kubernetes-dashboard
                    port: 
                      number: 8080
      tls:
        - hosts:
       - dashboard.k8s.dev-james.xyz
          secretName: dashboard-cert-secret
    EOF
    
  10. 通过域名就可以访问dashboard了

安装Metrics Server

wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 或者
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml

这里会出现很多很多的问题

  1. kubelet自签名证书的问题
E0908 15:28:39.751310       1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.68.14.125:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 10.68.14.125 because it doesn't contain any IP SANs" node="scw-sharp-cray"

该问题主要原因是kubelet签发的证书是kubelet自签名证书,其中hosts字段只指定了主机名,而为指定ip,当metrics server访问kubelet获取指标信息时,验证kubelet身份发现该证书并不能用于验证当前IP,所有会出现该问题

解决方案:将kubelet.json配置文件中添加serverTLSBootstrap: true参数,表示使用controller-manager为kubelet签发证书,该证书包含了主机名和当前节点IP地址。

  1. Mertrics Server没有内置KubeletCA的问题
E0622 10:35:51.062150       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.123.245:10250/metrics/resource\": x509: certificate signed by unknown authority" node="i2"
E0622 10:35:51.064470       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.123.246:10250/metrics/resource\": x509: certificate signed by unknown authority" node="i3"
E0622 10:35:51.066759       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.123.244:10250/metrics/resource\": x509: certificate signed by unknown authority" node="i1"

该问题主要是应为Metrics Server作为客户端访问kubelet时,因为Metrics Server没有kubelet的服务器根证书,无法验证kubelet-server的身份,所有这里创建一个configmap将签发kubelet-server的CA保存起来,挂载到Metrics Server中。

kubectl -n kube-system create configmap kubelet-server-ca --from-file=ca.pem=/etc/kubernetes/ssl/apiserver-client-ca.pem -o yaml

将configmap挂载到metrics service的容器中

     containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-certificate-authority=/ca/ca.pem
        # ...
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
        - mountPath: /ca
          name: ca-dir
     # ...
      volumes:
      - emptyDir: {}
        name: tmp-dir
      - name: ca-dir
        configMap:
          name: kubelet-server-ca

如上,添加--kubelet-certificate-authority=/ca/ca.pem

  1. 没有启用API聚合导致的问题
configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"

该问题是Metrics Server提供了一个扩展API,但是Api Server并没有开启API聚合功能,所有Metrics在访问Apiserver时出现问题,API聚合的运行流程是,当客户端需要访问扩展API时,首先会经过ApiServer,然后由ApiServer将请求转发到扩展ApiServer,然而,ApiServer和扩展ApiServer之间也需要保证请求的安全性,所以,这里也需要证书,后续扩展ApiServer需要验证等操作会从kube-system命名空间中找到extension-apiserver-authentication的configmap中的CA证书进行验证登操作。

解决方案:

创建一个用于扩展ApiServer的CA证书

cat > extra-apiserver-client-ca-csr.json <<"EOF"
{
  "CN": "extra-apiserver-client-ca",
  "key": {
      "algo": "rsa",
      "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "GuangDong",
      "L": "ShenZhen",
      "O": "Kubernetes",
      "OU": "System"
    }
  ],
  "ca": {
      "expiry": "87600h"
  }
}
EOF

生成CA证书及私钥

cfssl gencert -initca extra-apiserver-client-ca-csr.json | cfssljson -bare extra-apiserver-client-ca

通过extra-apiserver-client-ca创建客户端证书

cat > extra-proxy-client-csr.json <<"EOF"
{
  "CN": "aggregator",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "GuangDong",
    "L": "ShenZhen",
    "O": "system:masters",
    "OU": "System"
  }]
}
EOF

生成客户端证书及私钥

cfssl gencert -ca=extra-apiserver-client-ca.pem -ca-key=extra-apiserver-client-ca-key.pem -config=ca-config.json -profile=client extra-proxy-client-csr.json | cfssljson  -bare extra-proxy-client

修改kube-apiserver.conf配置文件,添加如下配置:

--requestheader-client-ca-file=/etc/kubernetes/ssl/extra-apiserver-client-ca.pem \
--requestheader-allowed-names=aggregator \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file=/etc/kubernetes/ssl/extra-proxy-client.pem \
--proxy-client-key-file=/etc/kubernetes/ssl/extra-proxy-client-key.pem \

另外,如果当前节点没有安装kube-proxy(也就是,当前主节点不作为node节点),需要额外添加参数--enable-aggregator-routing=true

详情参考:https://kubernetes.io/zh-cn/docs/tasks/extend-kubernetes/configure-aggregation-layer/

重启kube-apiserver

systemctl daemon-reload
systemctl restart kube-apiserver

常见问题

  1. 当apiserver需要进入Pod执行命令或者查看日志时,如果出现Forbidden (user=kubelet-client-apiserver, verb=create, resource=nodes, subresource=proxy)
    需要添加一个clusterrolebinding,为kubelet-client-apiserver授权 kubectl create clusterrolebinding kubelet-client-apiserver --clusterrole=cluster-admin --user=kubelet-client-apiserver 因为apiserver访问kubelet所使用的证书是kubelet client CA颁发的,而CN指定的名称为kubelet-client-apiserver,所以这里将该用户绑定到cluster-admin的ClusterRole上,就能拥有集群管理员的权限对kubelet中的资源进行访问

  2. Calico和kube-proxy的作用及关系
    Calico解决pod网络地址的分配以及pod之间的网络互通问题
    kube-proxy解决Service负载均衡,实现原理是根据ETCD配置创建ipvs规则,实际在Service NodePort模式中,Service并不接管流量出入,kube-proxy会根据配置直接将后端pod的IP作为当前服务的ip列表实现负载均衡
    每多一个Service就会在kube-ipvs0虚拟网卡中添加一个IP地址

  3. 无法下载image: gcr.io/google-samples/xtrabackup:1.0

    也可以使用 lank8s.cn,他们的对应关系 k8s.gcr.io –> lank8s.cn,gcr.io –> gcr.lank8s.cn

  4. 每个命名空间都有一个默认的Service Account账户,名称为default,通常被注入到Pod里面供其访问ApiServer使用,默认情况下,pod中使用的default ServiceAccount的token被挂载到容器中的/var/run/secrets/kubernetes.io/serviceaccount/token目录中,当然了,这里的token已经是采用TokenRequest请求的临时token,这一令牌默认会在一个小时之后或者 Pod 被删除时过期;所有default默认的Service Account并没有自动创建Secret