Kubernetes-v1.24版安装部署之基础环境准备

2023-04-24 17:30:01 浏览数 (1)

二进制安装Kubernete(k8s) v1.24.0

环境准备

主机名

角色

IP

安装软件

k8s-master.boysec.cn

代理节点

10.1.1.100

etcd、kueblet、kube-porxy、kube-apiserver、kube-controller-manager、kube-scheduler、Containerd

k8s-node01.boysec.cn

运算节点

10.1.1.120

etcd、kueblet、kube-porxy、Containerd

k8s-node02.boysec.cn

运算节点

10.1.1.130

etcd、kueblet、kube-porxy、Containerd

  • 3台vm,每台至少2g。
  • OS: CentOS 7.9
  • containerd:v1.6.4
  • kubernetes:v1.24
  • etcd:v3.3.22
  • flannel:v0.12.0
  • 证书签发工具CFSSL: V1.6.0

本次使用单master节点部署,需要多master请移步至一步步编译安装Kubernetes之master计算节点安装

安装CFSSL

CFSSL相关下载地址

代码语言:javascript复制
wget -O /usr/bin/cfssl https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssl_1.6.0_linux_amd64
wget -O /usr/bin/cfssljson https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssljson_1.6.0_linux_amd64

chmod  x /usr/local/bin/cfssl /usr/local/bin/cfssljson

创建CA证书JSON配置文件

  • CA证书签名请求
  • ca-config配置文件
代码语言:javascript复制
mkdir /opt/certs/ -p
cd /opt/certs/
cat > /opt/certs/ca-csr.json << EOF
{
    "CN": "kubernetes-ca",
    "hosts": [
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "beijing",
            "L": "beijing",
            "O": "system:masters",
            "OU": "kubernetes"
        }
    ],
    "ca": {
        "expiry": "876000h"
    }
}
EOF

## 生成CA公钥和私钥文件
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
代码语言:javascript复制
cat > /opt/certs/ca-config.json << EOF 
{
    "signing": {
        "default": {
            "expiry": "876000h"
        },
        "profiles": {
            "kubernetes": {
                "expiry": "876000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            },
            "etcd": {
                "expiry": "876000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}
EOF

k8s基本组件安装

安装Containerd

在k8s所有节点上安装Containerd作为Runtime

代码语言:javascript复制
cd /server/tools/
wget https://github.com/containerd/containerd/releases/download/v1.6.4/cri-containerd-cni-1.6.4-linux-amd64.tar.gz
mkdir /opt/containerd-1.6.4
tar xf cri-containerd-cni-1.6.4-linux-amd64.tar.gz -C /opt/containerd-1.6.4
cd /opt/containerd-1.6.4/
ln -s /opt/containerd-1.6.4/usr/local/bin/* /usr/local/bin/
## 服务启动文件
cp /opt/containerd-1.6.4/etc/systemd/system/containerd.service /usr/lib/systemd/system/

配置Containerd所需的模块

代码语言:javascript复制
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
## 加载模块
systemctl restart systemd-modules-load.service

配置Containerd所需的内核

代码语言:javascript复制
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF

# 加载内核
sysctl --system

配置runc支持

containerd 被设计成嵌入到一个更大的系统中,而不是直接由开发人员或终端用户使用。当 containerd 和 runC 成为标准化容器服务的基石后,上层的应用就可以直接建立在 containerd 和 runC 之上。我们的目的就是开发一个最小化容器系统,这需要containerd和runC的支持,使得Linux kernel在启动的时候,首先启动containerd而非init,并在容器中包含系统必要组件,如shell。但是containerd安装包中runc缺少undefined symbol: seccomp_notify_respond需要单独下载安装

代码语言:javascript复制
wget -O /usr/local/sbin/runc https://github.com/opencontainers/runc/releases/download/v1.1.2/runc.amd64
chmod  x /usr/local/sbin/runc

创建Containerd的配置文件

代码语言:javascript复制
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml

# 1. 修改Containerd的配置文件(任选1 2)
sed -i "s#SystemdCgroup = false#SystemdCgroup = true#g" /etc/containerd/config.toml
 
cat /etc/containerd/config.toml | grep SystemdCgroup
 
# 2. 找到containerd.runtimes.runc.options,在其下加入SystemdCgroup = true
 
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
              SystemdCgroup = true
    [plugins."io.containerd.grpc.v1.cri".cni]
# 3. 添加阿里镜像源
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
        endpoint = ["https://l2v84zex.mirror.aliyuncs.com"]  # 设置你的阿里镜像源
# 4. 将sandbox_image默认地址改为符合版本地址
    sandbox_image = "kubernetes/pause"                       # 默认可能会被墙。

配置cni网络

代码语言:javascript复制
cat > /etc/cni/net.d/10-flannel.conflist <<EOF
{
  "name": "flannel",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}
EOF
mkdir /opt/cni/bin -p
ln -s /opt/containerd-1.6.4/opt/cni/bin/* /opt/cni/bin/

启动并设置为开机启动

代码语言:javascript复制
systemctl daemon-reload
systemctl enable --now containerd

配置crictl客户端连接的运行时位置

代码语言:javascript复制
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
 
#测试
systemctl restart  containerd
crictl info

安装etcd集群

创建证书

代码语言:javascript复制
cat > /opt/certs/etcd-csr.json << EOF
{
    "CN": "etcd-peer",
    "hosts": [
        "10.1.1.100",
        "10.1.1.110",
        "10.1.1.120",
        "10.1.1.130"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "beijing",
            "L": "beijing",
            "O": "etcd",
            "OU": "Etcd Security"
        }
    ]
}
EOF

## 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd-csr.json |cfssljson -bare etcd
# 或者
cfssl gencert 
   -ca=ca.pem 
   -ca-key=ca-key.pem 
   -config=ca-config.json 
   -hostname=10.1.1.100,10.1.1.110,10.1.1.120,10.1.1.130 
   -profile=etcd 
   etcd-csr.json | cfssljson -bare etcd

安装etcd

etcd下载地址

代码语言:javascript复制
### 创建用户
useradd -s /sbin/nologin -M etcd

## 解压
cd /server/tools
tar xf etcd-v3.3.22-linux-amd64.tar.gz -C /opt
ln -s /opt/etcd-v3.3.22-linux-amd64 /opt/etcd

### 创建目录拷贝证书
mkdir -p /opt/etcd/{ssl,cfg}

### 将运维主机上生成的ca.pem、etcd-key.pem、etcd.pem拷贝到/opt/etcd/certs目录中,注意私钥文件权限600
chown etcd.etcd /opt/etcd/ssl/*
chmod 600 /opt/etcd/ssl/etcd-key.pem
  • etcd-1
  • etcd-2
  • etcd-3
  • etcd.service
代码语言:javascript复制
cat > /opt/etcd/cfg/etcd.config.yml << EOF 
name: 'etcd-1'
data-dir: /var/lib/etcd
wal-dir: /var/lib/etcd/wal
snapshot-count: 5000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://10.1.1.100:2380'
listen-client-urls: 'https://10.1.1.100:2379,http://127.0.0.1:2379'
max-snapshots: 3
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://10.1.1.100:2380'
advertise-client-urls: 'https://10.1.1.100:2379,http://127.0.0.1:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: 'etcd-1=https://10.1.1.100:2380,etcd-2=https://10.1.1.120:2380,etcd-3=https://10.1.1.130:2380'
initial-cluster-token: 'etcd-k8s-cluster'
initial-cluster-state: 'new'
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
  cert-file: '/opt/etcd/ssl/etcd.pem'
  key-file: '/opt/etcd/ssl/etcd-key.pem'
  client-cert-auth: true
  trusted-ca-file: '/opt/etcd/ssl/ca.pem'
  auto-tls: true
peer-transport-security:
  cert-file: '/opt/etcd/ssl/etcd.pem'
  key-file: '/opt/etcd/ssl/etcd-key.pem'
  peer-client-cert-auth: true
  trusted-ca-file: '/opt/etcd/ssl/ca.pem'
  auto-tls: true
debug: false
log-package-levels:
log-outputs: [default]
force-new-cluster: false
EOF
代码语言:javascript复制
cat > /opt/etcd/cfg/etcd.config.yml << EOF 
name: 'etcd-2'
data-dir: /var/lib/etcd
wal-dir: /var/lib/etcd/wal
snapshot-count: 5000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://10.1.1.120:2380'
listen-client-urls: 'https://10.1.1.120:2379,http://127.0.0.1:2379'
max-snapshots: 3
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://10.1.1.120:2380'
advertise-client-urls: 'https://10.1.1.120:2379,http://127.0.0.1:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: 'etcd-1=https://10.1.1.100:2380,etcd-2=https://10.1.1.120:2380,etcd-3=https://10.1.1.130:2380'
initial-cluster-token: 'etcd-k8s-cluster'
initial-cluster-state: 'new'
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
  cert-file: '/opt/etcd/ssl/etcd.pem'
  key-file: '/opt/etcd/ssl/etcd-key.pem'
  client-cert-auth: true
  trusted-ca-file: '/opt/etcd/ssl/ca.pem'
  auto-tls: true
peer-transport-security:
  cert-file: '/opt/etcd/ssl/etcd.pem'
  key-file: '/opt/etcd/ssl/etcd-key.pem'
  peer-client-cert-auth: true
  trusted-ca-file: '/opt/etcd/ssl/ca.pem'
  auto-tls: true
debug: false
log-package-levels:
log-outputs: [default]
force-new-cluster: false
EOF
代码语言:javascript复制
cat > /opt/etcd/cfg/etcd.config.yml << EOF 
name: 'etcd-3'
data-dir: /var/lib/etcd
wal-dir: /var/lib/etcd/wal
snapshot-count: 5000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://10.1.1.130:2380'
listen-client-urls: 'https://10.1.1.130:2379,http://127.0.0.1:2379'
max-snapshots: 3
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://10.1.1.130:2380'
advertise-client-urls: 'https://10.1.1.130:2379,http://127.0.0.1:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: 'etcd-1=https://10.1.1.100:2380,etcd-2=https://10.1.1.120:2380,etcd-3=https://10.1.1.130:2380'
initial-cluster-token: 'etcd-k8s-cluster'
initial-cluster-state: 'new'
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
  cert-file: '/opt/etcd/ssl/etcd.pem'
  key-file: '/opt/etcd/ssl/etcd-key.pem'
  client-cert-auth: true
  trusted-ca-file: '/opt/etcd/ssl/ca.pem'
  auto-tls: true
peer-transport-security:
  cert-file: '/opt/etcd/ssl/etcd.pem'
  key-file: '/opt/etcd/ssl/etcd-key.pem'
  peer-client-cert-auth: true
  trusted-ca-file: '/opt/etcd/ssl/ca.pem'
  auto-tls: true
debug: false
log-package-levels:
log-outputs: [default]
force-new-cluster: false
EOF
代码语言:javascript复制
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Service
Documentation=https://coreos.com/etcd/docs/latest/
After=network.target
After=network-online.target
Wants=network-online.target
 
[Service]
Type=notify
ExecStart=/opt/etcd/etcd --config-file=/opt/etcd/cfg/etcd.config.yml
Restart=on-failure
RestartSec=10
LimitNOFILE=65536
 
[Install]
WantedBy=multi-user.target
Alias=etcd3.service
EOF

启动etcd

代码语言:javascript复制
systemctl daemon-reload
systemctl enable --now etcd

查看etcd集群状态

代码语言:javascript复制
export ETCDCTL_API=3
/opt/etcd/etcdctl --endpoints="10.1.1.100:2379,10.1.1.120:2379,10.1.1.130:2379" --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl//etcd.pem --key=/opt/etcd/ssl/etcd-key.pem  endpoint status --write-out=table
 ----------------- ------------------ --------- --------- ----------- ----------- ------------ 
|    ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
 ----------------- ------------------ --------- --------- ----------- ----------- ------------ 
| 10.1.1.100:2379 | 4988e076821369e3 |  3.3.22 |   20 kB |      true |        86 |          9 |
| 10.1.1.120:2379 | 2612ebaf51b393a5 |  3.3.22 |   20 kB |     false |        86 |          9 |
| 10.1.1.130:2379 | 8de0ef816eba4013 |  3.3.22 |   20 kB |     false |        86 |          9 |
 ----------------- ------------------ --------- --------- ----------- ----------- ------------ 

etcd常见报错

问题背景:

当前部署了 3 个 etcd 节点,突然有一天 3 台集群全部停电宕机了。重新启动之后发现 K8S 集群是可以正常使用的,但是检查了一遍组件之后,发现有一个节点的 etcd 启动不了。

经过一遍探查,发现时间不准确,通过以下命令 ntpdate ntp.aliyun.com 重新将时间调整正确,重新启动 etcd,发现还是起不来,报错如下:

代码语言:javascript复制
Jun 26 05:38:12 moban etcd: listening for peers on https://10.1.1.120:2380
Jun 26 05:38:12 moban etcd: ignoring client auto TLS since certs given
Jun 26 05:38:12 moban etcd: pprof is enabled under /debug/pprof
Jun 26 05:38:12 moban etcd: The scheme of client url http://127.0.0.1:2379 is HTTP while peer key/cert files are presented. Ignored key/cert files.
Jun 26 05:38:12 moban etcd: The scheme of client url http://127.0.0.1:2379 is HTTP while client cert auth (--client-cert-auth) is enabled. Ignored client cert auth for this url.

解决方法:

检查日志发现并没有特别明显的错误,根据经验来讲,etcd 节点坏掉一个其实对集群没有大的影响,这时集群已经可以正常使用了,但是这个坏掉的 etcd 节点并没有启动,解决方法如下:

进入 etcd 的数据存储目录进行备份 备份原有数据:

cd /var/lib/etcd/member/

cp /data/bak/

删除这个目录下的所有数据文件

rm -rf /var/lib/etcd/default.etcd/member/

停止另外两台 etcd 节点,因为 etcd 节点启动时需要所有节点一起启动,启动成功后即可使用。

0 人点赞