一、系统设置
环境
代码语言:javascript复制[root@st01015vm192 /]# cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
[root@st01015vm192 /]# uname -a
Linux st01015vm192 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
默认为root用户安装
1、关闭swap
代码语言:javascript复制临时关闭swap
swapoff -a
永久关闭 注释掉 /etc/fstab 中的下面配置
代码语言:javascript复制#/dev/mapper/centos-swap swap swap defaults 0 0
2、 关闭SELinux
kubelet不支持SELinux, 这里需要将SELinux设置为permissive模式
代码语言:javascript复制# 查看状态
# /usr/sbin/sestatus -v
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
3、关闭防火墙
代码语言:javascript复制systemctl disable firewalld
systemctl stop firewalld
4、配置sysctl
创建文件/etc/sysctl.d/k8s.conf, 文件内容如下
代码语言:javascript复制net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
执行
代码语言:javascript复制sysctl -p /etc/sysctl.d/k8s.conf
在RHEL/CentOS 7上由于 iptables 被绕过导致网络请求被错误的路由。您得保证 在您的 sysctl 配置中 net.bridge.bridge-nf-call-iptables 被设为1。 插件将容器连接到 Linux 网桥,插件必须将 net/bridge/bridge-nf-call-iptables 系统参数设置为1,以确保 iptables 代理正常工作。
最后,在内核中启用了 IP 转发(因此内核将处理桥接容器的数据包): sysctl net.ipv4.ip_forward=1 所有这些的结果是所有 Pods 都可以互相访问,并且可以将流量发送到互联网。
k8s网络插件 https://kubernetes.io/zh/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/
k8s 网络模型 https://kubernetes.io/zh/docs/concepts/cluster-administration/networking/
5、配置安装源为阿里
5.1 配置yum安装源
代码语言:javascript复制## 备份
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
## 下载阿里源
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
5.2 配置k8s源
vim /etc/yum.repos.d/kubernetes.repo
代码语言:javascript复制[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
5.3 重建yum缓存
代码语言:javascript复制yum clean all
yum makecache fast
yum -y update
二、安装docker
1、安装docker
可以参考官网文档 https://docs.docker.com/engine/install/centos/
- 卸载旧版本
yum remove docker
docker-client
docker-client-latest
docker-common
docker-latest
docker-latest-logrotate
docker-logrotate
docker-engine
- 安装docker
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager
--add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum install -y docker-ce docker-ce-cli containerd.io
如果想要安装指定版本docker
代码语言:javascript复制yum list docker-ce --showduplicates | sort -r
sudo yum install docker-ce-<VERSION_STRING> docker-ce-cli-<VERSION_STRING> containerd.io
2、docker配置
创建文件/etc/docker/daemon.json,写入配置 mkdir /etc/docker/ vim /etc/docker/daemon.json
代码语言:javascript复制{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
// 如果在国内安装,添加以下配置
// "registry-mirrors": [
// "https://registry.docker-cn.com"
// ],
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
代码语言:javascript复制国内docker源
"registry-mirrors": [
"https://1nj0zren.mirror.aliyuncs.com",
"https://docker.mirrors.ustc.edu.cn",
"http://f1361db2.m.daocloud.io",
"https://registry.docker-cn.com"
]
3、重启docker
代码语言:javascript复制mkdir -p /etc/systemd/system/docker.service.d
systemctl daemon-reload
systemctl restart docker
可能会报错,docker.service failed
代码语言:javascript复制[root@st01015vm193 ~]# journalctl -xe
May 08 15:36:09 st01015vm193 systemd[1]: Dependency failed for Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has failed.
--
-- The result is dependency.
May 08 15:36:09 st01015vm193 systemd[1]: Job docker.service/start failed with result 'dependency'.
May 08 15:36:09 st01015vm193 systemd[1]: Unit docker.socket entered failed state.
需要给系统添加一个docker组,如果做过基线配置的话,会提示没有权限,使用 chattr -i 增加一下权限
代码语言:javascript复制chattr -i /etc/group
groupadd docker
systemctl enable docker && systemctl start docker
三、集群安装
1、安装kubeadm, kubelet和kubectl
代码语言:javascript复制yum install -y kubelet kubeadm kubectl kubernetes-cni --disableexcludes=kubernetes
systemctl enable --now kubelet && systemctl start kubelet
2、使用kubeadm创建集群
只在master节点执行
代码语言:javascript复制# master节点执行:
kubeadm init
--apiserver-advertise-address 10.10.45.192
--pod-network-cidr=10.244.0.0/16
# --kubernetes-version=v1.15.0
# --image-repository=registry.aliyuncs.com/google_containers
# --apiserver-advertise-address 指定与其它节点通信的接口
# --pod-network-cidr 指定pod网络子网,使用fannel网络必须使用这个CIDR
可能会报错如下
代码语言:javascript复制[root@st01015vm192 ~]# kubeadm init
> --apiserver-advertise-address 10.10.45.192
> --pod-network-cidr=10.244.0.0/16
W0508 15:09:35.577282 28115 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Swap]: running with swap on is not supported. Please disable swap
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
加入这个参数:–ignore-preflight-errors=all
报错如下
代码语言:javascript复制[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
一般情况下是kubelet启动失败,查看状态如日志,查找原因,我这里是swap临时关闭的,导致机器重启后swap被打开了: systemctl status kubelet journalctl -xeu kubelet
安装成功后,有如下打印
代码语言:javascript复制Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.10.45.192:6443 --token 82scon.3zopf5qra2b1s25i
--discovery-token-ca-cert-hash sha256:2ea38c2a269d105b09bbf2964c089c067f0c8e7b44c0504b5854fd9acac263e0
3、用户设置权限(root用户也需要执行)
代码语言:javascript复制# master节点执行:
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
4、应用flannel网络
代码语言:javascript复制sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
## 查看flannal是否安装成功
sudo kubectl -n kube-system get po -l app=flannel -o wide
5、节点加入
按照在master节点上构建集群后的打印,执行节点加入集群操作 kubeadm join 10.10.45.192:6443 --token 82scon.3zopf5qra2b1s25i –discovery-token-ca-cert-hash sha256:2ea38c2a269d105b09bbf2964c089c067f0c8e7b44c0504b5854fd9acac263e0
代码语言:javascript复制W0508 15:42:29.235510 5131 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[WARNING Hostname]: hostname "st01015vm194" could not be reached
[WARNING Hostname]: hostname "st01015vm194": lookup st01015vm194 on 192.168.16.24:53: no such host
修改/etc/hosts,添加hosts
代码语言:javascript复制10.10.45.192 st01015vm192
10.10.45.193 st01015vm193
10.10.45.194 st01015vm194
分别在节点上执行加入集群操作,执行完成后,在master节点上查看节点状态:
代码语言:javascript复制[root@st01015vm192 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
st01015vm192 Ready master 16d v1.18.2
st01015vm193 NotReady <none> 29s v1.18.2
st01015vm194 NotReady <none> 6m43s v1.18.2
发现节点状态一直是NotReady,检查pod状态 kubectl get pods -n kube-system -o wide
代码语言:javascript复制kube-flannel-ds-amd64-sdvpf 0/1 Init:ImagePullBackOff 0 8m12s 10.10.45.193 st01015vm193 <none> <none>
kube-flannel-ds-amd64-x58td 0/1 Init:ImagePullBackOff 0 14m 10.10.45.194 st01015vm194 <none> <none>
发现子节点上的flannel pod报错,查看详细信息 kubectl describe pod -n kube-system kube-flannel-ds-amd64-sdvpf
代码语言:javascript复制Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned kube-system/kube-flannel-ds-amd64-sdvpf to st01015vm193
Warning Failed 2m22s (x2 over 4m33s) kubelet, st01015vm193 Failed to pull image "quay.io/coreos/flannel:v0.12.0-amd64": rpc error: code = Unknown desc = context canceled
Warning Failed 2m22s (x2 over 4m33s) kubelet, st01015vm193 Error: ErrImagePull
Normal BackOff 2m11s (x2 over 4m33s) kubelet, st01015vm193 Back-off pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
Warning Failed 2m11s (x2 over 4m33s) kubelet, st01015vm193 Error: ImagePullBackOff
Normal Pulling 118s (x3 over 7m28s) kubelet, st01015vm193 Pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
Event中显示镜像拉取失败,这个可能是网络问题,pod运行失败后,会尝试重新运行,所以耐心等待一会,或者在失败的节点上手动拉取一下镜像,并且修改deployment中的imagePullPolicy: Always
By default, the kubelet will try to pull each image from the specified registry. However, if the imagePullPolicy property of the container is set to IfNotPresent or Never, then a local image is used (preferentially or exclusively, respectively).
修改为 imagePullPolicy: IfNotPresent
docker pull quay.io/coreos/flannel:v0.12.0-amd64
代码语言:javascript复制[root@st01015vm192 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
st01015vm192 Ready master 17d v1.18.2
st01015vm193 Ready <none> 33m v1.18.2
st01015vm194 Ready <none> 39m v1.18.2
四、安装网页界面 (Dashboard)
默认情况下不会部署 Dashboard。可以通过以下命令部署:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml 为了保护您的集群数据,默认情况下,Dashboard 会使用最少的 RBAC 配置进行部署。 当前,Dashboard 仅支持使用 Bearer 令牌登录。
所以,我们需要下载yaml并进行配置 我们将以下配置进行修改
代码语言:javascript复制kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
ports:
- port: 443
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
修改后如下
代码语言:javascript复制kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
nodePort: 30001
selector:
k8s-app: kubernetes-dashboard
修改镜像拉取策略为IfNotPresent或Never,并在k8s所有节点上拉取镜像,这一步是为了避免pod创建时因为网络原因拉取镜像失败 docker pull kubernetesui/metrics-scraper:v1.0.4 docker pull kubernetesui/dashboard:v2.0.0
如果某一台机器上已经拉取镜像成功了,而其他机器一直拉取不成功,可以将镜像备份后在不成功的机器上还原 备份
代码语言:javascript复制docker save -o dashboard.tar kubernetesui/dashboard:v2.0.0
docker save -o metrics-scraper.tar kubernetesui/metrics-scraper:v1.0.4
还原
代码语言:javascript复制docker load -i dashboard.tar
docker load -i metrics-scraper.tar
应用
kubectl apply -f recommended.yaml
检查状态
kubectl get pod,svc,ing,deploy -n kubernetes-dashboard
待所有pod都运行起来后,查看
https://10.10.45.192:30001/
安装完成后,在master节点上获取token
代码语言:javascript复制[root@st01015vm192 ~]# kubectl -n kubernetes-dashboard get secret
NAME TYPE DATA AGE
default-token-25jb9 kubernetes.io/service-account-token 3 23m
kubernetes-dashboard-certs Opaque 0 23m
kubernetes-dashboard-csrf Opaque 1 23m
kubernetes-dashboard-key-holder Opaque 2 23m
kubernetes-dashboard-token-fm795 kubernetes.io/service-account-token 3 23m
[root@st01015vm192 ~]# kubectl -n kubernetes-dashboard describe secret kubernetes-dashboard-token-fm795
Name: kubernetes-dashboard-token-fm795
Namespace: kubernetes-dashboard
Labels: <none>
Annotations: kubernetes.io/service-account.name: kubernetes-dashboard
kubernetes.io/service-account.uid: af7a61cf-901f-42f9-bcbe-6f521d026bc2
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1025 bytes
namespace: 20 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlhCZGZQa0MtaVFmMHJ0YTRBS083emppS0tKSENvb24xeW9scHIxY19zU0kifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1mbTc5NSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImFmN2E2MWNmLTkwMWYtNDJmOS1iY2JlLTZmNTIxZDAyNmJjMiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.haW6XvBAyog0BasqbaWJxPqWjTJKiemVBwP3J8dwFE43Q93Jx41yjxK41NRNaUflL8xL3Aj4CNIJ0YUQwlpIutIzOJq7rXWkneRI6tmgr3jsCarFtjwETph7-spg-WQAHXRQxt7hwMyxcNkJprEc13q6zGO_ycx9ei_hjjliXo0O8JMuQsL0rlm2zXrWOpRer5U77Hj33dnVSGrjvlD3X_5NsI0dlzG2MmKMFZHM0_PVbYFnSvWcEmLl_04_u5CJPtPfp9Pu6RTjy1lMOZtsHgBxqDC-oXxm0UP2Tcn2qlu_UDfIPhiL3r-QrwWFy7b3WpxJCcXwcm07pfUzijQ77A
在 https://10.10.45.192:30001/ 上使用token登录即可。
参考资料
https://my.oschina.net/u/2539854/blog/3023384 https://juejin.im/post/5d089f49f265da1baa1e7611