使用 NGINX ingress controller 和 Flagger 来实现 canary deployments
Flagger 介绍
Flagger 是一个逐步交付的 Kubernetes operator.
Flagger 是一个渐进式的交付工具,可以为运行在 Kubernetes 上的应用程序自动发布流程。它通过逐步将流量转移到新版本,同时测量指标和运行一致性测试,降低了在生产中引入新软件版本的风险.
Flagger 使用 service mesh(App Mesh, Istio, Linkerd, Kuma, Open Service Mesh)或 ingress controller(Contour, Gloo, NGINX, Skipper, Traefik)来实现几种部署策略(金丝雀发布、A/B测试、蓝/绿镜像).
对于发布分析,Flagger 可以查询 Prometheus、InfluxDB、Datadog、New Relic、CloudWatch、Stackdriver 或 Graphite,对于警报,它使用 Slack、MS Teams、Discord 和 Rocket.
先决条件
Flagger 需要 Kubernetes 集群 v1.19 或更高版本,以及 NGINX ingress v1.0.2 或更高版本。
使用 Helm v3 安装 NGINX ingress controller:
代码语言:javascript复制$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
$ kubectl create ns ingress-nginx
$ helm upgrade -i ingress-nginx ingress-nginx/ingress-nginx
--namespace ingress-nginx
--set controller.metrics.enabled=true
--set controller.podAnnotations."prometheus.io/scrape"=true
--set controller.podAnnotations."prometheus.io/port"=10254
//类似以下安装输出
Release "ingress-nginx" has been upgraded. Happy Helming!
NAME: ingress-nginx
LAST DEPLOYED: Tue Dec 20 07:49:42 2022
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace ingress-nginx get services -o wide -w ingress-nginx-controller'
An example Ingress that makes use of the controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: example
namespace: foo
spec:
ingressClassName: nginx
rules:
- host: www.example.com
http:
paths:
- pathType: Prefix
backend:
service:
name: exampleService
port:
number: 80
path: /
# This section is only required if TLS is to be enabled for the Ingress
tls:
- hosts:
- www.example.com
secretName: example-tls
If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:
apiVersion: v1
kind: Secret
metadata:
name: example-tls
namespace: foo
data:
tls.crt: <base64 encoded cert>
tls.key: <base64 encoded key>
type: kubernetes.io/tls
将 Flagger 和 Prometheus 附加组件安装在与 ingress controller 相同的命名空间中:
代码语言:javascript复制$ helm repo add flagger https://flagger.app
"flagger" has been added to your repositories
$ helm upgrade -i flagger flagger/flagger
--namespace ingress-nginx
--set prometheus.install=true
--set meshProvider=nginx
Release "flagger" does not exist. Installing it now.
NAME: flagger
LAST DEPLOYED: Tue Dec 20 07:57:21 2022
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Flagger installed
$ kubectl get po -n ingress-nginx
NAME READY STATUS RESTARTS AGE
flagger-57df5fbcb9-2w6v2 1/1 Running 0 12m
flagger-prometheus-5d44fbdbb8-hzqmx 1/1 Running 1 (7m42s ago) 12m
ingress-nginx-controller-crjkc 1/1 Running 0 17m
$ kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
flagger-prometheus ClusterIP 10.100.239.197 <none> 9090/TCP 17h
ingress-nginx-controller LoadBalancer 10.101.203.205 <pending> 80:30216/TCP,443:31852/TCP 102d
ingress-nginx-controller-admission ClusterIP 10.109.225.125 <none> 443/TCP 102d
ingress-nginx-controller-metrics ClusterIP 10.102.84.232 <none> 10254/TCP 17h
启动
Flagger 采用 Kubernetes 部署和可选的水平 Pod horizontal pod autoscaler (HPA),然后创建一系列对象(Kubernetes deployments, ClusterIP services 和 canary ingress),
这些对象将应用程序暴露在群集外部,并推动 Canary 分析和提升;
Create a test namespace
代码语言:javascript复制$ kubectl create ns test
Create a deployment and a horizontal pod autoscaler
代码语言:javascript复制https://github.com/fluxcd/flagger/tree/main/kustomize/podinfo
$ kubectl apply -k https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main
deployment.apps/podinfo created
Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23 , unavailable in v1.26 ; use autoscaling/v2 HorizontalPodAutoscaler
horizontalpodautoscaler.autoscaling/podinfo created
// 查看部署 podinfo
$ kubectl get po -n test
NAME READY STATUS RESTARTS AGE
podinfo-5d876b68bd-8pvv2 1/1 Running 0 2m56s
podinfo-5d876b68bd-fmsvw 1/1 Running 0 2m57s
Install flagger-loadtester
代码语言:javascript复制$ helm upgrade -i flagger-loadtester flagger/loadtester --namespace=test
Release "flagger-loadtester" does not exist. Installing it now.
NAME: flagger-loadtester
LAST DEPLOYED: Tue Dec 20 08:17:43 2022
NAMESPACE: test
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Flagger's load testing service is available at http://flagger-loadtester.test/
Create an ingress podinfo-ingress.yaml definition (replace app.example.com with your own domain)
代码语言:javascript复制apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: podinfo
namespace: test
labels:
app: podinfo
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: "app.example.com"
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: podinfo
port:
number: 80
$ kubectl apply -f ./podinfo-ingress.yaml
ingress.networking.k8s.io/podinfo created
创建 Canary 自定义资源 podinfo-canary.yaml(将 app.example.com 替换为自己的域
代码语言:javascript复制apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:
provider: nginx
# deployment reference
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
# ingress reference
ingressRef:
apiVersion: networking.k8s.io/v1
kind: Ingress
name: podinfo
# HPA reference (optional)
autoscalerRef:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
name: podinfo
# the maximum time in seconds for the canary deployment
# to make progress before it is rollback (default 600s)
progressDeadlineSeconds: 60
service:
# ClusterIP port number
port: 80
# container port number or name
targetPort: 9898
analysis:
# schedule interval (default 60s)
interval: 10s
# max number of failed metric checks before rollback
threshold: 10
# max traffic percentage routed to canary
# percentage (0-100)
maxWeight: 50
# canary increment step
# percentage (0-100)
stepWeight: 5
# NGINX Prometheus checks
metrics:
- name: request-success-rate
# minimum req success rate (non 5xx responses)
# percentage (0-100)
thresholdRange:
min: 99
interval: 1m
# testing (optional)
webhooks:
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 30s
metadata:
type: bash
cmd: "curl -sd 'test' http://podinfo-canary/token | grep token"
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://app.example.com/"
代码语言:javascript复制$ kubectl apply -f ./podinfo-canary.yaml
canary.flagger.app/podinfo created
此时 Canary 说明初始化已完成!
代码语言:javascript复制$ kubectl get canary -n test
NAME STATUS WEIGHT LASTTRANSITIONTIME
podinfo Initialized 0 2022-12-21T03:21:49Z
自动化的 canary 进阶
Flagger 实现了一个控制循环,在测量 HTTP 请求成功率、请求平均持续时间和 pod 健康度等关键性能指标的同时,逐渐将流量转移到金丝雀。基于对关键绩效指标的分析,金丝雀被提升或中止,分析结果被发布到 Slack 或 MS Teams.
通过更新容器镜像来触发一个 canary 部署
代码语言:javascript复制$ kubectl set image deployment/podinfo podinfod=ghcr.io/stefanprodan/podinfo:6.0.1 -n test
deployment.apps/podinfo image updated
$ kubectl get canary -n test
NAME STATUS WEIGHT LASTTRANSITIONTIME
podinfo Progressing 5 2022-12-21T07:40:33Z
// Flagger 检测到部署的修订版发生了变化并开始新的展开:
$ kubectl describe canary/podinfo -n test
Status:
Canary Weight: 0
Failed Checks: 0
Phase: Succeeded
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Synced 3m flagger New revision detected podinfo.test
Normal Synced 3m flagger Scaling up podinfo.test
Warning Synced 3m flagger Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available
Normal Synced 3m flagger Advance podinfo.test canary weight 5
Normal Synced 3m flagger Advance podinfo.test canary weight 10
Normal Synced 3m flagger Advance podinfo.test canary weight 15
Normal Synced 2m flagger Advance podinfo.test canary weight 20
Normal Synced 2m flagger Advance podinfo.test canary weight 25
Normal Synced 1m flagger Advance podinfo.test canary weight 30
Normal Synced 1m flagger Advance podinfo.test canary weight 35
Normal Synced 55s flagger Advance podinfo.test canary weight 40
Normal Synced 45s flagger Advance podinfo.test canary weight 45
Normal Synced 35s flagger Advance podinfo.test canary weight 50
Normal Synced 25s flagger Copying podinfo.test template spec to podinfo-primary.test
Warning Synced 15s flagger Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available
Normal Synced 5s flagger Promotion completed! Scaling down podinfo.test
注意,如果在 canary 分析期间对部署应用新的变化,Flagger 将重新启动分析
可以用以下方法监控所有的 canaries
代码语言:javascript复制$ watch kubectl get canaries --all-namespaces
以上已完成 Flagger 结合 Ingress nginx 的自动化部署!