下载flink1.11.3:
https://flink.apache.org/downloads.html
https://www.apache.org/dyn/closer.lua/flink/flink-1.11.3/flink-1.11.3-bin-scala_2.12.tgz
wget https://downloads.apache.org/flink/flink-1.11.3/flink-1.11.3-bin-scala_2.12.tgz
tar -xzvf flink-1.11.3-bin-scala_2.12.tgz
cd flink-1.11.3/bin
启动单机flink:
./start-cluster.sh
增加prometheus的度量支持:
cp plugins/metrics-prometheus/flink-metrics-prometheus-1.11.3.jar lib/
修改flink-conf.yaml配置文件(All job managers and task managers will expose the metrics on the configured port):
vim conf/flink-conf.yaml
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9999
杀掉flink后重启:
./bin/stop-cluster.sh
./bin/start-cluster.sh
访问metrics:
http://127.0.0.1:9999/metrics
运行一个测试demo:
打开一个界面输入如下指令:
nc -l 9000
然后提交demo到本地flink:
./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
在第一个用nc打开的界面进行输入并回车:
点开flink的taskManager界面:
http://127.0.0.1:8081/#/task-manager
点击运行的task查看日志:
说明demo正常运行了。
但是这种方式获取metrics有一个问题,因为task是由yarn调度到不同节点然后运行的,所以prometheus配置中不能写死IP,只能通过pushgateway的方式由flink的job/task将metrics主动推送到pushgateway,然后prometheus定期的从pushgateway取数据。这样,需要对flink-conf.yaml做改动:
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: 192.168.111.64
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: flink
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: false
metrics.reporter.promgateway.groupingKey: env=test;app=flink-standalone
metrics.reporter.promgateway.interval: 30 SECONDS
再次启动flink,可以在prometheus的pushgateway看到:
打开prometheus可以看到已经收集到的指标:
有采集数据后,贺鹏远后续在grafana出监控图表,最后观察指标变化,确认报警公式。
相关文档:
Apache Flink 文档
https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/
本地集群
https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/ops/deployment/local.html
Plugins
https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/ops/plugins.html
Flink and Prometheus: Cloud-native monitoring of streaming applications
https://flink.apache.org/features/2019/03/11/prometheus-monitoring.html
Metrics
https://ci.apache.org/projects/flink/flink-docs-release-1.11/monitoring/metrics.html
入门flink---测试demo
https://www.it610.com/article/1294877495442612224.htm