搭建高可用的flink JobManager HA

2021-06-29 11:32:03 浏览数 (1)

JobManager协调每个flink应用的部署,它负责执行定时任务和资源管理。每一个Flink集群都有一个jobManager, 如果jobManager出现问题之后,将不能提交新的任务和运行新任务失败,这样会造成单点失败,所以需要构建高可用的JobMangager。

类似zookeeper一样,构建好了高可用的jobManager之后,如果其中一个出现问题之后,其他可用的jobManager将会接管任务,变为leader。不会造成flink的任务执行失败。可以在单机版和集群版构建jobManager

flink ha主要分为两种,flink独立部署时的ha, flink on yarn模式部署时的ha

一.flink独立部署(Standalone模式)

我们先借助网上的一副时间线变化图来说明一下

从图上看道需要启动至少两个独立的jobmanager进程

下面我们来看一下配置

代码语言:javascript复制
localhost:8081
localhost:8082

ha配置放到下文去说

二.on yarn模式(yarn session/yarn per job/application mode)

无论是yarn session模式还是yarn per job模式,或者是application mod模式,在同一时刻只会有一个进程

三.统一配置​​​​​​​

代码语言:javascript复制
# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#
high-availability: zookeeper
 
# The path where metadata for master recovery is persisted. While ZooKeeper stores
# the small ground truth for checkpoint and leader election, this location stores
# the larger objects, like persisted dataflow graphs.
#
# Must be a durable file system that is accessible from all nodes
# (like HDFS, S3, Ceph, nfs, ...)
#
high-availability.storageDir: hdfs:///flink/ha/
 
high-availability.zookeeper.path.root: /flink
 
# The list of ZooKeeper quorum peers that coordinate the high-availability
# setup. This must be a list of the form:
# "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#
high-availability.zookeeper.quorum: localhost:3181
#high-availability.cluster-id: /cluster_one
yarn.application-attempts: 10

0 人点赞