目录:
(1).flink对hadoop版本的支持
(2).部署flink1.13.1 without hadoop
(3).部署flink1.13.1 with hadoop
(4).参考资料
(1).flink对hadoop版本的支持
Flink自1.11.0 版本开始,已经支持了hadoop 3.x,具体来讲就是将 HADOOP_CLASSPATH 配置成运行机器上的hadoop3 相关jar包即可。
YARN上的Flink是针对Hadoop 2.4.1编译的,支持所有的Hadoop版本>= 2.4.1,包括Hadoop 3.x。
所以flink1.13.1 on hadoop3.3.1是官方支持的。
(2).部署flink1.13.1 without hadoop
下载:
https://flink.apache.org/downloads.html#apache-flink-1131
https://www.apache.org/dyn/closer.lua/flink/flink-1.13.1/flink-1.13.1-bin-scala_2.12.tgz
wget https://mirrors.ocf.berkeley.edu/apache/flink/flink-1.13.1/flink-1.13.1-bin-scala_2.12.tgz
解压:
tar -xzvf flink-1.13.1-bin-scala_2.12.tgz
然后放到/app/3rd目录下。
cd flink
开启prometheus metric度量:
cp plugins/metrics-prometheus/flink-metrics-prometheus-1.13.1.jar lib/
vim conf/flink-conf.yaml
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9999
然后启动flink:
bin/start-cluster.sh
jps查看&验证flink进程:
TaskManagerRunner和StandaloneSessionClusterEntrypoint是flink的进程。
验证flink的度量metrics:
http://127.0.0.1:9999/metrics
验证flink work,运行一个测试flink-demo:
打开一个界面输入如下指令:
nc -l 6000
然后提交demo到本地flink:
bin/flink run examples/streaming/SocketWindowWordCount.jar --port 6000
在第一个用nc打开的界面进行输入并回车:
在第一个用nc打开的界面进行输入并回车:
点开flink的taskManager界面:
http://127.0.0.1:8081/#/task-manager
点击运行的task查看日志:
说明demo正常运行了。
(3).部署flink1.13.1 with hadoop
生产hadoop部署参见:
hadoop-3:原生方式在aws搭建生产级hadoop-flink集群
hadoop-4:hadoop-flink实时计算集群生产级优化
检查环境变量:
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64
export HADOOP_HOME=/app/3rd/hadoop-3.3.1
export CLASSPATH=.:JAVA_HOME/jre/lib/rt.jar:JAVA_HOME/lib/dt.jar:
export PATH=PATH:JAVA_HOME/bin
export PATH=JAVA_HOME/bin:HADOOP_HOME/bin:
export CLASSPATH=.:JAVA_HOME/lib/dt.jar:JAVA_HOME/lib/tools.jar:HADOOP_HOME/bin:HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_CLASSPATH=HADOOP_COMMON_HOME/lib:HADOOP_HOME/share/hadoop/yarn/*:HADOOP_HOME/share/hadoop/common/*:HADOOP_HOME/share/hadoop/mapreduce/*:HADOOP_HOME/share/hadoop/hdfs/*:HADOOP_HOME/share/tools/*:HADOOP_HOME/share/hadoop/httpfs/*:HADOOP_HOME/share/hadoop/kms/*:
除此之外,还需要往服务器上的 flink 中的lib目录里添加2个jar包,否则会报一些类not found:
flink-shaded-hadoop-3-uber-3.1.1.7.2.1.0-327-9.0.jar
commons-cli-1.4.jar
前者从这里下载:
https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop-3-uber/3.1.1.7.2.1.0-327-9.0
后者从这里下载:
https://mvnrepository.com/artifact/commons-cli/commons-cli/1.4
(4).参考资料
Hadoop Integration
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html
flink on yarn
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/yarn/
Flink1.13集成Hadoop3.x的解决方法
https://www.bilibili.com/read/cv11506216
flink针对hadoop 3.x的支持及集成
https://blog.csdn.net/penriver/article/details/116131889