第5章 YARN:资源调度平台
5.2 YARN参数解读与调优
yarn-site.xml文件默认参数: http://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
5.2.1 ResourceManager相关配置参数
参数 | 默认值 | 说明 |
---|---|---|
yarn.resourcemanager.address | ${yarn.resourcemanager.hostname}:8032 | ResourceManager 对客户端暴露的地址 |
yarn.resourcemanager.scheduler.address | ${yarn.resourcemanager.hostname}:8030 | ResourceManager 对ApplicationMaster暴露的访问地址2 |
yarn.resourcemanager.resource-tracker.address | ${yarn.resourcemanager.hostname}:8031 | ResourceManager 对NodeManager暴露的地址 |
yarn.resourcemanager.admin.address | ${yarn.resourcemanager.hostname}:8033 | ResourceManager 对管理员暴露的访问地址 |
yarn.resourcemanager.webapp.address | ${yarn.resourcemanager.hostname}:8088 | ResourceManager对外WebUI地址 |
yarn.resourcemanager.scheduler.class | ..capacity.CapacityScheduler | 启用的资源调度器主类,目前可用的有FIFO、Capacity Scheduler和Fair Scheduler |
yarn.resourcemanager.resource-tracker.client.thread-count | 50 | 处理来自NodeManager的RPC请求的Handler数目 |
yarn.resourcemanager.scheduler.client.thread-count | 50 | 处理来自ApplicationMaster的RPC请求的Handler数目 |
yarn.scheduler.minimum-allocation-mb | 1024 | 单个可申请的最小内存资源量 |
yarn.scheduler.maximum-allocation-mb | 8192 | 单个可申请的最大内存资源量 |
yarn.scheduler.minimum-allocation-vcores | 1 | 单个可申请的最小虚拟CPU个数 |
yarn.scheduler.maximum-allocation-vcores | 32 | 单个可申请的最大虚拟CPU个数 |
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms | 1000(毫秒) | NodeManager心跳间隔 |
..capacity.CapacityScheduler的完整名称: org.apache.Hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
5.2.2 NodeManager相关配置参数
参数 | 默认值 | 说明 |
---|---|---|
yarn.nodemanager.resource.memory-mb | 8192 | NodeManager总的可用物理内存(这个值通过一定要配置) |
yarn.nodemanager.vmem-pmem-ratio | 2.1 | 每使用1MB物理内存,最多可用的虚拟内存数 |
yarn.nodemanager.resource.cpu-vcores | 8 | NodeManager总的可用虚拟CPU个数 |
yarn.nodemanager.local-dirs | ${hadoop.tmp.dir}/nm-local-dir | 中间结果存放位置,这个参数通常会配置多个目录,已分摊磁盘IO负载。 |
yarn.nodemanager.log-dirs | ${yarn.log.dir}/userlogs | 日志存放地址(可配置多个目录) |
yarn.nodemanager.log.retain-seconds | 10800 | NodeManager上日志最多存放时间(不启用日志聚集功能时有效) |
yarn.nodemanager.aux-services | NodeManager上运行的附属服务,需配置成mapreduce_shuffle,才可运行MapReduce程序 |
5.2.3 mapred-site.xml
参数 | 默认值 | 说明 |
---|---|---|
mapreduce.job.reduces | 1 | 默认启动的reduce数 |
mapreduce.job.maps | 2 | 默认启动的map数 |
mapreduce.task.io.sort.factor | 10 | Reduce Task中合并小文件时,一次合并的文件数据 |
mapreduce.task.io.sort.mb | 100 | Map Task缓冲区所占内存大小 |
mapred.child.java.opts | -Xmx200m | jvm启动的子线程可以使用的最大内存 |
mapreduce.jobtracker.handler.count | 10 | JobTracker可以启动的线程数,一般为tasktracker节点的4% |
mapreduce.reduce.shuffle.parallelcopies | 5 | reuduce shuffle阶段并行传输数据的数量 |
mapreduce.tasktracker.http.threads | 40 | map和reduce是通过http进行数据传输的,这个是设置传输的并行线程数 |
mapreduce.map.output.compress | false | map输出是否进行压缩,如果压缩就会多耗cpu,但是减少传输时间,如果不压缩,就需要较多的传输带宽 |
mapreduce.reduce.shuffle.merge.percent | 0.66 | reduce归并接收map的输出数据可占用的内存配置百分比 |
mapreduce.reduce.shuffle.memory.limit.percent | 0.25 | 一个单一的shuffle的最大内存使用限制。 |
mapreduce.jobtracker.handler.count | 10 | 可并发处理来自tasktracker的RPC请求数,默认值10。 |
mapreduce.job.jvm.numtasks | 1 | 一个jvm可连续启动多个同类型任务,默认值1,若为-1表示不受限制。 |
mapreduce.tasktracker.reduce.tasks.maximum | 2 | 一个tasktracker并发执行的reduce数,建议为cpu核数 |
5.2.4 参数调优
参照 http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/index.html
单击“Command Line Installation”超链接后的PDF图标,即可打开HDP安装文档。
单击“1.10. Determining HDP Memory Configuration Settings”条目,跳转到对应页面
单击“Download Companion Files”连接,可以看到两条命令
代码语言:javascript复制wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/ hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
tar zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
下载:
代码语言:javascript复制[root@node1 ~]# wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
--2017-05-23 23:26:13-- http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
Resolving public-repo-1.hortonworks.com (public-repo-1.hortonworks.com)... 52.84.167.222, 52.84.167.38, 52.84.167.49, ...
Connecting to public-repo-1.hortonworks.com (public-repo-1.hortonworks.com)|52.84.167.222|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 85173 (83K) [application/x-tar]
Saving to: ‘hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz’
100%[==================================================================================================================================>] 85,173 132KB/s in 0.6s
2017-05-23 23:26:14 (132 KB/s) - ‘hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz’ saved [85173/85173]
解压缩
代码语言:javascript复制[root@node1 ~]# tar -zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
hdp_manual_install_rpm_helper_files-2.6.0.3.8/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/
.....
2.6.0.3.8/configuration_files/zookeeper/configuration.xsl
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/zookeeper/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/zookeeper/zoo.cfg
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/pig-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/pig.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/adminusers.txt
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/hadoop-config.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-default.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-policy.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/regionservers
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties.master-GANGLIA
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties.regionservers-GANGLIA
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/webhcat-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/webhcat-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/directories.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/yarn-utils.py
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/usersAndGroups.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/readme.txt
hdp_manual_install_rpm_helper_files-2.6.0.3.8/HDP-CHANGES.txt
[root@node1 ~]#
代码语言:javascript复制[root@node1 ~]# cd hdp_manual_install_rpm_helper_files-2.6.0.3.8
[root@node1 hdp_manual_install_rpm_helper_files-2.6.0.3.8]# ls
configuration_files HDP-CHANGES.txt readme.txt scripts
[root@nb0 hdp_manual_install_rpm_helper_files-2.6.0.3.8]# cd scripts/
[root@nb0 scripts]# ls
directories.sh usersAndGroups.sh yarn-utils.py
假设,某个节点有4核8G内存1块硬盘,该节点同时安装HBase,通过下面命令即可获得优化参数
代码语言:javascript复制[root@nb0 scripts]# python yarn-utils.py -c 4 -m 8 -d 1 -k True
Using cores=4 memory=8GB disks=1 hbase=True
Profile: cores=4 memory=5120MB reserved=3GB usableMem=5GB disks=1
Num Container=3
Container Ram=1536MB
Used Ram=4GB
Unused Ram=3GB
yarn.scheduler.minimum-allocation-mb=1536
yarn.scheduler.maximum-allocation-mb=4608
yarn.nodemanager.resource.memory-mb=4608
mapreduce.map.memory.mb=1536
mapreduce.map.java.opts=-Xmx1228m
mapreduce.reduce.memory.mb=3072
mapreduce.reduce.java.opts=-Xmx2457m
yarn.app.mapreduce.am.resource.mb=3072
yarn.app.mapreduce.am.command-opts=-Xmx2457m
mapreduce.task.io.sort.mb=614
选项说明
选项 | 描述 |
---|---|
-c | 每一个客户机的核数目 |
-m | 每一个客户机拥有的内存总数 |
-d | 每一个客户机拥有的磁盘数目 |
-k | 如果Hbase安装了为”True”,否则为”False” |