前言
Yarn Rest Api 返回的数据都是XML格式,需要解析XML。
任务查询
查询所有任务
http://hadoop02:8088/ws/v1/cluster/apps
字段说明
Item | DataType | Description |
---|---|---|
id | string | 应用的application-id |
user | string | 提交任务的用户名 |
name | string | 应用程序的名称 |
queue | string | 应用程序所属消息队列 |
state | string | 应用程序当前状态 |
finalStatus | string | 应用程序最终状态 |
progress | double | 应用程序进度 |
trackingUI | string | 追踪UI显示名称 |
trackingUrl | string | 追踪UI的url |
clusterId | string | 集群id |
applicationType | string | 应用程序类型 |
priority | int | 应用程序优先级 |
startedTime | long | 应用程序开始时间 |
launchTime | long | 应用程序加载时间 |
finishedTime | long | 应用程序完成时间 |
elapsedTime | long | 应用程序消耗时间(finished-start) |
amContainerLogs | string | am容器日志地址 |
amHostHttpAddress | string | am的主机http地址 |
amRPCAddress | string | am的RPC地址 |
allocatedMB | string | 初始化内存大小 |
allocatedVCores | string | 初始化核心数 |
reservedMB | string | 保留内存 |
reservedVCores | string | 保留核心数 |
runningContainers | string | 正在运行的容器数 |
memorySeconds | int | 所有的container每秒消耗的内存总和 |
vcoreSecond | string | 所有的container每秒消耗的核心数总和 |
queueUsagePercentage | double | 所属队列的资源使用百分比 |
clusterUsagePercentage | double | 所属集群的资源使用百分比 |
logAggregationStatus | string | 日志聚合状态 |
unmanagedApplication | boolean | 未被管理的应用程序 |
查询单个任务
http://hadoop02:8088/ws/v1/cluster/apps/application_1672710362889_0012
其中amHostHttpAddress是运行任务所在的服务器
返回值
代码语言:javascript复制<app>
<id>application_1672710362889_0012</id>
<user>root</user>
<name>yarnforflink</name>
<queue>default</queue>
<state>RUNNING</state>
<finalStatus>UNDEFINED</finalStatus>
<progress>100.0</progress>
<trackingUI>ApplicationMaster</trackingUI>
<trackingUrl>http://hadoop02:8088/proxy/application_1672710362889_0012/</trackingUrl>
<diagnostics/>
<clusterId>1672710362889</clusterId>
<applicationType>Apache Flink</applicationType>
<applicationTags/>
<startedTime>1672799886183</startedTime>
<finishedTime>0</finishedTime>
<elapsedTime>3849093</elapsedTime>
<amContainerLogs>http://hadoop01:8042/node/containerlogs/container_e19_1672710362889_0012_01_000001/root</amContainerLogs>
<amHostHttpAddress>hadoop01:8042</amHostHttpAddress>
<allocatedMB>2048</allocatedMB>
<allocatedVCores>1</allocatedVCores>
<runningContainers>1</runningContainers>
<memorySeconds>8116067</memorySeconds>
<vcoreSeconds>3960</vcoreSeconds>
<preemptedResourceMB>0</preemptedResourceMB>
<preemptedResourceVCores>0</preemptedResourceVCores>
<numNonAMContainerPreempted>0</numNonAMContainerPreempted>
<numAMContainerPreempted>0</numAMContainerPreempted>
<resourceRequests>
<capability>
<memory>2048</memory>
<virtualCores>1</virtualCores>
</capability>
<nodeLabelExpression/>
<numContainers>0</numContainers>
<priority>
<priority>0</priority>
</priority>
<relaxLocality>true</relaxLocality>
<resourceName>*</resourceName>
</resourceRequests>
<resourceRequests>
<capability>
<memory>2048</memory>
<virtualCores>1</virtualCores>
</capability>
<nodeLabelExpression/>
<numContainers>0</numContainers>
<priority>
<priority>1</priority>
</priority>
<relaxLocality>true</relaxLocality>
<resourceName>*</resourceName>
</resourceRequests>
</app>
查看任务状态
http://hadoop02:8088/ws/v1/cluster/apps/application_1672710362889_0012/state
返回
代码语言:javascript复制<appstate>
<state>RUNNING</state>
</appstate>
状态值
Item | Data Type | Description |
---|---|---|
state | string | The application state - can be one of “NEW”, “NEW_SAVING”, “SUBMITTED”, “ACCEPTED”, “RUNNING”, “FINISHED”, “FAILED”, “KILLED” |
集群
集群信息
http://hadoop02:8088/ws/v1/cluster
返回类似于
代码语言:javascript复制<clusterInfo>
<id>1672710362889</id>
<startedOn>1672710362889</startedOn>
<state>STARTED</state>
<haState>ACTIVE</haState>
<rmStateStoreName>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</rmStateStoreName>
<resourceManagerVersion>2.7.7</resourceManagerVersion>
<resourceManagerBuildVersion>2.7.7 from c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac by stevel source checksum d0c780b3552e7bd9462fffca3f9fc51d</resourceManagerBuildVersion>
<resourceManagerVersionBuiltOn>2018-07-19T00:39Z</resourceManagerVersionBuiltOn>
<hadoopVersion>2.7.7</hadoopVersion>
<hadoopBuildVersion>2.7.7 from c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac by stevel source checksum 792e15d20b12c74bd6f19a1fb886490</hadoopBuildVersion>
<hadoopVersionBuiltOn>2018-07-18T22:47Z</hadoopVersionBuiltOn>
<haZooKeeperConnectionState>CONNECTED</haZooKeeperConnectionState>
</clusterInfo>
返回数据字段说明
Item | Data Type | Description |
---|---|---|
id | long | 集群ID |
startedOn | long | 集群启动的时间(从纪元开始以毫秒为单位) |
state | string | ResourceManager状态-有效值为:NOTINITED,INITED,STARTED,STOPPED |
haState | string | ResourceManager HA状态-有效值为:INITIALIZING,ACTIVE,STANDBY,STOPPED |
rmStateStoreName | string | 实现ResourceManager状态存储的类的完全限定名称 |
resourceManagerVersion | string | ResourceManager的版本 |
resourceManagerBuildVersion | string | ResourceManager构建字符串以及构建版本,用户和校验和 |
resourceManagerVersionBuiltOn | string | 生成ResourceManager的时间戳(自纪元以来以毫秒为单位) |
hadoopVersion | string | Hadoop通用版本 |
hadoopBuildVersion | string | 具有构建版本,用户和校验和的Hadoop通用构建字符串 |
hadoopVersionBuiltOn | string | 建立hadoop common的时间戳(自纪元以来以毫秒为单位) |
haZooKeeperConnectionState | string | ZooKeeper高可用性服务的连接状态 |
集群指标
http://hadoop02:8088/ws/v1/cluster/metrics
返回数据
代码语言:javascript复制<clusterMetrics>
<appsSubmitted>13</appsSubmitted>
<appsCompleted>10</appsCompleted>
<appsPending>0</appsPending>
<appsRunning>1</appsRunning>
<appsFailed>0</appsFailed>
<appsKilled>2</appsKilled>
<reservedMB>0</reservedMB>
<availableMB>4096</availableMB>
<allocatedMB>2048</allocatedMB>
<reservedVirtualCores>0</reservedVirtualCores>
<availableVirtualCores>23</availableVirtualCores>
<allocatedVirtualCores>1</allocatedVirtualCores>
<containersAllocated>1</containersAllocated>
<containersReserved>0</containersReserved>
<containersPending>0</containersPending>
<totalMB>6144</totalMB>
<totalVirtualCores>24</totalVirtualCores>
<totalNodes>3</totalNodes>
<lostNodes>0</lostNodes>
<unhealthyNodes>0</unhealthyNodes>
<decommissionedNodes>0</decommissionedNodes>
<rebootedNodes>0</rebootedNodes>
<activeNodes>3</activeNodes>
</clusterMetrics>
返回数据字段说明
Item | Data Type | Description |
---|---|---|
appsSubmitted | int | 提交的应用程序数量 |
appsCompleted | int | 完成的应用程序数量 |
appsPending | int | 等待的应用程序数量 |
appsRunning | int | 正在运行的应用程序数量 |
appsFailed | int | 失败的应用程序数量 |
appsKilled | int | 被杀死的应用程序数量 |
reservedMB | long | 保留的内存量(MB) |
availableMB | long | 可用的内存量(MB) |
allocatedMB | long | 分配的内存量(MB) |
totalMB | long | 总内存量(MB) |
reservedVirtualCores | long | 保留的虚拟核心数 |
availableVirtualCores | long | 可用虚拟核心数 |
allocatedVirtualCores | long | 分配的虚拟核心数 |
totalVirtualCores | long | 虚拟核心总数 |
containersAllocated | int | 分配的容器数 |
containersReserved | int | 保留的容器数 |
containersPending | int | 待处理的容器数 |
totalNodes | int | 节点总数 |
activeNodes | int | 活动节点数 |
lostNodes | int | 丢失的节点数 |
unhealthyNodes | int | 不良节点数 |
decommissioningNodes | int | 停用的节点数 |
decommissionedNodes | int | 退役的节点数 |
rebootedNodes | int | 重新启动的节点数 |
shutdownNodes | int | 关闭的节点数 |