With the ResourceManger Restart enabled, the RM being promoted to an active state loads the RM internal state and continues to operate from where the previous active left off as much as possible depending on the RM restart feature. A new attempt is spawned for each managed application previously submitted to the RM. Applications can checkpoint periodically to avoid losing any work. The state-store must be visible from the both of Active/Standby RMs. Currently, there are two RMStateStore implementations for persistence - FileSystemRMStateStore and ZKRMStateStore. The ZKRMStateStore implicitly allows write access to a single RM at any point in time, and hence is the recommended store to use in an HA cluster. When using the ZKRMStateStore, there is no need for a separate fencing mechanism to address a potential split-brain situation where multiple RMs can potentially assume the Active role. When using the ZKRMStateStore, it is advisable to NOT set the “zookeeper.DigestAuthenticationProvider.superDigest” property on the Zookeeper cluster to ensure that the zookeeper admin does not have access to YARN application/user credential information.
如果RM重启是被激活可用的,依靠RM的重启特性一个RM被提升为活跃RM状态时加载前面那个活跃RM留下尽可能多的RM的内部状态和操作。应用可以周期的检查来避免丢失任何工作。状态仓库对主用/备用RM都是可见的。目前,有两个实现的持久化RM状态仓库- FileSystemRMStateStore和ZKRMStateStore。ZKRMStateStore允许在任何一个时间点只对一个RM可写,因此推荐在HA集群中使用这个仓库。当使用ZKRMStateStore作为状态仓库,建议不要在Zookepper集群中设置zookeeper.DigestAuthenticationProvider.superDigest属性确保zookepper管理员没有进入YARN 应用和用户的权限信息。
Deployment(部署) Configurations(配置)Most of the failover functionality is tunable using various configuration properties. Following is a list of required/important ones. yarn-default.xml carries a full-list of knobs. See yarn-default.xml for more information including default values. See the document for ResourceManger Restart also for instructions on setting up the state-store.
大部分的故障切换功能都可以用各样的配置属性来调用。下面是属性中需要的/重要的部分列表。yarn-default.xml是完整的开关列表。去查看 yarn-default.xml 获取更多信息包括默认值。看ResourceManger Restart 文档也可以得到状态仓库的设置信息。
Configuration Properties
Description
yarn.resourcemanager.zk-address
Address of the ZK-quorum. Used both for the state-store and embedded leader-election.
yarn.resourcemanager.ha.enabled
Enable RM HA.
RM高可用激活
yarn.resourcemanager.ha.rm-ids
List of logical IDs for the RMs. e.g., “rm1,rm2”.
RMs的逻辑ID列表
yarn.resourcemanager.hostname.rm-id
For each rm-id, specify the hostname the RM corresponds to. Alternately, one could set each of the RM’s service addresses.
为每个RM-id指定一个主机名。或者可以设置每个RM的服务地址
yarn.resourcemanager.address.rm-id
For each rm-id, specify host:port for clients to submit jobs. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
为每个rm-id设置主机:端口用来提交作业。如果设置,将覆盖yarn.resourcemanager.hostname.rm-id的设置
yarn.resourcemanager.scheduler.address.rm-id
For each rm-id, specify scheduler host:port for ApplicationMasters to obtain resources. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
为每个rm-id指定AM的主机:端口来获取资源。如果设置了将覆盖yarn.resourcemanager.hostname.rm-id的设置
yarn.resourcemanager.resource-tracker.address.rm-id
For each rm-id, specify host:port for NodeManagers to connect. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
为每个rm-id指定NodeManagers的连接的主机:端口。如果设置将覆盖yarn.resourcemanager.hostname.rm-id的设置
yarn.resourcemanager.admin.address.rm-id
For each rm-id, specify host:port for administrative commands. If set, overrides the hostname set in yarn.resourcemanager.hostname.rm-id.
为每个rm-id设置管理命令行的主机:端口。如果设置了将覆盖yarn.resourcemanager.hostname.rm-id的设置
yarn.resourcemanager.webapp.address.rm-id