在至少有一个Leader存在的前提下,进行Zookeeper的在线增量、在线减量、在线迁移 在全过程中ZooKeeper不停止服务
注意事项
首先,当我们要从3台扩充到5台时,应保证集群不停止服务。
3台不停止服务的最低限度是2台(X/2 1),而5台的最低限度是3台。
我们应该保证,集群中最低有3台ZooKeeper是启动的。
此外,重启时应保证先重启myid最小的机器,由小向大进行重启
Leader无论其myid大小,都放到最后重启
因为ZooKeeper的机制中,myid大的会向小的发起连接,而小的不会向大的发起连接。因此如果最后重启myid最小的机器,则其可能无法加入集群
环境情况
五台机器
IP | Hostname |
---|---|
10.1.24.110 | idc02-kafka-ds-00 |
10.1.24.111 | idc02-kafka-ds-01 |
10.1.24.112 | idc02-kafka-ds-02 |
10.1.24.113 | idc02-kafka-ds-03 |
10.1.24.114 | idc02-kafka-ds-04 |
JDK
jdk1.7.0_67
ZooKeeper
zookeeper-3.4.6
Myid
根据IP自增为1-5
配置文件
1234 | server.1=10.1.24.110:2888:3888server.2=10.1.24.111:2888:3888server.3=10.1.24.112:2888:3888 |
---|
实验过程
配置一个3节点的ZooKeeper
idc02-kafka-ds-00:
12345 | [hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
idc02-kafka-ds-01:
12345 | [hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: leader |
---|
idc02-kafka-ds-02:
12345 | [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
将其扩容为5节点的ZooKeeper
先查看原先的ZooKeeper集群情况
echo mntr|nc localhost 2181
这条4字命令可以查看集群的情况,其中follower的相关数据需要在Leader机器上才能查看
在idc02-kafka-ds-01上查看
1234567891011121314151617181920 | [hadoop@idc02-kafka-ds-01 bin]$ echo mntr|nc localhost 2181zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMTzk_avg_latency 0zk_max_latency 0zk_min_latency 0zk_packets_received 3zk_packets_sent 2zk_num_alive_connections 1zk_outstanding_requests 0zk_server_state leaderzk_znode_count 4zk_watch_count 0zk_ephemerals_count 0zk_approximate_data_size 27zk_open_file_descriptor_count 27zk_max_file_descriptor_count 65535zk_followers 2zk_synced_followers 2zk_pending_syncs 0 |
---|
启动另外两台机器的Zookeeper
另外两台机器的配置文件
123456 | server.1=10.1.24.110:2888:3888server.2=10.1.24.111:2888:3888server.3=10.1.24.112:2888:3888server.4=10.1.24.113:2888:3888server.5=10.1.24.114:2888:3888 |
---|
启动
idc02-kafka-ds-03:
12345 | [hadoop@idc02-kafka-ds-03 bin]# ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
idc02-kafka-ds-04:
12345 | [hadoop@idc02-kafka-ds-04 bin]# ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
再查看集群情况
仍然在idc02-kafka-ds-01上查看
1234567891011121314151617181920 | [hadoop@idc02-kafka-ds-01 bin]$ echo mntr|nc localhost 2181zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMTzk_avg_latency 0zk_max_latency 0zk_min_latency 0zk_packets_received 4zk_packets_sent 3zk_num_alive_connections 1zk_outstanding_requests 0zk_server_state leaderzk_znode_count 4zk_watch_count 0zk_ephemerals_count 0zk_approximate_data_size 27zk_open_file_descriptor_count 31zk_max_file_descriptor_count 65535zk_followers 4zk_synced_followers 4zk_pending_syncs 0 |
---|
可以看到zk_followers为4,连接到的follower从2变为4了
而且zk_synced_followers为4,说明新加入的2个也都同步好了
接下来我们滚动重启myid为1-3的前三台机器
先处理idc02-kafka-ds-00
关闭
如不放心请在关闭其间于Leader机器或后加入的两台机器上监控日志
12345 | [hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
修改其配置文件
由原来的
1234 | server.1=10.1.24.110:2888:3888server.2=10.1.24.111:2888:3888server.3=10.1.24.112:2888:3888 |
---|
到新的
123456 | server.1=10.1.24.110:2888:3888server.2=10.1.24.111:2888:3888server.3=10.1.24.112:2888:3888server.4=10.1.24.113:2888:3888server.5=10.1.24.114:2888:3888 |
---|
启动
123456789 | [hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh startJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
然后跳过作为Leader的idc02-kafka-ds-01,先处理idc02-kafka-ds-02
关闭
12345 | [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
修改配置文件
123456 | server.1=10.1.24.110:2888:3888server.2=10.1.24.111:2888:3888server.3=10.1.24.112:2888:3888server.4=10.1.24.113:2888:3888server.5=10.1.24.114:2888:3888 |
---|
启动
123456789 | [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh startJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
最后处理原Leader的idc02-kafka-ds-01
关闭
12345 | [hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
查看新Leader
ZooKeeper会尽可能的选择myid最大的机器为Leader,因此原本的idc02-kafka-ds-04其myid为5变为了Leader
12345 | [hadoop@idc02-kafka-ds-04 bin]# ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: leader |
---|
修改配置文件
123456 | server.1=10.1.24.110:2888:3888server.2=10.1.24.111:2888:3888server.3=10.1.24.112:2888:3888server.4=10.1.24.113:2888:3888server.5=10.1.24.114:2888:3888 |
---|
启动
123456789 | [hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh startJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
在新的Leader上查看集群情况
1234567891011121314151617181920 | [hadoop@idc02-kafka-ds-04 bin]# echo mntr|nc localhost 2181zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMTzk_avg_latency 1zk_max_latency 4zk_min_latency 0zk_packets_received 12zk_packets_sent 11zk_num_alive_connections 1zk_outstanding_requests 0zk_server_state leaderzk_znode_count 4zk_watch_count 0zk_ephemerals_count 0zk_approximate_data_size 27zk_open_file_descriptor_count 33zk_max_file_descriptor_count 65535zk_followers 4zk_synced_followers 4zk_pending_syncs 0 |
---|
一切正常
到这里,我们已经将原本的3台扩展到了5台,成功了一半。
然后只要将现在的5台再缩小到3台且不包括原本myid为1-2的机器,就完成了迁移
将5台缩小回3台
修改idc02-kafka-ds-02
根据前面的注意事项,我们此时5台集群中启动的数量不得少于3台,因此我们需要先修改3-5号机器的配置文件为3台,再关闭1-2号机器
关闭
12345 | [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
修改配置文件为
1234 | server.3=10.1.24.110:2888:3888server.4=10.1.24.111:2888:3888server.5=10.1.24.112:2888:3888 |
---|
启动
l
123456789 | [hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh startJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@idc02-kafka-ds-02 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
然后修改idc02-kafka-ds-03
关闭
12345 | [hadoop@idc02-kafka-ds-03 bin]# ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
修改配置文件为
1234 | server.3=10.1.24.110:2888:3888server.4=10.1.24.111:2888:3888server.5=10.1.24.112:2888:3888 |
---|
启动
123456789 | [hadoop@idc02-kafka-ds-03 bin]$ ./zkServer.sh startJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@idc02-kafka-ds-03 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
最后修改idc02-kafka-ds-04
关闭
12345 | [hadoop@idc02-kafka-ds-04 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
关闭后Leader移动到了myid第二大的idc02-kafka-ds-02上
修改配置文件为
l
1234 | server.3=10.1.24.110:2888:3888server.4=10.1.24.111:2888:3888server.5=10.1.24.112:2888:3888 |
---|
启动
123456789 | [hadoop@idc02-kafka-ds-04 bin]$ ./zkServer.sh startJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[hadoop@idc02-kafka-ds-04 bin]$ ./zkServer.sh statusJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower |
---|
在Leader中查看
1234567891011121314151617181920 | [hadoop@idc02-kafka-ds-03 bin]$ echo mntr|nc localhost 2181zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMTzk_avg_latency 0zk_max_latency 0zk_min_latency 0zk_packets_received 4zk_packets_sent 3zk_num_alive_connections 1zk_outstanding_requests 0zk_server_state leaderzk_znode_count 4zk_watch_count 0zk_ephemerals_count 0zk_approximate_data_size 27zk_open_file_descriptor_count 27zk_max_file_descriptor_count 65535zk_followers 2zk_synced_followers 2zk_pending_syncs 0 |
---|
此时的zk_followers为2,说明Leader已经不认1-2号机器了
关闭1-2号机器
关闭idc02-kafka-ds-00
12345 | [hadoop@idc02-kafka-ds-00 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
关闭idc02-kafka-ds-01
12345 | [hadoop@idc02-kafka-ds-01 bin]$ ./zkServer.sh stopJMX enabled by defaultUsing config: /usr/local/webserver/zookeeper-3.4.6/bin/../conf/zoo.cfgStopping zookeeper ... STOPPED |
---|
再查看
1234567891011121314151617181920 | [hadoop@idc02-kafka-ds-03 bin]$ echo mntr|nc localhost 2181zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMTzk_avg_latency 0zk_max_latency 0zk_min_latency 0zk_packets_received 5zk_packets_sent 4zk_num_alive_connections 1zk_outstanding_requests 0zk_server_state leaderzk_znode_count 4zk_watch_count 0zk_ephemerals_count 0zk_approximate_data_size 27zk_open_file_descriptor_count 27zk_max_file_descriptor_count 65535zk_followers 2zk_synced_followers 2zk_pending_syncs 0 |
---|
没有任何影响
实验成功