Mfs+drbd+keepalived实现mfs系统高可用

2021-03-08 15:47:32 浏览数 (1)

oosefs分布式文件系统是一个易用的系统,但其只有在Pro版中提供了master的高可用方案,免费版master只能单机运行,存在单点故障的隐患。

本文结合网上的相关资料,介绍通过drbd keepalived来实现mfsmaster高可用的方案。

​环境:

CentOS 6

Master-primary IP: 172.18.18.201 (主机名test01)

Master-secondary IP: 172.18.18.202 (test02)

Mfschunkserver IP:172.18.18.203 (test03)

Master 虚拟IP:172.18.18.204

一、安装操作系统

安装CentOS 6系统。需注意:在安装mfsmaster服务器时,一定要划分出一个独立的分区给drbd使用,我这里用的是/dev/sda3,大小500MB。

二、安装drbd

2.1 环境准备

因drbd编辑安装时,需要用到CentOS内核代码,因此需下载对应内核的源码包到本地安装,不要用yum安装。

rpm -ikernel-devel-2.6.32-504.el6.x86_64.rpm

安装编译时需要的环境:

yum -y install gcc flex perl

2.2 drbd编辑安装

tar zxvf drbd-8.4.2.tar.gz

cd drbd-8.4.2

./configure --prefix=/usr/local/drbd–with-km

makeKDIR=/usr/src/kernels/2.6.32-504.el6.x86_64

–说明:这是实际内核源码路径,根据实际情况设定

make install

mkdir -p /usr/local/drbd/var/run/drbd

cp /usr/local/drbd/etc/rc.d/init.d/drbd/etc/rc.d/init.d

chkconfig --add drbd

chkconfig drbd on

安装DRBD模块:

cd /root/data/drbd-8.4.2/drbd

make clean

makeKDIR=/usr/src/kernels/2.6.32-504.el6.x86_64

cp drbd.ko/lib/modules/2.6.32-504.el6.x86_64/kernel/lib/

–说明:内核版本要用uname -r查一下

modprobe drbd

查看模块是否加载成功

lsmod |grep drbd

drbd 299688 0

libcrc32c 1246 1 drbd

2.3 drbd配置

vi/usr/local/drbd/etc/drbd.d/global_common.conf

global {

usage-count yes;

}

common {

net {

代码语言:javascript复制
protocol C; 

}

}

vi /usr/local/drbd/etc/drbd.d/r0.res

resource r0 {

on test01 {

代码语言:javascript复制
  device /dev/drbd1;

  disk /dev/sda3;

  address 172.18.18.201:7788;

  meta-disk internal;

  }

on test02 {

代码语言:javascript复制
  device /dev/drbd1;

  disk /dev/sda3;

  address 172.18.18.202:7788;

  meta-disk internal;

  }

}

创建drbd资源:

dd if=/dev/zero of=/dev/sda3 bs=1Mcount=1 --一定要执行这个否则下一步会报错

drbdadm create-md r0

drbdadm up r0

以上操作在secondary (test02)上也执行一遍。

2.4 设置Primary Node

drbdadm primary --force r0

查看drbd状态:

cat /proc/drbd

1:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r—n-

ns:108008 nr:0 dw:0 dr:109208 al:0 bm:6 lo:0 pe:1 ua:1 ap:0 ep:1 wo:foos:404428

代码语言:javascript复制
     [===>................]sync'ed: 21.6% (404428/511948)K

     finish:0:00:33 speed: 11,944 (11,944) K/sec

2.5 创建drbd文件系统

在primary上:

mkfs.ext4 /dev/drbd1

mkdir drbddata

mount /dev/drbd1 /drbddata

2.6 drbd同步测试

在primary (test01)上,往/drbddata目录中写入测试文件后,执行:

umount /dev/drbd1

drbdadm secondary r0

然后在secondary (test02)上,执行:

mkdir drbddata

drbdadm primary r0

mount /dev/drbd1 /drbddata

此时,在/drbddata目录应能看到刚写入的测试文件!

三、安装mfs

3.1 环境准备

安装libpcap

yum –y install libpcap

3.2 安装mfsmaster

tar zxvfmoosefs-packages-linux-3.0.77.tar.gz

rpm -imoosefs-master-3.0.77-1.rhsysv.x86_64.rpm

3.3 配置mfsmaster

cd /etc/mfs

vi mfsmaster.cfg

确保修改:DATA_PATH = /drbddata

其它按默认参数设置。

以上操作在secondary (test02)上也执行一遍,完成两台服务器mfsmaster的安装。

3.4 mfsmaster启动测试

在primary上启动测试:

drbdadm primary r0

mount /dev/drbd1 /drbddata

chown mfs:mfs –R /drbddata

mfsmaster start

查看日志看mfsmaster是否启动成功。

在secondary上启动测试:

先将primary上服务停止:

mfsmaster stop

umount /drbddata

drbdadm secondary r0

再在secondary上执行:

drbdadm primary r0

mount /dev/drbd1 /drbddata

chown mfs:mfs –R /drbddata

mfsmaster start

测试成功后,进行下一步的安装。

四、安装keepalived

4.1 安装keepalived

tar zxvf keepalived-1.2.22.tar.gz

cd keepalived-1.2.22

./configure --prefix=/usr/local/keepalived --disable-fwmark

make

make install

cp /usr/local/keepalived/sbin/keepalived/usr/sbin/

cp /usr/local/keepalived/etc/sysconfig/keepalived/etc/sysconfig/

cp/usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/

chkconfig --add keepalived

chkconfig keepalived on

4.2 配置keepalived

mkdir -p /etc/keepalived

cp/usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/

vi /etc/keepalived/keepalived.conf

global_defs {

notification_email {

shuyb@sina.com

}

notification_email_from shuyb@sina.com

smtp_server 192.168.200.1

smtp_connect_timeout 30

router_id test01

vrrp_skip_check_adv_addr

vrrp_strict

vrrp_garp_interval 0

vrrp_gna_interval 0

}

vrrp_script check_drbd {

script “/etc/keepalived/check_drbd.sh”

interval 15

}

vrrp_instance mfs {

代码语言:javascript复制
state MASTER

interface p4p1

virtual_router_id 51

priority 100

advert_int 1

authentication {

    auth_type PASS

    auth_pass 1111

}

virtual_ipaddress {

172.18.18.204

}

track_script {

check_drbd

}

}

在/etc/keepalived下新增一个文件:

vi check_drbd.sh

#!/bin/bash

A=ps -C mfsmaster --no-header |wc -l

if [ $A -eq 0 ]; then

umount /dev/drbd1

drbdadm secondary r0

killall keepalived

fi

chmod x check_drbd.sh

在secondary上安装步骤同上,配置文件如下:

vi /etc/keepalived/keepalived.conf

global_defs {

notification_email {

shuyb@sina.com

}

notification_email_from shuyb@sina.com

smtp_server 192.168.200.1

smtp_connect_timeout 30

router_id test01

vrrp_skip_check_adv_addr

vrrp_strict

vrrp_garp_interval 0

vrrp_gna_interval 0

}

vrrp_instance mfs {

代码语言:javascript复制
state BACKUP

interface p4p1

virtual_router_id 51

priority 90

advert_int 1

authentication {

    auth_type PASS

    auth_pass 1111

}

virtual_ipaddress {

172.18.18.204

}

notify_master /etc/keepalived/master.sh

notify_backup /etc/keepalived/backup.sh

}

在/etc/keepalived下新增两个文件:

vi backup.sh

#!/bin/bash

mfsmaster stop

umount /dev/drbd1

drbdadm secondary r0

vi master.sh

#!/bin/bash

drbdadm primary r0

mount /dev/drbd1 /drbddata

mfsmaster start

chmod x backup.sh master.sh

至此,安装配置完成。需要注意的是,要将服务器的防火墙关闭,或保证mfs需要用到的9420~9425端口、drbd用到的7788端口等打开!

五、安装mfschunkserver和mfs客户端

5.1在chunkserver(test03)上

yum install libpcap

rpm -imoosefs-chunkserver-3.0.77-1.rhsysv.x86_64.rpm

vi /etc/mfs/mfsmetalogger.cfg

修改:

MASTER_HOST = 172.18.18.204

vi /etc/mfs/mfshdd.cfg

增加:

/home

(注:/home是该服务器上给mfschunk用的目录,mount到一个独立的逻辑卷上)

chown -R mfs:mfs /home

5.2 在mfs客户端

yum install fuse-libs

rpm -imoosefs-client-3.0.77-1.rhsysv.x86_64.rpm

mkdir -p /mfs

六、启动及切换测试

好了,现在可以进行测试了!

1. 先把mfsmaster在primary上启动起来,步骤同上

2. 在primary上启动keepalived

/etc/init.d/keepalived start

查看虚拟ip是否绑定:

ip addr

3.在secondary上启动keepalived

/etc/init.d/keepalived start

  1. 启动mfschunkserver

mfschunkserver start

查看/var/log/message看与Master是否连接成功

  1. 客户端连接

mfsmount /mfs –H 172.18.18.204

df --查看是否挂载成功

  1. 停止primary上的mfsmaster

mfsmaster stop

  1. 查看secondary、chunkserver、client是否正常

测试完成。完成后,需手工将服务重新切回到primary上。

ps:当出现问题调转到backup后,如果主的正常了,我们手动切换必须遵守以下步骤

停掉mfsmaster服务

mfsmaster stop

代码语言:javascript复制
      umount /dev/drbd1

     drbdadm secondary mfs

主机:

1.drbdadm primary mfs

代码语言:javascript复制
    2.mount /dev/drbd1 /data1/drbd

    3.mfsmaster start

    4.keepalived start

还有要提到的就是可能会出现脑裂,这一般时操作不按顺序,也可能有其他原因,下面是网上大神写的,

复制代码 1,正常情况下状态: [root@drbd1 ~]# cat /proc/drbd version: 8.3.8 (api:88/proto:86-94) : 299AFE04D7AFD98B3CA0AF9 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:2144476 nr:0 dw:36468 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[root@drbd2 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9

0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----

代码语言:javascript复制
ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

2,drbd1故障后

drbd1状态:

[root@drbd1 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9

0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----

代码语言:javascript复制
ns:4 nr:102664 dw:102668 dr:157 al:1 bm:8 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

drbd2的状态:

[root@drbd2 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9

0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----

代码语言:javascript复制
ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

3,处理方法:

a,将secondary配置成primary角色

[root@drbd2 ~]# drbdsetup /dev/drbd0 primary -o

[root@drbd2 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9

0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r----

代码语言:javascript复制
ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

挂载:

[root@drbd2 /]# mount /dev/drbd0 /data1

[root@drbd2 data1]# ll

total 10272

-rw-r–r-- 1 root root 10485760 Feb 13 11:26 aa.img

drwx------ 2 root root 16384 Feb 13 11:25 lost found

这个时候drbd2开始提供服务,开始写数据

drbd1主恢复正常后:

[root@drbd1 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9

0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----

代码语言:javascript复制
ns:2144476 nr:0 dw:36484 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8

drbd1状态是:StandAlone,此时,drbd1是不会和drbd2互相联系的

我们来查看下日志:

[root@drbd1 ~]# tailf /var/log/messages

Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0

Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)

Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( WFReportParams -> Disconnecting )

Feb 13 16:14:27 drbd1 kernel: block drbd0: error receiving ReportState, l: 4!

Feb 13 16:14:27 drbd1 kernel: block drbd0: asender terminated

Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_asender

Feb 13 16:14:27 drbd1 kernel: block drbd0: Connection closed

Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( Disconnecting -> StandAlone )

Feb 13 16:14:27 drbd1 kernel: block drbd0: receiver terminated

Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_receiver

脑裂出现!

解决方法:

1>,我们需要将现在的drbd1角色修改为secondary

[root@drbd1 ~]# drbdadm secondary r0

[root@drbd1 ~]# drbdadm – --discard-my-data connect r0 ##该命令告诉drbd,secondary上的数据不正确,以primary上的数据为准。

2>,我们还需要在drbd2上执行下面操作

[root@drbd2 /]# drbdadm connect r0

这样drbd1就能和drbd2开始连接上了,并且保证数据不会丢失:

[root@drbd1 ~]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

srcversion: 299AFE04D7AFD98B3CA0AF9

0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----

代码语言:javascript复制
ns:0 nr:20592 dw:20592 dr:0 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

0 人点赞