【DB宝86】使用OBD部署一个OceanBase 三副本集群并使用Promethues监控OB(在不同节点)

2022-02-23 19:36:43 浏览数 (1)

  • OceanBase Docker安装体验:https://www.xmmup.com/oceanbase-dockeranzhuangtiyan.html
  • 手动部署 OceanBase 单副本集群:https://www.xmmup.com/shoudongbushu-oceanbase-danfubenjiqun.html
  • 手动部署 OceanBase 三副本集群(在同一个节点):https://www.xmmup.com/shoudongbushu-oceanbase-sanfubenjiqunzaitongyigejiedian.html

环境规划

参考:https://open.oceanbase.com/docs/community/oceanbase-database/V3.1.1/deploy-the-distributed-oceanbase-cluster

环境是这样的,只有一台虚拟机,配置16c80G,准备使用docker来模拟出4台CentOS主机,搭建OceanBase 3.1.1社区版的集群。OBServer每台至少需要8G内存,否则不能启动。

具体配置如下所示:

主机名

IP

端口

主机映射端口

Zone

作用

lhrob1

172.72.8.11

2881

28811

zone1

OB Server1、OBAgent

lhrob2

172.72.8.12

2881

28812

zone2

OB Server2、OBAgent

lhrob3

172.72.8.13

2881

28813

zone3

OB Server3、OBAgent

lhrobproxy

172.72.8.14

2883

28814

用于OBD、OBProxy、OBClient、mysql、Promethues、Grafana、OB客户端、时间服务器等

OBAgent是OceanBase社区版数据库服务监控组件。

OBD 全称是 OceanBase Deployer,是 OceanBase 社区版的命令行下自动化部署软件。

ODP:OceanBase 数据库代理,简称为 ODP。ODP 是 OceanBase 数据库专用的连接代理软件。ODP 的核心功能包括:保证最佳路由,避免分布式事务;保护 OceanBase 数据库的高可用能力,单台服务器故障不影响应用。

初始化准备

申请环境

代码语言:javascript复制
docker network create --subnet=172.72.8.0/24  lhrob-network
docker network inspect lhrob-network


docker rm -f lhrob1 lhrob2 lhrob3 lhrobproxy

docker run -d --name lhrob1 -h lhrob1 
  --net=lhrob-network --ip 172.72.8.11 
  -p 28811:2881 
  -v /sys/fs/cgroup:/sys/fs/cgroup 
  --privileged=true lhrbest/lhrcentos76:8.5 
  /usr/sbin/init

docker run -d --name lhrob2 -h lhrob2 
  --net=lhrob-network --ip 172.72.8.12 
  -p 28812:2881 
  -v /sys/fs/cgroup:/sys/fs/cgroup 
  --privileged=true lhrbest/lhrcentos76:8.5 
  /usr/sbin/init  


docker run -d --name lhrob3 -h lhrob3 
  --net=lhrob-network --ip 172.72.8.13 
  -p 28813:2881 
  -v /sys/fs/cgroup:/sys/fs/cgroup 
  --privileged=true lhrbest/lhrcentos76:8.5 
  /usr/sbin/init 


docker run -d --name lhrobproxy -h lhrobproxy 
  --net=lhrob-network --ip 172.72.8.14 
  -p 28814:2883 -p 23000:3000 -p 29090:9090 
  -v /sys/fs/cgroup:/sys/fs/cgroup 
  --privileged=true lhrbest/lhrcentos76:8.5 
  /usr/sbin/init   



docker exec -it lhrobproxy bash


[root@docker35 ~]# docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED          STATUS          PORTS                                                    NAMES
494cb2a2cafe   lhrbest/lhrcentos76:8.5   "/usr/sbin/init"         26 minutes ago   Up 26 minutes   0.0.0.0:28814->2883/tcp, :::28814->2883/tcp              lhrobproxy
06d2587bbd61   lhrbest/lhrcentos76:8.5   "/usr/sbin/init"         26 minutes ago   Up 26 minutes   0.0.0.0:28813->2881/tcp, :::28813->2881/tcp              lhrob3
969ca85b3bea   lhrbest/lhrcentos76:8.5   "/usr/sbin/init"         26 minutes ago   Up 26 minutes   0.0.0.0:28812->2881/tcp, :::28812->2881/tcp              lhrob2
f86e1423e13a   lhrbest/lhrcentos76:8.5   "/usr/sbin/init"         26 minutes ago   Up 26 minutes   0.0.0.0:28811->2881/tcp, :::28811->2881/tcp              lhrob1

配置时钟源

参考:https://open.oceanbase.com/docs/community/oceanbase-database/V3.1.1/optional-configuring-clock-sources

如果您使用集群安装 OceanBase,则需要保证集群内各机器的时间同步。否则集群无法启动,服务在运行时也会出现异常。如果您已配置 NTP 时钟同步,则无需重新配置。

OceanBase 集群中的服务器时间必须保持一致,否则会导致 OceanBase 集群无法启动,运行时也会出现故障。物理机与时钟服务器的误差在 50ms 以下可认为时钟是同步状态,OceanBase 集群最大容忍误差不能超过 100ms。当超过 100ms 时,会出现无主情况。恢复时钟同步后。重启 OceanBase 集群, 可以恢复正常。

部署 OceanBase 集群时,各个 OBServer 的 RPC 允许的时钟偏差最大是100ms。

这里以“172.72.8.14”为时间服务器,其它3台OBServer同步该机器的时间:

代码语言:javascript复制
yum install ntp ntpdate -y
ntpq -4p
ntpstat
timedatectl

1、修改“172.72.8.14”为时间服务器/etc/ntp.conf

代码语言:javascript复制
# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).

driftfile /var/lib/ntp/drift

#新增:日志目录
logfile /var/log/ntpd.log

# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default nomodify notrap nopeer noquery

# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1 
restrict ::1

#新增:这一行的含义是授权172.72.8.0网段上的所有机器可以从这台机器上查询和同步时间.
restrict 172.72.8.0 mask 255.255.255.0 nomodify notrap

# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

#新增:时间服务器列表.
server 0.cn.pool.ntp.org iburst
server 1.cn.pool.ntp.org iburst
server 2.cn.pool.ntp.org iburst
server 3.cn.pool.ntp.org iburst

#新增:当外部时间不可用时,使用本地时间
server 127.0.0.1 iburst
fudge 127.0.0.1 stratum 10


#broadcast 192.168.1.255 autokey        # broadcast server
#broadcastclient                        # broadcast client
#broadcast 224.0.1.1 autokey            # multicast server
#multicastclient 224.0.1.1              # multicast client
#manycastserver 239.255.254.254         # manycast server
#manycastclient 239.255.254.254 autokey # manycast client

# Enable public key cryptography.
#crypto

includefile /etc/ntp/crypto/pw

# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography. 
keys /etc/ntp/keys

# Specify the key identifiers which are trusted.
#trustedkey 4 8 42

# Specify the key identifier to use with the ntpdc utility.
#requestkey 8

# Specify the key identifier to use with the ntpq utility.
#controlkey 8

# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats

# Disable the monitoring facility to prevent amplification attacks using ntpdc
# monlist command when default restrict does not include the noquery flag. See
# CVE-2013-5211 for more details.
# Note: Monitoring will not be disabled with the limited restriction flag.
disable monitor

配置开机启动:

代码语言:javascript复制
systemctl enable ntpd
systemctl is-enabled ntpd

ntpdate -u 1.cn.pool.ntp.org
systemctl restart ntpd


[root@lhrobproxy /]# ntpstat
synchronised to NTP server (84.16.73.33) at stratum 2
   time correct to within 98 ms
   polling server every 64 s

其它客户端,修改“/etc/ntp.conf”,注释server开头的行,并添加如下行:

代码语言:javascript复制
server 172.72.8.14

restrict 172.72.8.14 nomodify notrap noquery

server 127.0.0.1
fudge 127.0.0.1 stratum 10

配置开机启动:

代码语言:javascript复制
systemctl enable ntpd
systemctl restart ntpd

客户端配置自动同步:

代码语言:javascript复制
crontab -e
* * * * * /usr/sbin/ntpdate -u 172.72.8.14 & > /dev/null

检查:

  • 机器三节点之间时间同步检查,检查本机和目标节点时间误差常用命令是:clockdiff
代码语言:javascript复制
[root@lhrobproxy soft]# clockdiff lhrob1
..
host=lhrob1 rtt=562(280)ms/0ms delta=0ms/0ms Sun Jan  9 10:39:43 2022
[root@lhrobproxy soft]# clockdiff lhrob2
.
host=lhrob2 rtt=750(187)ms/0ms delta=0ms/0ms Sun Jan  9 10:39:57 2022
[root@lhrobproxy soft]# clockdiff lhrob3
.
host=lhrob3 rtt=750(187)ms/0ms delta=0ms/0ms Sun Jan  9 10:40:00 2022

delta = 目标主机减当前主机时间 ,时间单位是毫秒。

三节点时间同步误差如果超过 50ms,则后面初始化集群一定会失败。这里还要留意节点的时间误差可能有个缓慢递增的特点,也许当前集群还能正常工作,一天后由于节点时间误差扩大到 50ms 以外,该节点就掉线了。

配置内核参数

4个节点都运行:

代码语言:javascript复制
cat >> /etc/security/limits.conf <<"EOF"
root soft nofile 655350
root hard nofile 655350
* soft nofile 655350
* hard nofile 655350
* soft stack 20480
* hard stack 20480
* soft nproc 655360
* hard nproc 655360
* soft core unlimited
* hard core unlimited
EOF


echo "fs.aio-max-nr=1048576" >>  /etc/sysctl.conf

sysctl -p

如果只是测试,您可以只设置 fs.aio-max-nr=1048576

创建用户

代码语言:javascript复制
useradd -U admin -d /home/admin -s /bin/bash
echo "admin:lhr" | chpasswd

chown -R admin:admin /home/admin


echo "admin       ALL=(ALL)       NOPASSWD: ALL" >> /etc/sudoers

设置无密码SSH登陆

可以使用rac上的sshUserSetup.sh快速配置,只在lhrobproxy上运行:

代码语言:javascript复制
sh sshUserSetup.sh -user admin  -hosts "lhrob1 lhrob2 lhrob3 lhrobproxy" -advanced exverify -confirm

安装集群

在lhrobproxy操作:

安装OBD

代码语言:javascript复制
yum install -y yum-utils
yum-config-manager --add-repo https://mirrors.aliyun.com/oceanbase/OceanBase.repo
yum install -y ob-deploy

设置yaml的配置文件

OBD 针对不同的部署场景提供不同的配置文件。这些配置文件示例在 OceanBase 开源项目地址里,具体是:https://github.com/oceanbase/obdeploy/tree/master/example 。

OBD根据这个yaml文件即可自动创建集群。

代码语言:javascript复制
cat > /tmp/obd_observer_obproxy.yaml <<"EOF"
## Only need to configure when remote login is required
user:
    username: admin                    #用户名,前提三个节点保持一致
    password: lhr                #密码 ,前提三个节点保持一致
    key_file:        #密钥,可省略 

####################    下面是 observer搭建参数  ##############################
oceanbase-ce: 
  servers:
    - name: observer01         # zone名
      # Please don't use hostname, only IP can be supported
      ip: 172.72.8.11    # OB1 地址
    - name: observer02
      ip: 172.72.8.12    # OB2 地址
    - name: observer03
      ip: 172.72.8.13    # OB3 地址
  global:
    mysql_port: 2881            # 数据库端口
    rpc_port: 2882              # 远程访问的协议端口号
    home_path: /home/admin/oceanbase      # 软件目录
    #data_dir: /data      # 数据目录
    #redo_dir: /redo      # redo目录
    devname: eth0             # 设置要部署节点的网卡
    memory_limit: 8G           
    system_memory: 2G             # 系统剩余保留内存2G
    lower_case_table_names: 1   # 数据库不区分大小写
    foreign_key_checks: 0       # DML 语句不检查外建约束,DDL 操作不受影响
    sys_bkgd_migration_retry_num: 5        # 副本迁移失败时最多重试次数。
    stack_size: 512K               # 设置程序函数调用栈的大小。 磁盘要512k 对齐,如果不是可能会启动失败
    cpu_count: 16                  # cpu 16核
    cache_wash_threshold: 1G       # 设置触发缓存清理的容量阈值。如果内存空间小于指定值时,内存空间将被清理。
    __min_full_resource_pool_memory: 1073741824      # 默认普通租户的内存最小规格必须大于等于 5 GB,这里设置成1G,就说明我可以最低设置租户内存为1G 
    workers_per_cpu_quota: 10                         #  用于设置分配给每个 CPU 配额的工作线程数量。
    schema_history_expire_time: 1d                # 元数据历史数据过期时间。
    net_thread_count: 4                           # 设置网络 I/O 线程数,The value of net_thread_count had better be same as cpu's core number.
    major_freeze_duty_time: Disable
    minor_freeze_times: 10                        # 多少次小合并触发一次全局合并。
    enable_separate_sys_clog: True           # 是否把系统事务日志与用户事务日志分开存储。
    enable_merge_by_turn: FALSE
    datafile_size: 5G
    #datafile_disk_percentage: 0.1              # 数据库系统初始化用于存储数据,例如这里设置了40,表示百分之四十,如果我单节点是1TB的,约400多G空间将会被立即占用
    syslog_level: ERROR                          # 日志警报级别
    enable_syslog_recycle: True               # 开启回收系统日志的功能
    max_syslog_file_count: 4                  # 日志文件数量
    log_dir_size_threshold: 1G
    cluster_id: 1                             # 集群ID
    # observer cluster name, consistent with obproxy's cluster_name
    appname: lhrob312cluster                  # 集群名,要与下面obproxy的对应一致
    ###下面设置节点信息
  observer01:
    zone: zone1               # 设置节点所在的 Zone 的名字
  observer02:
    zone: zone2
  observer03:
    zone: zone3
########################## 以下是obrpoxy的 搭建参数 ######################3
obproxy: 
  servers:
    - 127.0.0.1
  depends:
    - oceanbase-ce
  global:
    listen_port: 2883
    prometheus_listen_port: 2884
    home_path: /home/admin/obproxy
    # oceanbase root server list
    # format: ip:mysql_port,ip:mysql_port
    # rs_list: 172.72.8.11:2881;172.72.8.12:2881;172.72.8.13:2881
    enable_cluster_checkout: false
    skip_proxy_sys_private_check: true
    # cluster_name: lhrob312cluster
EOF

使用OBD安装集群

代码语言:javascript复制
sh /etc/profile.d/obd.sh

obd cluster deploy lhrob312cluster -c /tmp/obd_observer_obproxy.yaml -f

obd cluster list
obd cluster display lhrob312cluster

-- 集群初始化
obd cluster start lhrob312cluster

-- 修改参数
obd cluster edit-config  lhrob312cluster

-- 安装客户端
yum install -y  obclient mariadb mariadb-libs mariadb-devel

执行过程:

代码语言:javascript复制
[root@lhrobproxy ~]# obd cluster deploy lhrob312cluster -c /tmp/obd_observer_obproxy.yaml -f
Update OceanBase-community-stable-el7 ok
Update OceanBase-development-kit-el7 ok
Download oceanbase-ce-3.1.2-10000392021123010.el7.x86_64.rpmime: 0:00:18   2.65 MB/s
Package oceanbase-ce-3.1.2 is available.
Download obproxy-3.2.0-1.el7.x86_64.rpmime: 0:00:03   2.50 MB/s
Package obproxy-3.2.0 is available.
install oceanbase-ce-3.1.2 for local ok
install obproxy-3.2.0 for local ok
 ------------------------------------------------------------------------------------------- 
|                                          Packages                                         |
 -------------- --------- ----------------------- ------------------------------------------ 
| Repository   | Version | Release               | Md5                                      |
 -------------- --------- ----------------------- ------------------------------------------ 
| oceanbase-ce | 3.1.2   | 10000392021123010.el7 | 7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed |
| obproxy      | 3.2.0   | 1.el7                 | 8d5c6978f988935dc3da1dbec208914668dcf3b2 |
 -------------- --------- ----------------------- ------------------------------------------ 
Repository integrity check ok
Parameter check ok
Open ssh connection ok
Remote oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed repository install ok
Remote oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed repository lib check !!
[WARN] z1(172.72.8.11) oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed require: libmariadb.so.3
[WARN] z2(172.72.8.12) oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed require: libmariadb.so.3
[WARN] z3(172.72.8.13) oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed require: libmariadb.so.3

Remote obproxy-3.2.0-8d5c6978f988935dc3da1dbec208914668dcf3b2 repository install ok
Remote obproxy-3.2.0-8d5c6978f988935dc3da1dbec208914668dcf3b2 repository lib check ok
Try to get lib-repository
Download oceanbase-ce-libs-3.1.2-10000392021123010.el7.x86_64.rpmime: 0:00:00   1.71 MB/s
Package oceanbase-ce-libs-3.1.2 is available.
install oceanbase-ce-libs-3.1.2 for local ok
Use oceanbase-ce-libs-3.1.2-94fff0ab31de053051dba66039e3185fa390cad5 for oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed
Remote oceanbase-ce-libs-3.1.2-94fff0ab31de053051dba66039e3185fa390cad5 repository install ok
Remote oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed repository lib check ok
Cluster status check ok
Initializes observer work home ok
Initializes obproxy work home ok
lhrob312cluster deployed
[root@lhrobproxy ~]# obd cluster list
 ------------------------------------------------------------------------ 
|                              Cluster List                              |
 ----------------- ------------------------------------ ----------------- 
| Name            | Configuration Path                 | Status (Cached) |
 ----------------- ------------------------------------ ----------------- 
| lhrob312cluster | /root/.obd/cluster/lhrob312cluster | deployed        |
 ----------------- ------------------------------------ ----------------- 
[root@lhrobproxy ~]# obd cluster start lhrob312cluster
Get local repositories and plugins ok
Open ssh connection ok
Load cluster param plugin ok
Check before start observer ok
[WARN] (172.72.8.11) clog and data use the same disk (/)
[WARN] (172.72.8.12) clog and data use the same disk (/)
[WARN] (172.72.8.13) clog and data use the same disk (/)

Check before start obproxy ok
Start observer ok
observer program health check ok
Connect to observer ok
Initialize cluster
Cluster bootstrap ok
Wait for observer init ok
 ----------------------------------------------- 
|                    observer                   |
 ------------- --------- ------ ------- -------- 
| ip          | version | port | zone  | status |
 ------------- --------- ------ ------- -------- 
| 172.72.8.11 | 3.1.2   | 2881 | zone1 | active |
| 172.72.8.12 | 3.1.2   | 2881 | zone2 | active |
| 172.72.8.13 | 3.1.2   | 2881 | zone3 | active |
 ------------- --------- ------ ------- -------- 

Start obproxy ok
obproxy program health check ok
Connect to obproxy ok
 --------------------------------------------- 
|                   obproxy                   |
 ----------- ------ ----------------- -------- 
| ip        | port | prometheus_port | status |
 ----------- ------ ----------------- -------- 
| 127.0.0.1 | 2883 | 2884            | active |
 ----------- ------ ----------------- -------- 
lhrob312cluster running



[root@lhrobproxy soft]# obd cluster list
 ------------------------------------------------------------------------ 
|                              Cluster List                              |
 ----------------- ------------------------------------ ----------------- 
| Name            | Configuration Path                 | Status (Cached) |
 ----------------- ------------------------------------ ----------------- 
| lhrob312cluster | /root/.obd/cluster/lhrob312cluster | running         |
 ----------------- ------------------------------------ ----------------- 
[root@lhrobproxy soft]# 

[root@lhrobproxy ~]# netstat -tulnp| grep 88
tcp        0      0 0.0.0.0:2883            0.0.0.0:*               LISTEN      6251/obproxy        
tcp        0      0 0.0.0.0:2884            0.0.0.0:*               LISTEN      6251/obproxy        
[root@lhrobproxy ~]# ps -ef|grep ob
admin     6220     1  0 11:22 ?        00:00:00 bash /home/admin/obproxy/obproxyd.sh /home/admin/obproxy 127.0.0.1 2883 daemon
admin     6251     1 58 11:22 ?        00:00:33 /home/admin/obproxy/bin/obproxy --listen_port 2883
root      6544   188  0 11:23 pts/0    00:00:00 grep --color=auto ob


-- 剩余节点类似    
[root@lhrob1 /]# netstat -tulnp | grep 88
tcp        0      0 0.0.0.0:2881            0.0.0.0:*               LISTEN      3255/observer       
tcp        0      0 0.0.0.0:2882            0.0.0.0:*               LISTEN      3255/observer       
[root@lhrob1 /]# ps -ef|grep ob
admin     3255     1 99 10:51 ?        00:06:09 /home/admin/oceanbase/bin/observer -r 172.72.8.11:2882:2881;172.72.8.12:2882:2881;172.72.8.13:2882:2881 -o __min_full_resource_pool_memory=1073741824,memory_limit=8G,system_memory=2G,lower_case_table_names=1,foreign_key_checks=0,sys_bkgd_migration_retry_num=5,stack_size=512K,cpu_count=16,cache_wash_threshold=1G,workers_per_cpu_quota=10,schema_history_expire_time=1d,net_thread_count=4,major_freeze_duty_time=Disable,minor_freeze_times=10,enable_separate_sys_clog=True,enable_merge_by_turn=False,datafile_size=5G,enable_syslog_recycle=True,max_syslog_file_count=4,log_dir_size_threshold=1G -z zone1 -p 2881 -P 2882 -n lhrob312cluster -c 1 -d /home/admin/oceanbase/store -i eth0 -l INFO
root      3974   402  0 10:53 pts/0    00:00:00 grep --color=auto ob

[root@lhrob2 /]# netstat -tulnp | grep 88
tcp        0      0 0.0.0.0:2881            0.0.0.0:*               LISTEN      3157/observer       
tcp        0      0 0.0.0.0:2882            0.0.0.0:*               LISTEN      3157/observer       
[root@lhrob2 /]# ps -ef|grep ob
admin     3157     1 99 10:51 ?        00:08:22 /home/admin/oceanbase/bin/observer -r 172.72.8.11:2882:2881;172.72.8.12:2882:2881;172.72.8.13:2882:2881 -o __min_full_resource_pool_memory=1073741824,memory_limit=8G,system_memory=2G,lower_case_table_names=1,foreign_key_checks=0,sys_bkgd_migration_retry_num=5,stack_size=512K,cpu_count=16,cache_wash_threshold=1G,workers_per_cpu_quota=10,schema_history_expire_time=1d,net_thread_count=4,major_freeze_duty_time=Disable,minor_freeze_times=10,enable_separate_sys_clog=True,enable_merge_by_turn=False,datafile_size=5G,enable_syslog_recycle=True,max_syslog_file_count=4,log_dir_size_threshold=1G -z zone2 -p 2881 -P 2882 -n lhrob312cluster -c 1 -d /home/admin/oceanbase/store -i eth0 -l INFO
root      3927   391  0 10:54 pts/0    00:00:00 grep --color=auto ob
[root@lhrob2 /]# 

[root@lhrob3 /]# netstat -tulnp | grep 88
tcp        0      0 0.0.0.0:2881            0.0.0.0:*               LISTEN      3139/observer       
tcp        0      0 0.0.0.0:2882            0.0.0.0:*               LISTEN      3139/observer       
[root@lhrob3 /]# ps -ef|grep ob
admin     3139     1 99 10:51 ?        00:07:55 /home/admin/oceanbase/bin/observer -r 172.72.8.11:2882:2881;172.72.8.12:2882:2881;172.72.8.13:2882:2881 -o __min_full_resource_pool_memory=1073741824,memory_limit=8G,system_memory=2G,lower_case_table_names=1,foreign_key_checks=0,sys_bkgd_migration_retry_num=5,stack_size=512K,cpu_count=16,cache_wash_threshold=1G,workers_per_cpu_quota=10,schema_history_expire_time=1d,net_thread_count=4,major_freeze_duty_time=Disable,minor_freeze_times=10,enable_separate_sys_clog=True,enable_merge_by_turn=False,datafile_size=5G,enable_syslog_recycle=True,max_syslog_file_count=4,log_dir_size_threshold=1G -z zone3 -p 2881 -P 2882 -n lhrob312cluster -c 1 -d /home/admin/oceanbase/store -i eth0 -l INFO
root      3911   378  0 10:54 pts/0    00:00:00 grep --color=auto ob    

配置obproxy

https://open.oceanbase.com/articles/1100243

obproxy跟OB集群通信是使用sys租户内的一个内部账户proxyro,这个账户需要创建,所以需要在OBserver中创建账号。

obproxy启动后,默认用 root@proxysys 登录,密码为空。需要改密码(通过proxy参数obproxy_sys_password指定)。

obproxy启动后,还需要修改proxyro的密码(通过proxy参数observer_sys_password指定),设置为跟OB集群里创建的proxyro密码一致才能链接那个OB集群。

代码语言:javascript复制
-- mysql登陆
mysql -h127.1 -uroot@sys -P2881 -p -c -A oceanbase

select * from mysql.user;
create user proxyro identified by 'lhr';
alter user proxyro identified by 'lhr';
grant select on *.* to proxyro;
alter user root identified by 'lhr';


-- obproxy登陆
mysql -h127.1 -uroot@proxysys -P2883 -p
alter proxyconfig set obproxy_sys_password='lhr';
alter proxyconfig set observer_sys_password='lhr';
show proxyconfig like '%sys_password%';


mysql -h127.1 -uroot@sys -P2883 -plhr -c -A oceanbase
mysql -uroot@sys -plhr -h192.168.66.35 -P28814

select * from oceanbase.__all_server;
show full processlist;



[root@lhrobproxy ~]#  strings /home/admin/obproxy/etc/obproxy_config.bin | grep sys
observer_sys_password1=
observer_sys_password=6095142f4b755fb18e0ca1edc3fa38fe0bdc78b9
obproxy_sys_password=6095142f4b755fb18e0ca1edc3fa38fe0bdc78b9
skip_proxy_sys_private_check=True
syslog_level=INFO

过程:

在observer1:

代码语言:javascript复制
[root@lhrob1 ~]# mysql -h127.1 -uroot -P2881 -p -c -A
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MySQL connection id is 3221496822
Server version: 5.7.25 OceanBase 3.1.2 (r10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d) (Built Dec 30 2021 02:47:29)

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MySQL [(none)]>  select * from mysql.user;

| host | user       | password                                  | select_priv | insert_priv | update_priv | delete_priv | create_priv | drop_priv | reload_priv | shutdown_priv | process_priv | file_priv | grant_priv | reference_priv | index_priv | alter_priv | show_db_priv | super_priv | create_tmp_table_priv | lock_tables_priv | execute_priv | repl_slave_priv | repl_client_priv | create_view_priv | show_view_priv | create_routine_priv | alter_routine_priv | create_user_priv | event_priv | trigger_priv | create_tablespace_priv | ssl_type | ssl_cipher | x509_issuer | x509_subject | max_questions | max_updates | max_connections | max_user_connections | plugin             | authentication_string | password_expired |

| %    | root       |                                           | Y           | Y           | Y           | Y           | Y           | Y         | N           | N             | Y            | Y         | Y          | N              | Y          | Y          | Y            | Y          | N                     | N                | N            | N               | N                | Y                | Y              | N                   | N                  | Y                | N          | N            | N                      |          |            |             |              |             0 |           0 |               0 |                    0 | ob_native_password |                       |                  |
| %    | ORAAUDITOR | *9753e2cf9d2dcd5e13c052f581c310ac70c62723 | Y           | Y           | Y           | Y           | Y           | Y         | N           | N             | Y            | Y         | Y          | N              | Y          | Y          | Y            | Y          | N                     | N                | N            | N               | N                | Y                | Y              | N                   | N                  | Y                | N          | N            | N                      |          |            |             |              |             0 |           0 |               0 |                    0 | ob_native_password |                       |                  |
| %    | proxyro    |                                           | N           | N           | N           | N           | N           | N         | N           | N             | N            | N         | N          | N              | N          | N          | N            | N          | N                     | N                | N            | N               | N                | N                | N              | N                   | N                  | N                | N          | N            | N                      |          |            |             |              |             0 |           0 |               0 |                    0 | ob_native_password |                       |                  |
 ------ ------------ ------------------------------------------- ------------- ------------- ------------- ------------- ------------- ----------- ------------- --------------- -------------- ----------- ------------ ---------------- ------------ ------------ -------------- ------------ ----------------------- ------------------ -------------- ----------------- ------------------ ------------------ ---------------- --------------------- -------------------- ------------------ ------------ -------------- ------------------------ ---------- ------------ ------------- -------------- --------------- ------------- ----------------- ---------------------- -------------------- ----------------------- ------------------ 
3 rows in set (0.10 sec)

MySQL [(none)]> 
MySQL [(none)]> create user proxyro identified by 'lhr';
alter user proxyro identified by 'lhr';
ERROR 1396 (HY000): Operation CREATE USER failed for 'proxyro'@'%'
MySQL [(none)]> alter user proxyro identified by 'lhr';
grant select on *.* to proxyro;
alter user root identified by 'lhr';Query OK, 0 rows affected (0.33 sec)

MySQL [(none)]> grant select on *.* to proxyro;
Query OK, 0 rows affected (0.25 sec)

MySQL [(none)]> alter user root identified by 'lhr';
Query OK, 0 rows affected (0.33 sec)

MySQL [(none)]> select * from mysql.user;

| host | user       | password                                  | select_priv | insert_priv | update_priv | delete_priv | create_priv | drop_priv | reload_priv | shutdown_priv | process_priv | file_priv | grant_priv | reference_priv | index_priv | alter_priv | show_db_priv | super_priv | create_tmp_table_priv | lock_tables_priv | execute_priv | repl_slave_priv | repl_client_priv | create_view_priv | show_view_priv | create_routine_priv | alter_routine_priv | create_user_priv | event_priv | trigger_priv | create_tablespace_priv | ssl_type | ssl_cipher | x509_issuer | x509_subject | max_questions | max_updates | max_connections | max_user_connections | plugin             | authentication_string | password_expired |

| %    | root       | *827f67f8037d3f8f076544b6ffc6d56058166d8b | Y           | Y           | Y           | Y           | Y           | Y         | N           | N             | Y            | Y         | Y          | N              | Y          | Y          | Y            | Y          | N                     | N                | N            | N               | N                | Y                | Y              | N                   | N                  | Y                | N          | N            | N                      |          |            |             |              |             0 |           0 |               0 |                    0 | ob_native_password |                       |                  |
| %    | ORAAUDITOR | *9753e2cf9d2dcd5e13c052f581c310ac70c62723 | Y           | Y           | Y           | Y           | Y           | Y         | N           | N             | Y            | Y         | Y          | N              | Y          | Y          | Y            | Y          | N                     | N                | N            | N               | N                | Y                | Y              | N                   | N                  | Y                | N          | N            | N                      |          |            |             |              |             0 |           0 |               0 |                    0 | ob_native_password |                       |                  |
| %    | proxyro    | *827f67f8037d3f8f076544b6ffc6d56058166d8b | Y           | N           | N           | N           | N           | N         | N           | N             | N            | N         | N          | N              | N          | N          | N            | N          | N                     | N                | N            | N               | N                | N                | N              | N                   | N                  | N                | N          | N            | N                      |          |            |             |              |             0 |           0 |               0 |                    0 | ob_native_password |                       |                  |

3 rows in set (0.14 sec)

在obproxy:

代码语言:javascript复制
[root@lhrobproxy ~]# mysql -h127.1 -uroot@proxysys -P2883 -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MySQL connection id is 3
Server version: 5.6.25

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MySQL [(none)]> show databases;
 -------------------------- ------------------- 
| Variable_name            | Value             |
 -------------------------- ------------------- 
| tx_isolation             | READ-COMMITTED    |
| system_time_zone         |  08:00            |
| time_zone                |  08:00            |
| character_set_server     | utf8mb4           |
| character_set_client     | utf8mb4           |
| interactive_timeout      | 28800             |
| query_cache_size         | 1048576           |
| character_set_results    | utf8mb4           |
| max_allowed_packet       | 4194304           |
| sql_mode                 | STRICT_ALL_TABLES |
| net_buffer_length        | 16384             |
| wait_timeout             | 28800             |
| lower_case_table_names   | 2                 |
| query_cache_type         | OFF               |
| init_connect             |                   |
| transaction_isolation    | READ              |
| character_set_connection | utf8mb4           |
| net_write_timeout        | 60                |
 -------------------------- ------------------- 
18 rows in set (0.00 sec)

MySQL [(none)]> 
MySQL [(none)]> show proxyconfig like '%sys_password%';
 ------------------------ ------- -------------------------------- ------------- --------------- 
| name                   | value | info                           | need_reboot | visible_level |
 ------------------------ ------- -------------------------------- ------------- --------------- 
| observer_sys_password1 |       | password for observer sys user | false       | SYS           |
| observer_sys_password  |       | password for observer sys user | false       | SYS           |
| obproxy_sys_password   |       | password for obproxy sys user  | false       | SYS           |
 ------------------------ ------- -------------------------------- ------------- --------------- 
3 rows in set (0.00 sec)

MySQL [(none)]> 
MySQL [(none)]> alter proxyconfig set obproxy_sys_password='lhr';
Query OK, 0 rows affected (0.00 sec)

MySQL [(none)]> alter proxyconfig set observer_sys_password='lhr';
Query OK, 0 rows affected (0.01 sec)

MySQL [(none)]> show proxyconfig like '%sys_password%';
 ------------------------ ------------------------------------------ -------------------------------- ------------- --------------- 
| name                   | value                                    | info                           | need_reboot | visible_level |
 ------------------------ ------------------------------------------ -------------------------------- ------------- --------------- 
| observer_sys_password1 |                                          | password for observer sys user | false       | SYS           |
| observer_sys_password  | 6095142f4b755fb18e0ca1edc3fa38fe0bdc78b9 | password for observer sys user | false       | SYS           |
| obproxy_sys_password   | 6095142f4b755fb18e0ca1edc3fa38fe0bdc78b9 | password for obproxy sys user  | false       | SYS           |
 ------------------------ ------------------------------------------ -------------------------------- ------------- --------------- 
3 rows in set (0.00 sec)


[root@lhrobproxy ~]# mysql -h127.1 -uroot@sys -P2883 -plhr -c -A oceanbase
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MySQL connection id is 4
Server version: 5.6.25 OceanBase 3.1.2 (r10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d) (Built Dec 30 2021 02:47:29)

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MySQL [oceanbase]> show databases;
 -------------------- 
| Database           |
 -------------------- 
| oceanbase          |
| information_schema |
| mysql              |
| SYS                |
| LBACSYS            |
| ORAAUDITOR         |
| test               |
 -------------------- 
7 rows in set (0.01 sec)

MySQL [oceanbase]> 
MySQL [oceanbase]> select * from oceanbase.__all_server;
show full processlist; ---------------------------- ---------------------------- ------------- ---------- ---- ------- ------------ ----------------- -------- ----------------------- ---------------------------------------------------------------------------------------- ----------- -------------------- -------------- ---------------- ------------------- 
| gmt_create                 | gmt_modified               | svr_ip      | svr_port | id | zone  | inner_port | with_rootserver | status | block_migrate_in_time | build_version                                                                          | stop_time | start_service_time | first_sessid | with_partition | last_offline_time |
 ---------------------------- ---------------------------- ------------- ---------- ---- ------- ------------ ----------------- -------- ----------------------- ---------------------------------------------------------------------------------------- ----------- -------------------- -------------- ---------------- ------------------- 
| 2022-01-09 16:27:25.543648 | 2022-01-09 16:30:04.103995 | 172.72.8.11 |     2882 |  1 | zone1 |       2881 |               1 | active |                     0 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |         0 |   1641716997009055 |            0 |              1 |                 0 |
| 2022-01-09 16:27:26.141450 | 2022-01-09 16:30:04.431168 | 172.72.8.12 |     2882 |  2 | zone2 |       2881 |               0 | active |                     0 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |         0 |   1641717000186609 |            0 |              1 |                 0 |
| 2022-01-09 16:27:25.363613 | 2022-01-09 16:30:04.408387 | 172.72.8.13 |     2882 |  3 | zone3 |       2881 |               0 | active |                     0 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |         0 |   1641717000489292 |            0 |              1 |                 0 |
 ---------------------------- ---------------------------- ------------- ---------- ---- ------- ------------ ----------------- -------- ----------------------- ---------------------------------------------------------------------------------------- ----------- -------------------- -------------- ---------------- ------------------- 
3 rows in set (0.03 sec)

MySQL [oceanbase]> show full processlist;
 ------------ --------- -------- ------------------- ----------- --------- ------ -------- ----------------------- ------------- ------ -------------- 
| Id         | User    | Tenant | Host              | db        | Command | Time | State  | Info                  | Ip          | Port | Proxy_sessid |
 ------------ --------- -------- ------------------- ----------- --------- ------ -------- ----------------------- ------------- ------ -------------- 
| 3221517435 | root    | sys    | 172.72.8.14:51096 | oceanbase | Query   |    0 | ACTIVE | show full processlist | 172.72.8.11 | 2881 |            8 |
| 3221496822 | root    | sys    | 127.0.0.1:36962   | NULL      | Sleep   |  205 | SLEEP  | NULL                  | 172.72.8.11 | 2881 |         NULL |
| 3222011942 | root    | sys    | 172.72.8.14:47922 | NULL      | Sleep   |   51 | SLEEP  | NULL                  | 172.72.8.13 | 2881 |            6 |
| 3221749796 | proxyro | sys    | 172.72.8.14:42744 | oceanbase | Sleep   |    0 | SLEEP  | NULL                  | 172.72.8.12 | 2881 |            7 |
 ------------ --------- -------- ------------------- ----------- --------- ------ -------- ----------------------- ------------- ------ -------------- 
4 rows in set (0.03 sec)

C:Userslhrxxt>mysql -uroot@sys -plhr -h192.168.66.35 -P28814
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 262154
Server version: 5.6.25 OceanBase 3.1.2 (r10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d) (Built Dec 30 2021 02:47:29)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MySQL [(none)]> show databases;
 -------------------- 
| Database           |
 -------------------- 
| oceanbase          |
| information_schema |
| mysql              |
| SYS                |
| LBACSYS            |
| ORAAUDITOR         |
| test               |
 -------------------- 
7 rows in set (0.06 sec)

安装OBAgent

OBAgent 是用 GO 语言开发的监控采集框架,通常部署在 OBServer 节点上。OBAgent 支持推、拉两种数据采集模式,可以满足不同的应用场景。OBAgent 默认支持的插件包括主机数据采集、OceanBase 数据库指标的采集、监控数据标签处理和 Prometheus 协议的 HTTP 服务。要使 OBAgent 支持其他数据源的采集,或者自定义数据的处理流程,您只需要开发对应的插件即可。

参考:https://open.oceanbase.com/docs/tutorials/quickstart/V1.0.0/2-9-how-to-deploy-obagent

编辑 OBAgent 部署配置文件

OBAgent 部署配置文件可以跟 OceanBase 集群部署配置文件一起,也可以后期单独部署。

下面示例是采用单独的配置文件部署 OBAgent 。OBAgent 的部署配置文件风格跟 OceanBase 集群部署配置文件一样。首先是指定部署节点,包括节点名称和 IP 。节点名称保持唯一就行,可以是主机名(假设主机名是唯一的)。然后指定全局配置。各个节点共同的配置都放在 global 节下。节点定制化的配置就不用放在这个下面。然后指定各个节点定制化的配置。比如说每个节点的 zone 名称是不一样的。

代码语言:javascript复制
cat > /tmp/obd_obagent_only.yaml <<"EOF"
## Only need to configure when remote login is required
user:
    username: admin    
    password: lhr    
    key_file:  
obagent:
  servers:
    # Please don't use hostname, only IP can be supported
    - name: observer01
      ip: 172.72.8.11    # OB1 地址
    - name: observer02
      ip: 172.72.8.12    # OB2 地址
    - name: observer03
      ip: 172.72.8.13    # OB3 地址
  global:
    # The working directory for obagent. obagent is started under this directory. This is a required field.
    home_path: /home/admin/obagent
    # The port that pulls and manages the metrics. The default port number is 8088.
    server_port: 8088
    # Debug port for pprof. The default port number is 8089.
    pprof_port: 8089
    # Log level. The default value is INFO.
    log_level: INFO
    # Log path. The default value is log/monagent.log.
    log_path: log/monagent.log
    # Encryption method. OBD supports aes and plain. The default value is plain.
    crypto_method: plain
    # Path to store the crypto key. The default value is conf/.config_secret.key.
    # crypto_path: conf/.config_secret.key
    # Size for a single log file. Log size is measured in Megabytes. The default value is 30M.
    log_size: 30
    # Expiration time for logs. The default value is 7 days.
    log_expire_day: 7
    # The maximum number for log files. The default value is 10.
    log_file_count: 10
    # Whether to use local time for log files. The default value is true.
    # log_use_localtime: true
    # Whether to enable log compression. The default value is true.
    # log_compress: true
    # Username for HTTP authentication. The default value is admin.
    http_basic_auth_user: admin
    # Password for HTTP authentication. The default value is root.
    http_basic_auth_password: lhr
    # Username for debug service. The default value is admin.
    pprof_basic_auth_user: admin
    # Password for debug service. The default value is root.
    pprof_basic_auth_password: lhr
    # Monitor username for OceanBase Database. The user must have read access to OceanBase Database as a system tenant. The default value is root.
    monitor_user: root
    # Monitor password for OceanBase Database. The default value is empty. When a depends exists, OBD gets this value from the oceanbase-ce of the depends. The value is the same as the root_password in oceanbase-ce.
    monitor_password: lhr
    # The SQL port for observer. The default value is 2881. When a depends exists, OBD gets this value from the oceanbase-ce of the depends. The value is the same as the mysql_port in oceanbase-ce.
    sql_port: 2881
    # The RPC port for observer. The default value is 2882. When a depends exists, OBD gets this value from the oceanbase-ce of the depends. The value is the same as the rpc_port in oceanbase-ce.
    rpc_port: 2882
    # Cluster name for OceanBase Database. When a depends exists, OBD gets this value from the oceanbase-ce of the depends. The value is the same as the appname in oceanbase-ce.
    cluster_name: lhrob312cluster
    # Cluster ID for OceanBase Database. When a depends exists, OBD gets this value from the oceanbase-ce of the depends. The value is the same as the cluster_id in oceanbase-ce.
    cluster_id: 1
    # Zone name for your observer. The default value is zone1. When a depends exists, OBD gets this value from the oceanbase-ce of the depends. The value is the same as the zone name in oceanbase-ce.
    zone_name: zone1
    # Monitor status for OceanBase Database.  Active is to enable. Inactive is to disable. The default value is active. When you deploy an cluster automatically, OBD decides whether to enable this parameter based on depends.
    ob_monitor_status: active
    # Monitor status for your host. Active is to enable. Inactive is to disable. The default value is active.
    host_monitor_status: active
    # Whether to disable the basic authentication for HTTP service. True is to disable. False is to enable. The default value is false.
    disable_http_basic_auth: false
    # Whether to disable the basic authentication for the debug interface. True is to disable. False is to enable. The default value is false.
    disable_pprof_basic_auth: false

  observer01:
    zone: zone1
  observer02:
    zone: zone2
  observer03:
    zone: zone3

EOF

注意:

指定节点的连接端口用的是 sql_port 不是 mysql_port ,这点跟 OBSERVER 节点配置不一样。

监控用户(monitor_user对应)和密码需要在 SYS 租户下创建。

OBD 部署 OBAgent

第一次使用 deploy 命令,指定 OBAgent 的配置文件。

代码语言:javascript复制
[root@lhrobproxy ~]# obd cluster deploy obagent-only -c  /tmp/obd_obagent_only.yaml
Download obagent-1.1.0-1.el7.x86_64.rpmime: 0:00:02   3.50 MB/s
Package obagent-1.1.0 is available.
install obagent-1.1.0 for local ok
 --------------------------------------------------------------------------- 
|                                  Packages                                 |
 ------------ --------- --------- ------------------------------------------ 
| Repository | Version | Release | Md5                                      |
 ------------ --------- --------- ------------------------------------------ 
| obagent    | 1.1.0   | 1.el7   | d2416fadeadba35944872467843d55da0999f298 |
 ------------ --------- --------- ------------------------------------------ 
Repository integrity check ok
Parameter check ok
Open ssh connection ok
Remote obagent-1.1.0-d2416fadeadba35944872467843d55da0999f298 repository install ok
Remote obagent-1.1.0-d2416fadeadba35944872467843d55da0999f298 repository lib check ok
Cluster status check ok
Initializes obagent work home ok
obagent-only deployed
[root@lhrobproxy ~]# obd cluster list
 ------------------------------------------------------------------------ 
|                              Cluster List                              |
 ----------------- ------------------------------------ ----------------- 
| Name            | Configuration Path                 | Status (Cached) |
 ----------------- ------------------------------------ ----------------- 
| obagent-only    | /root/.obd/cluster/obagent-only    | deployed        |
| lhrob312cluster | /root/.obd/cluster/lhrob312cluster | running         |
 ----------------- ------------------------------------ ----------------- 

上面 deploy 命令运行后,配置文件就被复制到 ~/.obd/cluster/obagent-only/config.yaml 了。后续修改 obagent-only.yaml 文件就不会生效。此时可以采取 edit-config 编辑使用的配置文件,或者使用 destroy 命令清理部署,重新读取 obagent-only.yaml 开始部署。这个取决于改动的影响范围。

deploy 命令只是在各个节点上部署 OBAgent 软件(直接解压缩方式,不是 RPM 安装),目录如下:

代码语言:javascript复制
[admin@lhrob1 ~]$ pwd
/home/admin
[admin@lhrob1 ~]$ tree obagent/
obagent/
├── bin
│   └── monagent -> /home/admin/.obd/repository/obagent/1.0.0/1d65fc3d2cd08b26d6142b6149eb6806260aa7db/bin/monagent
├── conf
│   ├── config_properties
│   │   ├── monagent_basic_auth.yaml
│   │   └── monagent_pipeline.yaml
│   ├── module_config
│   │   ├── monagent_basic_auth.yaml
│   │   ├── monagent_config.yaml
│   │   ├── monitor_node_host.yaml
│   │   └── monitor_ob.yaml
│   ├── monagent.yaml
│   └── prometheus_config
│       ├── prometheus.yaml
│       └── rules
│           ├── host_rules.yaml
│           └── ob_rules.yaml
├── lib
├── log
│   └── monagent.log
└── run

9 directories, 12 files
[admin@lhrob1 ~]$

OBD 启动 OBAgent

代码语言:javascript复制
[root@lhrobproxy ~]# obd cluster start obagent-only
Get local repositories and plugins ok
Open ssh connection ok
Load cluster param plugin ok
Check before start obagent ok
Start obproxy ok
obagent program health check ok
 ------------------------------------------------- 
|                     obagent                     |
 ------------- ------------- ------------ -------- 
| ip          | server_port | pprof_port | status |
 ------------- ------------- ------------ -------- 
| 172.72.8.11 | 8088        | 8089       | active |
| 172.72.8.12 | 8088        | 8089       | active |
| 172.72.8.13 | 8088        | 8089       | active |
 ------------- ------------- ------------ -------- 
obagent-only running

[root@lhrobproxy ~]# obd cluster list
 ------------------------------------------------------------------------ 
|                              Cluster List                              |
 ----------------- ------------------------------------ ----------------- 
| Name            | Configuration Path                 | Status (Cached) |
 ----------------- ------------------------------------ ----------------- 
| obagent-only    | /root/.obd/cluster/obagent-only    | running         |
| lhrob312cluster | /root/.obd/cluster/lhrob312cluster | running         |
 ----------------- ------------------------------------ ----------------- 

[root@lhrobproxy ~]# obd cluster display obagent-only
Get local repositories and plugins ok
Open ssh connection ok
Cluster status check ok
 ------------------------------------------------- 
|                     obagent                     |
 ------------- ------------- ------------ -------- 
| ip          | server_port | pprof_port | status |
 ------------- ------------- ------------ -------- 
| 172.72.8.11 | 8088        | 8089       | active |
| 172.72.8.12 | 8088        | 8089       | active |
| 172.72.8.13 | 8088        | 8089       | active |
 ------------- ------------- ------------ -------- 

OBAgent 启动后有两个进程,其中进程 moagent 会监听指定端口。

代码语言:javascript复制
[root@lhrob1 admin]# ps -ef|grep agent | grep -v grep
admin      500     1  0 17:27 ?        00:00:00 /home/admin/obagent/bin/monagent -c conf/monagent.yaml

[root@lhrob1 admin]# netstat -ntlp |grep 80
tcp6       0      0 :::8088                 :::*                    LISTEN      500/monagent        
tcp6       0      0 :::8089                 :::*                    LISTEN      500/monagent        

OBAgent 重启方法

直接重启某个节点的 OBAgent 方法是:

代码语言:javascript复制
kill -9 `pidof monagent`

cd /home/admin/obagent && nohup bin/monagent -c conf/monagent.yaml &

如果是集中重启,那就使用 OBD 命令:

代码语言:javascript复制
obd cluster restart obagent_only

如果 OBAgent 是跟 OceanBase 一起部署的,那只能重启组件 obagent

代码语言:javascript复制
obd cluster restart obce-3zones-obagent -c obagent

[admin@lhrobproxy ~]$ obd cluster restart obce-3zones-obagent -c obagent
Get local repositories and plugins ok
Open ssh connection ok
Stop obagent ok
succeed
Get local repositories and plugins ok
Open ssh connection ok
Cluster param config check ok
Check before start obagent ok
obagent program health check ok
 ------------------------------------------------- 
|                     obagent                     |
 ------------- ------------- ------------ -------- 
| ip          | server_port | pprof_port | status |
 ------------- ------------- ------------ -------- 
| 172.72.8.11 | 8088        | 8089       | active |
| 172.72.8.12 | 8088        | 8089       | active |
| 172.72.8.13 | 8088        | 8089       | active |
 ------------- ------------- ------------ -------- 
succeed

安装Promethues和Grafana

参考:https://open.oceanbase.com/docs/tutorials/quickstart/V1.0.0/2-9-how-to-deploy-obagent

Prometheus 配置

OBAgent 启动后会在节点自动生成 Prometheus 配置文件, 位置在 OBAgent 安装目录下,如 /home/admin/obagent/conf/prometheus_config/ 。这个配置文件可以给 Prometheus 产品直接使用。

示例如下:

代码语言:javascript复制
[root@lhrob1 admin]# cd /home/admin/obagent/conf/prometheus_config/
[root@lhrob1 prometheus_config]# ll
total 8
-rw------- 1 admin admin 1242 Jan  9 17:27 prometheus.yaml
drwxrwxr-x 2 admin admin 4096 Jan  9 17:27 rules
[root@lhrob1 prometheus_config]# more prometheus.yaml 
global:
  scrape_interval:     1s
  evaluation_interval: 10s

rule_files:
  - "rules/*rules.yaml"

scrape_configs:
  - job_name: prometheus
    metrics_path: /metrics
    scheme: http
    static_configs:
    - targets:
      - 'localhost:9090'
  - job_name: node
    basic_auth:
      username: admin
      password: lhr
    metrics_path: /metrics/node/host
    scheme: http
    static_configs:
      - targets:
        - 172.72.8.11:8088
        - 172.72.8.12:8088
        - 172.72.8.13:8088
  - job_name: ob_basic
    basic_auth:
      username: admin
      password: lhr
    metrics_path: /metrics/ob/basic
    scheme: http
    static_configs:
      - targets:
        - 172.72.8.11:8088
        - 172.72.8.12:8088
        - 172.72.8.13:8088
  - job_name: ob_extra
    basic_auth:
      username: admin
      password: lhr
    metrics_path: /metrics/ob/extra
    scheme: http
    static_configs:
      - targets:
        - 172.72.8.11:8088
        - 172.72.8.12:8088
        - 172.72.8.13:8088
  - job_name: agent
    basic_auth:
      username: admin
      password: lhr
    metrics_path: /metrics/stat
    scheme: http
    static_configs:
      - targets:
        - 172.72.8.11:8088
        - 172.72.8.12:8088
        - 172.72.8.13:8088

稍加说明如下:

配置项

说明

scrape_interval

1s

抓取间隔

evaluation_interval

10s

评估规则间隔

rule_files

rules/*rules.yaml

报警规则

scrape_configs

抓取配置

使用方法:

从 https://prometheus.io/download/ 下载相应版本,安装到服务器上官网提供的是二进制版,解压就能用,不需要编译

代码语言:javascript复制
wget https://github.com/prometheus/prometheus/releases/download/v2.32.1/prometheus-2.32.1.linux-amd64.tar.gz



-- 安装prometheus
tar -zxvf  prometheus-2.32.1.linux-amd64.tar.gz -C /usr/local/
ln -s /usr/local/prometheus-2.32.1.linux-amd64 /usr/local/prometheus
ln -s /usr/local/prometheus/prometheus /usr/local/bin/prometheus


scp prometheus.yaml lhrobproxy:/usr/local/prometheus/prometheus.yml


-- 启动
prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/usr/local/prometheus/data/ --web.enable-lifecycle --storage.tsdb.retention.time=60d &


[root@lhrobproxy ~]# netstat -tulnp | grep 9090
tcp6       0      0 :::9090                 :::*                    LISTEN      19891/prometheus

--web.enable-lifecycle 加上此参数可以远程热加载配置文件,无需重启prometheus,调用指令是curl -X POST http://ip:9090/-/reload

--storage.tsdb.retention.time 数据默认保存时间为15天,启动时加上此参数可以控制数据保存时间。

启动后通过浏览器访问:http://172.72.8.14:9090/targets 。

Grafana 使用

首先请从 Grafana 官网下载最新版本,并安装启动。下载地址:https://grafana.com/grafana/download 。

代码语言:javascript复制
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.3.3-1.x86_64.rpm 
yum install -y grafana-enterprise-8.3.3-1.x86_64.rpm

systemctl daemon-reload
systemctl enable grafana-server.service
systemctl start grafana-server.service
systemctl status grafana-server.service

netstat -tulnp | grep 3000
lsof -i:3000




[root@lhrobproxy soft]# lsof -i:3000
COMMAND     PID    USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
grafana-s 17064 grafana   15u  IPv6 149740351      0t0  TCP *:hbci (LISTEN)
[root@lhrobproxy soft]# 
[root@lhrobproxy soft]# 
[root@lhrobproxy soft]# netstat -tulnp | grep 3000
tcp6       0      0 :::3000                 :::*                    LISTEN      17064/grafana-serve

[root@lhrobproxy soft]# systemctl status grafana-server.service
● grafana-server.service - Grafana instance
   Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2022-01-09 17:47:53 CST; 6s ago
     Docs: http://docs.grafana.org
 Main PID: 17064 (grafana-server)
   CGroup: /docker/494cb2a2cafe261ec6b602029fd6c94333765995d776722a98915bf304e0df7c/system.slice/grafana-server.service
           └─17064 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --packaging=rpm cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning

Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Created default admin" logger=sqlstore user=admin
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Created default organization" logger=sqlstore
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Validated license token" logger=licensing appURL=http://localhost:3000/ source=disk status=NotFound
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Initialising plugins" logger=plugin.manager
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Plugin registered" logger=plugin.manager pluginId=input
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Live Push Gateway initialization" logger=live.push_http
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="Writing PID file" logger=server path=/var/run/grafana/grafana-server.pid pid=17064
Jan 09 17:47:53 lhrobproxy systemd[1]: Started Grafana instance.
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=warn msg="Scheduling and sending of reports disabled, SMTP is not configured and enabled. Configure SMTP to enable." logger=report
Jan 09 17:47:53 lhrobproxy grafana-server[17064]: t=2022-01-09T17:47:53 0800 lvl=info msg="HTTP Server Listen" logger=http.server address=[::]:3000 protocol=http subUrl= socket=
[root@lhrobproxy soft]# 

访问Grafana:http://172.72.8.14:3000/login 用户名和密码都为admin

然后在 Grafana 里新增 Datasource,填入 Prometheus 地址。http://127.0.0.1:9090

第三,从 Grafana 官网下载 OceanBase 提交的 主机性能模板和OceanBase 性能模板文件,文件是 json 格式。

  • 主机性能模板:15216
  • OceanBase 性能模板 :15215

下载到本机后,在Grafana 里 Import 这两个 json 文件。

也可以在线直接导入json文件的文件号(15215 和 15216)即可。

监控数据

主机监控数据:

OB监控数据:

总结

OB真的这么耗费性能么????

本实验结束!

0 人点赞