简介
文档:https://www.postgres-xl.org/documentation/index.html
https://www.postgres-xl.org/overview/
https://wiki.postgresql.org/wiki/Postgres-XC
Postgres-XL是一款开源的PG集群软件,XL代表eXtensible Lattice,即可扩展的PG“格子”之意,以下简称PGXL。
官方称其既适合写操作压力较大的OLTP应用,又适合读操作为主的大数据应用。它的前身是Postgres-XC(简称PGXC),PGXC是在PG的基础上加入了集群功能,主要适用于OLTP应用。PGXL是在PGXC的基础上的升级产品,加入了一些适用于OLAP应用的特性,如 Massively Parallel Processing (MPP) 特性。
通俗的说PGXL的代码是包含PG代码,使用PGXL安装PG集群并不需要单独安装PG。这样带来的一个问题是无法随意选择任意版本的PG,好在PGXL跟进PG较及时,目前最新版本Postgres-XL 10R1,基于PG 10。
Postgres-XL是由多个PostgreSQL数据库集群组成的,但看起来是单个数据库集群一样。根据你的设计,每个表都可以在各个数据库之间进行复制或分发。
为了实现这一目标,Postgres-XL是由GTM,Coordinator和Datanode三部分组成。GTM负责支持事务的ACID。Datanode存储数据并处理SQL操作(只能操作自己存储的数据)。Coordinator分析来自应用程序的SQL操作,确定哪个Datanode包含数据,并将指令发送到正确的Datanode。
通常情况下,GTM应该安装在单独的服务器上,因为GTM要处理所有Coordinator和Datanode的事务需求。你可以配置GTM-Proxy(GTM代理)来分组同一服务器上运行的Coordinator和Datanode的请求和响应, GTM-Proxy减少了与GTM的交互次数和数据量。GTM代理还还可以处理GTM故障。
在同一台服务器上同时部署Coordinator和Datanode通常是很好的做法,这样我们就不必担心两者之间的负载平衡,如果是复制表的话,不需要发送额外的网络请求就可以从本地拿到数据。你可以部署任意数量的服务器(Coordinator和Datanode同时运行)。Coordinator和Datanode都是PostgreSQL实例,你可能需要做些配置使它们避免资源冲突。例如为它们分配不同的工作目录和端口号是非常重要的。
Postgres-XL允许多个Coordinator单独从应用程序接受SQL指令,而不是集中的方式。写操作可以通过任何一个Coordinator来完成,没有任何区别。他们看起来就像是单一的数据库。Coordinator的职责是接受和分销SQL指令,查找哪些Datanodes存储相应的数据,可能需要将查询计划发送到适当的Datanodes,然后收集结果并将其返回给应用程序。
Coordinator不存储用户数据。它仅存储目录数据,用来确定如何处理SQL语句以及查找目标Datanodes等等。你不必过分担心Coordinator失败,当一个Coordinator失败时,你可以切换到另一个。
GTM可能发生单点故障(SPOF)。为了防止这种情况,你可以运行另一个GTM(GTM-Standby)来备份主GTM的状态。当主GTM失败时,GTM-Proxy可以随时切换到备用。
如上所述,Postgres-XL的Coordinator和Datanodes都是是PostgreSQL数据库。在数据库范畴,PostgreSQL使用客户端/服务器模型。PostgreSQL会话包含如下两个服务:
- server,服务端进程,管理数据库文件、接受客户端应用程序的连接,为client执行数据库操作。该进程称为postgres。
- client,客户端,需要执行数据库操作。客户端应用程序多种多样:可以是文本工具,图形应用程序,访问数据库以显示网页的Web服务器或专门的数据库维护工具。一些客户端应用程序随PostgreSQL发行版提供;大多数是由用户开发的。
在典型的客户端/服务器应用程序中,客户端和服务器部署在不同的主机上。它们通过TCP / IP网络连接进行通信。需要注意的是,有些文件在客户端上可以访问,在数据库服务器上却不行(可能只是文件名不同)。
PostgreSQL服务器可以处理来自客户端的多个并发连接。为此,它为每个连接启动一个新进程。连接一旦建立,原始的postgres进程不会干预客户端和新的服务器进程之间的通信。主服务进程是始终运行的,等待客户端的连接,期间,有很多连接产生或消亡。
组件简介
- Global Transaction Monitor (GTM) 全局事务管理器,确保群集范围内的事务一致性。GTM负责发放事务ID和快照作为其多版本并发控制的一部分。 集群可选地配置一个备用GTM,以改进可用性。此外,可以在协调器间配置代理GTM, 可用于改善可扩展性,减少GTM的通信量。
- GTM Standby GTM的备节点,在pgxc,pgxl中,GTM控制所有的全局事务分配,如果出现问题,就会导致整个集群不可用,为了增加可用性,增加该备用节点。当GTM出现问题时,GTM Standby可以升级为GTM,保证集群正常工作。
- GTM-Proxy GTM需要与所有的Coordinators通信,为了降低压力,可以在每个Coordinator机器上部署一个GTM-Proxy。
- Coordinator 协调员管理用户会话,并与GTM和数据节点进行交互。协调员解析,并计划查询,并给语句中的每一个组件发送下一个序列化的全局性计划。 为节省机器,通常此服务和数据节点部署在一起。
- Data Node 数据节点是数据实际存储的地方。数据的分布可以由DBA来配置。为了提高可用性,可以配置数据节点的热备以便进行故障转移准备。
总结:gtm是负责ACID的,保证分布式数据库全局事务一致性。得益于此,就算数据节点是分布的,但是你在主节点操作增删改查事务时,就如同只操作一个数据库一样简单。Coordinator是调度的,将操作指令发送到各个数据节点。datanodes是数据节点,分布式存储数据。
规划
准备三台Centos7服务器(或者虚拟机),版本为“CentOS Linux release 7.6.1810 (Core) ”,集群规划如下:
主机名 | IP | 角色 | 端口 | nodename | 数据目录 |
---|---|---|---|---|---|
lhrpgxl90 | 172.72.6.90 | GTM | 6666 | gtm | |
GTM Slave | 20001 | gtmSlave | PGHOME/data/gtm∣∣∣∣GTMSlave∣20001∣gtmSlave∣PGHOME/data/gtmSlave | ||
lhrpgxl91 | 172.72.6.91 | Coordinator | 5432 | coord1 | |
Datanode | 5433 | datanode1 | PGHOME/data/coord∣∣∣∣Datanode∣5433∣datanode1∣PGHOME/data/dn_master | ||
Datanode Slave | 15433 | datanode1_slave | |||
GTM Proxy | 6666 | gtm_pxy1 | PGHOME/data/dnslave∣∣∣∣GTMProxy∣6666∣gtmpxy1∣PGHOME/data/gtm_pxy | ||
lhrpgxl92 | 172.72.6.92 | Coordinator | 5432 | coord2 | |
Datanode | 5433 | datanode2 | PGHOME/data/coord∣∣∣∣Datanode∣5433∣datanode2∣PGHOME/data/dn_master | ||
Datanode Slave | 15433 | datanode2_slave | |||
GTM Proxy | 6666 | gtm_pxy2 | PGHOME/data/dnslave∣∣∣∣GTMProxy∣6666∣gtmpxy2∣PGHOME/data/gtm_pxy |
-- 网卡
docker network create --subnet=172.72.6.0/24 pg-network
docker rm -f lhrpgxl90
docker run -d --name lhrpgxl90 -h lhrpgxl90
--net=pg-network --ip 172.72.6.90
-p 64390:5432
-v /sys/fs/cgroup:/sys/fs/cgroup
--privileged=true lhrbest/lhrcentos76:8.5
/usr/sbin/init
docker rm -f lhrpgxl91
docker run -d --name lhrpgxl91 -h lhrpgxl91
--net=pg-network --ip 172.72.6.91
-p 64391:5432
-v /sys/fs/cgroup:/sys/fs/cgroup
--privileged=true lhrbest/lhrcentos76:8.5
/usr/sbin/init
docker rm -f lhrpgxl92
docker run -d --name lhrpgxl92 -h lhrpgxl92
--net=pg-network --ip 172.72.6.92
-p 64392:5432
-v /sys/fs/cgroup:/sys/fs/cgroup
--privileged=true lhrbest/lhrcentos76:8.5
/usr/sbin/init
[root@docker35 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
198a42183f53 lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 54 seconds ago Up 52 seconds 0.0.0.0:64392->5432/tcp, :::64392->5432/tcp lhrpgxl92
90ea3592000b lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 56 seconds ago Up 54 seconds 0.0.0.0:64391->5432/tcp, :::64391->5432/tcp lhrpgxl91
c11615c37160 lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 58 seconds ago Up 56 seconds 0.0.0.0:64390->5432/tcp, :::64390->5432/tcp lhrpgxl90
环境准备
安装之前,需要先确保机器满足一些先决条件。
- 要运行pgxc_ctl的节点需要支持无密码ssh访问。
- 在所有机器上,正确设置PATH环境变量包含Postgres-XL数据文件,特别是在通过ssh运行命令时。
- 必须配置pg_hba.conf允许远程访问。pgxc_ctl.conf配置文件中诸如coordPgHbaEntries和datanodePgHbaEntries都可能需要适当的更改。
- 配置防火墙和iptables使某些端口可以正常访问。
如果没有安装pgxc_ctl,可以从源代码编译并安装。
代码语言:javascript复制yum install -y flex bison readline-devel zlib-devel openjade docbook-style-dsssl gcc make
groupadd -g 5432 postgres
useradd -u 5432 -g postgres postgres
echo "postgres:lhr" | chpasswd
mkdir -p /postgresxl
chown -R postgres:postgres /postgresxl
cat >> /home/postgres/.bashrc <<"EOF"
export PGHOME=/postgresxl
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PATH=$PGHOME/bin:$PATH
export PGUSER=postgres
export PGXC_CTL_HOME=/postgresxl/bin
EOF
echo "postgres ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
下载安装
https://www.postgres-xl.org/download/
https://git.postgresql.org/gitweb/?p=postgres-xl.git;a=summary
在3台主机都需要安装PGXC,文件大约300MB,如下:
代码语言:javascript复制su - postgres
git clone git://git.postgresql.org/git/postgres-xl.git
cd postgres-xl
./configure --prefix=/postgresxl
make -j4
sudo make install
cd contrib
make -j4
sudo make install
chown -R postgres.postgres /postgresxl/
cortrib中有很多postgres很牛的工具,一般要装上。如ltree,uuid,postgres_fdw等等。
配置主节点可以无密码访问备节点
代码语言:javascript复制./sshUserSetup.sh -user postgres -hosts "lhrpgxl90 lhrpgxl91 lhrpgxl92" -advanced exverify -confirm
sudo chmod 600 /home/postgres/.ssh/config
集群配置
以下内容在lhrpgxl90上运行即可。
生成pgxc_ctl配置文件
代码语言:javascript复制[postgres@lhrpgxl90 ~]$ pgxc_ctl prepare
/usr/bin/bash
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
ERROR: File "/postgresxl/bin/pgxc_ctl.conf" not found or not a regular file. No such file or directory
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
Finished reading configuration.
******** PGXC_CTL START ***************
Current directory: /postgresxl/bin
[postgres@lhrpgxl90 ~]$ ll /postgresxl/bin/pgxc_ctl.conf
-rw-rw-r-- 1 postgres postgres 17815 Feb 21 17:18 /postgresxl/bin/pgxc_ctl.conf
配置pgxc_ctl.conf
在lhrpgxl90上运行即可。
代码语言:javascript复制cat > /postgresxl/bin/pgxc_ctl.conf <<"EOF"
pgxcInstallDir=$PGHOME
pgxlDATA=$PGHOME/data
pgxcOwner=postgres
pgxcUser=postgres
tmpDir=/tmp
localTmpDir=$tmpDir
#==========================================================================================================================
#---- GTM Master ---------------
gtmName=gtm
gtmMasterServer=lhrpgxl90
gtmMasterPort=6666
gtmMasterDir=$pgxlDATA/gtm
#---- Configuration
gtmExtraConfig=none
gtmMasterSpecificExtraConfig=none
#---- GTM Slave配置信息
gtmSlave=y # Specify y if you configure GTM Slave. Otherwise, GTM slave will not be configured and
# all the following variables will be reset.
gtmSlaveName=gtmSlave
gtmSlaveServer=lhrpgxl90 # value none means GTM slave is not available. Give none if you don't configure GTM Slave.
gtmSlavePort=20001 # Not used if you don't configure GTM slave.
gtmSlaveDir=$pgxlDATA/gtmSlave # Not used if you don't configure GTM slave.
#---- Configuration
gtmSlaveSpecificExtraConfig=none
#==========================================================================================================================
#---- GTM Proxy配置信息,最好每个数据节点配置一个
#---- GTM-Proxy Master -------
gtmProxyDir=$pgxlDATA/gtm_proxy
gtmProxy=y
gtmProxyNames=(gtm_pxy1 gtm_pxy2)
gtmProxyServers=(lhrpgxl91 lhrpgxl92)
gtmProxyPorts=(6666 6666)
gtmProxyDirs=($gtmProxyDir $gtmProxyDir)
#---- Configuration
gtmPxyExtraConfig=none
gtmPxySpecificExtraConfig=(none none)
#==========================================================================================================================
#---- Coordinators ---------
coordMasterDir=$pgxlDATA/coord
coordNames=(coord1 coord2)
coordPorts=(5432 5432)
poolerPorts=(6667 6667)
coordPgHbaEntries=(0.0.0.0/0)
coordMasterServers=(lhrpgxl91 lhrpgxl92)
coordMasterDirs=($coordMasterDir $coordMasterDir)
coordMaxWALsernder=0
coordMaxWALSenders=($coordMaxWALsernder $coordMaxWALsernder)
coordSlave=n
#==========================================================================================================================
#---- Datanodes ----------
datanodeMasterDir=$pgxlDATA/dn_master
primaryDatanode=lhrpgxl91
datanodeNames=(datanode1 datanode2)
datanodePorts=(5433 5433)
datanodePoolerPorts=(6668 6668)
datanodePgHbaEntries=(0.0.0.0/0)
datanodeMasterServers=(lhrpgxl91 lhrpgxl92)
datanodeMasterDirs=($datanodeMasterDir $datanodeMasterDir)
datanodeMaxWalSender=4
datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)
datanodeSlave=n
#==========================================================================================================================
EOF
初始化集群
代码语言:javascript复制pgxc_ctl -c /postgresxl/bin/pgxc_ctl.conf init all
过程:
代码语言:javascript复制[postgres@lhrpgxl90 ~]$ pgxc_ctl -c /postgresxl/bin/pgxc_ctl.conf init all
/usr/bin/bash
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
Finished reading configuration.
******** PGXC_CTL START ***************
Current directory: /postgresxl/bin
Initialize GTM master
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.
fixing permissions on existing directory /postgresxl/data/gtm ... ok
creating configuration files ... ok
creating control file ... ok
Success.
waiting for server to shut down.... done
server stopped
Done.
Start GTM master
server starting
Initialize GTM slave
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.
fixing permissions on existing directory /postgresxl/data/gtmSlave ... ok
creating configuration files ... ok
creating control file ... ok
Success.
Done.
Start GTM slaveserver starting
Done.
Initialize all the gtm proxies.
Initializing gtm proxy gtm_pxy1.
Initializing gtm proxy gtm_pxy2.
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.
fixing permissions on existing directory /postgresxl/data/gtm_proxy ... ok
creating configuration files ... ok
Success.
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.
fixing permissions on existing directory /postgresxl/data/gtm_proxy ... ok
creating configuration files ... ok
Success.
Done.
Starting all the gtm proxies.
Starting gtm proxy gtm_pxy1.
Starting gtm proxy gtm_pxy2.
server starting
server starting
Done.
Initialize all the coordinator masters.
Initialize coordinator master coord1.
Initialize coordinator master coord2.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /postgresxl/data/coord ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /postgresxl/data/coord ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success.
Done.
Starting coordinator master.
Starting coordinator master coord1
Starting coordinator master coord2
2022-02-22 15:22:42.142 CST [23303] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-02-22 15:22:42.142 CST [23303] LOG: listening on IPv6 address "::", port 5432
2022-02-22 15:22:42.203 CST [23303] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-02-22 15:22:42.291 CST [23304] LOG: database system was shut down at 2022-02-22 15:22:38 CST
2022-02-22 15:22:42.322 CST [23303] LOG: database system is ready to accept connections
2022-02-22 15:22:42.323 CST [23311] LOG: cluster monitor started
2022-02-22 15:22:42.142 CST [23180] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-02-22 15:22:42.142 CST [23180] LOG: listening on IPv6 address "::", port 5432
2022-02-22 15:22:42.203 CST [23180] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-02-22 15:22:42.291 CST [23181] LOG: database system was shut down at 2022-02-22 15:22:38 CST
2022-02-22 15:22:42.322 CST [23180] LOG: database system is ready to accept connections
2022-02-22 15:22:42.323 CST [23188] LOG: cluster monitor started
Done.
Initialize all the datanode masters.
Initialize the datanode master datanode1.
Initialize the datanode master datanode2.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /postgresxl/data/dn_master ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /postgresxl/data/dn_master ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success.
Done.
Starting all the datanode masters.
Starting datanode master datanode1.
Starting datanode master datanode2.
2022-02-22 15:22:50.478 CST [23764] LOG: listening on IPv4 address "0.0.0.0", port 5433
2022-02-22 15:22:50.478 CST [23764] LOG: listening on IPv6 address "::", port 5433
2022-02-22 15:22:50.555 CST [23764] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2022-02-22 15:22:50.665 CST [23764] LOG: redirecting log output to logging collector process
2022-02-22 15:22:50.665 CST [23764] HINT: Future log output will appear in directory "pg_log".
2022-02-22 15:22:50.478 CST [23641] LOG: listening on IPv4 address "0.0.0.0", port 5433
2022-02-22 15:22:50.478 CST [23641] LOG: listening on IPv6 address "::", port 5433
2022-02-22 15:22:50.522 CST [23641] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2022-02-22 15:22:50.625 CST [23641] LOG: redirecting log output to logging collector process
2022-02-22 15:22:50.625 CST [23641] HINT: Future log output will appear in directory "pg_log".
Done.
ALTER NODE coord1 WITH (HOST='lhrpgxl91', PORT=5432);
ALTER NODE
CREATE NODE coord2 WITH (TYPE='coordinator', HOST='lhrpgxl92', PORT=5432);
CREATE NODE
CREATE NODE datanode1 WITH (TYPE='datanode', HOST='lhrpgxl91', PORT=5433, PREFERRED);
CREATE NODE
CREATE NODE datanode2 WITH (TYPE='datanode', HOST='lhrpgxl92', PORT=5433);
CREATE NODE
SELECT pgxc_pool_reload();
pgxc_pool_reload
------------------
t
(1 row)
CREATE NODE coord1 WITH (TYPE='coordinator', HOST='lhrpgxl91', PORT=5432);
CREATE NODE
ALTER NODE coord2 WITH (HOST='lhrpgxl92', PORT=5432);
ALTER NODE
CREATE NODE datanode1 WITH (TYPE='datanode', HOST='lhrpgxl91', PORT=5433);
CREATE NODE
CREATE NODE datanode2 WITH (TYPE='datanode', HOST='lhrpgxl92', PORT=5433, PREFERRED);
CREATE NODE
SELECT pgxc_pool_reload();
pgxc_pool_reload
------------------
t
(1 row)
Done.
EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''lhrpgxl91'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''lhrpgxl92'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'ALTER NODE datanode1 WITH (TYPE=''datanode'', HOST=''lhrpgxl91'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'CREATE NODE datanode2 WITH (TYPE=''datanode'', HOST=''lhrpgxl92'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'SELECT pgxc_pool_reload()';
pgxc_pool_reload
------------------
t
(1 row)
EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''lhrpgxl91'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''lhrpgxl92'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'CREATE NODE datanode1 WITH (TYPE=''datanode'', HOST=''lhrpgxl91'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'ALTER NODE datanode2 WITH (TYPE=''datanode'', HOST=''lhrpgxl92'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'SELECT pgxc_pool_reload()';
pgxc_pool_reload
------------------
t
(1 row)
Done.
[postgres@lhrpgxl90 ~]$ pgxc_ctl
/usr/bin/bash
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
Finished reading configuration.
******** PGXC_CTL START ***************
Current directory: /postgresxl/bin
PGXC show config all
========= Postgres-XL configuration Common Info ========================
=== Overall ===
Postgres-XL owner: postgres
Postgres-XL user: postgres
Postgres-XL install directory: /postgresxl
pgxc_ctl home: /postgresxl/bin
pgxc_ctl configuration file: /postgresxl/bin/pgxc_ctl.conf
pgxc_ctl tmpDir: /tmp
pgxc_ctl localTempDir: /tmp
pgxc_ctl log file: /home/postgres/pgxc_ctl/pgxc_log/24719_pgxc_ctl.log
pgxc_ctl configBackup: n
pgxc_ctl configBackupHost: none
pgxc_ctl configBackupFile: none
========= Postgres-XL configuration End Common Info ===================
====== Server: lhrpgxl90 =======
GTM Master:
Nodename: 'gtm', port: 6666, dir: '/postgresxl/data/gtm' ExtraConfig: 'none', Specific Extra Config: 'none'
GTM Slave:
Nodename: 'gtmSlave', port: 20001, dir: '/postgresxl/data/gtmSlave' ExtraConfig: 'none', Specific Extra Config: 'none'
====== Server: lhrpgxl91 =======
GTM Proxy:
Nodename: 'gtm_pxy1', port: 6666, dir: '/postgresxl/data/gtm_proxy' ExtraConfig: 'none', Specific Extra Config: 'none'
Coordinator Master:
Nodename: 'coord1', port: 5432, pooler port: 6667
MaxWalSenders: 0, Dir: '/postgresxl/data/coord'
ExtraConfig: '(null)', Specific Extra Config: '(null)'
pg_hba entries ( '0.0.0.0/0' )
Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
Datanode Master:
Nodename: 'datanode1', port: 5433, pooler port 6667
MaxWALSenders: 4, Dir: '/postgresxl/data/dn_master'
ExtraConfig: '(null)', Specific Extra Config: '(null)'
pg_hba entries ( '0.0.0.0/0' )
Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
====== Server: lhrpgxl92 =======
GTM Proxy:
Nodename: 'gtm_pxy2', port: 6666, dir: '/postgresxl/data/gtm_proxy' ExtraConfig: 'none', Specific Extra Config: 'none'
Coordinator Master:
Nodename: 'coord2', port: 5432, pooler port: 6667
MaxWalSenders: 0, Dir: '/postgresxl/data/coord'
ExtraConfig: '(null)', Specific Extra Config: '(null)'
pg_hba entries ( '0.0.0.0/0' )
Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
Datanode Master:
Nodename: 'datanode2', port: 5433, pooler port 6667
MaxWALSenders: 4, Dir: '/postgresxl/data/dn_master'
ExtraConfig: '(null)', Specific Extra Config: '(null)'
pg_hba entries ( '0.0.0.0/0' )
Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
PGXC monitor all
Running: gtm master
Running: gtm slave
Running: gtm proxy gtm_pxy1
Running: gtm proxy gtm_pxy2
Running: coordinator master coord1
Running: coordinator master coord2
Running: datanode master datanode1
Running: datanode master datanode2
查看集群信息
在lhrpgxl91节点,执行psql -p 5432进入数据库操作。
代码语言:javascript复制[root@lhrpgxl91 /]# su - postgres
Last login: Mon Feb 21 17:11:45 CST 2022 on pts/0
[postgres@lhrpgxl91 ~]$ psql -p 5432
psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
Type "help" for help.
postgres=#
postgres=# select * from pgxc_node;
node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id
----------- ----------- ----------- ----------- ---------------- ------------------ -------------
coord1 | C | 5432 | lhrpgxl91 | f | f | 1885696643
coord2 | C | 5432 | lhrpgxl92 | f | f | -1197102633
datanode1 | D | 5433 | lhrpgxl91 | f | t | 888802358
datanode2 | D | 5433 | lhrpgxl92 | f | f | -905831925
(4 rows)
-- node_type中的C代表coordinator,D代表DataNode
postgres=# create database lhrdb;
CREATE DATABASE
postgres=# l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
----------- ---------- ---------- ------------- ------------- -----------------------
lhrdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
| | | | | postgres=CTc/postgres
(4 rows)
postgres=# c lhrdb
psql (14.0, server 10beta3 (Postgres-XL 10alpha2))
You are now connected to database "lhrdb" as user "postgres".
lhrdb=# create table test1(id int,name text);
CREATE TABLE
lhrdb=# insert into test1(id,name) select generate_series(1,8),'test';
INSERT 0 8
lhrdb=#
lhrdb=# select count(*) from test1;
count
-------
8
(1 row)
lhrdb=# SELECT xc_node_id, count(*) FROM test1 GROUP BY xc_node_id;
xc_node_id | count
------------ -------
-905831925 | 3
888802358 | 5
(2 rows)
[postgres@lhrpgxl91 ~]$ psql -p 5433 -d lhrdb
psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
Type "help" for help.
lhrdb=# select count(*) from test1;
count
-------
5
(1 row)
lhrdb=#
[postgres@lhrpgxl91 ~]$ psql -p 5433 -d lhrdb -h lhrpgxl92
psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
Type "help" for help.
lhrdb=# select count(*) from test1;
count
-------
3
(1 row)
注意:由于所有的数据节点组成了完整的数据视图,所以一个数据节点down机,整个pgxl都启动不了了,所以实际生产中,为了提高可用性,一定要配置数据节点的热备以便进行故障转移准备。
进程和端口信息
代码语言:javascript复制[postgres@lhrpgxl90 ~]$ netstat -tulnp | grep gtm
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:20001 0.0.0.0:* LISTEN 3002/gtm
tcp 0 0 0.0.0.0:6666 0.0.0.0:* LISTEN 2885/gtm
tcp6 0 0 :::20001 :::* LISTEN 3002/gtm
tcp6 0 0 :::6666 :::* LISTEN 2885/gtm
[postgres@lhrpgxl90 ~]$ ps -ef|grep gtm
postgres 2885 1 0 10:00 ? 00:00:00 gtm -D /postgresxl/data/gtm
postgres 3002 1 0 10:00 ? 00:00:00 gtm -D /postgresxl/data/gtmSlave
postgres 3291 485 0 10:02 pts/0 00:00:00 grep --color=auto gtm
[postgres@lhrpgxl91 ~]$ netstat -tulnp | grep "gtm|postgres"
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:6666 0.0.0.0:* LISTEN 2827/gtm_proxy
tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 2929/postgres
tcp 0 0 0.0.0.0:5433 0.0.0.0:* LISTEN 3038/postgres
tcp6 0 0 :::6666 :::* LISTEN 2827/gtm_proxy
tcp6 0 0 :::5432 :::* LISTEN 2929/postgres
tcp6 0 0 :::5433 :::* LISTEN 3038/postgres
[postgres@lhrpgxl91 ~]$ ps -ef|grep postgres
root 1749 295 0 09:46 pts/0 00:00:00 su - postgres
postgres 1750 1749 0 09:46 pts/0 00:00:00 -bash
postgres 2827 1 0 10:00 ? 00:00:00 gtm_proxy -D /postgresxl/data/gtm_proxy
postgres 2929 1 0 10:00 ? 00:00:00 /postgresxl/bin/postgres --coordinator -D /postgresxl/data/coord -i
postgres 2931 2929 0 10:00 ? 00:00:00 postgres: pooler process
postgres 2932 2929 0 10:00 ? 00:00:00 postgres: checkpointer process
postgres 2933 2929 0 10:00 ? 00:00:00 postgres: writer process
postgres 2934 2929 0 10:00 ? 00:00:00 postgres: wal writer process
postgres 2935 2929 0 10:00 ? 00:00:00 postgres: autovacuum launcher process
postgres 2936 2929 0 10:00 ? 00:00:00 postgres: stats collector process
postgres 2937 2929 0 10:00 ? 00:00:00 postgres: cluster monitor process
postgres 2938 2929 0 10:00 ? 00:00:00 postgres: bgworker: logical replication launcher
postgres 3038 1 0 10:00 ? 00:00:00 /postgresxl/bin/postgres --datanode -D /postgresxl/data/dn_master -i
postgres 3039 3038 0 10:00 ? 00:00:00 postgres: logger process
postgres 3042 3038 0 10:00 ? 00:00:00 postgres: pooler process
postgres 3043 3038 0 10:00 ? 00:00:00 postgres: checkpointer process
postgres 3044 3038 0 10:00 ? 00:00:00 postgres: writer process
postgres 3045 3038 0 10:00 ? 00:00:00 postgres: wal writer process
postgres 3046 3038 0 10:00 ? 00:00:00 postgres: autovacuum launcher process
postgres 3047 3038 0 10:00 ? 00:00:00 postgres: stats collector process
postgres 3048 3038 0 10:00 ? 00:00:00 postgres: cluster monitor process
postgres 3049 3038 0 10:00 ? 00:00:00 postgres: bgworker: logical replication launcher
建表说明
- REPLICATION表:各个datanode节点中,表的数据完全相同,也就是说,插入数据时,会分别在每个datanode节点插入相同数据。读数据时,只需要读任意一个datanode节点上的数据。
lhrdb=# CREATE TABLE repltab (col1 int, col2 int) DISTRIBUTE BY REPLICATION;
- DISTRIBUTE :会将插入的数据,按照拆分规则,分配到不同的datanode节点中存储,也就是sharding技术。每个datanode节点只保存了部分数据,通过coordinate节点可以查询完整的数据视图。
lhrdb=# CREATE TABLE disttab(col1 int, col2 int, col3 text) DISTRIBUTE BY HASH(col1);
模拟部分数据,插入测试数据:
代码语言:javascript复制#任意登录一个coordinate节点进行建表操作
[postgres@lhrpgxl91 ~]$ psql -p 5432
lhrdb=# INSERT INTO disttab SELECT generate_series(1,100), generate_series(101, 200), 'foo';
INSERT 0 100
lhrdb=# INSERT INTO repltab SELECT generate_series(1,100), generate_series(101, 200);
INSERT 0 100
查看数据分布结果:
代码语言:javascript复制#DISTRIBUTE表分布结果
lhrdb=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
xc_node_id | count
------------ -------
-905831925 | 58
888802358 | 42
(2 rows)
lhrdb=# select oid,* from pgxc_node;
oid | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id
------- ----------- ----------- ----------- ----------- ---------------- ------------------ -------------
11739 | coord1 | C | 5432 | lhrpgxl91 | f | f | 1885696643
16384 | coord2 | C | 5432 | lhrpgxl92 | f | f | -1197102633
16385 | datanode1 | D | 5433 | lhrpgxl91 | f | t | 888802358
16386 | datanode2 | D | 5433 | lhrpgxl92 | f | f | -905831925
(4 rows)
#REPLICATION表分布结果
lhrdb=# SELECT xc_node_id, count(*) FROM repltab GROUP BY xc_node_id;
xc_node_id | count
------------- -------
-1151313560 | 100
(1 row)
查看另一个datanode2中repltab表结果:
代码语言:javascript复制[postgres@lhrpgxl92 ~]$ psql -p 5433 -d lhrdb
psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
Type "help" for help.
lhrdb=# SELECT count(*) FROM repltab;
count
-------
100
(1 row)
结论:REPLICATION表中,datanode1,datanode2中表是全部数据,一模一样。而DISTRIBUTE表,数据散落近乎平均分配到了datanode1,datanode2节点中。
启动和关闭集群
以后启动,直接执行如下命令:
代码语言:javascript复制[postgres@lhrpgxl90 ~]$ pgxc_ctl start all
/usr/bin/bash
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
Finished reading configuration.
******** PGXC_CTL START ***************
Current directory: /postgresxl/bin
Start GTM master
server starting
Start GTM slaveserver starting
Done.
Starting all the gtm proxies.
Starting gtm proxy gtm_pxy1.
Starting gtm proxy gtm_pxy2.
server starting
server starting
Done.
Starting coordinator master.
Starting coordinator master coord1
Starting coordinator master coord2
2022-02-22 15:31:29.336 CST [24824] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-02-22 15:31:29.336 CST [24824] LOG: listening on IPv6 address "::", port 5432
2022-02-22 15:31:29.401 CST [24824] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-02-22 15:31:29.571 CST [24825] LOG: database system was shut down at 2022-02-22 15:31:00 CST
2022-02-22 15:31:29.617 CST [24824] LOG: database system is ready to accept connections
2022-02-22 15:31:29.618 CST [24832] LOG: cluster monitor started
2022-02-22 15:31:29.336 CST [24697] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-02-22 15:31:29.336 CST [24697] LOG: listening on IPv6 address "::", port 5432
2022-02-22 15:31:29.401 CST [24697] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-02-22 15:31:29.571 CST [24698] LOG: database system was shut down at 2022-02-22 15:31:00 CST
2022-02-22 15:31:29.617 CST [24697] LOG: database system is ready to accept connections
2022-02-22 15:31:29.618 CST [24706] LOG: cluster monitor started
Done.
Starting all the datanode masters.
Starting datanode master datanode1.
Starting datanode master datanode2.
2022-02-22 15:31:31.157 CST [24933] LOG: listening on IPv4 address "0.0.0.0", port 5433
2022-02-22 15:31:31.157 CST [24933] LOG: listening on IPv6 address "::", port 5433
2022-02-22 15:31:31.220 CST [24933] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2022-02-22 15:31:31.322 CST [24933] LOG: redirecting log output to logging collector process
2022-02-22 15:31:31.322 CST [24933] HINT: Future log output will appear in directory "pg_log".
2022-02-22 15:31:31.157 CST [24807] LOG: listening on IPv4 address "0.0.0.0", port 5433
2022-02-22 15:31:31.157 CST [24807] LOG: listening on IPv6 address "::", port 5433
2022-02-22 15:31:31.220 CST [24807] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2022-02-22 15:31:31.322 CST [24807] LOG: redirecting log output to logging collector process
2022-02-22 15:31:31.322 CST [24807] HINT: Future log output will appear in directory "pg_log".
Done.
停止集群如下:
代码语言:javascript复制[postgres@lhrpgxl90 ~]$ pgxc_ctl stop all
/usr/bin/bash
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
Finished reading configuration.
******** PGXC_CTL START ***************
Current directory: /postgresxl/bin
Stopping all the coordinator masters.
Stopping coordinator master coord1.
Stopping coordinator master coord2.
Done.
Stopping all the datanode masters.
Stopping datanode master datanode1.
Stopping datanode master datanode2.
Done.
Stopping all the gtm proxies.
Stopping gtm proxy gtm_pxy1.
Stopping gtm proxy gtm_pxy2.
waiting for server to shut down.... done
server stopped
waiting for server to shut down.... done
server stopped
Done.
Stop GTM slave
waiting for server to shut down.... done
server stopped
Stop GTM master
waiting for server to shut down.... done
server stopped
这几个主要命令暂时这么多,更多请从pgxc_ctl --help中获取更多信息。
代码语言:javascript复制[postgres@lhrpgxl90 ~]$ pgxc_ctl --help
/usr/bin/bash
pgxc_ctl [option ...] [command]
option:
-c or --configuration conf_file: Specify configruration file.
-v or --verbose: Specify verbose output.
-V or --version: Print version and exit.
-l or --logdir log_directory: specifies what directory to write logs.
-L or --logfile log_file: Specifies log file.
--home home_direcotry: Specifies pgxc_ctl work director.
-i or --infile input_file: Specifies inptut file.
-o or --outfile output_file: Specifies output file.
-h or --help: Prints this message and exits.
For more deatils, refer to pgxc_ctl reference manual included in
postgres-xc reference manual.
参考
https://www.jianshu.com/p/82aaf352b772