使用pgCompare比对不同pg的数据差异

2024-06-28 10:55:34 浏览数 (1)

官方项目地址: https://github.com/CrunchyData/pgCompare

限制

代码语言:txt复制
    日期/时间戳仅与秒精度(DDMMYYYYYYHH24MISS)进行比较。
    不支持的数据类型:blob、long、longraw、byta。
    执行跨平台比较时数据类型布尔值的限制。
    待比较的表必须有主键(没有主键会在比对的时候被自动跳过,日志中提示 Table xx has no Primary Key, skipping reconciliation)
    如果target的行比source的多,则不会被报告出来

前置条件

代码语言:txt复制
    Java version 21 or higher.
    Maven 3.9 or higher.
    Postgres version 15 or higher (to use for the pgCompare Data Compare repository).
    Necessary JDBC drivers (Postgres and Oracle currently supported).

准备2个pg实例

代码语言:txt复制
我这里用的是单机版的pg15和pg16
pg15作为source端, pg16作为target端。
create database db1;
c db1;
然后在db1里创建一些表,并写入测试数据。

编译安装

代码语言:txt复制
cd /root/pgCompare/pgCompare-v.0.2.0

export JAVA_HOME=/usr/local/software/jdk-22.0.1
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
mvn --version    
Apache Maven 3.9.1 (2e178502fcdbffc201671fb2537d0cb4b4cc58f8)
Maven home: /usr/local/software/maven
Java version: 22.0.1, vendor: Oracle Corporation, runtime: /home/software/jdk-22.0.1
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "6.8.9-1.el7.elrepo.x86_64", arch: "amd64", family: "unix"


mvn clean install

准备配置文件

代码语言:txt复制
cd target
cp ../pgcompare.properties.sample /root/pgcompare.properties

修改后的配置文件

代码语言:txt复制
cat pgcompare.properties | egrep -v '^#'

batch-fetch-size = 2000  # 设置从源或目标数据库检索行的获取大小
batch-commit-size = 2000  # 提交大小控制并发插入到 dc_source/dc_target 暂存表中的数组大小和行数
batch-progress-report-size = 1000000  # 定义 mod 中用于报告进度的行数
loader-threads = 2  # 设置将数据加载到临时表中的线程数。默认值为 4。设置为 0 可禁用加载器线程
message-queue-size = 100  # 加载线程使用的消息队列的大小(nbr 个消息)。默认值为 100
number-cast: notation     # 定义哈希函数中数字的转换方式(符号|标准)。默认为符号(科学计数法)
observer-throttle = true   # 设置为 true 或 false,指示加载器线程暂停并等待观察器线程赶上来,然后再继续将更多数据加载到暂存表中。
observer-throttle-size = 2000000  # 加载器线程休眠并等待观察器线程清除之前加载的行数
observer-vacuum = true  # 设置为 true 或 false,指示观察者是否在检查点期间对暂存表执行真空清理
log-destination = stdout  # 设置临时暂存表的并行工作器数量。默认值为 0

log-level = INFO  # 测试期间,建议改为 DEBUG
database-sort = true

repo-host=192.168.31.181
repo-port=5432
repo-dbname=pgcompare
repo-user=dts
repo-password=dts
repo-schema=pgcompare
repo-sslmode: disable

source-name=pg15
source-type=postgres
source-host=192.168.31.181
source-port=5432
source-dbname=db1
source-user=dts
source-password=dts
source-database-hash=true
source-sslmode=disable

target-name=pg16
target-type=postgres
target-host=192.168.31.181
target-port=5436
target-dbname=db1
target-user=dts
target-password=dts
target-database-hash=true
target-sslmode=disable

在repo-host的pg上创建存储库

代码语言:txt复制
create database pgcompare;

调整下pgcompare的repo-user的隔离级别

代码语言:txt复制
alter user dts set transaction_isolation='read committed';

TIPS:如果使用默认的RR隔离级别,在执行后续的 java -jar pgcompare.jar --batch=0 会报如下的错误
[2024-06-28 09:32:20] [SEVERE ] [dbCommon ] Error executing simple update (UPDATE dc_result SET source_cnt=source_cnt ? WHERE cid=?):  ERROR: could not serialize access due to concurrent update 

初始化存储库

代码语言:txt复制
java -jar pgcompare.jar --init

[2024-06-27 19:18:55] [INFO   ] [main                    ] Starting - rid: 1719487134959 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Version: 0.2.0 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Batch Number: 0 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Recheck Out of Sync: false 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Parameters:  
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-port=5436 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   observer-vacuum=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-port=5432 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-user=dts 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-type=postgres 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   batch-commit-size=2000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   stage-table-parallel=0 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-name=pg15 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   message-queue-size=100 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-sslmode=disable 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-dbname=db1 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-sslmode=disable 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   batch-fetch-size=2000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-type=postgres 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   database-sort=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-dbname=pgcompare 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-host=192.168.31.181 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-port=5432 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   loader-threads=2 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-schema=pgcompare 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   config-file=pgcompare.properties 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-user=dts 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   batch-progress-report-size=1000000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-dbname=db1 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   observer-throttle-size=2000000 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   observer-throttle=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-database-hash=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   number-cast=notation 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-host=192.168.31.181 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-database-hash=true 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   log-level=INFO 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   log-destination=stdout 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-user=dts 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   source-sslmode=disable 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   target-name=pg16 
[2024-06-27 19:18:55] [INFO   ] [main                    ]   repo-host=192.168.31.181 
[2024-06-27 19:18:55] [INFO   ] [main                    ] Connecting to repository database 

查看存储库

代码语言:txt复制
pgcompare=# dn
		List of schemas
   Name    |       Owner       
----------- -------------------
 pgcompare | dts
 public    | pg_database_owner
(2 rows)

pgcompare=# set search_path ='pgcompare';
SET
pgcompare=# dt
			  List of relations
  Schema   |       Name       | Type  | Owner 
----------- ------------------ ------- -------
 pgcompare | dc_object        | table | dts
 pgcompare | dc_result        | table | dts
 pgcompare | dc_source        | table | dts
 pgcompare | dc_table         | table | dts
 pgcompare | dc_table_history | table | dts
 pgcompare | dc_target        | table | dts
(6 rows)

自动tables注册

代码语言:txt复制
java -jar pgcompare.jar --discovery public  # 注意这里public指的是schema

[2024-06-27 19:21:54] [INFO   ] [main                    ] Starting - rid: 1719487314311 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Version: 0.2.0 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Batch Number: 0 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Recheck Out of Sync: false 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Parameters:  
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-port=5436 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   observer-vacuum=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-port=5432 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-user=dts 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-type=postgres 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   batch-commit-size=2000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   stage-table-parallel=0 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-name=pg15 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   message-queue-size=100 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-sslmode=disable 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-dbname=db1 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-sslmode=disable 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   batch-fetch-size=2000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-type=postgres 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   database-sort=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-dbname=pgcompare 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-host=192.168.31.181 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-port=5432 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   loader-threads=2 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-schema=pgcompare 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   config-file=pgcompare.properties 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-user=dts 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   batch-progress-report-size=1000000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-dbname=db1 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   observer-throttle-size=2000000 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   observer-throttle=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-database-hash=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   number-cast=notation 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-host=192.168.31.181 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-database-hash=true 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   log-level=INFO 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   log-destination=stdout 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-user=dts 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   source-sslmode=disable 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   target-name=pg16 
[2024-06-27 19:21:54] [INFO   ] [main                    ]   repo-host=192.168.31.181 
[2024-06-27 19:21:54] [INFO   ] [main                    ] Connecting to repository database 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Connecting to source database 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Connecting to target database 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Performaning table discovery for schema: db1 
[2024-06-27 19:21:55] [INFO   ] [main                    ] Shutting down 

查看pg存储库,可以看到已经采集到表信息了

(它会把source和target的表都采集进来)

代码语言:txt复制
pgcompare=# select * from dc_table;
 tid | source_schema | source_table | target_schema | target_table | batch_nbr | parallel_degree | mod_column | status | table_filter | column_map 
----- --------------- -------------- --------------- -------------- ----------- ----------------- ------------ -------- -------------- ------------
  19 | public        | t1           | public        | t1           |         1 |               1 | NULL       | ready  | NULL         | NULL
  20 | public        | t2           | public        | t2           |         1 |               1 | NULL       | ready  | NULL         | NULL
  21 | public        | t3           | public        | t3           |         1 |               1 | NULL       | ready  | NULL         | NULL
  22 | public        | t5           | public        | t5           |         1 |               1 | NULL       | ready  | NULL         | NULL
  23 | public        | tb           | public        | tb           |         1 |               1 | NULL       | ready  | NULL         | NULL
  24 | public        | ttt          | public        | ttt          |         1 |               1 | NULL       | ready  | NULL         | NULL
(6 rows)

执行数据比较

代码语言:txt复制
java -jar pgcompare.jar --batch=0   # 批次号的默认值为 0表示所有批次
	
[2024-06-28 10:14:01] [INFO   ] [main                    ] Starting - rid: 1719540841891 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Version: 0.2.0 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Batch Number: 0 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Recheck Out of Sync: false 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Parameters:  
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-port=5436 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   observer-vacuum=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-port=5432 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-user=dts 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-type=postgres 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   batch-commit-size=2000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   stage-table-parallel=0 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-name=pg15 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   message-queue-size=100 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-sslmode=disable 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-dbname=db1 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-sslmode=disable 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   batch-fetch-size=2000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-type=postgres 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   database-sort=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-dbname=pgcompare 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-host=192.168.31.181 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-port=5432 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   loader-threads=2 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-schema=pgcompare 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   config-file=pgcompare.properties 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-user=dts 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   batch-progress-report-size=1000000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-dbname=db1 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   observer-throttle-size=2000000 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   observer-throttle=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-database-hash=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   number-cast=notation 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-host=192.168.31.181 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-database-hash=true 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   log-level=DEBUG 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   log-destination=stdout 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-user=dts 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   source-sslmode=disable 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   target-name=pg16 
[2024-06-28 10:14:02] [INFO   ] [main                    ]   repo-host=192.168.31.181 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Connecting to source database 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Connecting to target database 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Start reconciliation 
[2024-06-28 10:14:02] [INFO   ] [main                    ] Clearing data compare findings 
[2024-06-28 10:14:02] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t1 
[2024-06-28 10:14:02] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t1 
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Source Columns:  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Target Columns:  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Source Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t1 WHERE 1=1  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Target Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t1 WHERE 1=1  
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Starting compare hash threads 
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Creating data compare staging tables 
[2024-06-28 10:14:02] [INFO   ] [ReconcileController     ] Starting compare thread 0 
[2024-06-28 10:14:02] [INFO   ] [Observer-c23-t0         ] Starting reconcile observer 
[2024-06-28 10:14:02] [INFO   ] [Observer-c23-t0         ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Start database reconcile thread 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Start database reconcile thread 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-target-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [loader-source-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Connecting to source database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Connecting to target database 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-target-c23-t0 ] Complete. Total rows loaded: 9 
[2024-06-28 10:14:02] [INFO   ] [Reconcile-source-c23-t0 ] Complete. Total rows loaded: 6 
[2024-06-28 10:14:03] [INFO   ] [loader-source-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:03] [INFO   ] [loader-target-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:03] [INFO   ] [loader-target-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:03] [INFO   ] [loader-source-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:04] [INFO   ] [ReconcileController     ] Waiting for hash threads to complete 
[2024-06-28 10:14:04] [INFO   ] [ReconcileController     ] Waiting for reconcile threads to complete 
[2024-06-28 10:14:04] [INFO   ] [Observer-c23-t0         ] Matched 6 rows 
[2024-06-28 10:14:08] [INFO   ] [Observer-c23-t0         ] Staging table cleanup 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Analyzing: Step 1 of 3 - Missing on Source 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Analyzing: Step 2 of 3 - Missing on Target 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Analyzing: Step 3 of 3 - Not Equal 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Reconciliation Complete: Table = t1; Status = out-of-sync ; Equal = 6; Not Equal = 0; Missing Source = 3; Missing Target = 0 
[2024-06-28 10:14:08] [INFO   ] [main                    ] Start reconciliation 
[2024-06-28 10:14:08] [INFO   ] [main                    ] Clearing data compare findings 
[2024-06-28 10:14:08] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t2 
[2024-06-28 10:14:08] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t2 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Source Columns:  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Target Columns:  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Source Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t2 WHERE 1=1  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Target Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(b,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"b": ' || b||'}' pk, md5(concat_ws('','0')) FROM public.t2 WHERE 1=1  
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Starting compare hash threads 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Creating data compare staging tables 
[2024-06-28 10:14:08] [INFO   ] [ReconcileController     ] Starting compare thread 0 
[2024-06-28 10:14:08] [INFO   ] [Observer-c24-t0         ] Starting reconcile observer 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Start database reconcile thread 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Start database reconcile thread 
[2024-06-28 10:14:08] [INFO   ] [Observer-c24-t0         ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:08] [INFO   ] [loader-target-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [loader-source-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Connecting to target database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Connecting to source database 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-target-c24-t0 ] Complete. Total rows loaded: 2 
[2024-06-28 10:14:08] [INFO   ] [Reconcile-source-c24-t0 ] Complete. Total rows loaded: 3 
[2024-06-28 10:14:09] [INFO   ] [loader-target-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:09] [INFO   ] [loader-source-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:09] [INFO   ] [loader-target-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:09] [INFO   ] [loader-source-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:10] [INFO   ] [ReconcileController     ] Waiting for hash threads to complete 
[2024-06-28 10:14:10] [INFO   ] [ReconcileController     ] Waiting for reconcile threads to complete 
[2024-06-28 10:14:10] [INFO   ] [Observer-c24-t0         ] Matched 2 rows 
[2024-06-28 10:14:14] [INFO   ] [Observer-c24-t0         ] Staging table cleanup 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Analyzing: Step 1 of 3 - Missing on Source 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Analyzing: Step 2 of 3 - Missing on Target 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Analyzing: Step 3 of 3 - Not Equal 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Reconciliation Complete: Table = t2 ; Status = out-of-sync ; Equal = 2; Not Equal = 0; Missing Source = 0; Missing Target = 1 
[2024-06-28 10:14:14] [INFO   ] [main                    ] Start reconciliation 
[2024-06-28 10:14:14] [INFO   ] [main                    ] Clearing data compare findings 
[2024-06-28 10:14:14] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t3 
[2024-06-28 10:14:14] [INFO   ] [DatabaseUtility         ] Building column expressions for public.t3 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Source Columns: address,remark 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Target Columns: address,remark 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Source Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(id,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"id": ' || id||'}' pk, md5(concat_ws('',coalesce(address::text,' ')||coalesce(remark::text,' '))) FROM public.t3 WHERE 1=1  
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Target Compare Hash SQL: SELECT md5(concat_ws('|',coalesce(trim(to_char(id,'0.9999999999EEEE')),' '))) pk_hash, '{'||'"id": ' || id||'}' pk, md5(concat_ws('',coalesce(address::text,' ')||coalesce(remark::text,' '))) FROM public.t3 WHERE 1=1  
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Starting compare hash threads 
[2024-06-28 10:14:14] [INFO   ] [ReconcileController     ] Creating data compare staging tables 
[2024-06-28 10:14:15] [INFO   ] [ReconcileController     ] Starting compare thread 0 
[2024-06-28 10:14:15] [INFO   ] [Observer-c25-t0         ] Starting reconcile observer 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Start database reconcile thread 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Start database reconcile thread 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [Observer-c25-t0         ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i1     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i2     ] Start repository loader thread 
[2024-06-28 10:14:15] [INFO   ] [loader-source-t0-i1     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [loader-target-t0-i2     ] Connecting to repository database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Connecting to target database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Connecting to source database 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-target-c25-t0 ] Complete. Total rows loaded: 6 
[2024-06-28 10:14:15] [INFO   ] [Reconcile-source-c25-t0 ] Complete. Total rows loaded: 5 
[2024-06-28 10:14:16] [INFO   ] [loader-target-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:16] [INFO   ] [loader-source-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:16] [INFO   ] [loader-source-t0-i1     ] Loader thread complete. 
[2024-06-28 10:14:16] [INFO   ] [loader-target-t0-i2     ] Loader thread complete. 
[2024-06-28 10:14:17] [INFO   ] [ReconcileController     ] Waiting for hash threads to complete 
[2024-06-28 10:14:17] [INFO   ] [ReconcileController     ] Waiting for reconcile threads to complete 
[2024-06-28 10:14:17] [INFO   ] [Observer-c25-t0         ] Matched 4 rows 
[2024-06-28 10:14:21] [INFO   ] [Observer-c25-t0         ] Staging table cleanup 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Analyzing: Step 1 of 3 - Missing on Source 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Analyzing: Step 2 of 3 - Missing on Target 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Analyzing: Step 3 of 3 - Not Equal 
[2024-06-28 10:14:21] [INFO   ] [ReconcileController     ] Reconciliation Complete: Table = t3 ; Status = out-of-sync ; Equal = 4; Not Equal = 0; Missing Source = 2; Missing Target = 1 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Processed 3 tables 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Table Summary: Table = t1 ; Status = out-of-sync ; Equal = 6; Not Equal = 0; Missing Source = 3; Missing Target = 0 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Table Summary: Table = t2 ; Status = out-of-sync ; Equal = 2; Not Equal = 0; Missing Source = 0; Missing Target = 1 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Table Summary: Table = t3 ; Status = out-of-sync ; Equal = 4; Not Equal = 0; Missing Source = 2; Missing Target = 1 
[2024-06-28 10:14:21] [INFO   ] [main                    ] Run Summary:  Elapsed Time (seconds) = 19; Total Rows Processed = 19; Total Out-of-Sync = 7; Through-put (rows/per second

查看pg存储库的结果

代码语言:txt复制
	pgcompare=# select * from dc_result;
	 cid |      rid      | table_name |   status    |          compare_dt           | equal_cnt | missing_source_cnt | missing_target_cnt | not_equal_cnt | source_cnt | target_cnt 
	----- --------------- ------------ ------------- ------------------------------- ----------- -------------------- -------------------- --------------- ------------ ------------
	  11 | 1719540297159 | t1         | out-of-sync | 2024-06-28 10:04:58.066141 08 |         6 |                  3 |                  0 |             0 |          6 |          9
	  12 | 1719540297159 | t2         | out-of-sync | 2024-06-28 10:05:04.188957 08 |         2 |                  0 |                  1 |             0 |          3 |          2

官方给出的查询的SQL语句

代码语言:txt复制

1、上次运行的结果
	WITH mr AS (SELECT max(rid) rid FROM dc_result)
	SELECT compare_dt, table_name, status, source_cnt total_cnt, equal_cnt, not_equal_cnt, missing_source_cnt missing_target_cnt missing_cnt
	FROM dc_result r
		 JOIN mr ON (mr.rid=r.rid)
	ORDER BY table_name;

结构
pgcompare-# ORDER BY table_name;
          compare_dt           | table_name |   status    | total_cnt | equal_cnt | not_equal_cnt | missing_cnt 
------------------------------- ------------ ------------- ----------- ----------- --------------- -------------
 2024-06-28 10:37:59.375011 08 | t1         | out-of-sync |         6 |         6 |             0 |           3
 2024-06-28 10:38:05.552199 08 | t2         | out-of-sync |         3 |         2 |             0 |           1
 2024-06-28 10:38:11.672654 08 | t3         | in-sync     |         2 |         2 |             0 |           0
 2024-06-28 10:38:17.797341 08 | t4         | skipped     |         0 |         0 |             0 |           0
(4 rows)


2、不同步的行(注意:如果target的行比source的多,则不会被报告出来)
	SELECT coalesce(s.table_name,t.table_name) table_name,
	   coalesce(s.batch_nbr, t.batch_nbr) batch_nbr,
       coalesce(s.thread_nbr,t.thread_nbr) thread_nbr,
       CASE WHEN s.compare_result='n' THEN 'out-of-sync'
            WHEN s.compare_result='m' THEN 'missing target'
            WHEN t.compare_result='m' THEN 'missing source'
            ELSE 'unknown'
       END compare_result,
       coalesce(s.pk,t.pk) primary_key       
	FROM dc_source s
		 CROSS JOIN dc_target t;

pgcompare-#  CROSS JOIN dc_target t;
 table_name | batch_nbr | thread_nbr | compare_result | primary_key 
------------ ----------- ------------ ---------------- -------------
 t2         |         1 |          0 | missing target | {"b": 567}
 t2         |         1 |          0 | missing target | {"b": 567}
 t2         |         1 |          0 | missing target | {"b": 567}
(3 rows)

如果发现差异,请使用--check选项运行比较:

代码语言:txt复制
java -jar pgcompare.jar --batch=0 --check

当初始比较期间事务可能正在进行时,此重新检查过程非常有用。
重新检查仅检查已标记为存在差异的行。如果行仍然不匹配,则会报告详细信息。否则,行将被清除并标记为同步。

其它:

代码语言:txt复制
如果在执行完pgcompare后,数据库里面又增加或者减少了表,则需要重新执行 下面的操作:

0、清空pgcompare下面的各个表(清掉后便于查看最新数据,不清的话则需要根据compare_dt时间戳来判断是哪一次执行的比对操作)
	truncate table pgcompare.dc_object        ;
	truncate table pgcompare.dc_result        ;
	truncate table pgcompare.dc_source        ;
	truncate table pgcompare.dc_table         ;
	truncate table pgcompare.dc_table_history ;
	truncate table pgcompare.dc_target        ;


1、java -jar pgcompare.jar --discovery public


2、查看pg存储库的是否采集到表信息
	pgcompare=# select * from dc_table;
	 tid | source_schema | source_table | target_schema | target_table | batch_nbr | parallel_degree | mod_column | status | table_filter | column_map 
	----- --------------- -------------- --------------- -------------- ----------- ----------------- ------------ -------- -------------- ------------
	   6 | public        | t1           | public        | t1           |         1 |               1 | NULL       | ready  | NULL         | NULL
	   7 | public        | t2           | public        | t2           |         1 |               1 | NULL       | ready  | NULL         | NULL
	   8 | public        | t3           | public        | t3           |         1 |               1 | NULL       | ready  | NULL         | NULL
	(3 rows)


3、java -jar pgcompare.jar --batch=0
	
	
4、查看比对的结果
	pgcompare=# select * from dc_result;
	 cid |      rid      | table_name |   status    |          compare_dt           | equal_cnt | missing_source_cnt | missing_target_cnt | not_equal_cnt | source_cnt | target_cnt 
	----- --------------- ------------ ------------- ------------------------------- ----------- -------------------- -------------------- --------------- ------------ ------------
	  17 | 1719540698085 | t1         | out-of-sync | 2024-06-28 10:11:39.071168 08 |         6 |                  3 |                  0 |             0 |          6 |          9
	  18 | 1719540698085 | t2         | out-of-sync | 2024-06-28 10:11:45.231873 08 |         2 |                  0 |                  1 |             0 |          3 |          2
	  19 | 1719540698085 | t3         | in-sync     | 2024-06-28 10:11:51.378574 08 |         3 |                  0 |                  0 |             0 |          3 |          3
	(3 rows)


5、找出不同步的行(注意:如果target的行比source的多,则不会被报告出来)
		SELECT coalesce(s.table_name,t.table_name) table_name,
	   coalesce(s.batch_nbr, t.batch_nbr) batch_nbr,
       coalesce(s.thread_nbr,t.thread_nbr) thread_nbr,
       CASE WHEN s.compare_result='n' THEN 'out-of-sync'
            WHEN s.compare_result='m' THEN 'missing target'
            WHEN t.compare_result='m' THEN 'missing source'
            ELSE 'unknown'
       END compare_result,
       coalesce(s.pk,t.pk) primary_key       
		FROM dc_source s
		 CROSS JOIN dc_target t;

0 人点赞