mysql历史表数据迁移到clickhouse库

2021-05-28 09:59:09 浏览数 (2)

1.使用mysql引擎的表

创建mysql引擎表:

代码语言:javascript复制
CREATE TABLE ck_test.tab_datack (
	`id` Int32,
	`phone_id` Int32,
	`phone` Nullable (String),
	`callid` Nullable (Int32),
	`content` Nullable (String),
	`answer` Nullable (String),
	`usedtimems` Nullable (String),
	`file` Nullable (String),
	`addtime` DateTime,
	`publish` Nullable (Int8),
	`operate` Nullable (String),
	`content_time` Nullable (Int32),
	`answer_time` Nullable (Int32),
	`content_new` Nullable (String)
) ENGINE = MySQL('11x.62.x56.x20:3108', 'pho22_sxxxai19', 'tab_datack', 'gxssiji', 'Gssxxi888*()');

查看创建的表:

localhost :) show tables;

SHOW TABLES

┌─name────────────┐
│ tab_calldata    │
│ tab_datack      │
│ tab_voicedata   │
│ tb_callout_plan │
│ tb_merge_tree   │
└─────────────────┘
mysql引擎的表数据不存在clickhouse中而是在远端的mysql中

新建一张clickhouse的表引擎为mergetree

代码语言:javascript复制
CREATE TABLE ck_test.tab_datack_1 (
	`id` Int32,
	`phone_id` Int32,
	`phone` Nullable (String),
	`callid` Nullable (Int32),
	`content` Nullable (String),
	`answer` Nullable (String),
	`usedtimems` Nullable (String),
	`file` Nullable (String),
	`addtime` DateTime,
	`publish` Nullable (Int8),
	`operate` Nullable (String),
	`content_time` Nullable (Int32),
	`answer_time` Nullable (Int32),
	`content_new` Nullable (String),
	INDEX idx_operate operate TYPE minmax GRANULARITY 5
) ENGINE = MergeTree
PARTITION BY toYYYYMM(addtime)
ORDER BY (id) 
SETTINGS index_granularity = 8192;

把mysql引擎的表数据插入到clickhouse表中

代码语言:javascript复制
insert into ck_test.tab_datack_1 select * from ck_test.tab_datack;

localhost :) insert into ck_test.tab_datack_1 select * from ck_test.tab_datack;

INSERT INTO ck_test.tab_datack_1 SELECT *
FROM ck_test.tab_datack

↖ Progress: 524.29 thousand rows, 219.97 MB (4.22 thousand rows/s., 1.77 MB/s.) 
↑ Progress: 589.82 thousand rows, 248.37 MB (4.19 thousand rows/s., 1.76 MB/s.) 

直接使用mysql引擎建表插入

代码语言:javascript复制
CREATE TABLE tab_voicedata
ENGINE = MergeTree
ORDER BY id AS
SELECT *
FROM mysql('11x.62.x56.x20:3108', 'pho22_sxxxai19', 'tab_datack', 'gxssiji', 'Gssxxi888*()');

上述这两种方式如果网络不好或者表很大,插入会很慢

2.利用导入文件的方式迁移mysql表到clickhouse

首先创建clickhouse表

代码语言:javascript复制
CREATE TABLE ck_test.tab_datack_1 (
	`id` Int32,
	`phone_id` Int32,
	`phone` Nullable (String),
	`callid` Nullable (Int32),
	`content` Nullable (String),
	`answer` Nullable (String),
	`usedtimems` Nullable (String),
	`file` Nullable (String),
	`addtime` DateTime,
	`publish` Nullable (Int8),
	`operate` Nullable (String),
	`content_time` Nullable (Int32),
	`answer_time` Nullable (Int32),
	`content_new` Nullable (String),
	INDEX idx_operate operate TYPE minmax GRANULARITY 5
) ENGINE = MergeTree
PARTITION BY toYYYYMM(addtime)
ORDER BY (id) 
SETTINGS index_granularity = 8192;

导出mysql表数据到文件:

代码语言:javascript复制
select tab_datack.* into outfile '/tmp/phone_sxxx_tab_datack.txt' from tab_datack;

然后压缩文件打包,并传输到clickhouse服务器上

执行导入操作

代码语言:javascript复制
 cat phone_sxxx_tab_datack.txt | clickhouse-client -h 10.12.11.201 --query="INSERT INTO php_data.tab_datack FORMAT TabSeparated"

这种方式速度很快2000万数据导入执行10s左右

0 人点赞