ClickHouse 简介
Yandex开源的数据分析的数据库,名字叫做ClickHouse,适合流式或批次入库的时序数据。ClickHouse不应该被用作通用数据库,而是作为超高性能的海量数据快速查询的分布式实时处理平台,在数据汇总查询方面(如GROUP BY),ClickHouse的查询速度非常快。
ClickHouse =
Click
Event Stream DataWareHouse
ClickHouse is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).
OLAP场景特征
· 大多数是读请求 · 数据总是以相当大的批(> 1000 rows)进行写入 · 不修改已添加的数据 · 每次查询都从数据库中读取大量的行,但是同时又仅需要少量的列 · 宽表,即每个表包含着大量的列 · 较少的查询(通常每台服务器每秒数百个查询或更少) · 对于简单查询,允许延迟大约50毫秒 · 列中的数据相对较小: 数字和短字符串(例如,每个URL 60个字节) · 处理单个查询时需要高吞吐量(每个服务器每秒高达数十亿行) · 事务不是必须的 · 对数据一致性要求低 · 每一个查询除了一个大表外都很小 · 查询结果明显小于源数据,换句话说,数据被过滤或聚合后能够被盛放在单台服务器的内存中
行式数据.gif
列式数据.gif
官网文档:https://clickhouse.tech/ https://clickhouse.tech/docs/en/
Github 地址:https://github.com/ClickHouse/ClickHouse
安装
https://clickhouse.tech/docs/en/getting-started/install/
快速开始
Creating a Table :
代码语言:javascript复制CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
...
INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
) ENGINE = MergeTree()
ORDER BY expr
[PARTITION BY expr]
[PRIMARY KEY expr]
[SAMPLE BY expr]
[TTL expr
[DELETE|TO DISK 'xxx'|TO VOLUME 'xxx' [, ...] ]
[WHERE conditions]
[GROUP BY key_expr [SET v1 = aggr_func(v1) [, v2 = aggr_func(v2) ...]] ] ]
[SETTINGS name=value, ...]
系统架构
源码阅读:
| [ ] Access/ | | | [ ] AggregateFunctions/ | | | [ ] Bridge/ | | | [ ] Client/ | | | [ ] Columns/ | | | [ ] Common/ | | | [ ] Compression/ | | | [ ] Coordination/ | | | [ ] Core/ | | | [ ] DataStreams/ | | | [ ] DataTypes/ | | | [ ] Databases/ | | | [ ] Dictionaries/ | | | [ ] Disks/ | | | [ ] Formats/ | | | [ ] Functions/ | | | [ ] IO/ | | | [ ] Interpreters/ | | | [ ] Parsers/ | | | [ ] Processors/ | | | [ ] Server/ | | | [ ] Storages/ | | | [ ] TableFunctions/ |
项目开发实战:Spring Boot 集成 ClickHouse:JDBC Driver
https://segmentfault.com/a/1190000020606636