C-Store: A Column-oriented DBMS

2022-12-02 20:41:05 浏览数 (1)

今天看了一篇CStore论文,笔记记录一下,后续内容待补充~

目录

1.C-Store: A Column-oriented DBMS

1.1 Proposal

1.2 Why C-Store?

1.3 Architecture

1.4 Innovative

1.5 Projections

1.6 Segments and Storage Keys

1.7 Join Indices

1.8 RS and WS

1.8.1 encoding schemes

1.8.2 WS

1.9 Storage Management

C-Store: A Column-oriented DBMS

1.1 Proposal

  • Compare Row-store with Column-store DBMS

Row-store DBMS

Column-store DBMS

Attributes of a row are placed contiguously in storage

Values for each attribute (column) are stored contiguously

Write-optimized system

Read-optimized system

More suitable for OLTP-style applications

Data warehouses, OLAP

1.2 Why C-Store?

  • Can only bring required attribute values into memory
  • Data can be packed densely together rather than aligned by byte or word boundaries
  • Easy to encode together column values of the same data type
  • Typical queries involve aggregates over important attributes
  • ...

1.3 Architecture

Writeable Store (WS) which is architected to support high performance inserts and updates.

Read-optimized Store (RS), which is capable of supporting very large amounts of information.

This architecture avoid locking with write based workload.

Components:

  • WS:support high performance inserts and updates.
  • RS:support high performance reads.
  • Tuple Mover:moves updated records from WS to RS.

Key Points:

  • Queries (in SQL) must access data in both storage systems.
  • Updates are implemented as an insert and a delete.
    • Inserts are sent to WS, while deletes must be marked in RS for later purging by the tuple mover.
  • merge out process
  • use a variant of the LSM-tree concept to merge ordered WS data objects with large RS blocks.

art

1.4 Innovative

  • hybrid architecture
  • projections
  • Heavily compressed columns using one of several coding schemes
  • A column-oriented optimizer and executor
  • High availability and improved performance through K-safety using a sufficient number of overlapping projections(基于 projections 的 HA 和 K-safety)
  • The use of snapshot isolation to avoid 2PC and locking for queries.

1.5 Projections

Projections Definition:Groups of columns which are stored in physically contiguous manner

  • Anchoredon a logical table T
  • Can also contain columnsfrom other tables that are related to T through foreign keys, etc.

emp

For example:

代码语言:javascript复制
EMP1 (name, age)
EMP2 (dept, age, DEPT.floor)
EMP3 (name, salary)
DEPT1(dname, floor

1.6 Segments and Storage Keys

  • Segments
    • Every projection is divided into segments
    • Value-based partitioning on the sort key
  • Storage Keys
    • In each segment, different column values belongingtothesamerecord are identified by SK

1.7 Join Indices

Used to reconstruct a table from its projections.For example:Procection2 -> Projcection1

join_index

1.8 RS and WS

1.8.1 encoding schemes
  1. Self-order, few distinct values(or with most of the same values)
代码语言:javascript复制
1,1,1,1,3,3,3,3,3,3,3,5,5,5
(v, f, n) -> v:the column where first appears, f: the position in the column, n: the number of times v appears
(1, 1, 4), (3, 5, 7),(5, 12, 3)
  1. Foreign-order, few distinct values(or with most of the same values)
代码语言:javascript复制
0,0,1,1,2,1,0,2,1
(v, b) -> v: value, b: bitset for this value 
(0, 110000100), (1, 001101001), and (2,000010010)
  1. Self-order, many distinct values
代码语言:javascript复制
1,4,7,7,8,12
(1,3,3,0,1,4)

The idea for this scheme is to represent every value in the column as a delta from the previous value in the column

  1. Foreign-order, few distinct values

leave it unencoded

1.8.2 WS
  • The difference being columns are not encoded since it is assumed that WS is trivial in size as compared to RS.
  • Sid and SK identify the same tuple in RS and WS.
  • Every column represented as (v, sk) and a B-tree is built for sk.
  • Sort key s represented as (s, sk)

1.9 Storage Management

storage allocator:It will dynamically adjust the partition of the Segment, or apply for the Segment.The corresponding WS and the Join Indexes related to the RS also need to be adjusted accordingly.

TODO:事务、查询器、执行器、恢复等,下一篇Vertica...

0 人点赞