StorageReplicatedMergeTree::alter
- 判断是不是 `MODIFY SETTING`, 如果是 `MODIFY SETTING`, 那么只会本地执行,不会 `Replicated` 地执行。
if (params.isSettingsAlter())
{
lockStructureExclusively(table_lock_holder, query_context.getCurrentQueryId());
/// We don't replicate storage_settings_ptr ALTER. It's local operation.
/// Also we don't upgrade alter lock to table structure lock.
StorageInMemoryMetadata metadata = getInMemoryMetadata();
params.apply(metadata);
changeSettings(metadata.settings_ast, table_lock_holder);
global_context.getDatabase(table_id.database_name)->alterTable(query_context, table_id.table_name, metadata);
return;
}
- 判断表是否是
is_readonly
. 如果是, 那么不能执行ALTER TABLE
操作。
if (is_readonly)
throw Exception("Can't ALTER readonly table", ErrorCodes::TABLE_IS_READ_ONLY);
```
注意,通过这里可以看到, 如果表处于 READ ONLY 的状态下,还是可以执行 ALTER TABLE MODIFY SETTING
的, 因为这个的执行是在判断 READ ONLY
之前
- 根据内存中的
metadata
, 构造新的metadata
ReplicatedMergeTreeTableMetadata future_metadata_in_zk(*this);
if (ast_to_str(future_metadata.order_by_ast) != ast_to_str(current_metadata.order_by_ast))
future_metadata_in_zk.sorting_key = serializeAST(*extractKeyExpressionList(future_metadata.order_by_ast));
if (ast_to_str(future_metadata.ttl_for_table_ast) != ast_to_str(current_metadata.ttl_for_table_ast))
future_metadata_in_zk.ttl_table = serializeAST(*future_metadata.ttl_for_table_ast);
String new_indices_str = future_metadata.indices.toString();
if (new_indices_str != current_metadata.indices.toString())
future_metadata_in_zk.skip_indices = new_indices_str;
String new_constraints_str = future_metadata.constraints.toString();
if (new_constraints_str != current_metadata.constraints.toString())
future_metadata_in_zk.constraints = new_constraints_str;
比较有意思的是,在 /clickhouse/on_time/tables/ontime_local/1/metadata
中的信息:
void ReplicatedMergeTreeTableMetadata::read(ReadBuffer & in)
{
in >> "metadata format version: 1n";
in >> "date column: " >> date_column >> "n";
in >> "sampling expression: " >> sampling_expression >> "n";
in >> "index granularity: " >> index_granularity >> "n";
in >> "mode: " >> merging_params_mode >> "n";
in >> "sign column: " >> sign_column >> "n";
in >> "primary key: " >> primary_key >> "n";
Partition Key 还是被称之为 data column
的。
- 计算需要创建/修改哪些
ZooKeeper
上的节点:
1) `/metadata`
代码语言:txt复制2) `/columns`
代码语言:txt复制3) `/log/log-`
代码语言:txt复制4) 如果 `have_mutation`, `/mutations`
- 请求 ZK int32_t rc = zookeeper->tryMulti(ops, results);
- 根据
replication_alter_partitions_sync
的设置,判断是否等待其他节点处理完 LOG ENTRY
if (query_context.getSettingsRef().replication_alter_partitions_sync == 2)
{
LOG_DEBUG(log, "Updated shared metadata nodes in ZooKeeper. Waiting for replicas to apply changes.");
unwaited = waitForAllReplicasToProcessLogEntry(*alter_entry, false);
}
else if (query_context.getSettingsRef().replication_alter_partitions_sync == 1)
{
LOG_DEBUG(log, "Updated shared metadata nodes in ZooKeeper. Waiting for replicas to apply changes.");
waitForReplicaToProcessLogEntry(replica_name, *alter_entry);
}
- 如果是
mutation
, 还要判断是否需要等待mutation
执行完成
if (mutation_znode)
{
LOG_DEBUG(log, "Metadata changes applied. Will wait for data changes.");
waitMutation(*mutation_znode, query_context.getSettingsRef().replication_alter_partitions_sync);
LOG_DEBUG(log, "Data changes applied.");
}
- 比较有趣的是, 哪些操作是属于
mutation
, 会导致 上面7
上的判断逻辑被触发呢? 从src/Storage/MutationCommands.h
中可以看出:
enum Type
{
EMPTY, /// Not used.
DELETE,
UPDATE,
MATERIALIZE_INDEX,
READ_COLUMN,
DROP_COLUMN,
DROP_INDEX,
MATERIALIZE_TTL
};
比较奇怪的是, 在 ALTER TABLE
这里, 是用 replication_alter_partitions_sync
来控制 是否同步执行 ALTER
, 而不是 mutations_sync
. 这里也在官方提了一个 issue:
https://github.com/ClickHouse/ClickHouse/issues/21821