ClickHouse 是一个真正的列式数据库管理系统(DBMS),用于联机分析(OLAP)
常见的列式数据库有: Vertica、 Paraccel (Actian Matrix,Amazon Redshift)、 Sybase IQ、 Exasol、 Infobright、 InfiniDB、 MonetDB (VectorWise, Actian Vector)、 LucidDB、 SAP HANA、 Google Dremel、 Google PowerDrill、 Druid、 kdb 。下面是clickhouse命令的帮助文档,当前CK的版本为:ClickHouse server /client version 22.3.1.1,其他版本酌情参考。
一、ck-server help file
代码语言:javascript复制usage:
clickhouse [OPTION] [-- [ARG]...]
positional arguments can be used to rewrite config.xml properties, for
example, --http_port=8010
-h, --help show help and exit
-V, --version show version and exit
-C<file>, --config-file=<file> load configuration from a given file
-L<file>, --log-file=<file> use given log file
-E<file>, --errorlog-file=<file> use given log file for errors only
-P<file>, --pid-file=<file> use given pidfile
--daemon Run application as a daemon.
--umask=mask Set the daemon's umask (octal, e.g. 027).
--pidfile=path Write the process ID of the application to
given file.
二、ck-client help file
代码语言:javascript复制Main options:
--help produce help message
-V [ --version ] print version information and exit
--version-clean print version in machine-readable
format and exit
-C [ --config-file ] arg config-file path
--queries-file arg file path with queries to execute;
multiple files can be specified
(--queries-file file1 file2...)
-d [ --database ] arg database
--history_file arg path to history file
-q [ --query ] arg query
--stage arg (=complete) Request query processing up to
specified stage: complete,fetch_columns
,with_mergeable_state,with_mergeable_st
ate_after_aggregation,with_mergeable_st
ate_after_aggregation_and_limit
--query_id arg query_id
--progress print progress of queries execution
-A [ --disable_suggestion ] Disable loading suggestion data. Note
that suggestion data is loaded
asynchronously through a second
connection to ClickHouse server. Also
it is reasonable to disable suggestion
if you want to paste a query with TAB
characters. Shorthand option -A is for
those who get used to mysql client.
-t [ --time ] print query execution time to stderr in
non-interactive mode (for benchmarks)
--echo in batch mode, print query before
execution
--verbose print query and other debugging info
--log-level arg log level
--server_logs_file arg put server logs into specified file
-m [ --multiline ] multiline
-n [ --multiquery ] multiquery
--suggestion_limit arg (=10000) Suggestion limit for how many
databases, tables and columns to fetch.
-f [ --format ] arg default output format
-E [ --vertical ] vertical output format, same as
--format=Vertical or FORMAT Vertical or
G at end of command
--highlight arg (=1) enable or disable basic syntax
highlight in interactive command line
--ignore-error do not stop processing in multiquery
mode
--stacktrace print stack traces of exceptions
--hardware-utilization print hardware utilization information
in progress bar
--print-profile-events Printing ProfileEvents packets
--profile-events-delay-ms arg (=0) Delay between printing `ProfileEvents`
packets (-1 - print only totals, 0 -
print every single packet)
--interactive Process queries-file or --query query
and start interactive mode
--pager arg Pipe all output into this command (less
or similar)
--max_memory_usage_in_client arg Set memory limit in client/local server
-c [ --config ] arg config-file path (another shorthand)
-s [ --secure ] Use TLS connection
-u [ --user ] arg (=default) user
--password arg password
--ask-password ask-password
--quota_key arg A string to differentiate quotas when
the user have keyed quotas configured
on server
-T [ --testmode ] enable test hints in comments
--max_client_network_bandwidth arg the maximum speed of data exchange over
the network for the client in bytes per
second.
--compression arg enable or disable compression
--query-fuzzer-runs arg (=0) After executing every SELECT query, do
random mutations in it and run again
specified number of times. This is used
for testing to discover unexpected
corner cases.
--interleave-queries-file arg file path with queries to execute
before every file from 'queries-file';
multiple files can be specified
(--queries-file file1 file2...); this
is needed to enable more aggressive
fuzzing of newly added tests (see
'query-fuzzer-runs' option)
--opentelemetry-traceparent arg OpenTelemetry traceparent header as
described by W3C Trace Context
recommendation
--opentelemetry-tracestate arg OpenTelemetry tracestate header as
described by W3C Trace Context
recommendation
--no-warnings disable warnings when client connects
to server
--min_compress_block_size arg The actual size of the block to
compress, if the uncompressed data less
than max_compress_block_size is no less
than this value and no less than the
volume of data for one mark.
--max_compress_block_size arg The maximum size of blocks of
uncompressed data before compressing
for writing to a table.
--max_block_size arg Maximum block size for reading
--max_insert_block_size arg The maximum block size for insertion,
if we control the creation of blocks
for insertion.
--min_insert_block_size_rows arg Squash blocks passed to INSERT query to
specified size in rows, if blocks are
not big enough.
--min_insert_block_size_bytes arg Squash blocks passed to INSERT query to
specified size in bytes, if blocks are
not big enough.
--min_insert_block_size_rows_for_materialized_views arg
Like min_insert_block_size_rows, but
applied only during pushing to
MATERIALIZED VIEW (default:
min_insert_block_size_rows)
--min_insert_block_size_bytes_for_materialized_views arg
Like min_insert_block_size_bytes, but
applied only during pushing to
MATERIALIZED VIEW (default:
min_insert_block_size_bytes)
--max_joined_block_size_rows arg Maximum block size for JOIN result (if
join algorithm supports it). 0 means
unlimited.
--max_insert_threads arg The maximum number of threads to
execute the INSERT SELECT query. Values
0 or 1 means that INSERT SELECT is not
run in parallel. Higher values will
lead to higher memory usage. Parallel
INSERT SELECT has effect only if the
SELECT part is run on parallel, see
'max_threads' setting.
--max_final_threads arg The maximum number of threads to read
from table with FINAL.
--max_threads arg The maximum number of threads to
execute the request. By default, it is
determined automatically.
--max_read_buffer_size arg The maximum size of the buffer to read
from the filesystem.
--max_distributed_connections arg The maximum number of connections for
distributed processing of one query
(should be greater than max_threads).
--max_query_size arg Which part of the query can be read
into RAM for parsing (the remaining
data for INSERT, if any, is read later)
--interactive_delay arg The interval in microseconds to check
if the request is cancelled, and to
send progress info.
--connect_timeout arg Connection timeout if there are no
replicas.
--connect_timeout_with_failover_ms arg
Connection timeout for selecting first
healthy replica.
--connect_timeout_with_failover_secure_ms arg
Connection timeout for selecting first
healthy replica (for secure
connections).
--receive_timeout arg
--send_timeout arg
--drain_timeout arg Timeout for draining remote
connections, -1 means synchronous drain
w/o ignoring errors
--tcp_keep_alive_timeout arg The time in seconds the connection
needs to remain idle before TCP starts
sending keepalive probes
--hedged_connection_timeout_ms arg Connection timeout for establishing
connection with replica for Hedged
requests
--receive_data_timeout_ms arg Connection timeout for receiving first
packet of data or packet with positive
progress from replica
--use_hedged_requests arg Use hedged requests for distributed
queries
--allow_changing_replica_until_first_data_packet arg
Allow HedgedConnections to change
replica until receiving first data
packet
--queue_max_wait_ms arg The wait time in the request queue, if
the number of concurrent requests
exceeds the maximum.
--connection_pool_max_wait_ms arg The wait time when the connection pool
is full.
--replace_running_query_max_wait_ms arg
The wait time for running query with
the same query_id to finish when
setting 'replace_running_query' is
active.
--kafka_max_wait_ms arg The wait time for reading from Kafka
before retry.
--rabbitmq_max_wait_ms arg The wait time for reading from RabbitMQ
before retry.
--poll_interval arg Block at the query wait loop on the
server for the specified number of
seconds.
--idle_connection_timeout arg Close idle TCP connections after
specified number of seconds.
--distributed_connections_pool_size arg
Maximum number of connections with one
remote server in the pool.
--connections_with_failover_max_tries arg
The maximum number of attempts to
connect to replicas.
--s3_min_upload_part_size arg The minimum size of part to upload
during multipart upload to S3.
--s3_upload_part_size_multiply_factor arg
Multiply s3_min_upload_part_size by
this factor each time
s3_multiply_parts_count_threshold parts
were uploaded from a single write to
S3.
--s3_upload_part_size_multiply_parts_count_threshold arg
Each time this number of parts was
uploaded to S3 s3_min_upload_part_size
multiplied by s3_upload_part_size_multi
ply_factor.
--s3_max_single_part_upload_size arg The maximum size of object to upload
using singlepart upload to S3.
--s3_max_single_read_retries arg The maximum number of retries during
single S3 read.
--s3_max_redirects arg Max number of S3 redirects hops
allowed.
--s3_max_connections arg The maximum number of connections per
server.
--s3_truncate_on_insert arg Enables or disables truncate before
insert in s3 engine tables.
--s3_create_new_file_on_insert arg Enables or disables creating a new file
on each insert in s3 engine tables
--hdfs_replication arg The actual number of replications can
be specified when the hdfs file is
created.
--hdfs_truncate_on_insert arg Enables or disables truncate before
insert in s3 engine tables
--hdfs_create_new_file_on_insert arg Enables or disables creating a new file
on each insert in hdfs engine tables
--hsts_max_age arg Expired time for hsts. 0 means disable
HSTS.
--extremes arg Calculate minimums and maximums of the
result columns. They can be output in
JSON-formats.
--use_uncompressed_cache arg Whether to use the cache of
uncompressed blocks.
--replace_running_query arg Whether the running request should be
canceled with the same id as the new
one.
--background_buffer_flush_schedule_pool_size arg
Number of threads performing background
flush for tables with Buffer engine.
Only has meaning at server startup.
--background_pool_size arg Number of threads to perform merges and
mutations in background. Only has
meaning at server startup.
--background_merges_mutations_concurrency_ratio arg
Ratio between a number of how many
operations could be processed and a
number threads to process them. Only
has meaning at server startup.
--background_move_pool_size arg Number of threads performing background
moves for tables. Only has meaning at
server startup.
--background_fetches_pool_size arg Number of threads performing background
fetches for replicated tables. Only has
meaning at server startup.
--background_common_pool_size arg Number of threads for some lightweight
tasks for replicated tables (like
cleaning old parts etc.). Only has
meaning at server startup.
--background_schedule_pool_size arg Number of threads performing background
tasks for replicated tables, dns cache
updates. Only has meaning at server
startup.
--background_message_broker_schedule_pool_size arg
Number of threads performing background
tasks for message streaming. Only has
meaning at server startup.
--background_distributed_schedule_pool_size arg
Number of threads performing background
tasks for distributed sends. Only has
meaning at server startup.
--max_replicated_fetches_network_bandwidth_for_server arg
The maximum speed of data exchange over
the network in bytes per second for
replicated fetches. Zero means
unlimited. Only has meaning at server
startup.
--max_replicated_sends_network_bandwidth_for_server arg
The maximum speed of data exchange over
the network in bytes per second for
replicated sends. Zero means unlimited.
Only has meaning at server startup.
--stream_like_engine_allow_direct_select arg
Allow direct SELECT query for Kafka,
RabbitMQ and FileLog engines. In case
there are attached materialized views,
SELECT query is not allowed even if
this setting is enabled.
--distributed_directory_monitor_sleep_time_ms arg
Sleep time for StorageDistributed
DirectoryMonitors, in case of any
errors delay grows exponentially.
--distributed_directory_monitor_max_sleep_time_ms arg
Maximum sleep time for
StorageDistributed DirectoryMonitors,
it limits exponential growth too.
--distributed_directory_monitor_batch_inserts arg
Should StorageDistributed
DirectoryMonitors try to batch
individual inserts into bigger ones.
--distributed_directory_monitor_split_batch_on_failure arg
Should StorageDistributed
DirectoryMonitors try to split batch
into smaller in case of failures.
--optimize_move_to_prewhere arg Allows disabling WHERE to PREWHERE
optimization in SELECT queries from
MergeTree.
--optimize_move_to_prewhere_if_final arg
If query has `FINAL`, the optimization
`move_to_prewhere` is not always
correct and it is enabled only if both
settings `optimize_move_to_prewhere`
and `optimize_move_to_prewhere_if_final
` are turned on
--replication_alter_partitions_sync arg
Wait for actions to manipulate the
partitions. 0 - do not wait, 1 - wait
for execution only of itself, 2 - wait
for everyone.
--replication_wait_for_inactive_replica_timeout arg
Wait for inactive replica to execute
ALTER/OPTIMIZE. Time in seconds, 0 - do
not wait, negative - wait for unlimited
time.
--load_balancing arg Which replicas (among healthy replicas)
to preferably send a query to (on the
first attempt) for distributed
processing.
--load_balancing_first_offset arg Which replica to preferably send a
query when FIRST_OR_RANDOM load
balancing strategy is used.
--totals_mode arg How to calculate TOTALS when HAVING is
present, as well as when
max_rows_to_group_by and
group_by_overflow_mode = ‘any’ are
present.
--totals_auto_threshold arg The threshold for totals_mode = 'auto'.
--allow_suspicious_low_cardinality_types arg
In CREATE TABLE statement allows
specifying LowCardinality modifier for
types of small fixed size (8 or less).
Enabling this may increase merge times
and memory consumption.
--compile_expressions arg Compile some scalar functions and
operators to native code.
--min_count_to_compile_expression arg The number of identical expressions
before they are JIT-compiled
--compile_aggregate_expressions arg Compile aggregate functions to native
code.
--min_count_to_compile_aggregate_expression arg
The number of identical aggregate
expressions before they are
JIT-compiled
--group_by_two_level_threshold arg From what number of keys, a two-level
aggregation starts. 0 - the threshold
is not set.
--group_by_two_level_threshold_bytes arg
From what size of the aggregation state
in bytes, a two-level aggregation
begins to be used. 0 - the threshold is
not set. Two-level aggregation is used
when at least one of the thresholds is
triggered.
--distributed_aggregation_memory_efficient arg
Is the memory-saving mode of
distributed aggregation enabled.
--aggregation_memory_efficient_merge_threads arg
Number of threads to use for merge
intermediate aggregation results in
memory efficient mode. When bigger,
then more memory is consumed. 0 means -
same as 'max_threads'.
--enable_positional_arguments arg Enable positional arguments in ORDER
BY, GROUP BY and LIMIT BY
--max_parallel_replicas arg The maximum number of replicas of each
shard used when the query is executed.
For consistency (to get different parts
of the same partition), this option
only works for the specified sampling
key. The lag of the replicas is not
controlled.
--parallel_replicas_count arg
--parallel_replica_offset arg
--allow_experimental_parallel_reading_from_replicas arg
If true, ClickHouse will send a SELECT
query to all replicas of a table. It
will work for any kind on MergeTree
table.
--skip_unavailable_shards arg If true, ClickHouse silently skips
unavailable shards and nodes
unresolvable through DNS. Shard is
marked as unavailable when none of the
replicas can be reached.
--parallel_distributed_insert_select arg
Process distributed INSERT SELECT query
in the same cluster on local tables on
every shard, if 1 SELECT is executed on
each shard, if 2 SELECT and INSERT is
executed on each shard
--distributed_group_by_no_merge arg If 1, Do not merge aggregation states
from different servers for distributed
queries (shards will process query up
to the Complete stage, initiator just
proxies the data from the shards). If 2
the initiator will apply ORDER BY and
LIMIT stages (it is not in case when
shard process query up to the Complete
stage)
--distributed_push_down_limit arg If 1, LIMIT will be applied on each
shard separatelly. Usually you don't
need to use it, since this will be done
automatically if it is possible, i.e.
for simple query SELECT FROM LIMIT.
--optimize_distributed_group_by_sharding_key arg
Optimize GROUP BY sharding_key queries
(by avoiding costly aggregation on the
initiator server).
--optimize_skip_unused_shards_limit arg
Limit for number of sharding key
values, turns off optimize_skip_unused_
shards if the limit is reached
--optimize_skip_unused_shards arg Assumes that data is distributed by
sharding_key. Optimization to skip
unused shards if SELECT query filters
by sharding_key.
--optimize_skip_unused_shards_rewrite_in arg
Rewrite IN in query for remote shards
to exclude values that does not belong
to the shard (requires
optimize_skip_unused_shards)
--allow_nondeterministic_optimize_skip_unused_shards arg
Allow non-deterministic functions
(includes dictGet) in sharding_key for
optimize_skip_unused_shards
--force_optimize_skip_unused_shards arg
Throw an exception if unused shards
cannot be skipped (1 - throw only if
the table has the sharding key, 2 -
always throw.
--optimize_skip_unused_shards_nesting arg
Same as optimize_skip_unused_shards,
but accept nesting level until which it
will work.
--force_optimize_skip_unused_shards_nesting arg
Same as force_optimize_skip_unused_shar
ds, but accept nesting level until
which it will work.
--input_format_parallel_parsing arg Enable parallel parsing for some data
formats.
--min_chunk_bytes_for_parallel_parsing arg
The minimum chunk size in bytes, which
each thread will parse in parallel.
--output_format_parallel_formatting arg
Enable parallel formatting for some
data formats.
--merge_tree_min_rows_for_concurrent_read arg
If at least as many lines are read from
one file, the reading can be
parallelized.
--merge_tree_min_bytes_for_concurrent_read arg
If at least as many bytes are read from
one file, the reading can be
parallelized.
--merge_tree_min_rows_for_seek arg You can skip reading more than that
number of rows at the price of one seek
per file.
--merge_tree_min_bytes_for_seek arg You can skip reading more than that
number of bytes at the price of one
seek per file.
--merge_tree_coarse_index_granularity arg
If the index segment can contain the
required keys, divide it into as many
parts and recursively check them.
--merge_tree_max_rows_to_use_cache arg
The maximum number of rows per request,
to use the cache of uncompressed data.
If the request is large, the cache is
not used. (For large queries not to
flush out the cache.)
--merge_tree_max_bytes_to_use_cache arg
The maximum number of bytes per
request, to use the cache of
uncompressed data. If the request is
large, the cache is not used. (For
large queries not to flush out the
cache.)
--do_not_merge_across_partitions_select_final arg
Merge parts only in one partition in
select final
--mysql_max_rows_to_insert arg The maximum number of rows in MySQL
batch insertion of the MySQL storage
engine
--optimize_min_equality_disjunction_chain_length arg
The minimum length of the expression
`expr = x1 OR ... expr = xN` for
optimization
--min_bytes_to_use_direct_io arg The minimum number of bytes for reading
the data with O_DIRECT option during
SELECT queries execution. 0 - disabled.
--min_bytes_to_use_mmap_io arg The minimum number of bytes for reading
the data with mmap option during SELECT
queries execution. 0 - disabled.
--checksum_on_read arg Validate checksums on reading. It is
enabled by default and should be always
enabled in production. Please do not
expect any benefits in disabling this
setting. It may only be used for
experiments and benchmarks. The setting
only applicable for tables of MergeTree
family. Checksums are always validated
for other table engines and when
receiving data over network.
--force_index_by_date arg Throw an exception if there is a
partition key in a table, and it is not
used.
--force_primary_key arg Throw an exception if there is primary
key in a table, and it is not used.
--use_skip_indexes arg Use data skipping indexes during query
execution.
--use_skip_indexes_if_final arg If query has FINAL, then skipping data
based on indexes may produce incorrect
result, hence disabled by default.
--force_data_skipping_indices arg Comma separated list of strings or
literals with the name of the data
skipping indices that should be used
during query execution, otherwise an
exception will be thrown.
--max_streams_to_max_threads_ratio arg
Allows you to use more sources than the
number of threads - to more evenly
distribute work across threads. It is
assumed that this is a temporary
solution, since it will be possible in
the future to make the number of
sources equal to the number of threads,
but for each source to dynamically
select available work for itself.
--max_streams_multiplier_for_merge_tables arg
Ask more streams when reading from
Merge table. Streams will be spread
across tables that Merge table will
use. This allows more even distribution
of work across threads and especially
helpful when merged tables differ in
size.
--network_compression_method arg Allows you to select the method of data
compression when writing.
--network_zstd_compression_level arg Allows you to select the level of ZSTD
compression.
--priority arg Priority of the query. 1 - the highest,
higher value - lower priority; 0 - do
not use priorities.
--os_thread_priority arg If non zero - set corresponding 'nice'
value for query processing threads. Can
be used to adjust query priority for OS
scheduler.
--log_queries arg Log requests and write the log to the
system table.
--log_formatted_queries arg Log formatted queries and write the log
to the system table.
--log_queries_min_type arg Minimal type in query_log to log,
possible values (from low to high):
QUERY_START, QUERY_FINISH,
EXCEPTION_BEFORE_START,
EXCEPTION_WHILE_PROCESSING.
--log_queries_min_query_duration_ms arg
Minimal time for the query to run, to
get to the query_log/query_thread_log/q
uery_views_log.
--log_queries_cut_to_length arg If query length is greater than
specified threshold (in bytes), then
cut query when writing to query log.
Also limit length of printed query in
ordinary text log.
--log_queries_probability arg Log queries with the specified
probabality.
--distributed_product_mode arg How are distributed subqueries
performed inside IN or JOIN sections?
--max_concurrent_queries_for_all_users arg
The maximum number of concurrent
requests for all users.
--max_concurrent_queries_for_user arg The maximum number of concurrent
requests per user.
--insert_deduplicate arg For INSERT queries in the replicated
table, specifies that deduplication of
insertings blocks should be performed
--insert_quorum arg For INSERT queries in the replicated
table, wait writing for the specified
number of replicas and linearize the
addition of the data. 0 - disabled.
--insert_quorum_timeout arg
--insert_quorum_parallel arg For quorum INSERT queries - enable to
make parallel inserts without
linearizability
--select_sequential_consistency arg For SELECT queries from the replicated
table, throw an exception if the
replica does not have a chunk written
with the quorum; do not read the parts
that have not yet been written with the
quorum.
--table_function_remote_max_addresses arg
The maximum number of different shards
and the maximum number of replicas of
one shard in the `remote` function.
--read_backoff_min_latency_ms arg Setting to reduce the number of threads
in case of slow reads. Pay attention
only to reads that took at least that
much time.
--read_backoff_max_throughput arg Settings to reduce the number of
threads in case of slow reads. Count
events when the read bandwidth is less
than that many bytes per second.
--read_backoff_min_interval_between_events_ms arg
Settings to reduce the number of
threads in case of slow reads. Do not
pay attention to the event, if the
previous one has passed less than a
certain amount of time.
--read_backoff_min_events arg Settings to reduce the number of
threads in case of slow reads. The
number of events after which the number
of threads will be reduced.
--read_backoff_min_concurrency arg Settings to try keeping the minimal
number of threads in case of slow
reads.
--memory_tracker_fault_probability arg
For testing of `exception safety` -
throw an exception every time you
allocate memory with the specified
probability.
--enable_http_compression arg Compress the result if the client over
HTTP said that it understands data
compressed by gzip or deflate.
--http_zlib_compression_level arg Compression level - used if the client
on HTTP said that it understands data
compressed by gzip or deflate.
--http_native_compression_disable_checksumming_on_decompress arg
If you uncompress the POST data from
the client compressed by the native
format, do not check the checksum.
--count_distinct_implementation arg What aggregate function to use for
implementation of count(DISTINCT ...)
--add_http_cors_header arg Write add http CORS header.
--max_http_get_redirects arg Max number of http GET redirects hops
allowed. Make sure additional security
measures are in place to prevent a
malicious server to redirect your
requests to unexpected services.
--use_client_time_zone arg Use client timezone for interpreting
DateTime string values, instead of
adopting server timezone.
--send_progress_in_http_headers arg Send progress notifications using
X-ClickHouse-Progress headers. Some
clients do not support high amount of
HTTP headers (Python requests in
particular), so it is disabled by
default.
--http_headers_progress_interval_ms arg
Do not send HTTP headers
X-ClickHouse-Progress more frequently
than at each specified interval.
--fsync_metadata arg Do fsync after changing metadata for
tables and databases (.sql files).
Could be disabled in case of poor
latency on server with high load of DDL
queries and high load of disk
subsystem.
--join_use_nulls arg Use NULLs for non-joined rows of outer
JOINs for types that can be inside
Nullable. If false, use default value
of corresponding columns data type.
--join_default_strictness arg Set default strictness in JOIN query.
Possible values: empty string, 'ANY',
'ALL'. If empty, query without
strictness will throw exception.
--any_join_distinct_right_table_keys arg
Enable old ANY JOIN logic with
many-to-one left-to-right table keys
mapping for all ANY JOINs. It leads to
confusing not equal results for 't1 ANY
LEFT JOIN t2' and 't2 ANY RIGHT JOIN
t1'. ANY RIGHT JOIN needs one-to-many
keys mapping to be consistent with LEFT
one.
--preferred_block_size_bytes arg
--max_replica_delay_for_distributed_queries arg
If set, distributed queries of
Replicated tables will choose servers
with replication delay in seconds less
than the specified value (not
inclusive). Zero means do not take
delay into account.
--fallback_to_stale_replicas_for_distributed_queries arg
Suppose max_replica_delay_for_distribut
ed_queries is set and all replicas for
the queried table are stale. If this
setting is enabled, the query will be
performed anyway, otherwise the error
will be reported.
--preferred_max_column_in_block_size_bytes arg
Limit on max column size in block while
reading. Helps to decrease cache misses
count. Should be close to L2 cache
size.
--insert_distributed_sync arg If setting is enabled, insert query
into distributed waits until data will
be sent to all nodes in cluster.
--insert_distributed_timeout arg Timeout for insert query into
distributed. Setting is used only with
insert_distributed_sync enabled. Zero
value means no timeout.
--distributed_ddl_task_timeout arg Timeout for DDL query responses from
all hosts in cluster. If a ddl request
has not been performed on all hosts, a
response will contain a timeout error
and a request will be executed in an
async mode. Negative value means
infinite. Zero means async mode.
--stream_flush_interval_ms arg Timeout for flushing data from
streaming storages.
--stream_poll_timeout_ms arg Timeout for polling data from/to
streaming storages.
--sleep_in_send_tables_status_ms arg Time to sleep in sending tables status
response in TCPHandler
--sleep_in_send_data_ms arg Time to sleep in sending data in
TCPHandler
--unknown_packet_in_send_data arg Send unknown packet instead of data Nth
data packet
--sleep_in_receive_cancel_ms arg Time to sleep in receiving cancel in
TCPHandler
--insert_allow_materialized_columns arg
If setting is enabled, Allow
materialized columns in INSERT.
--http_connection_timeout arg HTTP connection timeout.
--http_send_timeout arg HTTP send timeout
--http_receive_timeout arg HTTP receive timeout
--http_max_uri_size arg Maximum URI length of HTTP request
--http_max_fields arg Maximum number of fields in HTTP header
--http_max_field_name_size arg Maximum length of field name in HTTP
header
--http_max_field_value_size arg Maximum length of field value in HTTP
header
--http_skip_not_found_url_for_globs arg
Skip url's for globs with
HTTP_NOT_FOUND error
--optimize_throw_if_noop arg If setting is enabled and OPTIMIZE
query didn't actually assign a merge
then an explanatory exception is thrown
--use_index_for_in_with_subqueries arg
Try using an index if there is a
subquery or a table expression on the
right side of the IN operator.
--joined_subquery_requires_alias arg Force joined subqueries and table
functions to have aliases for correct
name qualification.
--empty_result_for_aggregation_by_empty_set arg
Return empty result when aggregating
without keys on empty set.
--empty_result_for_aggregation_by_constant_keys_on_empty_set arg
Return empty result when aggregating by
constant keys on empty set.
--allow_distributed_ddl arg If it is set to true, then a user is
allowed to executed distributed DDL
queries.
--allow_suspicious_codecs arg If it is set to true, allow to specify
meaningless compression codecs.
--allow_experimental_codecs arg If it is set to true, allow to specify
experimental compression codecs (but we
don't have those yet and this option
does nothing).
--query_profiler_real_time_period_ns arg
Period for real clock timer of query
profiler (in nanoseconds). Set 0 value
to turn off the real clock query
profiler. Recommended value is at least
10000000 (100 times a second) for
single queries or 1000000000 (once a
second) for cluster-wide profiling.
--query_profiler_cpu_time_period_ns arg
Period for CPU clock timer of query
profiler (in nanoseconds). Set 0 value
to turn off the CPU clock query
profiler. Recommended value is at least
10000000 (100 times a second) for
single queries or 1000000000 (once a
second) for cluster-wide profiling.
--metrics_perf_events_enabled arg If enabled, some of the perf events
will be measured throughout queries'
execution.
--metrics_perf_events_list arg Comma separated list of perf metrics
that will be measured throughout
queries' execution. Empty means all
events. See PerfEventInfo in sources
for the available events.
--opentelemetry_start_trace_probability arg
Probability to start an OpenTelemetry
trace for an incoming query.
--prefer_column_name_to_alias arg Prefer using column names instead of
aliases if possible.
--prefer_global_in_and_join arg If enabled, all IN/JOIN operators will
be rewritten as GLOBAL IN/JOIN. It's
useful when the to-be-joined tables are
only available on the initiator and we
need to always scatter their data
on-the-fly during distributed
processing with the GLOBAL keyword.
It's also useful to reduce the need to
access the external sources joining
external tables.
--max_rows_to_read arg Limit on read rows from the most 'deep'
sources. That is, only in the deepest
subquery. When reading from a remote
server, it is only checked on a remote
server.
--max_bytes_to_read arg Limit on read bytes (after
decompression) from the most 'deep'
sources. That is, only in the deepest
subquery. When reading from a remote
server, it is only checked on a remote
server.
--read_overflow_mode arg What to do when the limit is exceeded.
--max_rows_to_read_leaf arg Limit on read rows on the leaf nodes
for distributed queries. Limit is
applied for local reads only excluding
the final merge stage on the root node.
--max_bytes_to_read_leaf arg Limit on read bytes (after
decompression) on the leaf nodes for
distributed queries. Limit is applied
for local reads only excluding the
final merge stage on the root node.
--read_overflow_mode_leaf arg What to do when the leaf limit is
exceeded.
--max_rows_to_group_by arg
--group_by_overflow_mode arg What to do when the limit is exceeded.
--max_bytes_before_external_group_by arg
--max_rows_to_sort arg
--max_bytes_to_sort arg
--sort_overflow_mode arg What to do when the limit is exceeded.
--max_bytes_before_external_sort arg
--max_bytes_before_remerge_sort arg In case of ORDER BY with LIMIT, when
memory usage is higher than specified
threshold, perform additional steps of
merging blocks before final merge to
keep just top LIMIT rows.
--remerge_sort_lowered_memory_bytes_ratio arg
If memory usage after remerge does not
reduced by this ratio, remerge will be
disabled.
--max_result_rows arg Limit on result size in rows. Also
checked for intermediate data sent from
remote servers.
--max_result_bytes arg Limit on result size in bytes
(uncompressed). Also checked for
intermediate data sent from remote
servers.
--result_overflow_mode arg What to do when the limit is exceeded.
--max_execution_time arg
--timeout_overflow_mode arg What to do when the limit is exceeded.
--min_execution_speed arg Minimum number of execution rows per
second.
--max_execution_speed arg Maximum number of execution rows per
second.
--min_execution_speed_bytes arg Minimum number of execution bytes per
second.
--max_execution_speed_bytes arg Maximum number of execution bytes per
second.
--timeout_before_checking_execution_speed arg
Check that the speed is not too low
after the specified time has elapsed.
--max_columns_to_read arg
--max_temporary_columns arg
--max_temporary_non_const_columns arg
--max_subquery_depth arg
--max_pipeline_depth arg
--max_ast_depth arg Maximum depth of query syntax tree.
Checked after parsing.
--max_ast_elements arg Maximum size of query syntax tree in
number of nodes. Checked after parsing.
--max_expanded_ast_elements arg Maximum size of query syntax tree in
number of nodes after expansion of
aliases and the asterisk.
--readonly arg 0 - everything is allowed. 1 - only
read requests. 2 - only read requests,
as well as changing settings, except
for the 'readonly' setting.
--max_rows_in_set arg Maximum size of the set (in number of
elements) resulting from the execution
of the IN section.
--max_bytes_in_set arg Maximum size of the set (in bytes in
memory) resulting from the execution of
the IN section.
--set_overflow_mode arg What to do when the limit is exceeded.
--max_rows_in_join arg Maximum size of the hash table for JOIN
(in number of rows).
--max_bytes_in_join arg Maximum size of the hash table for JOIN
(in number of bytes in memory).
--join_overflow_mode arg What to do when the limit is exceeded.
--join_any_take_last_row arg When disabled (default) ANY JOIN will
take the first found row for a key.
When enabled, it will take the last row
seen if there are multiple rows for the
same key.
--join_algorithm arg Specify join algorithm: 'auto', 'hash',
'partial_merge', 'prefer_partial_merge'
. 'auto' tries to change HashJoin to
MergeJoin on the fly to avoid out of
memory.
--default_max_bytes_in_join arg Maximum size of right-side table if
limit is required but max_bytes_in_join
is not set.
--partial_merge_join_left_table_buffer_bytes arg
If not 0 group left table blocks in
bigger ones for left-side table in
partial merge join. It uses up to 2x of
specified memory per joining thread.
--partial_merge_join_rows_in_right_blocks arg
Split right-hand joining data in blocks
of specified size. It's a portion of
data indexed by min-max values and
possibly unloaded on disk.
--join_on_disk_max_files_to_merge arg For MergeJoin on disk set how much
files it's allowed to sort
simultaneously. Then this value bigger
then more memory used and then less
disk I/O needed. Minimum is 2.
--temporary_files_codec arg Set compression codec for temporary
files (sort and join on disk). I.e.
LZ4, NONE.
--max_rows_to_transfer arg Maximum size (in rows) of the
transmitted external table obtained
when the GLOBAL IN/JOIN section is
executed.
--max_bytes_to_transfer arg Maximum size (in uncompressed bytes) of
the transmitted external table obtained
when the GLOBAL IN/JOIN section is
executed.
--transfer_overflow_mode arg What to do when the limit is exceeded.
--max_rows_in_distinct arg Maximum number of elements during
execution of DISTINCT.
--max_bytes_in_distinct arg Maximum total size of state (in
uncompressed bytes) in memory for the
execution of DISTINCT.
--distinct_overflow_mode arg What to do when the limit is exceeded.
--max_memory_usage arg Maximum memory usage for processing of
single query. Zero means unlimited.
--max_guaranteed_memory_usage arg Maximum guaranteed memory usage for
processing of single query. It
represents soft limit. Zero means
unlimited.
--max_memory_usage_for_user arg Maximum memory usage for processing all
concurrently running queries for the
user. Zero means unlimited.
--max_guaranteed_memory_usage_for_user arg
Maximum guaranteed memory usage for
processing all concurrently running
queries for the user. It represents
soft limit. Zero means unlimited.
--max_untracked_memory arg Small allocations and deallocations are
grouped in thread local variable and
tracked or profiled only when amount
(in absolute value) becomes larger than
specified value. If the value is higher
than 'memory_profiler_step' it will be
effectively lowered to
'memory_profiler_step'.
--memory_profiler_step arg Whenever query memory usage becomes
larger than every next step in number
of bytes the memory profiler will
collect the allocating stack trace.
Zero means disabled memory profiler.
Values lower than a few megabytes will
slow down query processing.
--memory_profiler_sample_probability arg
Collect random allocations and
deallocations and write them into
system.trace_log with 'MemorySample'
trace_type. The probability is for
every alloc/free regardless to the size
of the allocation. Note that sampling
happens only when the amount of
untracked memory exceeds
'max_untracked_memory'. You may want to
set 'max_untracked_memory' to 0 for
extra fine grained sampling.
--memory_usage_overcommit_max_wait_microseconds arg
Maximum time thread will wait for
memory to be freed in the case of
memory overcommit. If timeout is
reached and memory is not freed,
exception is thrown
--max_network_bandwidth arg The maximum speed of data exchange over
the network in bytes per second for a
query. Zero means unlimited.
--max_network_bytes arg The maximum number of bytes
(compressed) to receive or transmit
over the network for execution of the
query.
--max_network_bandwidth_for_user arg The maximum speed of data exchange over
the network in bytes per second for all
concurrently running user queries. Zero
means unlimited.
--max_network_bandwidth_for_all_users arg
The maximum speed of data exchange over
the network in bytes per second for all
concurrently running queries. Zero
means unlimited.
--max_backup_threads arg The maximum number of threads to
execute a BACKUP or RESTORE request. By
default, it is determined
automatically.
--log_profile_events arg Log query performance statistics into
the query_log, query_thread_log and
query_views_log.
--log_query_settings arg Log query settings into the query_log.
--log_query_threads arg Log query threads into
system.query_thread_log table. This
setting have effect only when
'log_queries' is true.
--log_query_views arg Log query dependent views into
system.query_views_log table. This
setting have effect only when
'log_queries' is true.
--log_comment arg Log comment into system.query_log table
and server log. It can be set to
arbitrary string no longer than
max_query_size.
--send_logs_level arg Send server text logs with specified
minimum level to client. Valid values:
'trace', 'debug', 'information',
'warning', 'error', 'fatal', 'none'
--enable_optimize_predicate_expression arg
If it is set to true, optimize
predicates to subqueries.
--enable_optimize_predicate_expression_to_final_subquery arg
Allow push predicate to final subquery.
--allow_push_predicate_when_subquery_contains_with arg
Allows push predicate when subquery
contains WITH clause
--low_cardinality_max_dictionary_size arg
Maximum size (in rows) of shared global
dictionary for LowCardinality type.
--low_cardinality_use_single_dictionary_for_part arg
LowCardinality type serialization
setting. If is true, than will use
additional keys when global dictionary
overflows. Otherwise, will create
several shared dictionaries.
--decimal_check_overflow arg Check overflow of decimal
arithmetic/comparison operations
--prefer_localhost_replica arg If it's true then queries will be
always sent to local replica (if it
exists). If it's false then replica to
send a query will be chosen between
local and remote ones according to
load_balancing
--max_fetch_partition_retries_count arg
Amount of retries while fetching
partition from another host.
--http_max_multipart_form_data_size arg
Limit on size of multipart/form-data
content. This setting cannot be parsed
from URL parameters and should be set
in user profile. Note that content is
parsed and external tables are created
in memory before start of query
execution. And this is the only limit
that has effect on that stage (limits
on max memory usage and max execution
time have no effect while reading HTTP
form data).
--calculate_text_stack_trace arg Calculate text stack trace in case of
exceptions during query execution. This
is the default. It requires symbol
lookups that may slow down fuzzing
tests when huge amount of wrong queries
are executed. In normal cases you
should not disable this option.
--allow_ddl arg If it is set to true, then a user is
allowed to executed DDL queries.
--parallel_view_processing arg Enables pushing to attached views
concurrently instead of sequentially.
--enable_unaligned_array_join arg Allow ARRAY JOIN with multiple arrays
that have different sizes. When this
settings is enabled, arrays will be
resized to the longest one.
--optimize_read_in_order arg Enable ORDER BY optimization for
reading data in corresponding order in
MergeTree tables.
--optimize_aggregation_in_order arg Enable GROUP BY optimization for
aggregating data in corresponding order
in MergeTree tables.
--aggregation_in_order_max_block_bytes arg
Maximal size of block in bytes
accumulated during aggregation in order
of primary key. Lower block size allows
to parallelize more final merge stage
of aggregation.
--read_in_order_two_level_merge_threshold arg
Minimal number of parts to read to run
preliminary merge step during
multithread reading in order of primary
key.
--low_cardinality_allow_in_native_format arg
Use LowCardinality type in Native
format. Otherwise, convert
LowCardinality columns to ordinary for
select query, and convert ordinary
columns to required LowCardinality for
insert query.
--cancel_http_readonly_queries_on_client_close arg
Cancel HTTP readonly queries when a
client closes the connection without
waiting for response.
--external_table_functions_use_nulls arg
If it is set to true, external table
functions will implicitly use Nullable
type if needed. Otherwise NULLs will be
substituted with default values.
Currently supported only by 'mysql',
'postgresql' and 'odbc' table
functions.
--external_table_strict_query arg If it is set to true, transforming
expression to local filter is forbidden
for queries to external tables.
--allow_hyperscan arg Allow functions that use Hyperscan
library. Disable to avoid potentially
long compilation times and excessive
resource usage.
--max_hyperscan_regexp_length arg Max length of regexp than can be used
in hyperscan multi-match functions.
Zero means unlimited.
--max_hyperscan_regexp_total_length arg
Max total length of all regexps than
can be used in hyperscan multi-match
functions (per every function). Zero
means unlimited.
--allow_simdjson arg Allow using simdjson library in 'JSON*'
functions if AVX2 instructions are
available. If disabled rapidjson will
be used.
--allow_introspection_functions arg Allow functions for introspection of
ELF and DWARF for query profiling.
These functions are slow and may impose
security considerations.
--max_partitions_per_insert_block arg Limit maximum number of partitions in
single INSERTed block. Zero means
unlimited. Throw exception if the block
contains too many partitions. This
setting is a safety threshold, because
using large number of partitions is a
common misconception.
--max_partitions_to_read arg Limit the max number of partitions that
can be accessed in one query. <= 0
means unlimited.
--check_query_single_value_result arg Return check query result as single 1/0
value
--allow_drop_detached arg Allow ALTER TABLE ... DROP DETACHED
PART[ITION] ... queries
--postgresql_connection_pool_size arg Connection pool size for PostgreSQL
table engine and database engine.
--postgresql_connection_pool_wait_timeout arg
Connection pool push/pop timeout on
empty pool for PostgreSQL table engine
and database engine. By default it will
block on empty pool.
--glob_expansion_max_elements arg Maximum number of allowed addresses
(For external storages, table
functions, etc).
--odbc_bridge_connection_pool_size arg
Connection pool size for each
connection settings string in ODBC
bridge.
--distributed_replica_error_half_life arg
Time period reduces replica error
counter by 2 times.
--distributed_replica_error_cap arg Max number of errors per replica,
prevents piling up an incredible amount
of errors if replica was offline for
some time and allows it to be
reconsidered in a shorter amount of
time.
--distributed_replica_max_ignored_errors arg
Number of errors that will be ignored
while choosing replicas
--allow_experimental_live_view arg Enable LIVE VIEW. Not mature enough.
--live_view_heartbeat_interval arg The heartbeat interval in seconds to
indicate live query is alive.
--max_live_view_insert_blocks_before_refresh arg
Limit maximum number of inserted blocks
after which mergeable blocks are
dropped and query is re-executed.
--allow_experimental_window_view arg Enable WINDOW VIEW. Not mature enough.
--window_view_clean_interval arg The clean interval of window view in
seconds to free outdated data.
--window_view_heartbeat_interval arg The heartbeat interval in seconds to
indicate watch query is alive.
--min_free_disk_space_for_temporary_data arg
The minimum disk space to keep while
writing temporary data used in external
sorting and aggregation.
--default_database_engine arg Default database engine.
--default_table_engine arg Default table engine used when ENGINE
is not set in CREATE statement.
--show_table_uuid_in_table_create_query_if_not_nil arg
For tables in databases with
Engine=Atomic show UUID of the table in
its CREATE query.
--database_atomic_wait_for_drop_and_detach_synchronously arg
When executing DROP or DETACH TABLE in
Atomic database, wait for table data to
be finally dropped or detached.
--enable_scalar_subquery_optimization arg
If it is set to true, prevent scalar
subqueries from (de)serializing large
scalar values and possibly avoid
running the same subquery more than
once.
--optimize_trivial_count_query arg Process trivial 'SELECT count() FROM
table' query from metadata.
--optimize_respect_aliases arg If it is set to true, it will respect
aliases in WHERE/GROUP BY/ORDER BY,
that will help with partition
pruning/secondary indexes/optimize_aggr
egation_in_order/optimize_read_in_order
/optimize_trivial_count
--mutations_sync arg Wait for synchronous execution of ALTER
TABLE UPDATE/DELETE queries
(mutations). 0 - execute
asynchronously. 1 - wait current
server. 2 - wait all replicas if they
exist.
--optimize_move_functions_out_of_any arg
Move functions out of aggregate
functions 'any', 'anyLast'.
--optimize_normalize_count_variants arg
Rewrite aggregate functions that
semantically equals to count() as
count().
--optimize_injective_functions_inside_uniq arg
Delete injective functions of one
argument inside uniq*() functions.
--convert_query_to_cnf arg Convert SELECT query to CNF
--optimize_arithmetic_operations_in_aggregate_functions arg
Move arithmetic operations out of
aggregation functions
--optimize_duplicate_order_by_and_distinct arg
Remove duplicate ORDER BY and DISTINCT
if it's possible
--optimize_redundant_functions_in_order_by arg
Remove functions from ORDER BY if its
argument is also in ORDER BY
--optimize_if_chain_to_multiif arg Replace if(cond1, then1, if(cond2,
...)) chains to multiIf. Currently it's
not beneficial for numeric types.
--optimize_if_transform_strings_to_enum arg
Replaces string-type arguments in If
and Transform to enum. Disabled by
default cause it could make
inconsistent change in distributed
query that would lead to its fail.
--optimize_monotonous_functions_in_order_by arg
Replace monotonous function with its
argument in ORDER BY
--optimize_functions_to_subcolumns arg
Transform functions to subcolumns, if
possible, to reduce amount of read
data. E.g. 'length(arr)' ->
'arr.size0', 'col IS NULL' ->
'col.null'
--optimize_using_constraints arg Use constraints for query optimization
--optimize_substitute_columns arg Use constraints for column substitution
--optimize_append_index arg Use constraints in order to append
index condition (indexHint)
--normalize_function_names arg Normalize function names to their
canonical names
--allow_experimental_alter_materialized_view_structure arg
Allow atomic alter on Materialized
views. Work in progress.
--enable_early_constant_folding arg Enable query optimization where we
analyze function and subqueries results
and rewrite query if there're constants
there
--deduplicate_blocks_in_dependent_materialized_views arg
Should deduplicate blocks for
materialized views if the block is not
a duplicate for the table. Use true to
always deduplicate in dependent tables.
--use_compact_format_in_distributed_parts_names arg
Changes format of directories names for
distributed table insert parts.
--validate_polygons arg Throw exception if polygon is invalid
in function pointInPolygon (e.g.
self-tangent, self-intersecting). If
the setting is false, the function will
accept invalid polygons but may
silently return wrong result.
--max_parser_depth arg Maximum parser depth (recursion depth
of recursive descend parser).
--temporary_live_view_timeout arg Timeout after which temporary live view
is deleted.
--periodic_live_view_refresh arg Interval after which periodically
refreshed live view is forced to
refresh.
--transform_null_in arg If enabled, NULL values will be matched
with 'IN' operator as if they are
considered equal.
--allow_nondeterministic_mutations arg
Allow non-deterministic functions in
ALTER UPDATE/ALTER DELETE statements
--lock_acquire_timeout arg How long locking request should wait
before failing
--materialize_ttl_after_modify arg Apply TTL for old data, after ALTER
MODIFY TTL query
--function_implementation arg Choose function implementation for
specific target or variant
(experimental). If empty enable all of
them.
--allow_experimental_geo_types arg Allow geo data types such as Point,
Ring, Polygon, MultiPolygon
--data_type_default_nullable arg Data types without NULL or NOT NULL
will make Nullable
--cast_keep_nullable arg CAST operator keep Nullable for result
data type
--alter_partition_verbose_result arg Output information about affected
parts. Currently works only for FREEZE
and ATTACH commands.
--allow_experimental_database_materialized_mysql arg
Allow to create database with
Engine=MaterializedMySQL(...).
--allow_experimental_database_materialized_postgresql arg
Allow to create database with
Engine=MaterializedPostgreSQL(...).
--system_events_show_zero_values arg Include all metrics, even with zero
values
--mysql_datatypes_support_level arg Which MySQL types should be converted
to corresponding ClickHouse types
(rather than being represented as
String). Can be empty or any
combination of 'decimal' or
'datetime64'. When empty MySQL's
DECIMAL and DATETIME/TIMESTAMP with
non-zero precision are seen as String
on ClickHouse's side.
--optimize_trivial_insert_select arg Optimize trivial 'INSERT INTO table
SELECT ... FROM TABLES' query
--allow_non_metadata_alters arg Allow to execute alters which affects
not only tables metadata, but also data
on disk
--enable_global_with_statement arg Propagate WITH statements to UNION
queries and all subqueries
--aggregate_functions_null_for_empty arg
Rewrite all aggregate functions in a
query, adding -OrNull suffix to them
--optimize_syntax_fuse_functions arg Allow apply syntax optimisation: fuse
aggregate functions
--optimize_fuse_sum_count_avg arg Fuse functions `sum, avg, count` with
identical arguments into one `sumCount`
(`optimize_syntax_fuse_functions should
be enabled)
--flatten_nested arg If true, columns of type Nested will be
flatten to separate array columns
instead of one array of tuples
--asterisk_include_materialized_columns arg
Include MATERIALIZED columns for
wildcard query
--asterisk_include_alias_columns arg Include ALIAS columns for wildcard
query
--optimize_skip_merged_partitions arg Skip partitions with one part with
level > 0 in optimize final
--optimize_on_insert arg Do the same transformation for inserted
block of data as if merge was done on
this block.
--force_optimize_projection arg If projection optimization is enabled,
SELECT queries need to use projection
--async_socket_for_remote arg Asynchronously read from socket
executing remote query
--insert_null_as_default arg Insert DEFAULT values instead of NULL
in INSERT SELECT (UNION ALL)
--describe_include_subcolumns arg If true, subcolumns of all table
columns will be included into result of
DESCRIBE query
--optimize_rewrite_sum_if_to_count_if arg
Rewrite sumIf() and sum(if()) function
countIf() function when logically
equivalent
--insert_shard_id arg If non zero, when insert into a
distributed table, the data will be
inserted into the shard
`insert_shard_id` synchronously.
Possible values range from 1 to
`shards_number` of corresponding
distributed table
--allow_experimental_query_deduplication arg
Experimental data deduplication for
SELECT queries based on part UUIDs
--experimental_query_deduplication_send_all_part_uuids arg
If false only part UUIDs for currently
moving parts are sent. If true all read
part UUIDs are sent (useful only for
testing).
--engine_file_empty_if_not_exists arg Allows to select data from a file
engine table without file
--engine_file_truncate_on_insert arg Enables or disables truncate before
insert in file engine tables
--engine_file_allow_create_multiple_files arg
Enables or disables creating a new file
on each insert in file engine tables if
format has suffix.
--allow_experimental_database_replicated arg
Allow to create databases with
Replicated engine
--database_replicated_initial_query_timeout_sec arg
How long initial DDL query should wait
for Replicated database to precess
previous DDL queue entries
--max_distributed_depth arg Maximum distributed query depth
--database_replicated_always_detach_permanently arg
Execute DETACH TABLE as DETACH TABLE
PERMANENTLY if database engine is
Replicated
--distributed_ddl_output_mode arg Format of distributed DDL query result
--distributed_ddl_entry_format_version arg
Version of DDL entry to write into
ZooKeeper
--external_storage_max_read_rows arg Limit maximum number of rows when table
with external engine should flush
history data. Now supported only for
MySQL table engine, database engine,
dictionary and MaterializedMySQL. If
equal to 0, this setting is disabled
--external_storage_max_read_bytes arg Limit maximum number of bytes when
table with external engine should flush
history data. Now supported only for
MySQL table engine, database engine,
dictionary and MaterializedMySQL. If
equal to 0, this setting is disabled
--external_storage_connect_timeout_sec arg
Connect timeout in seconds. Now
supported only for MySQL
--external_storage_rw_timeout_sec arg Read/write timeout in seconds. Now
supported only for MySQL
--union_default_mode arg Set default Union Mode in
SelectWithUnion query. Possible values:
empty string, 'ALL', 'DISTINCT'. If
empty, query without Union Mode will
throw exception.
--optimize_aggregators_of_group_by_keys arg
Eliminates min/max/any/anyLast
aggregators of GROUP BY keys in SELECT
section
--optimize_group_by_function_keys arg Eliminates functions of other keys in
GROUP BY section
--legacy_column_name_of_tuple_literal arg
List all names of element of large
tuple literals in their column names
instead of hash. This settings exists
only for compatibility reasons. It
makes sense to set to 'true', while
doing rolling update of cluster from
version lower than 21.7 to higher.
--query_plan_enable_optimizations arg Apply optimizations to query plan
--query_plan_max_optimizations_to_apply arg
Limit the total number of optimizations
applied to query plan. If zero,
ignored. If limit reached, throw
exception
--query_plan_filter_push_down arg Allow to push down filter by predicate
query plan step
--regexp_max_matches_per_row arg Max matches of any single regexp per
row, used to safeguard
'extractAllGroupsHorizontal' against
consuming too much memory with greedy
RE.
--limit arg Limit on read rows from the most 'end'
result for select query, default 0
means no limit length
--offset arg Offset on read rows from the most 'end'
result for select query
--function_range_max_elements_in_block arg
Maximum number of values generated by
function 'range' per block of data (sum
of array sizes for every row in a
block, see also 'max_block_size' and
'min_insert_block_size_rows'). It is a
safety threshold.
--short_circuit_function_evaluation arg
Setting for short-circuit function
evaluation configuration. Possible
values: 'enable' - use short-circuit
function evaluation for functions that
are suitable for it, 'disable' -
disable short-circuit function
evaluation, 'force_enable' - use
short-circuit function evaluation for
all functions.
--local_filesystem_read_method arg Method of reading data from local
filesystem, one of: read, pread, mmap,
pread_threadpool.
--remote_filesystem_read_method arg Method of reading data from remote
filesystem, one of: read, threadpool.
--local_filesystem_read_prefetch arg Should use prefetching when reading
data from local filesystem.
--remote_filesystem_read_prefetch arg Should use prefetching when reading
data from remote filesystem.
--read_priority arg Priority to read data from local
filesystem. Only supported for
'pread_threadpool' method.
--merge_tree_min_rows_for_concurrent_read_for_remote_filesystem arg
If at least as many lines are read from
one file, the reading can be
parallelized, when reading from remote
filesystem.
--merge_tree_min_bytes_for_concurrent_read_for_remote_filesystem arg
If at least as many bytes are read from
one file, the reading can be
parallelized, when reading from remote
filesystem.
--remote_read_min_bytes_for_seek arg Min bytes required for remote read
(url, s3) to do seek, instead for read
with ignore.
--async_insert_threads arg Maximum number of threads to actually
parse and insert data in background.
Zero means asynchronous mode is
disabled
--async_insert arg If true, data from INSERT query is
stored in queue and later flushed to
table in background. Makes sense only
for inserts via HTTP protocol. If
wait_for_async_insert is false, INSERT
query is processed almost instantly,
otherwise client will wait until data
will be flushed to table
--wait_for_async_insert arg If true wait for processing of
asynchronous insertion
--wait_for_async_insert_timeout arg Timeout for waiting for processing
asynchronous insertion
--async_insert_max_data_size arg Maximum size in bytes of unparsed data
collected per query before being
inserted
--async_insert_busy_timeout_ms arg Maximum time to wait before dumping
collected data per query since the
first data appeared
--async_insert_stale_timeout_ms arg Maximum time to wait before dumping
collected data per query since the last
data appeared. Zero means no timeout at
all
--remote_fs_read_max_backoff_ms arg Max wait time when trying to read data
for remote disk
--remote_fs_read_backoff_max_tries arg
Max attempts to read with backoff
--http_max_tries arg Max attempts to read via http.
--http_retry_initial_backoff_ms arg Min milliseconds for backoff, when
retrying read via http
--http_retry_max_backoff_ms arg Max milliseconds for backoff, when
retrying read via http
--force_remove_data_recursively_on_drop arg
Recursively remove data on DROP query.
Avoids 'Directory not empty' error, but
may silently remove detached data
--check_table_dependencies arg Check that DDL query (such as DROP
TABLE or RENAME) will not break
dependencies
--use_local_cache_for_remote_storage arg
Use local cache for remote storage like
HDFS or S3, it's used for remote table
engine only
--allow_experimental_funnel_functions arg
Enable experimental functions for
funnel analysis.
--allow_experimental_nlp_functions arg
Enable experimental functions for
natural language processing.
--insert_deduplication_token arg If not empty, used for duplicate
detection instead of data digest
--max_memory_usage_for_all_queries arg
Obsolete setting, does nothing.
--multiple_joins_rewriter_version arg Obsolete setting, does nothing.
--enable_debug_queries arg Obsolete setting, does nothing.
--allow_experimental_database_atomic arg
Obsolete setting, does nothing.
--allow_experimental_bigint_types arg Obsolete setting, does nothing.
--allow_experimental_window_functions arg
Obsolete setting, does nothing.
--handle_kafka_error_mode arg Obsolete setting, does nothing.
--database_replicated_ddl_output arg Obsolete setting, does nothing.
--replication_alter_columns_timeout arg
Obsolete setting, does nothing.
--odbc_max_field_size arg Obsolete setting, does nothing.
--allow_experimental_map_type arg Obsolete setting, does nothing.
--merge_tree_clear_old_temporary_directories_interval_seconds arg
Obsolete setting, does nothing.
--merge_tree_clear_old_parts_interval_seconds arg
Obsolete setting, does nothing.
--partial_merge_join_optimizations arg
Obsolete setting, does nothing.
--max_alter_threads arg Obsolete setting, does nothing.
--allow_experimental_projection_optimization arg
Obsolete setting, does nothing.
--format_csv_delimiter arg The character to be considered as a
delimiter in CSV data. If setting with
a string, a string has to have a length
of 1.
--format_csv_allow_single_quotes arg If it is set to true, allow strings in
single quotes.
--format_csv_allow_double_quotes arg If it is set to true, allow strings in
double quotes.
--output_format_csv_crlf_end_of_line arg
If it is set true, end of line in CSV
format will be rn instead of n.
--input_format_csv_enum_as_number arg Treat inserted enum values in CSV
formats as enum indices N
--input_format_csv_arrays_as_nested_csv arg
When reading Array from CSV, expect
that its elements were serialized in
nested CSV and then put into string.
Example: "[""Hello"", ""world"",
""42"""" TV""]". Braces around array
can be omitted.
--input_format_skip_unknown_fields arg
Skip columns with unknown names from
input data (it works for JSONEachRow,
-WithNames, -WithNamesAndTypes and TSKV
formats).
--input_format_with_names_use_header arg
For -WithNames input formats this
controls whether format parser is to
assume that column data appear in the
input exactly as they are specified in
the header.
--input_format_with_types_use_header arg
For -WithNamesAndTypes input formats
this controls whether format parser
should check if data types from the
input match data types from the header.
--input_format_import_nested_json arg Map nested JSON data to nested tables
(it works for JSONEachRow format).
--input_format_defaults_for_omitted_fields arg
For input data calculate default
expressions for omitted fields (it
works for JSONEachRow, -WithNames,
-WithNamesAndTypes formats).
--input_format_csv_empty_as_default arg
Treat empty fields in CSV input as
default values.
--input_format_tsv_empty_as_default arg
Treat empty fields in TSV input as
default values.
--input_format_tsv_enum_as_number arg Treat inserted enum values in TSV
formats as enum indices N
--input_format_null_as_default arg For text input formats initialize null
fields with default values if data type
of this field is not nullable
--input_format_arrow_import_nested arg
Allow to insert array of structs into
Nested table in Arrow input format.
--input_format_orc_import_nested arg Allow to insert array of structs into
Nested table in ORC input format.
--input_format_orc_row_batch_size arg Batch size when reading ORC stripes.
--input_format_parquet_import_nested arg
Allow to insert array of structs into
Nested table in Parquet input format.
--input_format_allow_seeks arg Allow seeks while reading in
ORC/Parquet/Arrow input formats
--input_format_orc_allow_missing_columns arg
Allow missing columns while reading ORC
input formats
--input_format_parquet_allow_missing_columns arg
Allow missing columns while reading
Parquet input formats
--input_format_arrow_allow_missing_columns arg
Allow missing columns while reading
Arrow input formats
--input_format_hive_text_fields_delimiter arg
Delimiter between fields in Hive Text
File
--input_format_hive_text_collection_items_delimiter arg
Delimiter between collection(array or
map) items in Hive Text File
--input_format_hive_text_map_keys_delimiter arg
Delimiter between a pair of map
key/values in Hive Text File
--input_format_msgpack_number_of_columns arg
The number of columns in inserted
MsgPack data. Used for automatic schema
inference from data.
--output_format_msgpack_uuid_representation arg
The way how to output UUID in MsgPack
format.
--input_format_max_rows_to_read_for_schema_inference arg
The maximum rows of data to read for
automatic schema inference
--date_time_input_format arg Method to read DateTime from text input
formats. Possible values: 'basic' and
'best_effort'.
--date_time_output_format arg Method to write DateTime to text
output. Possible values: 'simple',
'iso', 'unix_timestamp'.
--bool_true_representation arg Text to represent bool value in TSV/CSV
formats.
--bool_false_representation arg Text to represent bool value in TSV/CSV
formats.
--input_format_values_interpret_expressions arg
For Values format: if the field could
not be parsed by streaming parser, run
SQL parser and try to interpret it as
SQL expression.
--input_format_values_deduce_templates_of_expressions arg
For Values format: if the field could
not be parsed by streaming parser, run
SQL parser, deduce template of the SQL
expression, try to parse all rows using
template and then interpret expression
for all rows.
--input_format_values_accurate_types_of_literals arg
For Values format: when parsing and
interpreting expressions using
template, check actual type of literal
to avoid possible overflow and
precision issues.
--input_format_avro_allow_missing_fields arg
For Avro/AvroConfluent format: when
field is not found in schema use
default value instead of error
--format_avro_schema_registry_url arg For AvroConfluent format: Confluent
Schema Registry URL.
--output_format_json_quote_64bit_integers arg
Controls quoting of 64-bit integers in
JSON output format.
--output_format_json_quote_denormals arg
Enables ' nan', '-nan', ' inf', '-inf'
outputs in JSON output format.
--output_format_json_escape_forward_slashes arg
Controls escaping forward slashes for
string outputs in JSON output format.
This is intended for compatibility with
JavaScript. Don't confuse with
backslashes that are always escaped.
--output_format_json_named_tuples_as_objects arg
Serialize named tuple columns as JSON
objects.
--output_format_json_array_of_rows arg
Output a JSON array of all rows in
JSONEachRow(Compact) format.
--output_format_pretty_max_rows arg Rows limit for Pretty formats.
--output_format_pretty_max_column_pad_width arg
Maximum width to pad all values in a
column in Pretty formats.
--output_format_pretty_max_value_width arg
Maximum width of value to display in
Pretty formats. If greater - it will be
cut.
--output_format_pretty_color arg Use ANSI escape sequences to paint
colors in Pretty formats
--output_format_pretty_grid_charset arg
Charset for printing grid borders.
Available charsets: ASCII, UTF-8
(default one).
--output_format_parquet_row_group_size arg
Row group size in rows.
--output_format_avro_codec arg Compression codec used for output.
Possible values: 'null', 'deflate',
'snappy'.
--output_format_avro_sync_interval arg
Sync interval in bytes.
--output_format_avro_string_column_pattern arg
For Avro format: regexp of String
columns to select as AVRO string.
--output_format_avro_rows_in_file arg Max rows in a file (if permitted by
storage)
--output_format_tsv_crlf_end_of_line arg
If it is set true, end of line in TSV
format will be rn instead of n.
--format_csv_null_representation arg Custom NULL representation in CSV
format
--format_tsv_null_representation arg Custom NULL representation in TSV
format
--output_format_decimal_trailing_zeros arg
Output trailing zeros when printing
Decimal values. E.g. 1.230000 instead
of 1.23.
--input_format_allow_errors_num arg Maximum absolute amount of errors while
reading text formats (like CSV, TSV).
In case of error, if at least absolute
or relative amount of errors is lower
than corresponding value, will skip
until next line and continue.
--input_format_allow_errors_ratio arg Maximum relative amount of errors while
reading text formats (like CSV, TSV).
In case of error, if at least absolute
or relative amount of errors is lower
than corresponding value, will skip
until next line and continue.
--format_schema arg Schema identifier (used by schema-based
formats)
--format_template_resultset arg Path to file which contains format
string for result set (for Template
format)
--format_template_row arg Path to file which contains format
string for rows (for Template format)
--format_template_rows_between_delimiter arg
Delimiter between rows (for Template
format)
--format_custom_escaping_rule arg Field escaping rule (for
CustomSeparated format)
--format_custom_field_delimiter arg Delimiter between fields (for
CustomSeparated format)
--format_custom_row_before_delimiter arg
Delimiter before field of the first
column (for CustomSeparated format)
--format_custom_row_after_delimiter arg
Delimiter after field of the last
column (for CustomSeparated format)
--format_custom_row_between_delimiter arg
Delimiter between rows (for
CustomSeparated format)
--format_custom_result_before_delimiter arg
Prefix before result set (for
CustomSeparated format)
--format_custom_result_after_delimiter arg
Suffix after result set (for
CustomSeparated format)
--format_regexp arg Regular expression (for Regexp format)
--format_regexp_escaping_rule arg Field escaping rule (for Regexp format)
--format_regexp_skip_unmatched arg Skip lines unmatched by regular
expression (for Regexp format
--output_format_enable_streaming arg Enable streaming in output formats that
support it.
--output_format_write_statistics arg Write statistics about read rows,
bytes, time elapsed in suitable output
formats.
--output_format_pretty_row_numbers arg
Add row numbers before each row for
pretty output format
--insert_distributed_one_random_shard arg
If setting is enabled, inserting into
distributed table will choose a random
shard to write when there is no
sharding key
--cross_to_inner_join_rewrite arg Use inner join instead of comma/cross
join if possible
--output_format_arrow_low_cardinality_as_dictionary arg
Enable output LowCardinality type as
Dictionary Arrow type
--format_capn_proto_enum_comparising_mode arg
How to map ClickHouse Enum and
CapnProto Enum
External tables options:
--file arg data file or - for stdin
--name arg (=_data) name of the table
--format arg (=TabSeparated) data format
--structure arg structure
--types arg types
In addition, --param_name=value can be specified for substitution of parameters for parametrized queries.