统计cassandra单表数据量

2022-03-28 21:23:15 浏览数 (1)

当cassandra数据量很大时使用select count(*)这种方式基本上是无法统计的,会返回如下类似错误信息:

Cassandra timeout during read query at Consitency ONE(1 responses were required but only 0 replica responed)

这时候可以借助cassandra-count这个工具来实现count的统计,需要注意的是这个工具在工作时会对cassandra服务器CPU以及内存使用都会带来不同程度的压力,所以在线上尽量不要执行count操作,cassandra不适合做count统计,

1、下载cassandra-count工具,地址https://github.com/brianmhess/cassandra-count

2、执行如下命令,数据量很大时可以通过调大numSplits值来避免read timeout问题

./cassandra-count -host xx.xx.xx.xx -keyspace ks -table table1 -numSplits 1024

PS:指令参考

代码语言:javascript复制
Switch	Option	Default	Description
-host	IP Address		Cassandra connection point - required.
-keyspace	Keyspace Name		Cassandra keyspace - required.
-table	Table Name		Cassandra table name - required.
-configFile	Filename	none	Filename of configuration options
-port	Port Number	9042	Cassandra native protocol port number
-user	Username	none	Cassandra username
-pw	Password	none	Cassandra password
-ssl-truststore-path	Truststore Path	none	Path to SSL truststore
-ssl-truststore-pwd	Truststore Password	none	Password to SSL truststore
-ssl-keystore-path	Keystore Path	none	Path to SSL keystore
-ssl-keystore-path	Keystore Password	none	Password to SSL keystore
'-consistencyLevel	Consistency Level	LOCAL_ONE	CQL Consistency Level
-numSplits	Number of Splits	Number of Token Ranges	Number of splits/queries to create
-numFutures	Number of Futures	1000	Number of Java driver futures in flight.
-splitSize	Size of Split in MB	16	Split size in MB
-debug	Debug mode	0	Debug printing verbosity (0=none, 1=some, 2=verbose)

0 人点赞