当cassandra数据量很大时使用select count(*)这种方式基本上是无法统计的,会返回如下类似错误信息:
Cassandra timeout during read query at Consitency ONE(1 responses were required but only 0 replica responed)
这时候可以借助cassandra-count这个工具来实现count的统计,需要注意的是这个工具在工作时会对cassandra服务器CPU以及内存使用都会带来不同程度的压力,所以在线上尽量不要执行count操作,cassandra不适合做count统计,
1、下载cassandra-count工具,地址https://github.com/brianmhess/cassandra-count
2、执行如下命令,数据量很大时可以通过调大numSplits值来避免read timeout问题
./cassandra-count -host xx.xx.xx.xx -keyspace ks -table table1 -numSplits 1024
PS:指令参考
代码语言:javascript复制Switch Option Default Description
-host IP Address Cassandra connection point - required.
-keyspace Keyspace Name Cassandra keyspace - required.
-table Table Name Cassandra table name - required.
-configFile Filename none Filename of configuration options
-port Port Number 9042 Cassandra native protocol port number
-user Username none Cassandra username
-pw Password none Cassandra password
-ssl-truststore-path Truststore Path none Path to SSL truststore
-ssl-truststore-pwd Truststore Password none Password to SSL truststore
-ssl-keystore-path Keystore Path none Path to SSL keystore
-ssl-keystore-path Keystore Password none Password to SSL keystore
'-consistencyLevel Consistency Level LOCAL_ONE CQL Consistency Level
-numSplits Number of Splits Number of Token Ranges Number of splits/queries to create
-numFutures Number of Futures 1000 Number of Java driver futures in flight.
-splitSize Size of Split in MB 16 Split size in MB
-debug Debug mode 0 Debug printing verbosity (0=none, 1=some, 2=verbose)