NoSQL Distilled 第四章 Distribution Models
The primary driver of interest in NoSQL has been its ability to run databases on a large cluster. As data volumes increase, it becomes more difficult and expensive to scale up—buy a bigger server to run the database on. A more appealing option is to scale out—run the database on a cluster of servers. Aggregate orientation fits well with scaling out because the aggregate is a natural unit to use for distribution.
Depending on your distribution model, you can get a data store that will give you the ability to handle larger quantities of data, the ability to process a greater read or write traffic, or more availability in the face of network slowdowns or breakages. These are often important benefits, but they come at a cost. Running over a cluster introduces complexity—so it’s not something to do unless the benefits are compelling.
Broadly, there are two paths to data distribution: replication and sharding. Replication takes the same data and copies it over multiple nodes. Sharding puts different data on different nodes. Replication and sharding are orthogonal techniques: You can use either or both of them. Replication comes into two forms: master-slave and peer-to-peer. We will now discuss these techniques starting at the simplest and working up to the more complex: first single-server, then master-slave replication, then sharding, and finally peer-to-peer replication.
4.1 Single Server 单一服务器
The first and the simplest distribution option is the one wewould most often recommend—no distribution at all. Run the database on a single machine that handles all the reads and writes to the data store. We prefer this option because it eliminates all the complexities that the other options introduce; it’s easy for operations people to manage and easy for application developers to reason about.
Although a lot of NoSQL databases are designed around the idea of running on a cluster, it can make sense to use NoSQL with a single-server distribution model if the data model of the NoSQL store is more suited to the application. Graph databases are the obvious category here—these work best in a single-server configuration. If your data usage is mostly about processing aggregates, then a single-server document or key-value store may well be worthwhile because it’s easier on application developers.
尽管很多的NoSQL数据库都是为集群而生的,但是NoSQL运行在单机上有时候也是很有感觉的,如果你的应用程序恰好需要某种NoSQL数据库的数据模型的话。(译者曰:只要某个NoSQL数据库的数据模型符合你的应用需求,运行在单机上也是很带感的。) 图数据库就是这种特别适合运行在单机上的NoSQL数据库(译者曰:我们前面的某章说图数据库就是NoSQL阵营的奇葩)。如果你的数据使用大部分时候都是处理聚合的话,那么你完全可以把文档数据库或者key-value数据库运行在单机上,而且这样对于application developers也是很简单的。
key-value store 这里我们统指键值数据库。比如redis、Cassandra等。
document store 这里我们统指文档数据库。比如MongoDB、CouchDB、Apa
che Jackrabbit等。
Graph databases 图数据库。比如:neo4j等。
application developers 应用程序开发人员。
operations people 运维人员
running on a cluster 运行一个集群上