翻译内容:
NoSQL Distilled 第四章 Distribution Models
作者简介:
本节摘要:
今天我们主要讨论有关分布模型中复制(Replication)的内容,今天的内容主要说对等复制(Peer-to-Peer Replication)。复制在分布式存储中是一个很重要的内容,知道了有关复制及分片,你也就知道了分布式存储的底层原理。
4.4. Peer-to-Peer Replication 对等复制
Master-slave replication helps with read scalability but doesn’t help with scalability of writes. It provides resilience against failure of a slave, but not of a master. Essentially, the master is still a bottleneck and a single point of failure. Peer-to-peer replication (see Figure 4.3) attacks these problems by not having a master. All the replicas have equal weight, they can all accept writes, and the loss of any of them doesn’t prevent access to the data store.
主从复制帮助我们提高了读取操作的故障恢复能力。然后并没有提高写入能力的扩展性(译者曰:意思就是读取现在有多个节点,但写入还是只有一个节点来处理)。主从复制提供的这种故障恢复能力只有在从节点出现问题的时候,才能体现出来,并不能解决主节点出现问题的恢复。(译者曰:这句话有点绕,他说的其实就是读的扩展有了,但写的扩展还没有,因为写还是必须经过master,然后同步到slave)。所以呢,主节点(master)依然是一个瓶颈,依然是一个单点(a single point of failure)。对等复制(Peer-to-peer replication)就是为解决这个问题而生的。因为他没有master一说,没有主从一说。所有的副本(replicas)的权重都是一样的,地位是相同的,他们都可以接受写操作并且丢失了一个并不会影响数据的读取。(译者曰:也有小伙伴们称之为:masterless模式)
Figure 4.3. Peer-to-peer replication has all nodes applying reads and writes to all the data.
图 4.3 对等复制的所有的节点都可以接受写和读操作。
The prospect here looks mighty fine. With a peer-to-peer replication cluster, you can ride over node failures without losing access to data. Furthermore, you can easily add nodes to improve your performance. There’s much to like here—but there are complications.
上面描述的画面也许看起来不错。在集群上使用“对等复制”方案时,你可以轻松的驾驭(ride:骑在各个节点上,是不是有点污)节点故障,而不至于数据无法访问。更重要的是,你可以轻松增加节点来改善和提高性能。她有很多可爱之处,但也有一些问题存在。
The biggest complication is, again, consistency. When you can write to two different places, you run the risk that two people will attempt to update the same record at the same time—a write-write conflict. Inconsistencies on read lead to problems but at least they are relatively transient. Inconsistent writes are forever.
一个最大的问题就是,老生常谈的问题:一致性。当可以写到两个地方的时候,如果有两个人在同一时间内尝试更新同一个纪录,这时候就会出现叫做“write-write”冲突。(译者曰:这可是个大问题啊) 在读取操作上的不一致至少还是短暂的。但写入操作的不一致是永远的。
We’ll talk more about how to deal with write inconsistencies later on, but for the moment we’ll note a couple of broad options. At one end, we can ensure that whenever we write data, the replicas coordinate to ensure we avoid a conflict. This can give us just as strong a guarantee as a master, albeit at the cost of network traffic to coordinate the writes. We don’t need all the replicas to agree on the write, just a majority, so we can still survive losing a minority of the replica nodes.
我们将会在之后的章节来讨论如何解决“写不一致”的问题,不过我们只需要知道两种大致的做法就可以了。一种就是,不论何时写数据,我们都让那些副本们进行相互的协调,来避免冲突。这就好比主从复制时候的master一样。(译者曰:对等复制只需要加个协调动作,便和主从复制一样把写入冲突给解决了。)尽管协调这件事情需要花费一些网络的流量。在写入的时候我们不需要所有的副本都同意,只需要大部分同意就行了,这样就算丢失了副本节点的一小部分,数据库还是能用。
At the other extreme, we can decide to cope with an inconsistent write. There are contexts when we can come up with policy to merge inconsistent writes. In this case we can get the full performance benefit of writing to any replica.
另外一种极端方案就是,设法去处理这些不一致的写入操作。怎么处理呢?我们可以按照某种策略来合并这些不一致的写入操作。(译者曰:就像git处理冲突那一样,通过merge来搞,你暂时可以这样理解)在这种情况下,任何副本节点都能写入数据,这自然就提高了性能啊。
These points are at the ends of a spectrum where we trade off consistency for availability.
当我们权衡一致性和可用性的时候,上面这两个方案处于两个极端,一个是写入之前必须要解决冲突,一个是完全不解决,随便写,之后merge。(译者曰:其实就是说你要求无限的一致性的时候,性能就会下降;你要求无限的性能的时候,无限一致性自然不能保障)
附:几个文中的单词