翻译内容:
NoSQL Distilled 第五章 Consistency
作者简介:
本节摘要:
一致性向来是分布式的一大问题。本文主要讨论一致性中的更新一致性的内容。
Chapter 5. Consistency 第五章 一致性
One of the biggest changes from a centralized relational database to a cluster- oriented NoSQL database is in how you think about consistency. Relational databases try to exhibit strong consistency by avoiding all the various inconsistencies that we’ll shortly be discussing. Once you start looking at the NoSQL world, phrases such as “ CAP theorem” and “ eventual consistency” appear, and as soon as you start building something you have to think about what sort of consistency you need for your system.
从关系数据库过渡到NoSQL数据库的一个最大改变就是你对一致性的思考方式。关系数据库主要是通过“强一致性”来避免各种不一致的问题,这个我们很快就会说到。一旦你进入NoSQL的世界,你就会接触到“CAP 定理”和“最终一致性”这些术语,一旦你开始构建,你就要考虑你的系统需要哪种一致性,什么样级别的一致性。
Consistency comes in various forms, and that one word covers a myriad of ways errors can creep into your life. So we’re going to begin by talking about the various shapes consistency can take. After that we’ll discuss why you may want to relax consistency (and its big sister, durability)..
一致性有很多种表现形式,并且它下面也潜藏着众多可能出错的地方。本章先说说一致性的各种形式,然后再讨论哪些理由可以让开发者放宽对一致性的约束(并放宽另一个与之相伴的因素:持久性)。
5.1. Update Consistency 更新一致性
We’ll begin by considering updating a telephone number. Coincidentally, Martin and Pramod are looking at the company website and notice that the phone number is out of date. Implausibly, they both have update access, so they both go in at the same time to update the number. To make the example interesting, we’ll assume they update it slightly differently, because each uses a slightly different format. This issue is called a write-write conflict: two people updating the same data item at the same time.
我们现在就来举一个例子,一个修改电话号码的例子。比如,Martin和Pramod两个人都看到公司的网站上的联系电话不是最新的。他们两个又都有修改权限。于是两个就登录进去在同一时间去修改电话号码。为了让这个例子更加的说明问题,我们假设他们更新后的电话号码格式还不太一样。我们把这种情况叫做“写写冲突”( write-write conflict),就是这种两个人同一时刻去更新同一条数据的情况。
When the writes reach the server, the server will serialize them—decide to apply one, then the other. Let’s assume it uses alphabetical order and picks Martin’s update first, then Pramod’s. Without any concurrency control, Martin’s update would be applied and immediately overwritten by Pramod’s. In this case Martin’s is a lost update. Here the lost update is not a big problem, but often it is. We see this as a failure of consistency because Pramod’s update was based on the state before Martin’s update, yet was applied after it.
当写入请求到达server的时候,server就会把这些写入请求排序处理—一个一个依次处理。我们现在假设server按照字母顺序来进行处理这些写请求,这样的话首先处理的应该是Martin的更新,然后是
Pramod的请求。这样的话Martin的更新就丢失了(译者曰:对于Martin来说我的更新没什么鸟用,不知道去哪儿了)。像这种丢失的更新看起来好像也不是什么大问题,但至少也是个问题啊。我们认为这是一个失败的一致性,因为该更新是基于Martin的更新之前的状态,但却在Martin更新后被启用(译者曰:什么意思呢?就是让Martin觉得自己更新的不知道去哪儿了,比如我把号码修改成13312341234,结果返回的结果却说我更新成功了,号码是133-1234-1234。这个号码格式显然和我更新时候不一样。我们假定server并没有对号码格式化处理。)
Approaches for maintaining consistency in the face of concurrency are often described as pessimistic or optimistic. A pessimistic approach works by preventing conflicts from occurring; an optimistic approach lets conflicts occur, but detects them and takes action to sort them out. For update conflicts, the most common pessimistic approach is to have write locks, so that in order to change a value you need to acquire a lock, and the system ensures that only one client can get a lock at a time. So Martin and Pramod would both attempt to acquire the write lock, but only Martin (the first one) would succeed. Pramod would then see the result of Martin’s write before deciding whether to make his own update.
在并发的情况下处理一致性的方法通常可以归纳为两种:悲观和乐观。悲观的做法就是不允许冲突的发生;乐观的方式就是允许冲突的存在,然后检测这些冲突,然后给冲突排序。就拿更新重新冲突来说吧,通常的悲观做法就是有一把写入锁,如果你要想进行更新操作你就需要获取这把锁,然后才可以更新数据。系统自然会确保在同一时刻只有一个可客户端可以得到锁。在这种情况下Martin和Pramod就会尝试去得到那个写入锁,但是只有Martin会成功得到,因为它先抢。Pramod就只有在外面等,直到Martin更新了释放锁后,Pramod再决定自己要不要更新数据。
A common optimistic approach is a conditional update where any client that does an update tests the value just before updating it to see if it’ s changed since his last read. In this case, Martin’s update would succeed but Pramod’s would fail. The error would let Pramod know that he should look at the value again and decide whether to attempt a further update.
现在来说说乐观手法。乐观的做法就是“条件更新”,也就是任何客户端在做更新之前,都要测试数据的当前值和其上一次读取的值是否相同。在这种情况下,Martin的更新将会成功但
Pramod的更新就会失败。这个错误会让Pramod意识到他应该再去读取该数据,然后决定是否尝试继续更新。
Both the pessimistic and optimistic approaches that we’ve just described rely on a consistent serialization of the updates. With a single server, this is obvious— it has to choose one, then the other. But if there’ s more than one server, such as with peer-to-peer replication, then two nodes might apply the updates in a different order, resulting in a different value for the telephone number on each peer. Often, when people talk about concurrency in distributed systems, they talk about sequential consistency—ensuring that all nodes apply operations in the same order.
上面我们说的悲观和乐观的两种方法都依赖于更新操作的顺序必须一致。在单机服务器情况下,很明显,只能先处理完一个,然后再是下一个。但如果是不止一台服务器,比如对等复制的那种情况,有可能两个节点可能按照各自不同的顺序来更新操作,自然带来的后果就是那个电话号码在每个节点上可能是不同的值。(或者格式不同或者号码不同)通常人们说到分布式系统中的并发问题,他们说的都是“顺序一致性”:就是确保所有的节点按照同一个顺序来处理操作。
There is another optimistic way to handle a write-write conflict—save both updates and record that they are in conflict. This approach is familiar to many programmers from version control systems, particularly distributed version control systems that by their nature will often have conflicting commits. The next step again follows from version control: You have to merge the two updates somehow. Maybe you show both values to the user and ask them to sort it out—this is what happens if you update the same contact on your phone and your computer. Alternatively, the computer may be able to perform the merge itself; if it was a phone formatting issue, it may be able to realize that and apply the new number with the standard format. Any automated merge of write-write conflicts is highly domain-specific and needs to be programmed for each particular case.
还有另外一种处理“写写”冲突的乐观方法就是保存两次更新操作并纪录下他们是有冲突的。这个方法对于那些搞版本控制系统,特别是分布式版本控制系统编程者来说应该很熟悉。那些分布式版本控制系统天生就是经常有包含冲突的提交。下一步要做的事情也和版本控制一样:你必须得以某种方式来合并(merge)。也许你把两个值展示给用户并让用户自己分辨需要那个值。如果你用手机和电脑分别更新同一个联系方式时,这种情况就会发生。还有另外一种做法就是让电脑自己进行合并操作;如果是电话号码格式问题,那么电脑会识别到并采用使用了标准格式的那个新号码。不过任何这种对于“写写”冲突的自动合并方案都是高度领域特定的,需要为每个具体的情况来编程。
Often, when people first encounter these issues, their reaction is to prefer pessimistic concurrency because they are determined to avoid conflicts. While in some cases this is the right answer, there is always a tradeoff. Concurrent programming involves a fundamental tradeoff between safety (avoiding errors such as update conflicts) and liveness (responding quickly to clients).. Pessimistic approaches often severely degrade the responsiveness of a system to the degree that it becomes unfit for its purpose. This problem is made worse by the danger of errors—pessimistic concurrency often leads to deadlocks, which are hard to prevent and debug.
通常,当人们第一次遭遇这种问题时,他们的应对都是倾向于悲观并发,因为想着要避免冲突。尽管有些时候这样做没错,但大部分时候还是要进行权衡。并发编程涉及到一个基本的权衡就是安全性(这里的安全性指避免像更新冲突这样的错误)和响应能力(译者曰:这里说的响应能力就是快速响应给客户端的能力,也可以叫实时性,liveness,像现场直播一样)之间的权衡。。悲观的方式通常严重的降低了一个系统的响应速度,到了一种无法满足需求的程度。而且他还有出错的危险:悲观并发通常会导致死锁,这个难以防范也不好调试。
Replication makes it much more likely to run into write-write conflicts. If different nodes have different copies of some data which can be independently updated, then you’ll get conflicts unless you take specific measures to avoid them. Using a single node as the target for all writes for some data makes it much easier to maintain update consistency. Of the distribution models we discussed earlier, all but peer-to-peer replication do this.
采用“复制”模型来分布数据,很有可能就会导致“写写”冲突。如果不同的节点有某个数据的不同副本,并且他们独立更新的,那么你将会面临冲突。除非你采取特定的措施去避免他们。使用一个单一节点作为所有数据写入的目标,会让处理更新一致性问题变得简单起来。就像我们前面讨论的分布模型一样,除了“对等复制”模型以外,其他的方案都采用上述方法