A Note on Distributed Computing

Local computing means programs that are confined to a single address space.

本地计算即只有一个地址空间的程序

Distributed computing means programs that make calls to other address spaces, possibly on another machine. Nothing is known about the recipient of the call other than that it supports a particular interface.

分布式计算是指会调用其它地址空间的程序，很可能是在其它机器上。调用者只知道一个接口，而关于被调用者的其它信息一无所知

There is an overall vision of distributed OO computing in which there is no essential distinction between objects that share an address space and objects that are on two machines with different architectures located on different continents. Whether local or remote, these systems are defined in terms of a set of interfaces declared in an interface definition language – the implementation is independent of the interface and hidden from other objects.

分布式面向对象计算有一个全局的视野，不管是共享相同的地址空间的对象还是在不同计算机上的对象，这些对象之间没有本质的区别。不管是本地的还是远程的，这些系统都是通过接口定义语言中的一系列接口组成的——接口的实现是相互独立的，且实现过程对其它对象是隐藏的。

Given the isolation of an object’s implementation from clients of the object, the use of objects for distributed computing seems natural.

通过对用户隐藏对象的实现细节，分布式计算中对象的使用似乎很自然。

There is a single paradigm of object use and communication used no matter what the location of the object might be. In actual practice, a local member function call and a cross-continent object invocation are not the same thing.

不管对象的位置在哪，对象的使用和通信都只有一个机制。实际上，本地成员方法调用和跨节点调用不是同一件事情。

First phase of writing an application proceeds with writing the application without worrying about where the objects are located and how their communication is implemented. The developer will simply strive for the natural and correct interface between objects. This approach enforces a desireable separation between the abstract architecture of the application and the need for performance tuning.

分布式应用的第一个阶段是写一个应用程序，而不用考虑对象的位置以及通信的实现过程。开发者尽量让对象使用自然且正确的接口。这种对应用架构和性能调优之间的抽象，是一种令人满意的隔离

The second phase is to tune performance. The right set of interfaces to export to various clients can be chosen.

第二个阶段是提高性能。使用合理的接口，以让不同的用户使用。

The final phase is to test with “real bullets” (e.g., networks being partitioned, machines going down). Interfaces between objects can be beefed up to deal with these sorts of partial failures introduced by distribution by adding replication, transactions or whatever else is needed.

最后阶段是用“真枪实弹”进行测试（比如网络分割，机器down掉）。在处理这类问题时，对象间接口的功能应该被增强，可以使用副本、事务机制，或者任何需要使用的手段。

A central part of this vision is machines an application is built using OO all the way down, the right “fault points” to insert process or machine boundaries will emerge naturally. One justification for this vision is that there is no impact of the correctness of the program. It makes no difference to the correctness of the program whether the operation is carried out within the same address space, on some other machine, or off-line by some other piece of equipment.

愿景的一个中心部分是一直通过面向对象的方式编程，插入一段处理过程或者机器界限的功能正确的“故障处理点”将会很自然地融合到程序中。这个愿景的一个理由是这对程序的正确性没有影响。不管是在相同的地址空间，或者一些远程的机器，或者一些其它类型的离线装备上，这些操作对程序的正确性都没有影响。

The vision is centered around the following principles:

愿景拥有以下几个中心原则：

there is a single natural OO design for a given application, regardless of deployment context; 不管部署环境如何，对给定的应用程序来说，都有一个给定的自然的面向对象设计方案。
failure and performance issues are tied to the implementation of components, and consideration of these can be left out of an initial design; and 失败和性能相关的问题被绑定到组件的实现过程中，这些是设计最开始就要考虑的
the interface of an object is independent of the context in which that object is used. 对象的接口独立于使用环境 Unfortunately, all of these principles are false! 不幸的是，这些原则都是错的

The desire to merge the programming and computational models of local and remote computing is not new.

将本地计算和远程计算的编程和计算模型融合在一起的需求不是现在才出现的。

Programming distributed applications is not the same as programming non-distributed applications. Making the communications paradigm the same as the language paradigm is insufficient to make programming distributed programs easier, because communicating between the parts of a distributed application is not the difficult part of that application.

编写分布式应用程序和编写非分布式应用程序不是一样的。让通信机制和语言本身的提供的机制一样，不会让分布式应用程序更简单，因为分布式应用程序不同组件之间的通信不是程序中比较难的部分。

The hard problems in distributed computing concern dealing with partial failure and the lack of a central resource manager, ensuring adequate performance and dealing with problems of concurrency, differences in memory access paradigms between local and distributed entities.

分布式应用中比较难的部分是要考虑局部失败、中心资源管理器的缺乏、保证充足的性能、处理并发导致的问题以及本地和分布式组件之间内存访问机制的差异。

The often overlooked differences concerning memory access, partial failure, and concurrency are far more difficult to explain away, and differences concerning partial failure and concurrency make unifying the local and remote computing models impossible without making unacceptable compromises.

缺乏对内存访问、局部失败、并发性的关注，非常难以解释。局部失败和并发相关问题使得如果不作不可接受的妥协的话，编写本地和远程统一的计算模型是不可能的

It is the disparity between processor speed and network latency that is often seen as the essential difference between local and distributed computing.

处理器速度和网络延迟的不同，通常被认为是本地和分布式计算的核心区别。

If the only difference between local and distributed object invocations was the difference in the amount of time it took to make the call, one would strive for a future in which the two types of calls would be virtually indistinguishable – rational people would disagree as to the wisdom of this approach.

如果本地调用和分布式对象调用的唯一不同是要考虑调用时间的不同，有人追求在将来让这两种调用实际上是一样的———理智的人不会认为这是一种智慧的解决方案。

Pointers in a local address space are not valid in another (remote) address space. Either all memory access must be controlled by the underlying system, or the programmer must be aware of the different types of access – local and remote. Providing distributed shared memory is one way of completely relieving the programmer from worrying about remote memory access.

访问本地地址空间的指针在另一个（远程）地址空间是无效的。要么内存访问一定会被潜在的系统控制，要么程序员需要考虑本地和远程访问的区别。分布式共享内存可以让开发者完全不用担心远程内存访问所引入的问题。

Adding a layer that allows the replacement of all pointers to object references only permits the developer to adopt a unified model of object interaction – it cannot be enforced unless one also removes the ability to get address-space-relative pointers from the language used by the developer. This requires that programmers learn a new style of programming which does not use address-space-relative pointers. One gives up the complete transparency between local and distributed computing.

添加这样一层来更换指向对象引用的指针，可以让开发者进行对象交互时使用统一的接口——这是不可能实现的，除非让开发者不再使用相对地址空间的指针。这需要开发者学会新的编程模型，这种模型不再使用相对地址空间的指针。这种方式将放弃本地编程和分布式编程的透明度。

It is logically possible that the differences between local and remote memory access could be completely papered over and a single model of both presented to the programmer. When we turn to the problems introduced to distributed computing by partial failure and concurrency, it is not clear that such a unification is even conceptually possible.

将本地和远程内存访问的区别隐藏起来逻辑上是完全可以实现的。当我们转向因局部失败或并发导致的分布式计算的问题时，现在还不清楚这样的统一在概念上是否是可行的。

Partial failure is a central reality of distributed computing. Both the local and the distributed world contain components that are subject to periodic failure. In the case of local computing, such failures are either total or detectable.

局部失败是分布式应用的一个核心问题。本地和分布式计算世界都包含周期性失败的组件。在本地计算的案例中，这些失败要么是完全失败的，要么是可以检测的。

In distributed computing, one component (machine, network link) can fail while the others continue. There is no common agent that is able to determine what component has failed and inform the other components of that failure, no global state that can be examined that allows determination of exactly what error has occurred. The failure of a network link is indistiguishable from the failure of a processor on the other side of that link.

在分布式计算中，可能一个组件（计算机、交换机）失败而其他组件正在运行。没有通用的代理检测哪个组件失败并将该失败通知其他组件，也没有全局的状态被用来检测发生了哪种错误。到底是网路失败还是网络另一端的进程失败，是很难检测的。

Partial failure requires that programs deal with indeterminacy. The interfaces that are used for the communication must be designed in such a way that it is possible for the objects to react in a consistent way to possible partial failures.

局部失败导致程序对不确定性进行处理。通信用的接口必须被设计成在面对局部失败时允许对象保持一致的运行方式。

The result of choosing the earlier development phases is the resulting model is essentially indeterministic in the face of partial failure and consequently fragile and non-robust. The price of adopting the model is to require such failures are unhandled and catastrophic.

选择开发的早期阶段的结果，是因为局部失败是不确定的，而使得结果模型是不确定的。采用这种模型的代价是允许失败时不被处理的和灾难性的。

Distributed object by their nature must handle concurrent method invocations. Either all objects must bear the weight of concurrency semantics, or all objects must ignore the problem and hope for the best when distributed.

分布式对象必须处理并发方法调用。要么所有对象容忍并发机制的代价，要么在希望分布式计算带来最好的一面时忽略分布式系统面临的可能问题。

Either the programming model must ignore significant modes of failure or the overall programming model must assume a worst-case complexity model for all objects within the program, making the production of any program, distributed or not, more difficult.

要么编程模型忽略失败，要么为编程所有的对象设计一套考虑最坏情况的模型，使得不管是不是分布式编程，任何生产环境的应用都更加复杂。

Robustness is not simply a function of the implementations of the interfaces that make up the system. It is not the sole factor for determining system robustness. Many aspects of robustness can be reflected only at the protocol/interface level.

鲁棒性不仅仅是对构成系统的接口的实现过程中的函数。函数不是决定系统鲁棒性的基础因素。很多鲁棒性只是反映在了协议/接口层。

Merging the models by attempting to make distributed computing follow the model of local computing requires ignoring the different failure modes and basic indeterminacy inherent in distributed computing, leading to systems that are unreliable and incapable of scaling beyond small groups of machines that are geographically co-located and centrally administered.

让分布式计算模型遵循本地计算模型需要忽略不同的失败模型和继承自分布式计算中固有的不确定性，而且这会导致系统不可靠并且不能对物理上在一起且中心化管理的一组计算机进行自由伸缩。

A better approach is to accept there are irreconcilable differences between local and distributed computing, and to be concious of those differences at all stages of the design and implementation of distributed applications.

更好的方式是接受本地计算和分布式计算的不可能协调的区别。并且在设计和实现分布式系统的各个阶段考虑这些不同。

Programming a distributed application will require thinking about the problem in a different way than before it was thought about when the solution was a non-distributed application. Keeping the difference visible will keep the programmer from forgetting the difference and making mistakes.

编写分布式应用程序要求用不同的方式考虑问题，以区别于非分布式应用。让这个不同保持可见，将会让程序员忘记其中的不同并避免出错。

网站分布式

0 人点赞