[译]最终一致性
作者:会写代码的猪 发布时间:January 16, 2010 分类:猪在写代码,全说中国话
前几天在网上看到了这篇amazon CTO Werner Vogels的文章,讲分布系统最终一致性的,特意翻过来,学习
Eventually Consistent – Revisited
最终一致
By Werner Vogels
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
I wrote a first version of this posting on consistency models about a year ago, but I was never happy with it as it was written in haste and the topic is important enough to receive a more thorough treatment. ACM Queue asked me to revise it for use in their magazine and I took the opportunity to improve the article. This is that new version.
一年前我首次发表这篇关于一致性模型的文章,但郁闷的是那篇文章写得太仓促,而且这个主题很大,本应该投入更多精力。ACM Queue(一本很牛B的杂志,ACM办的,http://queue.acm.org/ )希望我能够修改一下这篇文章,他们杂志可以用。我答应了他们,一举两得,这样还可以有机会去完善我的文章。以下是这篇文章的新版本。
Eventually Consistent - Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability.
最终一致——在世界范围搭建一个可靠的分布式系统,就要求我们在一致性和可用性上进行权衡,取舍。
At the foundation of Amazon's cloud computing are infrastructure services such as Amazon's S3 (Simple Storage Service), SimpleDB, and EC2 (Elastic Compute Cloud) that provide the resources for constructing Internet-scale computing platforms and a great variety of applications. The requirements placed on these infrastructure services are very strict; they need to score high marks in the areas of security, scalability, availability, performance, and cost effectiveness, and they need to meet these requirements while serving millions of customers around the globe, continuously.
创建亚马逊云计算的时候我们的定位是,一个像亚马逊S3(Simple Storage Service)、Simple DB和EC2(Elastic Compute Cloud)一样的基础设施,他可以提供资源,搭建全网范围的计算平台和大量应用。对这些基础设施的要求是很苛刻的,他们需要在安全性、伸缩性、可用性、性能和性价比上都有上佳的表现,而且要知道,所有的这些要求是建立在不间断的为全球数百万用户提供服务的基础上的。
Under the covers these services are massive distributed systems that operate on a worldwide scale. This scale creates additional challenges, because when a system processes trillions and trillions of requests, events that normally have a low probability of occurrence are now guaranteed to happen and need to be accounted for up front in the design and architecture of the system. Given the worldwide scope of these systems, we use replication techniques ubiquitously to guarantee consistent performance and high availability. Although replication brings us closer to our goals, it cannot achieve them in a perfectly transparent manner; under a number of conditions the customers of these services will be confronted with the consequences of using replication techniques inside the services.
在这些服务的背后是一个存在于世界范围的庞大的分布系统。这种庞大的规模带来很多额外的挑战,由于当一个系统要处理数以万亿计的请求时,正常情况下的小概率事件在这里就可能被认为是必然事件,而且要在设计和构建系统之前就考虑到。考虑到这些系统是世界级规模的,我们无处不在使用冗余技术来保证一致性能和高可用性。虽然冗余让我们更容易达成目标,但这不是一个完美的,清晰的方法;在大量不相同的条件下,这些服务的用户将会面临服务内部冗余技术带来的不良结果。
One of the ways in which this manifests itself is in the type of data consistency that is provided, particularly when the underlying distributed system provides an eventual consistency model for data replication. When designing these large-scale systems at Amazon, we use a set of guiding principles and abstractions related to large-scale data replication and focus on the trade-offs between high availability and data consistency. In this article I present some of the relevant background that has informed our approach to delivering reliable distributed systems that need to operate on a global scale. An earlier version of this text appeared as a posting on the All Things Distributed weblog in December 2007 and was greatly improved with the help of its readers.
解决这个问题的一个途径就是在系统提供的数据一致性类型中声明自己,尤其是当底层分布式系统为数据冗余提供最终一致性模型的时候。在亚马逊设计这些大规模系统,我们使用了一系列的定向原理和大规模数据冗余的抽象关系,并且关注高可用性和数据一致性之间的取舍。本文我提到了一些有关背景,这些是我们即将应用在全球规模的分布式系统上的。本文2007年12月,较早的那个版本在读者的帮助下有了很大的提高。
