English |  Español |  Français |  Italiano |  Português |  Русский |  Shqip

Big Data Dictionary

Amazon Dynamo

The Amazon Dynamo system, commercially available as Amazon DynamoDB, is a highly available and scalable distributed key/value based datastore built for supporting internal Amazon's applications. Dynamo is used to manage the state of services that have very high reliability requirements and need tight control over the tradeo s between availability, consistency, cost-eff ectiveness and performance. In practice, Amazons platform has a very diverse set of applications with di erent storage requirements. A select set of applications requires a storage technology that is exible enough to let application designers con gure their data store appropriately based on these tradeo s to achieve high availability and guaranteed performance in the most cost e ective manner. There are many services on Amazons platform that only need primary-key access to a data store. The common pattern of using a relational database would lead to ineciencies and limit scale and availability. Thus, Dynamo provides a simple primary-key only interface to meet the requirements of these applications. The query model of the Dynamo system relies on simple read and write operations to a data item that is uniquely identi ed by a key. State is stored as binary objects (blobs) identi ed by unique keys. No operations span multiple data items. 

Dynamo's partitioning scheme relies on a variant of consistent hashing mechanism to distribute the load across multiple storage hosts. In this mechanism, the output range of a hash function is treated as a fi xed circular space or ring (i.e. the largest hash value wraps around to the smallest hash value). Each node in the system is assigned a random value within this space which represents its position on the ring. Each data item identi ed by a key is assigned to a node by hashing the data items key to yield its position on the ring, and then walking the ring clockwise to find the first node with a position larger than the items position. Thus, each node becomes responsible for the region in the ring between it and its predecessor node on the ring. The principle advantage of consistent hashing is that departure or arrival of a node only a ects its immediate neighbors and other nodes remain una ected.

In Dynamo system, Each data item is replicated at N hosts where N is a parameter con gured per-instance. Each key k is assigned to a coordinator node. The coordinator is in charge of the replication of the data items that fall within its range. In addition to locally storing each key within its range, the coordinator replicates these keys at the (N - 1) clockwise successor nodes in the ring. This results in a system where each node is responsible for the region of the ring between it and its Nth predecessor. As illustrated in the below figure, node B replicates the key at nodes C and D in addition to storing it locally. Node D will store the keys that fall in the ranges (A;B], (B;C], and (C;D]. The list of nodes that is responsible for storing a particular key is called the preference list. The system is designed so that every node in the system can determine which nodes should be in this list for any particular key.

There has been error in communication with Booktype server. Not sure right now where is the problem.

You should refresh this page.