As mentioned in the previous blog entry, searching for data items is a central issue in peer-to-peer systems. Several approaches have been suggested to deal with this problem. The paper “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications”[1] focuses on one of them.
The article discussed a simple yet effective means for locating data items in a dynamic peer-to-peer system. Aside from its simplicity, the provability of its correctness and performance distinguish it from other lookup protocols.
Overview
The foundation of Chord’s simplicity lies in its consistent hash function that makes use of a base hash function such as SHA-1. Chord’s consistent hash function is both used in assigning to nodes and keys an m-bit identifier, where m is large enough to ensure that the event of having 2 nodes or keys sharing the same identifier is highly improbable (though the use of SHA-1 is already a measure for enforcing deterministic results, unless explicitly modified).
This approach can be viewed as a circle having 2^m identifiers, ranging from 0 to 2^m – 1, and arranged in ascending order. This circle is called the identifier circle. Nodes are plotted in the circle based on its IP address, while keys – which refer to their hash function values in this context – are assigned to nodes with the nearest identifier, starting from the value of the keys themselves (i.e. the identifiers having the same value as the keys). The node where a key is assigned is called the successor node of key k or simply successor(k). To maintain the consistency produced by the hashing function, certain keys are reassigned to a different node when nodes enter and leave the network. When a node enters a network, some of the keys belonging to its successor, specifically those whose values are less than or equal to the new node’s identifier, will be assigned to it. When a node leaves a network, all of its keys will be assigned to its successor.
Searching for a key, which is mapped to a data item or an address, is facilitated by maintaining routing information, stored in what is called as the finger table, in each node in the Chord circle. The finger table contains a list of nodes, their successors, and the range that they cover. Having a single entry in the finger table that denotes the successor of the current node would have already been sufficient for searching a key in the circle, but this may take up to n nodes to search for a key. To reduce the number of steps needed for searching, a node stores up to m entries in the finger table. This is the same as having logn entries in a node, thus routing information remains relatively small even for large networks. Having additional routing information provides the means to search for the nearest node if the current one doesn’t contain the key. This allows Chord to perform look up in just logn (application level) hops, which means its lookup also scales well for large networks.
Changes in the Network
It is understandable, however, that nodes may join and leave a network at any given time. To maintain its capacity to search for keys, Chord provides operations to ensure that each node’s successor is updated and successor(k) returns the correct value. Maintaining a correct finger table is also ideal, since this will enforce the same lookup speed of logn. These processes are simplified by adding a predecessor pointer to each node.
When a new node enters the network, its finger table and predecessor node are initialized. All other nodes update their finger tables and predecessors to reflect the existence of the new node. Key associations are also updated. Frequent corrections of finger tables, though, are costly, so an alternative is to use a stabilization procedure that runs periodically to verify each node’s successor and adjust its finger table.
When a node n fails, each node that contains n in its finger table must replace n with its successor. This is done by searching for the first active entry in each node’s successor list, which contain a node’s r nearest successors. Periodic stabilization will correct each node’s finger table.
Related Works
There are other similar approaches that use Distributed Hash Table (DHT) for searching just like Chord. I’ll cite two of them: Content Addressable Network (CAN)[2] and Tapestry[3].
CAN makes use of a d-dimensional coordinate system to plot network nodes and keys. Just like Chord, it makes use of a consistent hash function such as SHA-1 to determine the coordinate space of nodes and keys. Though, unlike Chord, the size of each node’s routing table is based on d, and it remains fixed even if the size of the network increases. For each node, it only has to maintain information about its immediate neighbors. The number of steps needed for searching scales relatively well also, as this is asymptotic to O(n^(1/d)). Note, however, that if d is 2 (resulting to a 2-dimensional coordinate space), the number of hops needed for searching is O(n^(1/2)) or O(√n), which grows faster than O(logn). In order for CAN to match the speed of Chord’s searching routine, it has to increase the number of dimensions of its coordinate space or the number of coordinate spaces (each one called a “reality”), resulting to a complex system.
In Tapestry, each node in the network is assigned a nodeID from a large namespace by also using a consistent hash function (e.g. SHA-1). The same mechanism is also used for assigning application-specific endpoints GUIDs. Each identifier G is assigned to a live node that has an ID equal to G. This node becomes the root of G or GR. To allow the routing of messages, each node maintains a routing table consisting of nodeIDs and the IP addresses of their neighbors. This routing table is divided into multiple levels, with each level matching a prefix up to a certain digit position. Messages are passed by selecting nodes whose ID’s have prefixes that are closer to G. With each hop, the number of digits that are similar to G are increasing until it reaches its destination. This method ensures that any existing nodes in the system can be reached in at most logBN hops (application level), where N is the size of the namespace and B is the base of the IDs. The structure of Tapestry can be viewed as a graph of connected nodes, with each node that is a root of an identifier having a spanning tree that allows messages from the leaf nodes to be routed towards the root. While Tapestry can perform lookups faster than Chord, it also has a more complex system than Chord.
Analysis
One property that stands out for Chord is its simplicity despite its facility for fast lookups and its relatively small routing tables. This makes Chord ideal for large networks not only because of its scalability, but because of its simple mechanisms that are suitable even for limited resources.
Though the size of its routing table increases with the network size, a size of logn is still small, even if the number of nodes in the network is almost equal to the current human population (31 entries for 4 billion nodes). Its load balancing also ensures that nodes will generally carry the same burden, reducing the risk of nodes having responsibilities that are more than their capacity.
Its lookup speed of logn steps is also impressive. Other P2P overlays can match or outperform Chord’s lookup speed, but at the expense of implementing a more complex system (e.g. CAN and Tapestry). From a reader’s perspective, Chord is a practical solution for the real world problems faced by P2P systems.
References:
[1] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. ACM SIGCOMM ‘01. August 2001.
[2] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. A Scalable Content Addressable Network. ACM SIGCOMM ‘01. August 2001.
[3] Ben Zhao, Ling Huang, Jeremy Stribling, Sean Rhea, Anthony Joseph, and John Kubiatowicz. Tapestry: A Resilient Global-Scale Overlay for Service Deployment. IEEE Journals on Selected Areas in Communications, Vol. 22, No. 1. January 2004.
Monday, October 5, 2009
Subscribe to:
Post Comments (Atom)
http://pacquiao-vs-marquez-fight.blogspot.com/
ReplyDelete