Knot Detection Algorithms In Distributed Systems Computer Science Essay

Published: November 9, 2015 Words: 6975

Knot detection algorithms in distributed systems have many applications in deadlock detection and possibly in deadlock resolution. We'll examine three different approaches to this problem. They're listed chronically, and each one was trying to improve and outperform the older ones. Then a comparison between these algorithms is devised, then finally the conclusion and the references.

Index Terms-Distributed algorithms, Deadlocks detection, knots and cycles detection

Distributed systems consist of processes running different programs on different machines communicating through message passing or shared memory. Processes request resources to complete their tasks; these requests must be dealt with in order to reach termination and prevent deadlocks from happening.

A deadlock is a situation where two or more processes are requesting for resources that both have and won't release till the other releases first. Also for a deadlock to occur the following three properties must hold which are exclusive use of a resource, non-preemptive scheduling and circular wait by a set of at least two processes.

The processes requests are modeled in directed graphs called the Wait-For-Graphs WFG, they represent the processes as nodes and edges form the requests with the direction representing from where the resource is needed. A knot is a sub-graph in a directed graph such that every node in the sub-graph is reachable from any other node in the sub-graph (strongly connected), and no node outside the sub-graph is reachable by any node in the sub-graph. Also we need to define what a cycle is; it's a closed directed walk, with no repeated nodes allowed, also a group of strongly connected nodes is called a 'cluster'.

Deadlocks are represented in WFG as either by cycles or knots depending on the request model, discussed later. Deadlocks occur frequently and cause the system to crash or become inoperable; some of the causes are congestion of networks and shortage of memory. As an example, store-and-forward networks, packets at switching nodes are blocked if there is no empty buffer at the next switch, this occurs when there's a knot in the buffer graph of the switching nodes. In distributed databases if the transaction WFG contains a knot, then we also have a deadlock. The real problem is finding efficient knot detection algorithms which are not easy to come up with.

Knot detection is an important problem that has been the subject of many scholars from the early 1980's. Many knot detection algorithms were proposed in the literature for solving the knot detection problem, the algorithm presented by Misra and Chandy in [1] forms the basis of the three algorithms summarized in this paper. The Misra-Chandy algorithm was an improvement over Dijkstra's algorithm [2] both of which are based on diffusing computations. Diffusing computation algorithms is where computations start from a starter or an initial node and then diffuse to all other nodes in the graph. The three algorithms summarized in this paper are chronically listed, first algorithm summarized is I. Cidon's algorithm, then we'll present the algorithm by A. Boukerche and C. Tropper, finally we'll see D. Manivannan and M. Singhal's algorithm. Afterwards we'll compare all these three different approaches and see their contributions and drawbacks, and suggest some improvements. Finally the paper ends with the conclusion section then the references section.

THE Underlying Models

The two models that are assumed are the request model and the network model. These models give us a set of assumption about what the underlying layer provides, in such a way the three algorithms will take for granted all of these assumptions, i.e. services that are done for the algorithm.

The network consists of communicating nodes N and a set of bidirectional edges E, that form a FIFO channel, meaning no loss, reorder, duplication or generation of messages. Messages have no bound on their arrival times but it's always finite. Node i has an outgoing edge from i to j (i,j) meaning there's a request pending from node i to node j. Also for j the edge (i,j) is called an incoming edge, meaning node i is waiting for node j to get its request.

We have two request models, the AND model and the OR model. In the AND model a node simultaneously requests multiple resources and remains blocked until it gets all of its resources. Thus having only a cycle is sufficient for a deadlock. In the OR model a node requests simultaneously many resources and remains blocked until one of these resources is granted. Cycles don't cause deadlocks in this model but knots do.

Knot Detection algorithms

classes

In the literature, knot detection algorithms are divided in to three main groups or classes. Each class follows a methodology that is very different from the other, the three main classes are:

Complete Topology: At each node the complete distributed graph topology is collected by extensive message passing "flooding messages". This class is considered to be centralized; also it's simple to implement but very inefficient in terms of memory and message complexity.

Diffusing Computation: Each node independently searches whether it's a member of a knot or a cycle. It's significantly more complex than the first class. This approach is used by the algorithm proposed by Misra and Chandy. Also the second and third algorithms summarized in this paper are of the same class also.

Cluster Detection: Cycles of clusters are detected and merged together to form bigger clusters. This is because a knot is a strongly connected sub-graph with no directed edges away from it. While merging clusters together, if a cluster has no directed edge directed outside, then a knot is detected. This class is complex and slow, the first algorithm summarized in this paper is from this class.

summarized Papers

We shall summarize the three papers in this section. The major algorithms presented in those papers are summarized, for more details please refer to the original papers, they can be found in [3], [4] and [5] respectively.

AN EFFICIENT DISTRIBUTE KNOT DETECTION ALGORITHM

Israel Cidon

Published 1989, it has a message complexity of O (n log n + e) and O (e + n log n) of space complexity, where n is the number of nodes and e is the number of edges. The main idea of the algorithm is to assign to each node a value that represents the state. The final state can be either knot, indicating that the node is a member of a knot, or free, indicating it isn't a member of any knot. A single node without any outgoing edge is single node knot, and all single node knots are immediately detected and their state is set to knot. Knots' nodes are strongly connected (clusters), and therefore a knot must contains a closed path (cycle). The algorithm looks for cycles of clusters and merges them into bigger clusters. New clusters are formed by merging all clusters in a cycle, if these new clusters have no edges directed outside then we have a knot in the WFG.

At the beginning, each cluster consists of a single node. The algorithm consists of two steps which are repeated. First, within each cluster, a single outgoing edge is selected. In the second step, it is checked whether this edge is directed to either a knot or a free node. If the edge is going to a free node, then we have a free cluster meaning that all nodes in the cluster are set to free and no more detection is needed. We repeat this until all possible clusters are set to free or knot. In the other case that the cluster has outgoing edge which isn't directed to a knot or free node, then a cycle of clusters detected. All clusters of this cycle are merged to form a single cluster. We repeated until all nodes in the graph are either in the free or knot state.

The main two parts of the algorithm are the detection of cycles of clusters and the combination of these cycles to a single cluster. First, the cycle detection, it's done by having nodes pass composed ID's through the edges of the cycle and have the maximum ID node receive back its own ID. The cycle combination is done by the maximum ID node that establishes a tree structure where all nodes of the cluster report to it. For example node i starts the cycle detection by choosing an arbitrary outgoing edge and designating it as a selected outgoing edge. The node to which this edge is incoming is the father of i and i is the son of this node. Also a node may have several sons, but only one father. If i's father is in a knot or a free state, i will set its state to free. This leads to nodes which have a directed path of selected outgoing edges to a knot node to be free. The nodes forward maximum IDs to their sons, whenever a node receives an ID which is higher than any other IDs received so far, even its own ID, it's recorded and sent to all sons. When a node receives its own ID from its father, it realizes that it belongs to a cycle and has the maximum ID in that cycle. This information is forwarded to all of its outgoing edges.

To reduce the worst case communication cost, node IDs must cover enough nodes so that the total number of ID's forwarded by a node is in the worst case a logarithmic order of the number of nodes. A composite ID CID, consists of two numbers, a phase number PN and a node cluster ID or NCD. CIDs are compared lexicographically, meaning that a CID1 is strictly smaller than CID2 if PN1 is smaller than PN2. Also It's simply smaller if PN1 = PN2 and NCD1 is smaller than NCD2, and vice versa. Whenever the father's CID is strictly larger than that its sons, his CID is accepted his sons. Doing this procedure is not enough for covering all nodes of the cycle with the same CID, whenever all nodes have the same phase number but not the same node CID, it will terminates.

PN is incremented in order to cover the entire cycle with the same CID. Clusters that are covered by the same CID form a directed tree of selected outgoing edges. The root of the directed tree is the node or cluster who's ID covers the nodes of this 'directed tree' which is called segment. The father of its root node is the father of the segment; the sons of the segment are the sons of its leaf nodes. A segment is a local maxima segment if its father and at least one of the sons of the segment have CIDs simply smaller than its CID. When a local maxima is detected, we'll increment its PN by one from the root node. The new CID replaces the current one across the segment. Then we repeat this till only one local maxima segments increment its PN. The new CID will always cover at least one additional segment currently covered by the previous PN, and the local maxima segment won't be covered by a different CID with the new PN, because the segment's father CID is smaller than the segment's CID, and because for each segment only a single CID is accepted, this bounds the number of different CIDs accepted by a node to a logarithmic order of the total number of nodes. Cycle detection is done whenever the root node detects that its father is covered by its own CID. Another part of the algorithm is the distributed coordination between nodes of clusters and segments. The coordination of a segment is easy since each segment forms a rooted tree. The decision whether to increment the PN is taken at the root node. The root node informs its father when its CID is found to be simply smaller than that of the father. The leaf nodes report to the root node upon the comparison results at their sons, which may be roots of other segments. If the root node decides to increment the PN, it broadcasts the new CID to all the nodes of the segment upstream the rooted tree.

The merging of clusters is more complex than the coordination of segments. For each cluster, a directed control tree is maintained, this tree is rooted the cluster leader, a special node similar to the root node in the segments. The cluster leader serves as control point of the cluster to which the information is brought and decisions are made. The cluster leader coordinates the search for new outgoing edges in the cluster. Unlike the root of a segment and in order to save communication cost, some local decisions are taken by nodes of the cluster without informing the leader. When all edges are tested, the leader then signals a search termination message to all other nodes.

A formal pseudocode is given in a 1986 IBM research paper done by I. Cidon; unfortunately we couldn't find this paper on the internet. The fig. below shows how clusters are merged to a single cluster.

Figure 1 Cluster Merging (from [3])

A Distributed Graph Algorithm for the Detection of Local Cycles and Knots

Azzedine Boukerche and Carl Tropper

Published in1998; this algorithm is from the second class, the "diffusing computation" class. The communication complexity is O (2e), where e is the number of edges, and the space complexity is O (n log n), where n is the number of nodes. The diffusing computation algorithm initiator node will start the search whether it's in a knot or a cycle. Many processes may start their own searches of the same knot; nodes can distinguish between different searches based upon the initiator. This algorithm sends messages only to the initiator's successors, thus reducing the messages involved in the search. Each message has the identity of its sender, its destination and a type. A process can be in one of the two modes: asleep; not participating in the 'search' and awake; when it's participating.

Each process has a status variable that indicates whether it's a g-leaf, in a cycle, or that it hasn't completed its search. We say that a process i is a g-leaf if it has no descendants or it's in a knot and it has an incoming edge from a node which is outside the knot. The algorithm terminates when the initiator receives replies from all of its successors. The final state can be either a knot, a knot is detected, or a cycle only, a member of a cycle, or neither, a member of neither a knot nor a cycle. Two types of messages used are request and reply; the request is used to search the initiator's graph to identify the descendants set. The search goes on recursively if the process receiving the request is asleep. The reply reports the results of the search to the initiator, they'll return to the initiator eventually. Request has the ID of the process which forwards the message, while the reply has the ID of the sender, a set of processes that are the originators of the 'incomplete_search' messages, and a type that has one of the following values: cycle_only, in the same cycle, but don't belong to a knot, incomplete_search, the descendants haven't completed their search yet, leaf, a leaf or a g-leaf. The 'incomplete_search' messages are used for speed up, as processes send replies of type 'incomplete_search' to the sender instead of blocking, as in other diffusing computation approaches.

Let's assume that node p initiates the algorithm, by sending a request <Request, p> to all of its successors. Then when a successor node 'k' upon receiving the request from p, if it's asleep then it sends <Request, k> to all of its successors. But if it's awake, and not finished then sends back a reply message <Reply, incomplete_search, k, S (= {k})> to p. When p receives <Request, p> from its predecessor, then p recognizes that it's in a cycle and sends <Reply, cycle, p, S> to its predecessor, where S = Ø, if p received the request for the first time, otherwise S = {p'}, this helps distinguish between single cycles and disjoint cycles. If p receives all replies of type cycle then p is in a knot. Also when k receives all of the replies from its successors, it sends a reply<Reply, type, k, S> to p, the process from which k received the first request. The type of the reply which k sends to its parent depends on the types of replies which k received. If k is a leaf then it sends a reply <Reply, leaf, k, S (=Ø)> to the p. If k received at least one reply of type cycle and at least one of type leaf, then k sends a reply < Reply, cycle_only, k, S (=Ø)> to p. Then k can't be in a knot because of the leaf reply. Also, if k receives at least one reply of type cycle_only, then k sends a reply < Reply, cycle_only, k, S (=Ø)> to p. When k sends a request to its descendants then p sends a request, k will sends a reply <Reply, incomplete_search, pk, S (= {k})> to p. There are then two possibilities, either the reply will encounter a g-leaf node on the return path that sends an 'incomplete_search' message earlier or it doesn't encounter such a node. If it encounters a g-leaf along the path, then k receives a reply with the set S = {k}. If the reply encounters a node j on the return path which has sent an 'incomplete_search' message, a "marked element" j is placed in the set S of the reply which departs from j. In this case j is sent in the reply to "annihilate" the 'incomplete_search' sent earlier by j. The marked element j determines if a process k is a g-leaf. Consequently, a path of nodes started by the initiator then to the successors is formed; each member sends a reply of type {cycle_only, leaf} or {cycle, incomplete_search}. The 'incomplete_search' message sent earlier by j will annihilate at one of j's ancestors.

The contents of the set S is the set Skj at k containing all the IDs of those descendants of j which are the originator processes of the 'incomplete_ search' messages sent to k via j. When k receives a reply from j with a set S, k updates the set Skj, which helps in distinguishing between the 'incomplete search' messages. The messages will identify nodes that belong to a cycle from those nodes which are leaves. This helps to determine whether k is a g-leaf or not. When k terminates its search, k checks whether all of its incomplete search messages are annihilated, meaning that all elements in Skj are marked, then k sends a reply to its parent with the set S. A reply of type <Reply, incomplete_search, k, S> is sent to parent of k if all replies received are of type 'incomplete_search' and the set Skj contains at least one element which isn't marked. Otherwise, if all of the elements of Skj are marked, then k concludes that it's a g-leaf, thus k sends a reply of type leaf to its parent. The computation will terminate when the initiator's receives all replies from its successors. If the received replies of type cycle or replies of type cycle and 'incomplete_search' and all the elements of S are marked, then a knot is detected. A cycle is detected if p received at least one reply message of type 'cycle_only, otherwise, p is neither in a knot nor a cycle.

The pseudo-code is given below; it's taken from the original paper. The algorithm has two parts the initiator's algorithm and the receiver's algorithm:

Symbols

(+) denotes XOR operation

+U denotes a set union with the marked elements taking the place of unmarked elements

Algorithm at p (initiator)

A1. Initially:

Mode = Awake; Status = Undefined;

Num_Sucp= Num of Successors of P;

Send <Request, P> to all successors of P

Upon receiving <Request, Pi>

A2. If p has sent earlier a cycle message

S = {p'}

// this will help to distinguish between single and disjoint cycles

Else

S = Ø

End If

A3.

Send <Reply, cycle, p, S > to i.

Upon receiving <Reply, type, i, S>

A3. Num_Sucp - - ;

Spi = Spi (+U) S

A3.1. Status = Status (+) type;

// type (+) cycle_only = cycle_only, cycle (+) leaf = cycle_only, cycle (+) incomplete_search = cycle, leaf (+) incomplete_search = leaf

If (Num_Sucp= 0) // Computation will terminates

Begin

If (Status = cycle_only)

Then p is in a cycle.

A3.2. Else If (Status = cycle and (+U) Sij = Ø)

Then p is in a cycle only

A3.3. Else If (Status = cycle) // all elements Sij are marked

Then p is in a knot.

Else p is neither in a knot nor a cycle.

EndIf

EndIf

EndIf

EndBegin

Algorithm at k (not an initiator)

B1. Initially:

Mode = sleep; Status = Undefined;

Num_Suck = 0;

Upon receiving <Request, p>

B2. If k has terminated its search earlier

Send to p // this reply message k sent when it terminates its search

Else

B3. If Mode = awake

// k is already in the cycle/knot detection algorithm

Send <Reply, incomplete_search, k, {k}> to p;

Else // k isn't participating in the knot/cycle detection algorithm and it's the first time k is reached from p

B3.1. If k has no successors // k is a g-leaf node

Send <Reply, leaf, k, Ø > to p

Else

parentk = p; Mode = awake;

Num_Suck = Num of Successors of k;

Send <Request, k> to all successors of k;

EndIf;

EndIf;

EndIf;

Upon receiving <Reply,type, j, S>

B4. Num_Suck - - ;

Skj = Skj +U S

Status = Status (+) type

B5. If Num_Suck = 0 // process k terminates its search

Begin

S = +Uj Skj

B5.1 If k has sent earlier an incomplete search message

S = S U {k'}

// k' will annihilate incomplete_search messages previously sent by k. This'll help determine

the leaves in the graph

B5.2 If (Status = leaf) or (Status = incomplete_search)

And (all elements in S are marked)

// the two uses of marked messages are identification of a g-leaf and making sure that all of k and p's successors have sent replies

Send <Reply, leaf, k, Ø > to parentk;

// k is a leaf

B5.3 If (Status = incomplete_search)

And (S has at least one element not marked)

Send <Reply,incomplete_search, k, S>

// to parentk

B5.4 If (Status = cycle_only)

Send <Reply,cycle_only, k, Ø > to parentk;

B5.5 If (Status = cycle)

If k receives all replies of type cycle

Send <Reply,cycle, k, S> to parentk;

Else If all elements in +Uj Skj are marked

Send <Reply,cycle_only, k, S> to parentk;

Else

Send <Reply,cycle, k, S> to parentk;

EndIf

EndIf

EndIf

EndBegin

Figure 2 P4 is a g-leaf (from [4])

AN EFFICIENT DISTRIBUTED ALGORITHM FOR DETECTION OF KNOTS AND CYCLES IN A DISTRIBUTED GRAPH

D. Manivannan and Mukesh Singhal

The third algorithm was published in 2003; it is of the second class also. This algorithm's main advantage is that it detects which nodes are involved in the cycles or knots. The communication complexity of this algorithm is O (2e) where e is the edges in the reachable graph, and a delay of 2(d + 1) messages hopes where d is the diameter of the reachable graph. The reachable graph means all the paths of directed edges that can be traversed by any node. A node j is reachable from node i if there is a directed path in the sub-graph from i to j. Again node i the 'initiator', initiates the algorithm. Node i is in a knot if and only if every node that is reachable from i lies on a cycle containing it. We only need to check if every node that is reachable from i is on a cycle that passes through it. The initiator diffuses detect knot messages to all nodes in the reachable part of the graph; meaning detect knot messages sent to all of i's immediate successors. Every node receiving the detect knot message propagates it to all its successors. The edges along which the first detect knot messages are received form a directed spanning tree or DST rooted at the initiator. If the initiator i receives the same detect knot message back from node j, i sends a cycle acknowledgment (ack) to j. The cycle ack will inform j that it's on a cycle with the initiator i.

All the nodes in the DST that lie on a path from the initiator i to j are also on a cycle with the initiator. An efficient way to determine if a node in the DST lies on a path that passes from i to a node that is known to be on a cycle with i; is that after a node knows that it's on a cycle with i, it sends a cycle ack to all the nodes from which it receives a detect knot message afterwards. Thus the nodes receiving the message know that they are on a cycle with i. However, a node k may receive a detect knot message from i, then it receives another detect knot message from j before knowing if it's on a cycle with i; in this case, k sends a seen ack to j as soon as it receives the detect knot message. When j receives the seen ack from k, it includes the ordered pair (j, k) in a local variable called Seenj, so that later if k is found to be on a cycle with i, j can conclude from Seenj that j is also on a cycle with i. After j receives acks from all its successors it sends a parent ack to its parent, this message will include a list of nodes that are known to j to be on a cycle with i, as well as the local variable Seenj.

This information about the nodes is propagated back in the DST until it reaches the root i. The initiator i keeps the set of all nodes that are known to be on a cycle with it in a local variable Incyclei. It also has the variable Seeni from the parent ack, and seen ack replies received from its successors. If k is in Incyclei, then node i concludes that k is on a cycle with it and removes (k, j) from Seeni and includes k in the set Incyclei. Whenever Seeni becomes empty then i conclude that it's in a knot. Also when i concludes that it's in a knot, then the variable Cyclenodesi contains exactly the IDs of the nodes involved in the knot. Furthermore when i concludes that it's not in a knot, then the variable Cyclenodesi contains the set of nodes that are in cycle with i only but this cycle doesn't form a knot.

When node i need to find out if it's in a knot, it sends a detect knot message to all its immediate successors. The first parameter of the detect knot message is the ID of the initiator and the second is the ID of the node propagating the message. When node k receives the message from node j, it takes one of the following actions, If it's the first detect knot message received, it sets its parent to j and propagates the message to all of its successors. Second action is if it has already seen a detect knot message and Incyclek isn't empty, then it sends cycle_ack(k) to j. When j receives this ack, it includes the ID of k in the set Incyclej. Implying that k is on a cycle with i which implies that j is also on a cycle with i. The third action is when it has already seen a detect knot message and Incyclek is empty, then it sends seen_ack(k) to j. Then j includes the ordered pair (j, k) in its local variable Seenj so that if later k is found to be on a cycle with i, then i can conclude that j is also on a cycle with it.

When i receives a detect knot message back from k, it sends a cycle_ack(i) to k, so that k knows it's on a cycle with i. When node k receives cycle_ack(i) for a detect knot message, it includes the ID of i in the set Incyclek so that k can conclude that it's on a cycle with the i. When node k receives seen_ack(j) for a detect knot message from j, it includes the ordered pair (k, j) in the variable Seenk, which implies that j is an immediate successor of k and k is reachable from i. When this information reaches i, if i finds that j is on a cycle with it, then it conclude that k is also on a cycle with it. When k receives a parent_ack(Seenj, Incyclej) from one of its children, it updates its variables Seenk and Incyclek. After k receives acks from all its successors, it includes its ID in the set Incyclek, if it finds at least one of its immediate successors is on a cycle with i, it sends a parent_ack to its parent.

After receiving all acks from all its successors, the variable Cyclenodesi will contain the IDs of all the nodes reachable from i and on a cycle with i. If Seeni becomes empty, which means all the nodes reachable from i are on a cycle with i, then node i declares a "knot detected" signal, and all nodes in this knot will know this information. Also the variable Cyclenodesi contains the IDs of nodes involved in the cycle if no knot is detected, and the IDs of nodes involved in the knot if a knot is detected. The pseudocode is described in the next paragraph as a set of guarded actions.

Variables for process i (initiator)

Outi: set of immediate successors IDs

Seeni = Ø: set of ordered pairs of processes IDs

Incyclei = Ø: set of processes IDs that are in cycle with i

Cyclenodesi = Ø: set of nodes IDs in a cycle with i

Donei = false: will be set to true with termination

Algorithm for process i (initiator)

Send detect_knot(i,i) to all in Outi;

[] ┌ Donei & receive parent_ack(Seenk,Incyclek) from k

Then Outi = Outi - {k};

Seeni = Seeni U Seenk;

If Incyclek = Ø then

Seeni = Seeni U {(i,k)};

Else

Incyclei = Incyclei U Incyclek;

[] ┌ Donei & receive seen_ack(k) from k

Then Outi = Outi - {k};

Seeni = Seeni U {(i,k)};

[] ┌ Donei & receive cycle_ack(k) from k

Then Outi = Outi - {k};

Incyclei = Incyclei U {k};

[] ┌ Donei & receive detect_knot(i,k) from k

Then send cycle_ack(i) to k;

[] ┌Donei & Outi = Ø

Then for each j Incyclei

Incyclei = Incyclei - {j};

Cyclenodesi = Cyclenodesi U {j};

For each k (1 ≤ k ≤ n)

If (k,j) Seeni

Incyclei = Incyclei U {k};

Seeni = Seeni - {(k,j)};

If Seeni = Ø

Declare 'knot detected';

Else

Declare 'knot not detected';

Donei = true;

End of process i's algorithm

Variables for process k (not an initiator)

Has_seenk: Boolean initially false

Outk: set of immediate successors IDs

Seenk = Ø: set of ordered pairs of processes IDs

Incyclek = Ø: set of processes IDs that are in cycle with k

Parentk = Ø: node ID of the parent

Donek = false: Boolean same as donek

Algorithm for process k (not the initiator)

[] receive detect_knot(i,j) from j

Then if ┌ has_seenk

has_seenk = true;

Parentk = j;

Send detect_knot(i,k) to all nodes in Outk;

Else If Incyclek ≠Ø

Send cycle_ack(k) to j;

Else

Send seen_ack(k) to j;

[] ┌ Donek & receive parent_ack(Seenj,Incyclej) from j

Then Outk = Outk - {j};

Seenk = Seenk U Seenj;

If Incyclej = Ø

Seenk = Seenk U {(k,j)};

Else

Incyclek = Incyclek U Incyclej;

[] ┌ Donek & receive seen_ack(j) from j

Then Outk = Outk - {j};

Seenk = Seenk U {(k,j)};

[] ┌ Donek & receive cycle_ack(j) from j

Then Outk = Outk - {j};

Incyclek = Incyclek U {j};

[] ┌ Donek & Outk = Ø & Has_seenk

Then if Incyclek ≠Ø

Incyclek = Incyclek U {k};

Send parent_ack(Seenk, Incyclek) to parent;

Donek = true;

End of process k's algorithm

COMPARISON

Type of the Search

The type of search in the three algorithms vary, the first algorithm's search type is based on detecting and finding clusters, then these clusters are checked if they form a cycle or not. The search is started from a node by sending a TEST message on a selected outgoing edge to another node if this node status is not free or knot then it will send back on the same edge a message containing its CID. Then the node that sent the TEST message checks the CID in that message and checks whether this CID is larger or strictly larger.

This type of search is called edge chasing, where a message is propagated over selected edges. The search is coordinated by the leader which is the root of tree or segment; this is done using a distributed depth-first search starting from the leader. When the leader receives back the TEST message it sends a search terminated message through all of its outgoing edges.

In contrast to the second and third algorithms, these algorithms use the 'diffusing computation' method. That consists of a diffusion of messages on the outgoing edges from the initiator along the edges of the WFG. Then the reply diffusing messages will echo back to the initiator, thus recording a snapshot of the WFG at the initiator. The performance of the diffusing computations is better than the cluster detecting method due to the reduced message complexity. Thus the second and third algorithms have better performance.

Complexities

In terms of complexity will start with I. Cidon's algorithm, it has a message complexity of O (n log n + e) and O (e + n log n) of space complexity, which is relatively high compared to the other two algorithms. The message complexity is measured by number of messages sent in the network; where n is the number of nodes and e is the number of unidirectional edges. The TEST and FREE messages can be sent once over any unidirectional edge at most 2e such messages are sent. The INC, JOIN, SEARCH, and SEARCH TERMINATION messages are sent only over trees once for each phase number. Since there are no more than log n phases, the total number of transmissions is thus bounded by n log n. The space complexity is measured by bits for storing the variables. This implies that except for CID, which contains a node ID, a PN, and a binary flag, all variables need one bit and each edge adjacent to a node. Then the total space cost of is O (n log n + e) bits for the total network.

The second algorithm which is A. Boukerche and C. Tropper's algorithm has a message complexity is O (2e) and a space complexity is O (n log n). Because there is one request message and one reply message sent along each edge involved in the search, when a process it's not participating in the search and it receives a request message for the first time, it will propagate it and send back a reply message. Therefore, the total number of messages sent in the edges of the network is 2e. The size of the request message is O (log n) bits, and the size of the reply is at most O (n log n) bits, where n is number of nodes in the network. Thus making the total space needed approximately O (n log n).

For the third algorithm messages propagate only to nodes that are reachable from the initiator, that may or may not be part of the knot. Also nodes that receive the knot detection message send a single ack back to the initiator. This results in a total of only 2 messages per edge in the reachable graph of the initiator.

Advantages

The three algorithms have unique advantages; first of all, all three algorithms fall in the class of deadlock detection, the other two classes are deadlock prevention and deadlock avoidance. Deadlock detection means that the system regularly detects whether there is a deadlock or not, if so then resolves it. The other two classes which are prevention and avoidance both try to ensure that the system never enter a deadlock state. Deadlock prevention is achieved either by insuring that there is no hold and wait on all resources by making processes acquire resources a prior to their execution, or by allowing preemption, processes with the needed resources are preempted by other processes with higher priorities. The other class is deadlock avoidance; a process only proceeds if the resulting global state is checked to be safe from any possible deadlock.

Beside deadlock detection, the other two classes are inefficient for complex distributed systems. For deadlock prevention, it decreases system level of concurrency by restricting the execution of the processes to avoid at least one of the deadlock conditions. For deadlock avoidance, the process of checking for a safe state is computationally expensive. These inefficiencies lowers performance especially in complex distributed systems with large numbers of processes, resources and communication edges. Also due to lack of perfect global synchronization mechanisms, avoidance and prevention algorithms may lead to deadlocks in the execution of the algorithm itself. Thus deadlock detection is the best suited strategy of the three strategies for deadlocks in distributed systems since it lacks these disadvantages listed above.

We'll look at the main advantages of I. Cidon's algorithm first. This algorithm is based on the property that knots are strongly connected graphs which are clusters, thus message propagation is ensured to reach all nodes in the cluster. The use of CID, where a tree root or segment is formed by the node with the highest ID pass it's CID to all other nodes in this segment and all the other nodes send messages only to the root. This reduces the message complexity and the congestion in the edges.

The second algorithm unlike the other diffusing computation algorithms, it can distinguish between a knot and a cycle. This is an advantage when we consider applying this algorithm for both request models the AND model and the OR model. Also it has a low communication cost; lower than Misra and Chandy's algorithm [1]. The latter has a messages complexity of 4e, while the former has a message complexity of 2e.

The third algorithm uses diffusing computation also and it's deadlock free. Nodes initiating the search and all other nodes reached by the detect_knot message must know their incoming and outgoing edges in the previous diffusing computation algorithms, unlike this algorithm where nodes need not to know anything about the network topology, useful with dynamic networks where the topology is ever changing. Another advantage is that detects exactly which nodes are involved in the knot or cycle, depending on whether the nodes are in a cycle or knot. This is quite useful for deadlock resolution as these nodes are to be rolled back to resolve the deadlock.

Also it's worth mentioning that all three algorithms don't discover phantom deadlocks, a phantom deadlock is not a real deadlock detected by an inconsistent global state realized from an inconsistent snapshot of the system.

Future Developments

Developments were made to overcome the shortcomings of the presented algorithms. Since the execution patterns of these algorithms is extremely large, which leads to the impossibility of exhaustive testing, thus it's very difficult to assess the performance of these algorithms empirically. Correctness proofs are also very difficult to find due to the complexity and timing sensitivity of these algorithms. These are some of the factors that hinder the development in distributed deadlock detection.

Future developments include the reduction in both the space complexity and the message complexity. Also more fault tolerant attributes are desired for future distributed knot detection algorithms, they include crash resistance, when processes fail and also resistance to failing channels. Advanced distributed knot detection algorithms are used in many distributed systems that include distributed DBMS, the internet, GPS, so efficient distributed knot detection algorithms are really needed due to their important role in many applications.

Conclusion

In conclusion we can see that distributed graph algorithms are very complexity, hard to implement and difficult to empirically verify their correctness. Theses algorithms are used for deadlock detection and resolution as they can detect knots and cycles, also cycle removal and unknotting procedures using other algorithms are used to resolve the deadlock, the third algorithm summarized in this paper however also detects the nodes causing the deadlocks so this information can be used to help resolve the situation.

The three algorithms summarized in this paper approach the problem from different angles, although the second and the third algorithms are very similar. The advantages and drawbacks of these algorithms should be considered in developing newer distributed graph algorithms for knot detection. The general trend in these algorithms was the dynamic shift from the early 1980's paradigm of simple diffusing computation to a more elaborate, time and space efficient implementations of the diffusing computation paradigm. These algorithms have evolved from simple detection to detection and resolving of the deadlocks. Finally we can say that these algorithms are vital and are used every day in networks, databases, and even in knowledge based systems with networks and links as its backbone.