Application Layer Approaches For Grid Networking Information Technology Essay

The need for high performance data transfer services is becoming more and more critical in today's distributed data intensive computing applications, such as remote data analysis and distributed data mining. Although efficiency is one of the common design objectives in most network transport protocols, efficiency often decreases as the bandwidth delay product (BDP) increases. Other considerations, like fairness and stability, make it more difficult to realize the goal of optimum efficiency. Another factor is that many of today's popular protocols were designed when bandwidth was only counted in bytes per second, so performance was not thoroughly examined in high BDP environments. Implementation also becomes critical to the performance as the network BDP increases. A regular HTTP session sends several messages per second, and it does not matter when message processing is delayed for a short time. Whereas, in data intensive applications, the packet arrival speed can be as high as 105 packets per seconds and any such delay matters. The protocol needs to process each event in a limited amount of time and inefficient implementations can lead to packet loss or time-outs. People in the high performance computing field have been looking for application level solutions. One of the common solutions is to use parallel TCP connections [51] and tune the TCP parameters, such as window size and number of flows. However, parallel TCP is inflexible because it needs to be tuned on each particular network scenario. Moreover, parallel TCP does not address fairness issues. In this section we review some application level protocols for high performance computing purposes especially for grid applications. Fig.7 gives some of the applications layer protocols analyzed in this paper.

5.1. Fast Object Based File Transfer System (FOBS)

A FOBS is an efficient, application level data transfer system for computer grids [52]. TCP is able to detect and respond to network congestion and because of its aggressive congestion control mechanism results in poor bandwidth utilization even when the network is lightly loaded.

FOBS is a simple, user level communication mechanism designed for large scale data transfers in the high bandwidth, high delay network environment typical of computational Grids. It uses UDP as the data transport protocol, and provides reliability through an application level acknowledgement and retransmission mechanism. FOBS employs a simple acknowledgement and retransmission mechanism where the file to be transferred is divided into data units called chunks. Data is read from the disk, transferred to the receiver and writes in the disk in units of chunks. Each chunk is subdivided into segments, and the segments are further subdivided into packets. Packets are 1470 bytes (within the MTU of most transmission medium), and a segment consists of 1000 packets. The receiver maintains a bitmap for each segment in the current chunk depicting received / not received status of each packet in the segment. These bitmaps are sent from the data receiver to the data sender at internals dictated by the protocols and triggers (at a time determined by the congestion / control flow algorithm) a retransmission of the lost packets. The bitmaps are sent over a TCP sockets.

5.2. Light Object Based File Transfer System (LOBS)

LOBS is a mechanism for transferring file in high performance computing networks which are optimized for high bandwidth delay network, especially for computational grids. Heart of this environment is the ability of transferring vast data in an efficient manner. LOBS rectify the problem found in the TCP for the data transfer mechanism for grid based computations. LOBS is used to transfer large files between two computational resources in a grid and this mechanism is lightweight and it does not support all functionalities of GridFTP. It supports the primary functionality required for computation grids (i.e. fast and robust file transfer). In LOBS the increase in performance optimization is done and the order of data delivered is not considered. A LOBS is built directly on top of FOBS [54]. The TCP window size plays a vital role for achieving best performance in high bandwidth delay networks; this leads to tune the size at runtime. In LOBS, the size of TCP window is tuned using different approach, i.e. using UDP stream. UDP is used because of these reasons: i) user level not kernel level, ii) to avoid multiplexing TCP streams in kernel level and iii) to provide user level enhancements.

Two protocols closely related to LOBS are RBUDP [47] and SABUL [ ]. Primary differences between these two protocols are how loss of packet is interpreted and how to minimize the packet loss impacts affecting the behavior of the protocol.

SABUL assumes that packet losses indicate congestion, and it reduces the rate based on the perceived congestion, where as in LOBS it is assumed that some packet loss is inevitable and does not make any changes in the sender rate. Primary difference between LOBS and RBUDP is based on the type of network for which the protocols is designed.

The basic working concept of LOBS is, it creates threads in sender part for controlling its data buffer, to read file from the disk and fills the data buffer. Once the buffer is full, it is transferred to the client over the network. When the data is in transfer mode, the other threads start reading the data from the file and fill the data buffer and the steps are repeated again until the full file is transferred. In LOBS, the goal is to make use of network I/O operation and the disk I/O operation to the largest extent as possible.

5.3. Simple Available Bandwidth Utilization Library (SABUL)

SABUL [55, 56] is an application level library, designed for data intensive grid application over high performance networks and to transport data reliably. SABUL uses UDP for the data channel and it detects and retransmits dropped packets. Using TCP as a control channel reduces the complexity of reliability mechanism. To use available bandwidth efficiently, SABUL estimates the bandwidth available and recovers from congestion events as soon as possible. To improve performance, SABUL does not acknowledge every packet, but instead acknowledges packets at constant time interval which is called as selective acknowledgement [57]. SABUL is designed to be fair, so that grid applications can employ parallelism and also it is designed in a way so that all flows ultimately reach the same rate, independent of their initial sending rates and of the network delay. SABUL is designed as an application layer library so that it can be easily deployed without any changes in operating systems network stacks or to the network infrastructure.

SABUL is a reliable transfer protocol with loss detection and retransmission mechanism. It is light in weight with small packet size and less computation overhead, hence can be deployed easily in public networks and is also TCP friendly. In SABUL, both the sender and the receiver maintain a list of the lost sequence numbers sorted in ascending order. The sender always checks the loss list first when it is time to send a packet. If it is not empty, the first packet in the list is resent and removed; otherwise the sender checks the number of unacknowledged packets flow size, and if not, it packs a new packet and sends it out. The sender then waits for the next sending time decided by the rate control. The flow window serves the job of limiting the number of packet loss upon congestion when TCP control reports about the occurrence of delay and the maximum window size is set as

After each constant synchronization (SYN) time, the sender triggers a rate control event that updates the inter packet time. The receiver receives and reorders data packets. The sequence numbers of lost packets are recorded in the loss list and removed when the resent packets are received. The receiver sends back ACK periodically if there is any newly received packet. The ACK interval is the same as SYN time. The higher the throughput the less ACK packets generated. NAK is sent once loss is detected. The loss will be reported again if the retransmission has not been received after k*RTT, where k is set to 2 and is incremented by 1 each time the loss is reported. Loss information carried in NAK is compressed, considering that loss is often continuous. In the worst case, there is 1 ACK for every received DATA packet if the packet arrival interval is not less than the SYN time, there are M/2 NAKs when every other DATA packet gets the loss for every M sent DATA packets.

5.4. UDP based Data Transfer Protocol (UDT)

UDT [58, 59] is a high performance data transfer and is an alternative data transfer protocol for the TCP when its performance goes down. Goal of UDP is to overcome TCP's inefficiency in BDP networks and in connection oriented unicast and duplex networks. The congestion control module is an open source, so that different control algorithms can be deployed apart from native or default control algorithm based on AIMD. Rate control tunes the inter-packet time at every constant interval, which is called SYN. The value of SYN is 0.01 seconds, an empirical value reflecting a trade off among efficiency, fairness and stability. For every SYN time, the packet loss rate is compared with the last SYN time, when it is less than a threshold then the maximum possible link Bit Error Rate (BER) and the number of packets that will be sent in the next SYN time is increased by the following equation,

where B is the estimated bandwidth and C is the current sending rate, both in number of packets per second, ï¢ is a constant value of 0.0000015. MTU is the maximum transmission unit in bytes, which is the same as the UDT packet size. The inter-packet time is then recalculated using the total estimated number of sent packets during the next SYN time. The estimated bandwidth B is probed by sampling UDT data packet pairs and is designed for scenarios where numbers of sources share more bandwidth than expected. In other scenarios, e.g., messaging or low BDP networks, UDT can still be used but there may be no improvement in performance.

5.6. Lambda Stream

Network researchers have reached a consensus that the current TCP implementations are not suitable for long distance high performance data transfer. Either TCP needs to be modified radically or new transport protocols should be introduced. Long Fat Networks (LFNs), where the round-trip latencies are extremely high, this latency results in gross bandwidth under-utilization when TCP is used for data delivery. Several solutions have been proposed, one such solution is to provide revised versions of TCP with better utilization of the link capacity. Another solution is to develop UDP-based protocols to improve bandwidth usage. The Simple Available Bandwidth Utilization Library (SABUL [55], Tsunami [50], Reliable Blast UDP (RBUDP) [47] and the Group Transport Protocol (GTP) [ ] are few recent examples.

LambdaStream is an application-layer transport protocol [60]. Correspondingly, key characteristics of LambdaStream include a combination loss recovery and a special rate control, which avoids packet loss inherent in other congestion control schemes. To efficiently utilize bandwidth and quickly converge to a new state, the protocol sets the initial sending rate as the quotient of the link capacity over the maximum number of flows, which is easily obtained in a dedicated network.

It adapts the sending rate to dynamic network conditions while maintaining a constant sending rate whenever possible. One advantage of this scheme is that the protocol avoids deliberately provoking packet loss when probing for available bandwidth, a common strategy used by other congestion control schemes. Another advantage is that it significantly decreases fluctuations in the sending rate. As a result, streaming applications experience small jitter and react smoothly to congestion. Another important feature is that the protocol extends congestion control to encompass an end-to-end scope. It differentiates packet loss and updates the sending rate accordingly, thus increasing throughput.

LambdaStream builds on experiences from a high performance networking protocol called Reliable Blast User Datagram Protocol (RBUDP) which transports data using UDP and TCP for control packets. In LambdaStream, the congestion control scheme is developed in such manner it decreases jitter and improve RBUDP's adaptation to network conditions. LambdaStream is an application-layer library, for two reasons. Firstly, application-layer tool makes development easier and simplifies deployment for testing purposes. Secondly, an application-layer protocol can measure end-to-end conditions as applications actually experience them, allowing the protocol to distinguish packet loss and avoid unnecessarily throttling throughput.

The key characteristics of the congestion control in LambdaStream are: it is rate based [61], it uses receiving interval as the primary metric to control the sending rate, it calculates rate decrease/increase at the receiver side during a probing phase, and it maintains a constant sending rate after probing for available bandwidth. LambdaStream uses the receiving interval as a metric because 1) the receiving interval is closely related with the link congestion and the receiver's processing capability; 2) the receiving interval can be used to detect incipient congestion and loss differentiation.

The congestion control is composed of two parts. One part is to distinguish a packet loss and adjusts sending rate accordingly, thus avoiding unnecessarily throttling of the sending rate. Another part is to update the sending rate based on the ratio between the average receiving interval and the sending interval. Incipient congestion leads to a higher ratio, which triggers the protocol to decrease the sending rate. The protocol increases its sending rate if the ratio is close to one and the available bandwidth is greater than zero.

The pseudo-code for LambdaStream is shown below. When the protocol detects a packet loss (line 2), it first checks its average receiving delay and the loss spacing. If the loss is due to serious congestion or continuous low receiver's capacity (line 5), the receiver decreases sending rate and sends the feedback to the sender (line 6 and 7). Otherwise, it neglects the packet loss and does not update the sending rate.

LambdaStream protocol's throughput converges very well for a single flow, even for the initial sending rate in between 1720Mbps or 172Mbps. The protocol manages to maintain the throughput at an almost fixed sending rate, about 950Mbps.

LambdaStream extends the congestion control to encompass an end-to-end scope. It distinguishes packet loss and adjusts the sending rate accordingly. The protocol also applies a ratio sampling approach to detect incipient congestion and combines it with a bandwidth estimation method for proactively probing for an appropriate sending rate. The experimental results show that LambdaStream achieves 950Mbps throughput in a 1Gbps channel. It exhibits small application-level jitter and react smoothly to congestion, which is very suitable for streaming applications and also works well for continuous data streams of varying payloads.

5.7. Group Transport Protocol

Group Transport Protocol (GTP), is a receiver-driven transport protocol that exploits information across multiple flows to manage receiver contention and fairness. The key novel features of GTP include 1) request-response based reliable data transfer model, flow capacity estimation schemes, 2) receiver-oriented flow co-scheduling and max-min fairness rate allocation, and 3) explicit flow transition management.

Group Transport Protocol (GTP) is designed to provide efficient multipoint-to-point data transfer while achieving low loss and max-min fairness among network flows [62]. In a multipoint-to-point transfer pattern, multiple endpoints terminate at a receiver and aggregate a much higher capacity than the receiver can handle. In a sender-oriented scheme (e.g. TCP), this problem is more severe because the high bandwidth-delay product of the network makes it difficult for senders to react to congestion in a timely and accurate manner. To address this problem, GTP employs receiver-based flow management, which locates most of the transmission control at the receiver side, close to where packet loss is detected. Moreover, a receiver-controlled rate-based scheme in GTP, where each receiver explicitly tell senders the rate at which they should follow, allow flows to be adjusted as quickly as possible in response to detected packet loss.

In order to support multi-flow management, enable efficient and fair utilization of the receiver capacity, GTP uses a receiver-driven centralized rate allocation scheme. In this approach, receivers actively measure progress (and loss) of each flow, estimate the actual capacity for each flow, and then allocate the available receiver capacity fairly across the flows. Because GTP is receiver-centric rate-based approach, it manages all senders of a receiver, and enables rapid adaptation to flow dynamics by adjusting, when flows join or terminate.

GTP is a receiver-driven response-request protocol. As with a range of other experimental data transfer protocols, GTP utilizes light-weight UDP (with additional loss retransmission mechanism) for bulk data transfer and a TCP connection for exchanging control information reliably. The sender side design is simple: send the requested data to receiver at the receiver-specified rate (if that rate can be achieved by sender). Most of the management is at the receiver side, which includes a Single Flow Controller and Single Flow Monitor for each individual flow, and Capacity Estimator and Max-min Fairness Scheduler for centralized control across flows.

GTP implements two levels of flow control. For each individual flow, the receiver explicitly controls the sender's transmission rate (by sending rate requests to senders). This allows the flow's rate to be adjusted quickly in response to packet loss (detected at the receiver side). Ideally any efficient rate-based point-to-point flow control scheme could be 'plugged in' and it acts as a centralized scheduler. The scheduler at the receiver manages across multiple flows, dealing with any congestion or contention and performing max-min rate amongst them. The receiver actively measures per- flow throughput, loss rate, and uses it to estimate bandwidth capacity. It then allocates the available receiver capacity (can be limited by resource or the final link) across flows. This allocation is done once for each control interval in Max-min fair manner. Correspondingly, the senders adjust to transmit at the revised rates.

Results got after implementation of GTP shows that for point-to-point single flow case, GTP performs well, like other UDP-based aggressive transport protocols (e.g. RBUDP, SABUL), achieving dramatically higher performance than TCP with low loss rates. In case of multipoint-to-point, GTP still achieves high throughput with 20 to 100 times lower loss rates than other aggressive rate-based protocols. In addition, simulation results show, unlike TCP, which is unfair to flows with different RTT, GTP responses to flow dynamics and converge to max-min fair rate-allocation quickly.

GTP outperforms other rate based protocols, for multipoint-to-point data transmission, GTP reduces the packet loss, which are caused by aggressiveness of rate based protocols. GTP focus on Receiver based flow contention management, however detailed rate allocation and fairness among flow are not yet considered. Some works has to be done regarding GTP's performance such as achieving max-min fairness, estimating TCP flow's capability and tuning TCP parameters to achieve target rate.

5.8. GridFTP

A common data transfer protocol for grid would ideally offer all the features currently available from any of the protocols in use. At a minimum, it must offer all of the features that are queried for the types of scientific and engineering applications that are intended to support in the gird. For this the existing FTP standard is selected, by adding some features a common data transfer protocol, 'GridFTP' is developed [63]. GridFTP is used as a data transfer protocol for transferring a large volume of data in grid computing. It adopts parallel data transfer which improves the throughput by creating multiple TCP connections in parallel and automatic negotiation of TCP socket buffer size. GridFTP uses TCP as its transport-level communication protocol [64]. In order to get maximal data transfer throughput, it has to use optimal TCP send and receive socket buffer sizes for the link being used. TCP congestion window never fully opens if the buffer size is too small. If the receiver buffers are too large, TCP flow control breaks, and the sender can overrun the receiver, thereby causing the TCP window to shut. This situation is likely to happen if the sending host is faster than the receiving host. The optimal buffer size is twice the bandwidth-delay product (BDP) of the link is Buffersize = 2 * bandwidth delay.

The GridFTP is implemented in Globus [65] and uses multiple TCP streams for transferring file. Using multiple TCP streams improves performance because of these reasons: i) aggregate TCP buffer size which is closer to real size and ii) circumvents the congestion control. Several experiments were done for analyzing GridFTP [66]. According to a technical report [67] globus_url_copy achieved a throughput very close to 95%. The windows size was set to bandwidth*RTT, when more than one TCP streams are used, then the window size was set to windows size * num streams. However, to achieve high throughput, the number of TCP connections has to be optimized according to network condition. Problems persist in the file sizes, when the end points want to transfer lots of small files, and then the throughput is reduced. The performance of GridFTP depends on the number of connections used in parallel, the best performance is achieved with 4 connections and when more connections are there, it creates too much control overhead.

5.9. GridCopy

GridCopy [68], or GCP, provides a simple user interface to this sophisticated functionality, and takes care of all to get optimal performance for data transfers. GCP accepts scp-style source and destination specifications. If well-connected GridFTP servers can access the source file and/or the destination file, GCP translates the filenames into the corresponding names on the GridFTP servers. In addition to translating the filenames/URLs into GridFTP URLs, GCP adds appropriate protocol parameters such as TCP buffer size and number of parallel streams to attain optimal performance in networks.

Tools such as ping and synack can be used to estimate end-to-end delay; and tools such as IGI [69], abing [70], pathrate [71], and Spruce [72] can be used to estimate end-to-end bandwidth. Latency estimation tools need to be run on one of the two nodes between which the latency needs to be estimated. For data transfers between a client and server, the tools mentioned above can be used to estimate the bandwidth-delay product. However, in Grid environments, users often perform third-party data transfers, in which the client initiates transfers between two servers.

The end-to-end delay and bandwidth estimation tools cited above are not useful for third-party transfers. King [73], developed at the University of Washington at Seattle makes it possible to calculate the round-trip time (RTT) between arbitrary hosts on the Internet. GCP uses King to estimate the RTT between source and destination nodes in a transfer. GCP assumes a fixed one Gbits bandwidth for all source and destination pairs. King estimates RTT between any two hosts in the internet by estimating the RTT between their domain name servers. For example, if King estimates the RTT between the source and the destination to be 50 ms, GCP sets the TCP buffer size to 0.05. GCP caches the source, destination, and buffer size in a configuration file which is available in the home directory of the user running GCP. By default, GCP uses four parallel streams for the first transfer between two sites by a user. GCP calculates the TCP buffer size for each stream as follows: BDP/max(1, streams/l f), where l f is set to 2 by default to accommodate for the fact that the streams that are hit by congestion would go slower and the streams that are not hit by congestion would go faster.

The primary design goal for GCP are i) to provide a scp-style interface for high performance, reliable, secure data transfers, ii) to calculate the optimal TCP buffer size and optimal number of parallel TCP streams to maximize throughput and iii)to support configurable URL translations to optimize throughput.

5.10. Summary

Table 5 presents the summary of TCP based protocols for high performance grid computing.

Table 5: Summary of Application Layer Protocols for HPC

Protocols

Contributors

Year

Perf. Parameters

Remarks

FOBS

Phillip M. Dickens

2003

Object Based

Less functionality

LOBS

Phillip M. Dickens

2003

Object Based & above FOBS

Lest Functionality

SABUL

Yunhong Gu et. al.

2008

Change in flow control

Application Level Library

UDT

Yunhong Gu et. al.

2009

Rate control + MAIMD

More functionality

Lambda

Stream

Chaoyue Xiong, Jason Leigh, Eric He, Venkatram Vishwanath, Tadao Murata

2005

bandwidth estimation method [proactive ]

more bandwidth utilization +small application level jitter

GTP

Ryan Wu and Andrew Chien

2004

rate based flow control

Ability of quick

exploring available bandwidth.

GridFTP

W. Allcock et. al.

2008

Parallel TCP

No User Level Ease

GridCopy

Rajkumar Kettimuthu

2007

Parallel TCP + SCP

User Level Ease

6. CONCLUSION

A detailed study of the most recent developments on network protocols for grid computing in high a bandwidth delay network is done in this paper. We reviewed the protocols based on TCP and UDP. Below are some points which has to be considered when developing high performance grid computing application level protocols, they are: i) using TCP in another transport protocol should be avoided, ii) using packet delay as indication of congestion can be hazardous to protocol reliability, iii) processing continuous loss is critical to the performance and iv) A knowledge how much CPU time each part of the protocol costs helps to make an efficient implementation. And also, concentrating on three inter-related research tasks namely: [i] dynamic right-sizing, [ii] high-performance IP, and [iii] Rate-Adjusting, can lead to efficient transport protocol for grid computing.

Table VIII A. Comparison chart for TCP Based Protocols - Part 1

HS-TCP

STCP

Fast TCP

TCP-Illinois

Design

TCP

Mode of Operation

D2D

Security

Congestion control

Modified

MIMD

Depends on parameters

Uses C-AIMD Algorithm and use loss and delay for congestion control.

Fairness

Yes

Proportional fairness

Maintains fairness

TCP Friendly

Yes

Performance

window size less and recovery less time

Improved

Well performed

Performs more than standard TCP

Throughput

Increases

Better Throughput

Bandwidth Utilization

Low

High

Multicast

Implementation details

Linux Kernel 2.4.16

Linux Kernel 2.4.19

Usage

High Speed data Transfer

High speed TCP variant.

Table VIII B. Comparison chart for TCP Based Protocols - Part 2

TCP-Africa

CUBIC TCP

XTP

CTCP

Design

TCP

Mode of Operation

D2D

D2D, M2M

Security

Congestion control

Modified + 2 Mode Congestion Avoidance rule adopted

Modified

Combination of Loss and Delay

Fairness

Improved

Enhanced

Improved

TCP Friendly

Yes

Performance

Two mode congestion increases the performance

Increased because of good network utilization

Increased, more when used in FDDI, Fibre Channels.

Increased because of good network utilization

Throughput

Good

High

Good

Bandwidth Utilization

Acquires available bandwidth

Good

Multicast

Yes

Implementation details

Developed by XTP Forum

Developed by Microsoft and implemented XP

Usage

High Bandwidth Networks

High Speed Networks

High Speed Networks and Multimedia Applications

High Speed Data Network.

Table IX. Comparison chart for UDP Based Protocols

NETBLT

RBUDP

Tsunami

Design

UDP

UDP data + TCP control.

Mode of Operation

D2D

D2D & M2M

D2D

Security

No.

Yes

Congestion control

Optional. Limited congestion control can be turned on.

Limited. Sending rate is reduced when loss rate is more than a threshold.

Fairness

TCP Friendly

Performance

Performed extremely well

eliminates TCP's slow-start and uses full bandwidth

relays on the parameters

Throughput

Good

Bandwidth Utilization

Fair

Good

Multicast

Implementation details

Provides C++ API.

www.indiana.edu

Usage

developed at MIT for high throughput bulk data transfer

Aggressive protocol designed for dedicated or QoS enabled high bandwidth networks.

Designed for faster transfer of large file over high-speed networks.

Table X A. Comparison chart for Application Layer Based Protocols - Part 1

SABUL

FOBS

LOBS

UDT

Design

UDP data + TCP control.

OO Based

FOBS

UDP

Mode of Operation

D2D & M2M

D2D

D2D & M2M

Security

Yes

Congestion control

Rate based algorithm

Yes

D- AIMD

Fairness

Fairness is independent to network delay

Yes

Fairness of UDT is independent of RTT

TCP Friendly

Yes.

Yes

Performance

90% of the available bandwidth

rate of 35MB per second

Throughput

Good

Bandwidth Utilization

Good

High

Multicast

Implementation details

C++ Library on Linux.

udt.sourceforge.net/

Usage

A general purpose transport protocol.

High Speed data Transfer

Table X B. Comparison chart for Application Layer Based Protocols - Part 2

LambdaStream

GTP

GridFTP

Grid Copy

Design

UDP data + TCP control

Light weight UDP

FTP

Mode of Operation

D2D

D2D & M2M

Security

GSI

scp style

Congestion control

Rate Based + Modified

Same as TCP

Fairness

Yes

TCP Friendly

Yes

Performance

Improved, because of loss recovery and rate control

Improved because of Receiver driven and flow control

Problems persist in file sizes and number of connections

Problems persist in file sizes (small size)

Throughput

Good

Bandwidth Utilization

Good

High

Multicast

Yes

Implementation details

Globus Toolkit

Usage

Long Fat Networks

High Bandwidth Networks

High Bandwidth networks

Application Layer Approaches For Grid Networking Information Technology Essay

You may also find these documents helpful