The performance of the SMTP, POP and X.400 protocols is compared in terms of end-to-end delay, volume of traffic and number of frames generated at the physical layer and frame length distribution. An analytical model is proposed for the approximation of the upper and lower bounds of volume generated by SMTP which can easily be extended for POP3. The model considers both explicit and piggyback acknowledgements at the TCP layer.
Figure : Testbed Configuration
The performance analysis of the SMTP, POP3 and X.400 protocols is based on benchmarking. The configuration for the system tested is show in Figure 1. A P90 PC running Netscape Mail for SMWROP and NEXOR Message Ware for X.400 was connected to a SPARCstation 4 running the SMTP, POP3 and X.400 servers. A Network General Expert Sniffer protocol analyser was used to capture the frames exchanged between the client and server. Finally a Windows Sockets monitor application was used on the client in order to monitor and timestamp events at the TCP layer.
The important performance measures for the system are the protocol efficiency, include the volume of user data over the volume created by the protocol stack, the end to end delay, the number of Medium Access Control (MAC) frames created and the frame length distribution. The last two are significant factors for random access MAC protocols such as Carrier Sense Multiple Access with Collision Detection (CSMAKD), ALOHA etc. where data transmitted in many small frames can result poor performance.
For the performance tests ten messages ranging from IKB to lOKB which were sent to and retrieved from the server are generated. Comparing the end-to-end delay results, the delay differences at the Physical layer, derived from the protocol analyser traces, and the TCP layer, derived from the Winsock monitor traces, were insignificant (0.10% to 0.36% of the total end-to-end delay with a mean of 0.21%). Therefore we disabled the monitoring application in the client in order to minimize overheads and repeated the tests.
Protocol Efficiency
Protocol overheads that are added in the form of headers in a PDU or as connection establishment for the communication of peer layers can reduce the efficiency of a certain profile. Figure 2(a) shows the efficiency for the POP3, SMTP and X.400.
SMTP adds minimal overhead to the user message, in the form of message information such as sender address, recipient address, time that the message was submitted and message encoding scheme. POP3 does not add any message headers. The additional overheads for the SMTP and POP3 protocols are incurred by the handshake commands exchanged between client and server which are carried on individual TCP PDUs. SMTP has to define the sender, recipient and indicate the beginning of message transfer by issuing the MAIL, RCPT and D.4TA commands respectively, all of which are explicitly acknowledged by the SMTP server.
For the X.400 protocol apart from the message headers, user authentication and service definition commands exchanged between client and server, the complex protocol stack and plethora of underlying protocols are a major source of protocol overhead. The TCP connection is established once for X.400 and POP3 while SMTP disconnects after each message is successfully transmitted.
As shown in Figure 3(a) SMTP proves to be the least efficient and only when message size is larger than 8KB does the efficiency of SMTP get marginally higher than that of X.400. The poor performance for SMTP was the result of the TCP connection tear-down following the submission of each message. This results in an additional SMTP client-server handshake for each subsequent message. This is an implementation specific drawback and depends on the client software. Therefore the SMTP protocol efficiency should be calculated when the TCP connection is maintained until the last message is submitted. The efficiency for the optimised SMTP can increase as much as 9.3% for 1KB messages.
The efficiency of all three protocols can be further improved by fine tuning the operation of the TCP layer in such a way that acknowledgements are piggybacked rather that created explicitly for each TCP PDU. The explicit TCP acknowledgements account for 6.78%, 3.52% and 8.22% of the total volume created by SMTP, POP and X.400 respectively (for IKB messages). The TCP slow start algorithm can only prove beneficial in the case of a slow and/or heavily loaded servers. This will prevent the client SMTP application from sending packets, before receiving acknowledgements, that would be dropped at the server end. For high speed networks and servers the performance of the protocols can benefit from piggyback accumulated acknowledgements.
End-to-end delay
The end-to-end delays at the physical layer is measured using the time the first TCP synchronise (SYN) PDU is captured until the last TCP FIN acknowledgement. This end-to-end delays at the physical layer is significantly larger than the end-to-end delay at the TCP layer as derived from the Winsock monitor traces. Therefore the Winsock monitor is disabled since it adds additional overhead at the client end.
The results for the end-to-end delay measured from the POP3, SMTP and X.400 are depicted in Figure 2(b). For all three protocols there are significant variations for the delay with respect to message size. From the graph, we can see that POP3 is proved to be the fastest when compared with SMTP and X.400. This shows that SMTP and X.400 requiring substantially more time to deliver their messages. But, although SMTP appears to produce the highest delay , a large portion of the delay is the time required by the inetd daemon to wake up the SMTP daemon. This is repeated several times during the SMTP benchmarking because the client resets the TCP connection after each new message.
In Figure 2(c) delay 6 is plotted over the overall end-to-end delays. As we can see the delay for alerting the SMTP daemon ranges from 38.3%-42.6% of the overall delay. Therefore, SMTP performance can be greatly improved by keeping the TCP connection alive until all the messages have been submitted.
The end-to-end delay also depends on the specific software implementation of the protocol, the load on the client and server, and the intermediate link between them. For the duration of our tests the background load on the server, the client and the interconnecting lOMbps Ethernet LANs was minimal.
Frame length and frame length distribution
An important metric for the protocol performance is the number of frames generated at the physical layer and the distribution of their length. It is greatly reduced when the offered load is in the form of many small rather than less but larger frames. This is due to the increased probability of collision which also increases as the network size increases. Such a metric is particularly important when analysing random access protocols for networks that run for several kilometres such as data based Community Antenna Television (CATV) networks.
The results for the number of frames created by the POP3, SMTP and X.400 are presented in figure 2(d). In this case optimised-SMTP is also considered. SMTP generates more frames compared to POP3 and X.400. Although the optimised-SMTP reduces the number of frames created, it is only better than SMTP and still larger than POP3 and X.400. The plots for POP3 and X.400 are almost linear with respect to the message size for message sizes larger than 3dB. In contrast the SMTP and optimised-SMTP curves the number of frames generated increases noticeably when moving from odd to even number of kilobytes of message length. This is due to the actual message delivery mechanism at the TCP layer.
The number of large (over 1000 bytes) and full frames in the POP3 and X.400 cases is significantly greater due to the fact that the two protocols utilise the full MAC frame. In comparison SMTP generates more short (less that 100 bytes) frames and as many half-full and full frames. This is another reason for the decreased SMTP efficiency. However this turns into an advantage for SMTP in noisy environments. Smaller packet sizes would yield increased throughput since smaller packets have smaller loss probability and even if lost or corrupted a smaller amount of information would have to be retransmitted.