VoIP spam detection system

Chapter 1: INTRODUCTION

Since late 1800's voice data is being carried by the Public Switched Telephone system. The cost for long distance calling has grown to billions of dollars in USA in both business and residential sectors. The primary objective is to reduce the cost and as a result of this there is a growing interest of using IP based networks to transmit voice. The major reason of developing VoIP was for the voice communication all over the globe at a minimal cost. A lot of factors have contributed to the increasing use of VoIP because the cost of packet switched networks is almost half of the Public Switched telephone network (PSTN) [15]. VoIP stands for Voice over Internet Protocol [1]. It is also referred to as IP Telephony. IP telephony is a type of technology that allows standard voice signals to be compressed into data packets and transmitted over the internet or IP networks [1]. VoIP is a type of mechanism which is used to transmit voice signals across the internet. VoIP has to better the standards and the comforts provided by the PSTN. VoIP concepts have been in market for long but the use and integration of VoIP has just started. The major differences between PSTN and VoIP are as follows. PSTN has dedicated lines which are 64 Kbps in each direction whereas in VoIP all channels are carried over one internet connection [16]. Features like Caller ID and many others are available for a specific cost in PSTN whereas in VoIP these features are available for free. In PSTN long distance calls is usually per minute or bundled minute subscription whereas in VoIP this feature is included in the regular monthly price [16]. PSTN can be upgraded or expanded with new equipment and line provisioning whereas VoIP upgrades require only bandwidth and software upgrades.

Technically VoIP call takes place in the following manner:-

* The analog to digital converter (ADC) converts the analog signals into digital signals i.e. into bits [5].

* The bits are then formatted in the format for the transmission of voice data.

* The voice packets are encapsulated as data packets using a real time protocol (RTP).

* The receiver extracts the data converts the data to analog signals which are then send to the phone [5].

When VoIP implemented properly it has many advantages than the PSTN like Cost Reduction, Simplification, and Consolidation [5].

Approximately 18 % of the voice traffic is being carried on by IP telephony [14]. As number of users using this service is increasing day by day Spammers are finding out new ways in developing spam and harming the system. The main problem in detecting VoIP Spam is that it has to be detected in real time as compared to email spam [6]. There has been little work done in finding ways to defend VoIP networks from various attacks and therefore we have to make sure to implement a Spam detection system that should be effective enough to detect all kinds of VoIP spam being sent across the network.

1.1 Project Goals and Objectives:-

Table 1:- Project Goals

Project Objective:-

The objective of this project is to have an in-depth study of VoIP Spam or Spam over Internet Telephony (SPIT) threats, generate Spam calls using SIP based IP phones and finally creating application software which would detect VoIP Spam.

This project also includes implementation of white listing and blacklisting techniques which is used to detect spam under certain exceptions. I intend to use the XTEN XLITE soft phones which will be installed in the client machines.

All the calls made from the client will be routed through a central server which has all the entries in its database and is responsible to authenticate all calls made before forwarding it to the destination. During the authentication and implementation phase concrete efforts have been made to detect all the spam calls without discarding the legitimate calls coming from a known source for a meaningful purpose.

1.2 Technology Trends and Market Research:-

Communication has a long history going back to the invention of telephone by Alexander Graham Bell in 1876. The history of VOIP can be traced to 1995 when a small internet based company called VOCALTEC made a phone software using H323 signaling protocol. VOIP traffic had about 1% share of the entire voice traffic in the United States by the end of 1998 which grew to 3 % by mid 2001. Today, VoIP traffic represents 13% of the voice traffic in the US [14]. According to an estimate, the total revenue in the VoIP market will grow by 24.3 % to $ 3.19 billion [14]. Also the total number of VoIP users will grow to 16.6 million, a rise of 21.2%, by the end of the current year [14]. Vonage is the leading VoIP provider in the United States with a 53.9% share the market. There is lots of future for the VOIP technology. Smarter applications must be integrated with the VOIP technology to achieve privacy and block unwanted calls and to have an amazing phone experience. There is always a fear for small businesses to move into a larger business i.e the fear of the unknown technology. Lots of work is been done to ease their concern and provide them convenient service. Also the users who do not have a broadband connection, they find it easier to use VOIP technology over PSTN.

Table 2:- Leading VoIP Providers and their corresponding Market Shares [14]

1.3 Project Requirements:-

* Functional Requirements: -

The project is a mixture of various areas of study. The two primary goals of this project are first to research the existing techniques of the spam detection and second to integrate these techniques in SIP Proxy Server.

The VoIP Spam detection system is a system used to detect spam calls and make the network spam free. This is achieved by developing a server in C language which is able to authenticate the users and then to detect spam. A MySql database was created in the server which had the lists of legitimate users and potential spammers. The server acts as an IP- PBX as it routes the call to various destinations.

The above figure is a traditional system consisting of the sender, the server and the receiver. When the client places a call it is directed to the server which directs the call to the intended destination.

The above figure is the proposed new system which consists of spam detection too. A database is created within the server which consists of the white lists and the black lists i.e. the list of legitimate users and list of potential spammers. Also an authentication feature is also added to the server. When the client makes a call the server makes an authentication and spam check by comparing the call sequence against the blacklist rules implemented on the server i.e. pattern matching. If the user is present in the white list the server forwards the call to the destination and if any user in the white list sends spam across the network that user is placed in the blacklist and all further calls from that user are dropped by the server dynamically.

The following are the list of functional requirements for the above system:

1. The IP-PBX enables voice communications among users in the network.

2. O develop a SIP Ser which will be used to forward calls to the destination

3. To generate spam using the SIP phones which are installed in the client system

4. To make a Spam detection system which will detect spam and drop the calls before the call is forwarded to the destination.

5. Also to add an authentication feature so that only legitimate users can communicate among each other.

6. To build an index of information about spam abuse and their counter measures.

* Hardware Requirements & Software Requirements:-

For hardware we used the Intel based pc's Dlink modems and for software we used the UNUNTU operating system and XLite softphones. We used the Mysql for creating a database and Microsoft word/visio, Smartdraw etc.

Chapter 2: VOIP SECURITY

Recently researchers have become more concerned about the VoIP security as security was not a major issue at the earlier stages of development. Security is one of the integral parts of VoIP applications [14]. Intruders can hack into your voice mail box also intercept incoming and outgoing phone numbers listen to ones conversation in an IP network. The security domain in Voice over IP much larger than the Public Switched Telephone Networks so it should be designed and implemented in a secure manner providing end to end security for VoIP Applications.

The fundamental requirements of VoIP applications are Confidentiality, Integrity and Availability [6]. VoIP is vulnerable to attacks like malicious code, Denial of Service, Distributed denial of service. These attacks affect systems by consuming resources disrupting confidential information. The prevention and detection code in the internet is difficult if the code is encrypted. Another attack which poses DDos problems to VoIP systems is Pharming [6]. Another major threat is SPAM. VoIP is susceptible to spam also known as SPIT (spam over Internet Telephony).

Spam over Internet telephony (SPIT) is a bulk, automatically generated, unsolicited call. The similarity between email and SPAM is that in both cases, senders use the Internet to target recipients or a group of users, in order to place bulk unsolicited calls [2]. The main difference between SPIT and SPAM is that SPIT can only be analyzed after the call is established. Email spam can be checked before the user sees it. Another major difference between email spam and SPIT is that a single email itself can contain information that can be used for spam detection.

A single SPIT call in contradiction is technically indistinguishable from a call in general. A SPIT call is initiated and answered with the same set of SIP messages as any other call. There are 3 steps in which SPIT occurs. The first step is to systematically gather all the information like contact addresses of the victims [6]. The second step is to establish connection with these users and the third step is to send messages to these users. Therefore we can say in general SPITTING is the systematic gathering about available user accounts and the systematic session establishment attempts to as many users as possible in order to transfer any kind of message. Unfortunately many mechanisms which work for email spam fail completely in the context of VoIP. There are various reasons for this mishap. Many filtering strategies can be applies to the file because the file first comes to the server before it is downloaded by the user. In contrast, in VoIP, human voices are transmitted rather than text [3].

There are several techniques to prevent or mitigate spam in VoIP networks. For SPIT mitigation techniques to be effective and promising for the future, they need to meet a number of criteria. The most important criterion is that the protection techniques have to identify spam before the user's phone rings [4]. A good spam protection technique should therefore be both effective as well as difficult to circumvent.

Based on the criteria's there are number of techniques:-

Signaling Protocol Analysis:

VoIP calls consist of two parts: signaling and media data. Before a VoIP call starts, signaling data for setting up the call is exchanged between both users. A characteristic of spam calls is that they are unidirectional the spammer initiates the calls to the targeted network. This technique has the benefits that it decides if a call is a spam call before the phone at the receiving side rings and the technique is located at the service provider, so the user isn't bothered with spam calls and maintenance. However this technique can only decide if a call is a spam call after at least ten calls from one caller. This indicates a heavy reliance on the fact that a spammer will not change his number for quite some time [4].

Device Fingerprinting:

According to the person if we compare the header layout and order or the response behavior of a SIP User Agent with a typical User Agent, we can determine if the initiated session establishment is an attack or a normal call [1]. Two types are Passive fingerprinting and Active fingerprinting. In passive finger printing INVITE message of a session initiation is compared with the INVITE message of a set of standard SIP clients. If the order or appearance of the header fields does not match any of the standard clients, the call is classified as SPIT. In active finger printing user agents are probed with OPTIONS requests and the responses are analyzed and compared with the response behavior of standard clients [1].

The fingerprint in this case is the returned response code and the value of the Allow header field. If the fingerprint doesn't match any of the standard clients, the call is classified as SPIT.

The weakness of passive fingerprinting is that passive fingerprinting only analyzes the order and existence of the header fields of an INVITE message, an attacker simply needs to order the header fields in the same way as one standard client [1]. In that case the passive fingerprinting mechanism can't detect the attack.

Whitelisting:

According to the writer whitelisting is a technique primarily used in instant messaging networks. In case ofVoIP a whitelist contains the telephone numbers of the people that are allowed to call, all other people are blocked [5]. Whitelisting blocks all spam calls in theory, assuming nobody on your whitelist is a spam source or will become one. But this is also a big disadvantage, because if a unknown person wants to call you, his call will be blocked. Some home users don't think this is a disadvantage, but for business users it is vital that potential customers are able to contact them. [5]. According to the writer this technique is also used in Skype. If client A wants to call client B, first the contact needs to be added to the contact list and then client A should send a contact request to client B [3]. Only when client has accepted this request, client A can make calls to client B [3].

Blacklisting:

Blacklisting is the complete opposite of whitelisting, instead of maintaining a list of numbers of the people that are allowed to call you, you maintain a list of numbers that aren't. For this system to be effective it needs to be implemented on a global level, when separate users maintain their one blacklists it will have very limited effects, because spammers will simply call someone else [5]. According to the writer a Black List on server side would require statistical methods for classifying a caller as Spitter. In case at the client side, the user marks a caller as a Spitter [1].

Greylisting:

Greylisting applies a simple rule that each call will be blocked unless the same number has tried to establish a call within the last N hours/minutes. When a call is blocked, the sender will receive a message like the user is currently busy. When the sender calls again within the N hour timeframe, his number is automatically added to a whitelist and all future calls will be connected immediately [6]. According to the writer Grey listing just represents a mechanism that allows first time contact [1]. According to writer Grey listing is a useful technique which can be used to block spam calls [7]. Each call is blocked unless the same sender tries to establish the call again within a certain time period. Greylisting may also block emergency calls from friends.

Turing Tests and Computational Puzzles:

According to writer on initial call establishment attempt, the caller is transferred to an interactive System, where he is challenged with a task e.g. dialing 5 digits that he is hearing [1].

While the numbers are read out background music or any other kind of noise is played, so that speech recognition systems can't be used to solve the task. A human caller in contradiction will solve the task without difficulties and only if the task is solved, the call will be forwarded to its destination. Turing tests can be used in combination with white lists. Computational puzzles goal is to raise CPU costs of a call and so reduce the number of undesirable messages that can be sent. The weakness of Turing tests is that this method is very intrusive. User Interaction

is forced every time a caller is not present in the White List of a callee. The difficulty with Computational Puzzles is that different VoIP endpoints have different abilities in computational power. So if the task is to hard to solve, session establishment will be delayed very much while attackers with high CPU power PCs won't be concerned much. With this fact Computational Puzzles are very ineffective and contra productive, as they only bothers normal users [1].

According to writer Turing tests are those solutions whereby the sender of the message is given some kind of puzzle or challenge, which only a human can answer [6]. If the puzzle is solved correctly then the user is registered in the white list. Designing such tests is not easy, since ongoing advances in image processing an artificial intelligence continually raise the bar [6].

Content Filtering:

According to the writer this method is completely useless for the Call spam [6]. There are two reasons. First, in the case where the user answers the call, the call is already established and the user is paying attention before the content is delivered. The spam cannot be analyzed before the user sees it. Second, if the content is stored before the user accesses it; the content will be in the form of recorded audio or video [6]. According to writer Content Filtering makes use of speech recognition technology to analyze if the content of a message is spam. But disadvantage is in today's technology it is impossible to analyze speech content in real time [7].

According to another writer Content Filtering makes use of speech recognition technology to analyze if the content of a message is spam, however with today's technology it's impossible to analyze the content real-time [6].

Chapter 3 PROJECT DESIGN

3.1 Introduction:

Email spam is one of the most prevalent forms of sending spam across the network, but spammers have managed to compromise email servers by adopting new techniques. But now the focus has been shifted to VoIP networks. Spammers have started sending VoIP spam's over the networks adopting new techniques. Sending spam over VoIP networks is known as Spam over Internet Telephony (SPIT). As a result network administrators have to find new counter measures in order to prevent spam. Detection of VoIP spam is much more complex than the email spam because VoIP spam needs to be detected in real time as compared to email spam. Spam can flood a network and bring down the server or a network performance considerably by consuming resources and bandwidth. Therefore there has to be an efficient system to detect Spam before spammers can gain access to the system. One such system under development is the VOIP Spam Detection System. Since spammers in day to day life are trying to develop various techniques to generate spam the administrator of the system has to update the system regularly so that it is able to detect any kind of spam or unwanted call from reaching the receiver. There are three phases of the project. The first phase involved extensive research on the working of spam and also various forms of VOIP spams. The second phase involved finding ways to detect spam and the third phase involved to implement a Client-Server model to detect Spam. The technique was integrated using the open Ser Sip Server which involved extensive studies and research.

3.2 Architecture Subsystems:

The following figure shows the project architecture in detail. The following figure shows a sender a receiver a SIP server and the detection system. When the client makes a call it is routed through the central server before it is forwarded to the receiver. When the server receives a call it first checks the authentication of the call by checking into the URI of the call with the URI's which is present in it database. It then updates its white list and black list depending on the nature of the call whether it is spam or legitimate. The database then updates its lists using the update engine.

This project is developed in C/C++ and MySql database is used. This system will include various techniques to detect spam namely white listing black listing pattern matching, hashing etc. There are 3 important parts of the VOIP Spam Detection System namely

* XTEN XLITE Soft phone

* Open SER SIP server

· MySql database

Soft phones:

A soft phone is software which is used to make calls over the internet with the help of a application layer protocol namely the Session Initiation Protocol (SIP) rather than using a dedicated hardware [20]. For our project we have used the XLITE XTEN VoIP soft phones to call over the internet which are identified by SIP identifiers, these Sip identifiers have are useful to route calls to appropriate destinations. These soft phones have features like making, sending and receiving voice messages.

SERVER (SIP server):

A SIP server is the main component of an IP PBX dealing with the setup of all SIP calls in the network. It is software which manages all the calls which are either spam or legitimate [19]. In our project the server consists of the VoIP Spam detection system. The function of the Detection system is to authenticate the incoming calls forward it to their respective destinations and also to detect whether the calls are spam or legitimate. All the calls made by the sender are passed through the server which checks it user location table in order to find the location of the destination soft phone and then forwards the call to the destination. The server consists of a database which has both the list of the legitimate users as well as the list of users who are black listed by the receiver [19]. When a call originates from the sender the call is passed through the server which uses the SIP identification to check whether the user is in the legitimate list or the black list. If the user is in the legitimate list then the server allows the call to go through and if the user is in the blacklist the server blocks the call. If the receiver marks the call as legitimate then the user is added to the legitimate list and any number of calls from that user is allowed to pass through and if the receiver marks the call as spam it gets updated in the black list dynamically and as of a result of this no calls will be accepted from the user [19].

Mysql database and Update engine:

The engine used for updating the database is the Update engine. When a receiver marks the call as a legitimate call the update engine updates the database by adding the SIP identification of the call to its whitelist dynamically [19]. Similarly when the receiver marks the call as spam the update engine updates the blacklist present in the database. Database is the place where the black list and the white list are stored.

3.3: SPAM DETECTION ENGINE in Detail

The Spam detection engine is the main component of the Spam detection system. It is at this engine the actual spam detection takes place. When the sender places a call the server directs the call to the spam detection engine. This spam detection engine then uses the SIP identification of the source to check the authenticity of the call and matches the SIP identifications present in its records in the blacklist. Apart from these spam detection engine also performs call authentication, pattern matching, content analysis etc.

* Call Authentication: The spam detection engine first authenticates the call whether it is a legitimate caller or no. It does so by checking the SIP identification records present in the white list in the database [7]. If the SIP identification of the caller is present in the white list it will allow the call to be forwarded to the destination or the call will be dropped.

* Pattern Matching: The spam detection engine keeps tracks of the SIP identifications of all the sources of call. It also keeps information about the frequency of the call the time at which the call was made the day on which call was made etc [8]. Based on these patterns a call made be detected as spam or legitimate based on the white lists and black lists in the database.

3.4 Session Establishment

The above scenario depicts the scenario of the XLite client 1 calling XLite client 2. The clients use the XTEN XLITE soft phone to communicate among each other. The clients who are known as SIP User agents (UA's) make use of the SIP protocol to communicate among themselves. The SIP protocol is an application layer protocol which is used to initiate, manage and terminate session between two clients. SIP soft phones have their unique Uniform Resource Identifiers (URI's) in order to identify themselves to the server [6]. In the above figure we can see that when the XLite client 1 sends an INVITE message to the receiver it is first routed through the server. The server first performs the authentication check by checking its database for a list of registered users. If the client is present in the blacklist the call is immediately dropped by the server. If the client is a legitimate user i.e. the client is present in the white list the INVITE message is forwarded to the receiver. The INVITE message will consists of information like the Uniform Resource Identifier of the client 1 and client 2, the type of session etc. When the server forwards the INVITE to the receiver the receiver decides whether to accept the call or not by sending a 180 Ringing response back to the sender through the server [18]. When the receiver is ready to accept the call it sends a 200 OK back to the sender. As a result a TCP connection is established between the sender and the receiver. Once the connection is established then direct transfer of messages between the sender and the receiver is possible. If either of them wants to end the conversation they send a BYE request and the connection is terminated.

3.5 Interfaces:

As mentioned earlier there are four major components in this project namely SIP Server, XTEN XLITE VoIP Soft phones, Spam Detection System, Database and Update Engine.

When the sender makes a call the call is routed to the server. The server performs pattern matching i.e. compares the call sequence with the black listing rules defined in the server to determine whether a call is a legitimate call or spam call. The updates engine is used to update the database of the system regarding legitimate users and potential spammers. The ways in which different components of the system communicate with other components at various interfaces are shown in the following figures:

CALL FLOW OF A SPAM USER

The above diagram is the sequence diagram when the caller is Spam. When the caller generates a spam call the call is first directed towards the server. The server then checks in its database whether the call is a legitimate call or no. If it is not a legitimate call the server enters the details into the blacklist. As a result the call is not forwarded to the receiver and there is no response for the receiver. As a result of this the call is dropped.

CALL FLOW OF A LEGITIMATE USER

The above diagram is a sequence diagram when the caller is a legitimate user. When the caller calls the call is first directed towards the server which checks the nature of the call by checking in its database. If the call is legitimate the server enters the details into its white list and forwards the call to the receiver. As a result of this the receiver answers the call and a connection is established.

Chapter 4 Project Architecture

4.1: Introduction

Spam can flood a network and bring down the server or a network performance considerably by consuming resources and bandwidth. Therefore there has to be an efficient system to detect Spam before spammers can gain access to the system. One such system under development is the VOIP Spam Detection System. Since spammers in day to day life are trying to develop various techniques to generate spam the administrator of the system has to update the system regularly so that it is able to detect any kind of spam or unwanted call from reaching the receiver.

There are three phases of the project. The first phase involved extensive research on the working of spam and also various forms of VOIP spams. The second phase involved finding ways to detect spam and the third phase involved to implement a Client-Server model to detect Spam. The technique was integrated using the open Ser Sip Server which involved extensive studies and research.

4.2 Client Architecture

The client we use for the project is the XTEN XLite soft phones. These soft phones use the Session Initiation Protocol (SIP) to communicate between the sender and the receiver. Counter path's XLITE is the largest selling i.e. market leading SIP based soft phone [21]. It has various features namely:

1. Combining voice and video calls

2. Instant messaging

3. Presence management

It acts as transition from the traditional hard phone environment into the world of VoIP [21].

CONFIGURATION:

XLITE NETWORK SETTINGS:

The above shown figure is how the client's network settings are configured in order to communicate among users. The client uses the SIP port number 5060 to send and receive data. Since SIP is used to manage and teardown a session the actual carrier of data is the Real-time Transport Protocol (RTP) which uses port 8000 to send and receive data.

XLITE PROXY SETTINGS:

The above figure shows the proxy settings of the client. In order to enable communication between the users through the server, the client needs to be authenticated by the server. Therefore each client has a username and a password which are registered in the userloc table which is present in the server's database. Whenever the client tries to make a connection through the server the server checks in its userloc table and forwards the call to the intended destination.

4.3: Server Architecture

SIP EXPRESS ROUTER (SER) is an open source which was developed by a group of developers employed by Fraunhofer Fokus a German research institute. The server is responsible of detecting spam calls after user authentication. The server identifies the spammers and puts them into the black list. The server uses the black list and white list technique to detect spam which is a set of rules used for user authentication and blocking spammers from attacking the network. The server runs on LINUX machine and the application is developed in C language. The server modules which are loaded into the ser.cfg file in the server are the brains of the server. The ser.cfg file has seven main logical sections namely:

* Global Definitions Section: This configuration file consists of the IP address and the port to listen on, debug level etc. It also contains the major variables used in the server configuration file.

· Modules Section: This section consists of the modules which are loaded during the start of the server. This module section contains the .so files which are loaded during the load modules.

* Modules Configuration Section: In this configuration section the module parameters are set. These parameters are set using the modparam command in the following format modparam(module_name, module_parameter, parameter_value)

* Main Route Block: This main route block is similar to C programs in the main function. This block controls the way the commands should be executed.

* Secondary Route block: This block acts a subroutine to the main route block.

* Reply Route Block: this block is utilized to handle the SIP messages replies especially the OK messages.

* Failure Route Block: This block is used to handle failure conditions like the timeout and retransmission etc.

The server provides supports to all kinds of databases like oracle, radius mysql etc. In this project mysql database is used. This database is used store black lists and the white lists. The server also stores the user location table which is used to identify the users connected to the network. The server stores the Uniform Resource Identifiers of all the users.

4.4: Database Architecture

The whitelist and the blacklist module access the database before routing the call to the destination user in the network. The usrloc table is present in the database which is used to store user locations. SIP SER uses this table to locate users in the network on basis if their URI (Uniform Resource Identifier). The black list module uses the database to keep a record of users who are potential spammers and are prevented from calling other users in the network. a user in the white list is placed automatically by the server if spam is detected in the nature of calls by that user. The package, libsqlclient, was installed along with MySQL support before the database tables were created. The mysql.so module present in the server configuration file, ser.cfg loads the MySQl database support for the server at startup.

Chapter 5: IMPLEMENTATION

5.1: Introduction

The main objective of this project is to create a Spam Detection system which will detect Spam using the Sip server. For accomplishing the task one has to start with the basic setup of the IP-PBX systems. The implementation is divided into many phases namely the Client Implementation, the Server Implementation, the white list and the black list implementation etc. The following are the functional requirements of the system namely:

* To enable VoIP communication among the users using the IP-PBX system.

* To develop a SIP Proxy server to route the calls from the sender to the receiver.

* To develop a SIP based soft phone which run as clients.

* To develop a Spam Detection system which will decide whether the call needs to be forwarded to the appropriate destination by checking into its database?

* To add user authentication in the server so that only authenticated users can communicate among each other using that SIP server.

* To build an index of information of the countermeasures of spam over the Internet.

The scope of the project includes the following:

* To implement a XTEN XLITE SIP soft phone at the client end to enable to make voice communication among the users.

* To modify the configuration file in the Server by implementing various modules using C programming in the Sip server to route calls among the users.

* To implement a mysql database in the server so that when the server wants to route calls among the users it will first check into its database whether the user is a legitimate user or not.

* Multiple users will be allowed to communicate among themselves using multiple connections using the SIP server.

* The server is scalable to add new features as and when necessary.

* Also we have an authentication feature which allows only the registered peers to communicate with the other peers in the network.

5.2: Client Implementation

The XTEN XLITE soft phones were launched in April last year and since then 30000 users have used its services [14]. These phones are user as SIP user agents to communicate with each other. These phones are responsible in generating the INVITE request to the IP-PBX server in order to enable VOIP communication among various users. These phone use the Session Initiation Protocol (SIP) in order to communicate among themselves. SIP is a text based signaling protocol used for creating and controlling multimedia sessions between two or more participants [17]. It is one of the most important protocol in VoIP.SIP is a text based client-server protocol which is either transported over TCP or UDP.

SIP enables features like video conferencing, video streaming, online games etc. this runs on either TCP or UDP [6]. The SIP soft phones use the port 5060 to connect to other users and use the URI's to detect the users. The SIP user agents are responsible for creating and managing a Call session. All the calls made from the XTEN XLITE soft phone are routed through a central server which decides where to route the call. The SIP Server has the following entities:

User Agent:

User Agent is a client who is starting a session between server and client. This client is always a sender in sip server while server is a receiver [6]. This client can be phone, fax or anything. According to requirements the role of server and client can be interchanged in the single session.

Registrar:

In SIP registrar has the same functionality as Gate Keeper has in H.323. This register is working as a database center in SIP Server. It has all the information like IP address and port address of all connected users. It is also updating its data base according to the connection changes between server and client [6]. It is performing an authentication function.

Proxy Server:

Proxy Servers execute call signaling on the behalf of the parties they serve. It is also performing an authentication function. Proxy Server stays in the path of call routing. It receives SIP requests and routes them to the other users or other Proxy Servers [6]. Server do it based on the routing data base and routing rules. Proxy Server can stay in the path of call.

Redirect Server:

Redirect Server determine the current location of the called party and gave its address to the calling party . Then calling party can do direct contact of called party. It is basically used to point the location purpose. Redirect server is not staying in the path of call.

Max-Forward: It is the maximum number of hops required to reach the destination. The number of hops goes on decreasing by 1 so that connection request would have time to live (TTL).

Via: It is the address of the user from which it is expecting a response.

To: This contains the address of the destination.

From: This contains the address of the server.

Call-ID: It is used to uniquely identify the call.

CSeq: It is used to detect lost messages.

5.3: Server Implementation

The Central SIP Server is the most important component of the system. This Central SIP server controls all the activity until a connection is established. One has to develop a SIP server in order for the VOIP communications to takes place among the users. The server acts as a IP-PBX which registers the users and authenticates them when the user sends an INVITE request. When the client calls the call is first forwards to the Central SIP Server. This server first checks the authenticity of the call by checking into its database whether the call is a legitimate call or no. Once the server confirms that the user is a legitimate user it updates its white list and forwards the call to the intended destination. If the user is a spam user the server puts the user into its blacklist updates its blacklist and drops the call without forwarding it to the destination. The SIP server is available at www.iptel.org and it is in accordance with RFC 3261. It also supports various features like Oracle, MySQL, postgres, radius databases can be used a back end [19]. I am using MySQL as my database back-end.

I have implemented SER on an Ubuntu Linux machine. To enable the proper functionality of the server various other packages were installed like the "libpq", "libmysqlclient" in order for the database and also the bison command was implemented. The modules which are present in the configuration file of the server ser.cfg need to be configured properly. Whenever the server is started the configuration file ser.cfg is run through because it controls the sequence in which the modules should be loaded. The ser.cfg acts a kernel of the SIP server and also supports various other modules. To enable a particular feature of the server a module needs to be crated and managed in the ser.cfg file. I have implemented the black list module which will load with the server configuration file. The configuration file is developed using C programming language.

Once the server is ready it identifies its users using the URI's. When the server gets the URI's it checks in it database for the user and if a match is found it authenticates the user. For starting of the server we need to get to the place where the server is installed. In my case the server is located at the following destination:

The SIP server acts as an IP-PBX which is responsible for routing calls to the intended destination. The spam detection system is developed and integrated with the Open Source SIP server which is responsible for detecting spam in real time. Spam messages created at the client end are tested and verified at the server.

5.4: Database Implementation

Once the server was created a MySql database was created within the server which is used for call authentication. The server configuration file ser.cfg was modified to enable MySql authentication. A separate module called mysql.so has to be loaded before any other module in the configuration file. This module is always called first whenever the user sends an INVITE request to the server to connect to the remote user within the network. The mysql.so module along with auth.so module perform the function of white list by enabling the user authentication feature in the server. This server keeps all the information about the users. This database contains the black lists and the white lists which have information about the legitimate users and the spam users. This database helps the server to authenticate the user by first checking into its database and if the user is present in the white list the server then forwards the call to the destination.

The registered users not present in the black list are considered as legitimate users who can communicate among themselves within the network. When the user starts sending spam calls the server puts the user dynamically in the blacklist and further calls from that user are always dropped.

5.5: Analysis

The performance of VoIP Spam Detection System is based on how well it responds on being subjected to spam calls and also the working of white and black list. The server is also developed such that it will authenticate the users before it forwards the call to the intended destination. The client first needs to register with the server before making the call. If the client is not registered to the server the server sends back an error 407 asking for proxy authentication i.e. for username and password. The user then registers to the server with the username and the password. When the user registers with the server the server compares it with the record of all registered users in its MySQL database. If the user is present in the database, the user is present in the whitelist, and then the call is forwarded to the destination. This is based on the concept of whitelist i.e. only authenticated users are allowed to communicate among themselves. My server works very well under this scenario. The blacklist technique also works under these scenarios. If the client is not registered to the server or if the caller is considered to be a potential spammer the call is immediately dropped by the server. A user present in the white list is transferred to blacklist if it starts making spam calls to other user in the network.

Here are some of the blacklist rules that have been implemented on this server.

* If the client ends up calling 5 times within a span of 2 minutes then the server determines the caller as a potential spammer and puts the user into its blacklist and updates its list.

* If the connection between the sender and the receiver is slow or if the bandwidth between the sender and the receiver varies when a call is made the server puts the user into its blacklist dynamically.

* We have this feature too that if a specific number is registered to the server to which no user can call and if the call is made to that user the server determines the call as a spam and puts it into its black list. The following figure explains the scenario:

The server denies any requests sent by the users present in the blacklist and is capable of detecting most calls sent over the VoIP network. The server is able to handle 500 INVITE requests i.e. it is able to work when 500 calls are made simultaneously. The password is also saved as a hash function in the server database and so there is very minimal possibility of the password getting stolen by a potential hacker. The server is user friendly as it provides a good user experience. The server does not go down when multiple users are accessing the system at the same time. Also the system is scalable and portable as new features and functions like content analysis, pattern matching, voice recognition etc can be implemented with this SIP server.

Chapter 6: Deployment Operations and Maintenance

6.1: DEPLOYMENT

The deployment of the project involves a detailed analysis of SPIT (Spam over Internet Telephony). The server which has been used is the OPEN SER Server and is available at www.iptel.org. The server was run on a UBUNTU LINUX Machine and various packages like the "libmysqlclient", bison was installed. The server was started using the following command:

joeyfrmfrnds@joeyfrmfrnds-laptop:-$ ser -D -E.

Mysql database was used to store the information about the clients. Authentication feature was also applied to the server. The command to add user inside the database is as follows:

joeyfrmfrnds@joeyfrmfrnds-laptop:-$ serctl add <number> <password> <user@domainname>

where number stands for the name of the client and the domain name stands for the IP address of the user location.

The XLITE clients were installed on the client machines and were given a username and a password. The XLITE clients are available at www.counterpath.com. The softphone supports both LINUX as well as Windows operating system. It was given a IP address of 192.168.0.1 and it has a default port of 5060.

The server was then tested with connection requests from the clients in order to connect to each other. The users who's URI (Uniform Resource Identifier) matched with the user information stored in the database, were allowed to communicate with each other i.e. only the request of authenticated users was allowed to go through to the intended destination. The registered users who are present in the white list are put into the black list if they start sending spam messages.

If the user sends multiple connection requests in one minute, then the user is a potential spammer who is trying to congest network traffic, therefore is put in the black list automatically by the server. Our black list is able to detect potential spammers and drop any further communication from those users.

6.2: Operations

The white list and the black list were loaded up into the server and their modules were loaded into the server configuration file. The database was made using Mysql and the users were added using the serctl command. Whenever the user is puts into the blacklist no further communications takes place with that user. If we try to call a private number or if the user has a wrong username and password the server doesn't authenticate the user and no INVITE request is accepted by the server. If the user calls many times within a specific time the caller is placed into the blacklist and no further communication takes place with that user.

6.3: Maintenance

The server must keep the database updated all the times before it gets obsolete. Spammers are day by day trying to come up with various new techniques to generate spam therefore periodic updates of the system is necessary in order to prevent spam. Whenever a new update is developed or a module is loaded it needs to be tested with the system. The testing approach for the system needs to be foolproof and also must include the scope of failure analysis. The server is very scalable and portable and as a result new features can be added as and when necessary depending upon the requirements. Still SPAM over Internet Telephony is in its prevalent form regular updates need to be made to the system as spammers are developing new and new techniques to break through. We can add new features to the system but complete understanding of the server and a deep knowledge of C programming will be required.

6.4: Implementation Resources

The hardware resources that were needed to implement the system successfully were

* PC: Intel based- PC with a minimum of 512 MB RAM and considerable hard-disk space.

* Network Devices: DLink modem, Linksys wireless router, 8 Mbps internet connection.

The software resources that were needed to implement the system successfully were

* Operating System: Windows XP, Ubuntu Linux

* Languages: C, C++, MySQL

* Microsoft Office: Microsoft Project, Visio, Word, Excel, Power-point

SmartDraw.

Chapter 7 Conclusion

7.1: Learning Experience

For a long time now email has been the most prevalent form of producing spam across the internet. Spammers have developed enough and effective techniques to send spam through email. But now the focus of the spammers has turned into VOIP networks or Internet Telephony. This type of sending Spam is known as SPAM over Internet Telephony or SPIT. Spammers have started developing new and new techniques to send spam across such networks. VOIP Spam can only be detected in real time as compared to email spam. Detecting VOIP spam in real time is very difficult as compared to email spam. One has to have detailed knowledge about the system study the nature of spam look for some specific patterns in order to prevent spam from reaching the intended destination.

The spam messages give trouble and incur costs to both network administrators as well as the users. These spam messages consume bandwidth of the network and time. Therefore a system must be developed to prevent spam. One such system is this project VoIP Spam Detection System. The project was completed in two phases, Phase A and Phase B. In phase A, a lot of research was carried out on the field of Voice over Internet Protocol (VoIP), the Session Initiation Protocol (SIP), enabling communication between SIP user agents, how spammers flood VoIP networks, various techniques to prevent them from gaining access to the network, The second phase of the project involved the actual implementation of our research. The first step was to install the SIP SER on our Linux machines. This was followed by installation of Xten Xlite softphones on the client machines and configuring them according to our network setup.

The system was first set up in order to enable VoIP communication between clients without authentication by the server. I then implemented the concept of white list into our VoIP environment by enabling the server to authenticate users before forwarding calls to the destination users within the network. This allowed server to verify the user accounts on receiving connection requests from the users and allowing communication only after successful authentication from the server. Black list was implemented in the final stages of the project, to disallow potential spammers from gaining access to the network. If some user who is registered with the server and is present in the white list starts sending spam calls, the server puts that user into the black list, and all further connection requests from that user are dropped automatically by the server. The default SIP port 5060 was open to allow for sending and receiving connection requests and other messages. The server requested login ID and password from the clients before they were allowed to log in into the server and allowed to gain access to the network resources.

A dedicated and sincere effort has been made to create such a system but as one knows nothing is perfect, there is always room for improvement. As a result many new features can be added to the system as and when required. The System is scalable as well as portable allowing effective Voice communication among the users in the VOIP network.

There is lots of future for the VOIP technology. Smarter applications must be integrated with the VOIP technology to achieve privacy and block unwanted calls and to have an amazing phone experience. There is always a fear for small businesses to move into a larger business i.e the fear of the unknown technology. Lots of work is been done to ease their concern and provide them convenient service. Also the users who do not have a broadband connection, they find it easier to use VOIP technology over PSTN.

Chapter 8: REFERENCES

1. Spam over Internet Telephony and How to deal with it, Diploma thesis - Rachid El Khayari, Fraunhofer Institute for Secure Internet Telephony.

2. S. Dritsas, J. Mallios, M. Theoharidou, G. F. Marias, and D. Gritzalis, Threat analysis of the session initiation protocol regarding spam. Technical report, IEEE, 2007.

3. Voice Over IP - Security and SPIT Swiss Army, FU Br 41, KryptDet Report Rainer Baumann Stephane Cavin Stefan Schmidt University of Berne, August 24 - September 8, 2006

4. MacIntosh, R., Vinokurov, D.: Detection and mitigation of spam in IP telephony networks using signalling protocol analysis, pp. 49-52 (2005).

5. Radermacher, T.A.: Spam Prevention in Voice over IP Networks. University of Salzburg,Salzburg (2005).

6. The Session Initiation Protocol(SIP) and SPAM draft-ietf-sipping-spam-04 J Rosenberg, C Jennings

7. Analysis of Techniques for Protection against Spam Over Internet Telephony

Vincent M. Quiten, Remco van de Meent , and Aiko Pras University of Twente.

8. H. J. Kang, H, Z. L.Zhang, S. Ranjan, and A.Nucci, "SIP-based VoIP traffic behavior profiling and its applications, " in Proceedings of the 3rd annual ACM workshop on Mining network data, Jun '07, pp. 39-44.

9. S. Y. Park, J. T. Kim, and S. G. Kang, "Analysis of applicability of traditional spam regulations to VOIP spam" presented at 8th International Conference. Advanced Communication Technology. ICACT 2006, vol. 2, pp. 3.

10. R. MacIntosh, and D. Vinokurov, "Detection and mitigation of spam in IP telephony networks using signaling protocol analysis," in IEEE/Samoff Symposium'05. Advances in Wired and Wireless Communication, Apr 2005, pp.49-52.

11. D. Shinder, "Don't fall prey to these methods of VOIP abuse," [online document] [2006, Nov 22], Available at http://articles.techrepublic.com.com/5100-10878_11-6137937.html

12. P. Korzeniowski, "VOIP emerging as next spam entryway," [online document] [2005, Aug 24], Available at http://www.technewsworld.com/story/45518.html

13. RFC 3261-"Session Initiation Protocol", http://www.ietf.org/rfc/rfc3261.txt [October 5, 2008]

14. US VoIP market shares", http://www.itfacts.biz/ us-voip-market-shares-vonage- 539-verizon-55-callvantage-55-sunrocket-4-lingo-26/7049 [October 16, 2008]

15. http://en.wikipedia.org/wiki/Voip

16. http://gigaom.com/2006/04/14/pstn-versus-voip/

17. http://www.iptel.org/

18. Heung Youl Youm, Technical means to Combat Spam in the VoIP Service, Soonchunhyang University, South Korea.

19. P.Hazlett, S.Miles & G.Teigre."SER Getting Started," http://www.iptel.org/ser/doc/gettingstarted [OCT 11 2008].

20. Juergen Quittek, Saverio Niccolini, Sandra Tartarelli and Roman Schlegel, On Spam over Internet Telephony (SPIT) Prevention, NEC Europe Ltd, Report IEEE, 2008.

21. http://www.counterpath.net/x-lite.html

You may also find these documents helpful