DDoS (Distributed Denial of Service) attacks are a major threat to security. These attacks are mainly originated from the network layer or application layer of the compromised systems that are connected to the network. The main intention of these DDoS attacks is to deny or disrupt the services or network bandwidth of the victim or target system. Now-a-days, application layer DDoS attacks are posing a serious threat to the Internet. Differentiating between the legitimate/normal and malicious traffic is a very difficult task. A lot of research work has been done in detecting the attacks using machine learning approaches. In this paper, the authors have proposed the machine learning metrics for detecting the application layer DDoS attacks.
Denial of Service (DoS) attacks are one of the major threat that we are facing today. A Denial of Service (DoS) attack is any intended malicious attempt to make a network resource or a server unavailable to users. This type of attack mainly aims to deny or degrade the normal services of legitimate users by sending a large amount of traffic to the victim or target system by exhausting the services, computational resources such as bandwidth, disk space, and processor time. The resources targeted by the DoS attack can be a specific computer, a port, an entire network or a component of given network system component. When the traffic of a DoS attack comes from different sources, it is called as a Distributed Denial of Service (DDoS) attack. Distributed Denial of Service (DDoS) attacks reduces or completely disrupt services to genuine users by expanding communication and/or computational resources of the target. DDoS attacks mainly disables a specific computer, service or an entire network, printers, laptops or phones, target alarms, hit system resources like bandwidth, disk space, processor time, and crash the operating system.
The recent DDoS attacks are very popular in the websites like yahoo, eBay and their services have exposed the vulnerability of the internet for DDoS attacks. DDoS attackers are usually motivated by various reasons:
The attacks that are launched for financial gain are often, the most dangerous and are difficult to stop. These are mainly concerned for corporations and require more experience and technical skills.
The attacker then launches an attack to block the resources of victim/target system, which slows down the performance of the system and also the network.
Attackers in this type are inspired by their ideological beliefs in order to attack their targets. This group is one of the major incentives for the attackers to launch DDoS attacks.
In this, attack the victim systems for experiment and learns how to launch various attacks. They are usually young hacking enthusiasts who want to show off their competencies.
In this attack, the attacker overloads the services offered by the victim system through unwanted or fake traffic.
Some of the open problems are listed below:
Now-a-days, DDoS attacks are posing a very real threat to the business operation, so countermeasures need to be developed to avoid the DDoS attacks. This proves to be a very difficult task because of the following factors:
Some of the common types of DoS attacks are network device attacks, operating system level attacks, application level attacks, data flood attack, and protocol feature attacks. DDoS attacks can be broadly classified into three types and is shown in Figure 1. They are,
The bandwidth depletion attacks flood the unwanted network traffic to the target network preventing the legitimate traffic from the target network. These attacks can be classified into direct flood and reflection flood attacks. The direct flood attacks are the network layer attacks. In UDP (User Defined Protocol) flood attack, a large number of UDP packets is sent to the victim system likewise TCP packets, Ping packets are sent in order to deplete the targeted system. In a reflection type of attack, the attacker sends the messages to the broadcast IP address, thus causing the systems in the subnet to send a reply to the target system. In Smurf attack, the attacker sends the packets to a network amplifier in turn the return address is spoofed to the target IP address. Fraggle attack is similar to Smurf attack except the network amplifier uses UDP Echo packets.
Resource depletion attacks make the resources of the target system that are unable to process the legitimate user requests. They are classified into protocol exploit attacks and malformed attacks. In DDoS TCP SYN attack, the attacker sends the bogus requests to the target server in order to consume the server's processor resources, thereby preventing the legitimate requests to the target server. The attacker sends the TCP packets that contain the PUSH and ACK (Acknowledement) bits that are set to 1. These bits are triggered into the TCP packet header that instructs the target system to unload data in the TCP buffer, and also send acknowledges when it is complete. A malformed attack is where the attacker instructs the slaves that send the incorrectly formed IP packets to the target system in order to crash the network or system. In the IP address attack, the packet contains the same source and destination address, thereby confusing the OS, causing the system to crash. In the IP packet options, attacks may randomize the optional fields in the IP packet in order to analyze the victim with more network traffic.
The application layer DDoS attacks sends HTTP GET/POST requests to the target server. This HTTP (Hypertext Transfer Protocol) request can cause consumption of resources in CPU (Central Processing Unit), memory and other devices. By using the protocols such as the Domain Name Service (DNS), File Transfer Protocol (FTP), VoIP (Voice over Internet Protocol) and SMTP (Simple Mail Transfer Protocol), the attacker can launch the DDoS attacks.
The DDoS attacks in the OSI model can be targeted at different layers, many of the attacks concentrate on the network layer such as ICMP (Internet Control Message Protocol) flooding, SYN flooding, and UDP flooding attacks. These types of attacks are also called as Net- DDoS attacks. Since application-layer DoS attacks usually do not clear or cannot detect themselves at the network level, they evade traditional network-layer-based detection. Hence, the security community has focused on the specific application-layer DoS attacks. The attacks that are aimed at the application layer are also called App-DDoS attacks. In another words, the application layer DDoS attack is an attack that sends out requests by using a communication protocol. Thus these requests are impossible to differentiate from legitimate requests in the network layer. Implementing detection algorithms at the application layer is not easy when compared to the network layer. In the network layer, blocking methods are carried out at the router or IDS and it may or may not be on the application's host machine. An application layer DDoS attack may be one of the following combination types [1]. They are,
1. Session flooding attack that sends session connection requests at a rate higher than legitimate users.
2. Asymmetric attack sends sessions with more highworkload requests, and
3. Request flooding attack sends sessions that will send more requests than normal sessions.
Most of the challenging application layer DDoS attacks have employed botnets. Botnets consist of attackers (also called as masters), handlers, and bots. In order to communicate indirectly with the bots, the attackers use handlers for their means of communication. At the victim's system, bots are the devices that have been compromised by handlers. These bots will send large number of attack requests to the target server based on the legitimate HTTP protocol. DDoS attacks mainly focuses on exhausting the server resources such as CPU, Sockets Memory, and I/O bandwidth resulting that legitimate users will not be able to use the services of network resources.
According to the traditional network layer, DDoS attacks, are giving way to complicated application layer attacks. Supranamaya Ranjan et al. [1] have proposed a statistical approach based on the rate limiting as the primary detection mechanism. Jian Yuan and Kevin Mills [2] used cross-correlations analysis method, that can decide whether an attack is occurred or not by capturing the traffic patterns in the network. Jung, et al. [3] filters the DDoS attack by using IP addresses at the HTTP level, but this algorithm cannot work on DDoS attack which uses many legitimate IP addresses and also fails when NAT (Network Address Translation) approach is used. Yi Xie and Shun- Zheng [4] have introduced a new way to monitor the DDoS attacks by using hidden semi Markov model. Wen, et al. [5] have designed CALD, against various DDoS attacks that masquerade as flash crowds. Liu, and Chang [6] have introduced a DAT (Defense system Against Tilt DDoS attacks) to defend against attacks containing high processing complexity. Ye, et al. [7] have introduced clustering method to analyze application layer .
Das, et al. [8] have proposed an effective approach and algorithm that consists of three phases to handle different HTTP flooding scenarios. Byers, et al. [9] discussed the theoretical approaches of AL-DDoS attacks. Yatagai, et al. [10] introduced the page access behaviour technique, which is used to detect the HTTP-GET flood attacks. Srivatsa, et al. [11] adopted a new method called admission control to limit the number of clients. Jin Wang, et al. [12] introduced a new solution for application layer DDoS attack that is a relative entropy method. By using some cluster methods, it extracts the features of web objects and calculates the relative entropy for each session. Oikonomou and Mirkovic [13] introduced new defense mechanisms against application layer DDoS attacks using human behaviour modelling that differentiate DDoS bots from human users. Jie Yu, et al. [14] proposed a lightweight mechanism such as Trust Management Helmet scheme that uses trust to differentiate the attackers and legitimate users. Pawel, etal. [15] introduced a method that uses entropy-based clustering and application of Bayes factors to differentiate between the attacking and legitimate sequences.
Few literatures proposed some of the machine learning techniques such as Cumulative Sum (CUSUM) Algorithm, Support Vector Machines (SVM), Principal Components Analysis (PCA), and Neural Classifiers. Bharati and Sukanesh [16] have presented a framework for detection of application layer DDoS attacks by using Principal Component Analysis (PCA). This framework works by building user browsing activity profiles. This method presents a very low attack detection time when compared with the other existing methods. Raj Kumar and Selvakumar [17] proposed an approach based on ensemble of neural classifiers that reduces the overall error. This approach is used to detect the TCP, UDP, ICMP, and HTTP based attacks. Stefan Seufert and Darragh [18] proposed some machine learning techniques mainly Artificial Neural Networks (ANN) in automatic defense mechanism against DDoS attacks. Manjula and Anitha [19] have presented various machine learning methods like Navies Bayes, C4.5, SVM, KNN, and K-means clustering for efficient detection of DDoS attacks. Shyamala Devi and Umarani [20] proposed some Multivarate based machine learning approaches which can be used to detect the anomaly based application layer DDoS attacks. Umarani and Sharmila [21] uses a machine learning technique called PCA to detect the DDoS attack traces. Two classifiers such as Naive Bayes and K-Nearest neighbourhood are used to categorize the traffic as normal or abnormal.
The authors have used two metrics such as request chain length and discovering frame length by using LLDoS (Lincoln Laboratory DoS) 2.0.2 and CAIDA (Cooperative Association for Internet Data Analysis) dataset.
For a given set of cached transactions for training set that are labelled as N or D, the average length of requests usually observed from the client is said to be the possible length of requests in a given timeframe to be labelled as an attack or normal. The request chain length is defined as the sum of average chain of transactions and RMSD (Root Mean Square Deviation) of chain of transactions length. The average length of requests usually observed from the clients is said to be a possible length of requests in a given timeframe.
The RCL can be measured as follows:
For a given set of transactions in CSN as,
The average chain of transactions length observed in all the timeframes discovered from the CSN are as follows:
In equation (2), is the mean of transaction count observed for all the timeframes TS(CSN); |tsi| is the count of transactions observed in the span of ith timeframe.
Request Chain Length (RCL) is a absolute deviation which is observed in all timeframes discovered from (CSN) is as follows:
Request Chain Length is observed as follows:
For each dataset CSN and CSD , order the sessions of their initiated time:
By using k means algorithm [22], we can cluster the session beginning time and the session ending time.
Let, as the set represents the session beginning time of all sessions to Ci .
Let, as the set represents the session begin time of all sessions to Ci .
Then the time frame tf(Ci ) of the cluster Ci is measured as,
Then, find the average of time frame length observed from all the clusters as follows:
Further, the time frame absolute deviation is given by,
Then fix the time frame, tf as the sum of average of time frames length and time frame absolute deviation as follows:
Here, the authors have used datasets such as LLDoS 2.0.2 [23] and CAIDA [24] and used their own metrics such as discovering time frame length and request chain length.
Table 1 shows the Request Chain Length calculated for the 1 hour time frame as a fixed time frame and by taking variable length of the packet size, i.e 10571, 22261, 31882, 40502, and 50283. The Request Chain Length (RCL) can be calculated for the normal requests or legitimate requests and also for the Application layer DDoS attack requests or illlegitimate requests. By calculating the Request Chain Length, for a given time frame the average length for the normal requests and App-DDoS attack requests can be effectively detected.
Table 1. Request Chain Length Calculation
Figure 2 shows the Request Chain Length calculated by using CAIDA dataset. The graph is taken for detecting the average chain length for the normal requests and the App-DDoS requests. In Figure 2, on the X-axis, the variable packet size is shown i.e 10571, 22261, 31882, 40502, and 50283 and the Y-axis shows the request chain length values calculated in the seconds format for the normal and the App-DDoS requests. The blue color line in the graph indicates that the requests are normal and the red color line in the graph indicates that the request are App- DDoS attack requests. By observing this Figure 2, there is a clear variation that the RCL is higher for the App-DDos attack requests than the Normal request packets.
Figure 2. Request Chain Length
The time frame can be calculated in Table 2. The time frame length can be calculated by taking the variable packet size, i.e, 10000, 20000, and 33875. First the authors have to calculate the average time frame length and then the time frame absolute deviation. By adding both the values, the time frame length can be obtained. The time frame length can be calculated for the normal or legitimate requests and the App-DDoS attack requests. The time frame can be calculated in the time format (hh::mm::ss).
Table 2. Time Frame Length Calculation
Figure 3 shows the results for the time frame length metric by DARPA LLDoS 2.0.2 dataset. The time frame length for the normal requests and the App-DDoS attack requests can be shown in the graph. On the X-axis, the variable packet size is shown and on the Y-axis, the time frame is shown. The blue color indicates the normal time frame from the normal requests that are taken and the red color indicates the App-DDoS time frame from the attack requests. Figure 3 shows, that there is a clear variation for the normal and the attack requests. Hence, the authors conclude that, the time frame for the attack requests are higher than the normal requests.
Figure 3. Time Frame Length
This paper mainly deals with two machine learning metrics such as Request Chain Length (RCL) and Discovering Time Frame Length (DTFL). By using these metrics, can be effectively application layer DDoS attacks delected by using k-means clustering algorithm with the help of LLDoS 2.0.2 and CAIDA datasets.
In the future, by using some more machine learning metrics like packet type and intervals between the sessions, the authors can detect the application layer DDoS attacks.