Inter-Agent Coordinated Security Model for Cloud Based Virtual Machines

* Lecturer, Federal Polytechnic, Bida, Nigeria.

** Lecturer, Department of Computer Science, University of Ibadan, Ibadan, Nigeria.

Abstract

For a user who desires to utilize the services of the cloud, security is not negotiable. Cloud Service Providers (CSPs) have security features that help protect user's data and information. These features are however not comprehensive. The Service Level Agreement (SLA) of most CSPs have certain exclusions that warrant users to undertake some measures of security upon themselves, especially for tenants having Virtual Machines (VM) in a multi-tenant architecture. This means that users who are ignorant of the security implications might be exposed to great risks. This paper presents a security model that used the OPNET (Optimized Network Engineering Tools) modeler, based on distributed agents, to prevent attacks from rogue virtual machine and enhance security of VM-to-VM communication. A set of mobile devices were given varying levels of access and pitched against some servers. Observing the packet network delays, phase response time for security apps and the coordination between these mobile devices and the installed agents on the servers showed that data belonging to tenants are safer and attacks from virtual machines are almost negligible.

Keywords :

Multi-Agent,
Distributed Agent,
Virtual Machine,
Service Level Agreement

Introduction

Cloud computing refers to shared computing resources, software, or data that are delivered as a service and on-demand through the internet via virtualization. It offers advantages like reduced costs, scalability, and increased utilization of hardware resources. Users of the cloud can lease resources in accordance to their needs and requirement and pay only for the services that they eventually use.

There are various interpretations of cloud computing, but according to National Institute of Standards and Technology (NIST), cloud computing consists of five essential features, three service models, and four deployment models. The five essential features include broad network access, virtualized computing resource pool, measured service, rapid elasticity, and on-demand self-service; the three service models are Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS); the four deployment models are private, community, public, and hybrid clouds (Che, Duan, Zhang, & Fan, 2011).

Figure 1 shows the different features, service, and deployment models of cloud computing.

Figure 1. The NIST's definition model of cloud computing (Che et al., 2011)

Cloud computing would almost be impossible without the concept of abstraction. Abstraction is the technological process of hiding the details of lower level implementations in order to limit the depth of intricacies involved in a design process. Abstracted systems usually have limitations due to the Instruction Set Architecture (ISA) used by different manufacturers. Virtualization helps to handle this kind of limitation. Virtualization is the mapping of resources and interface of a given system onto the interface and resources of an existing system that does the actual implementation (Smith & Nair, 2005a). At a certain level of abstraction, a virtualized environment behaves just like the real system upon which it is mapped. Virtualization as a technology isolates the low-level resources and provides virtualized resources for high-level applications (Zhao, Sakr, Liu, & Bouguettaya, 2014).

Via virtualization service providers extend the physical servers to be used as individual machines in a multitenant setup. The virtualized system is called a Virtual Machine (VM) and the act of deploying VMs over the physical machines is called placement (Gaggero & Caviglione, 2016).

As VM placements increase, especially for mobile devices, there is also an increase in threats and vulnerabilities. These security concerns consistently grow as clouds become the heart of storage for digital data (Khalil, Khreishah, & Azeem, 2014). Cloud Service Providers hence are burdened with the task of ensuring that users are secured, however the CSPs offer vulnerable security mechanisms (Felici & Fernández-Gago, 2015). This enormous task of securing users' data may be responsible for CSPs like Microsoft Azure, Google Compute Engine, and Amazon S3 having exclusions in their Service Level Agreements (SLA) that protect them from being indicted on grounds of security breaches in multi-tenancy settings (Amazon S3 Service Level Agreement, 2018; SLA for Storage, 2015; Google Compute Engine Service Level Agreement, 2018).

Protecting Virtual Machines might look like it is the primary task of Cloud Service Providers (CSP) and many users may depend on the basic security that a cloud environment offers and decide to bank on that, it however is not a safe option. The risk envisaged here is that clients are exposed to possible data breaches, which are likely to occur from malicious attacks from neighbouring tenants who operate rogue Virtual Machines in the same physical cloud space (Kumar, Kumar, Kumar, & Kumar, 2014). This paper proposes a model that mitigates the security risks that are faced by users who are within a multi-tenant cloud environment.

1. Cloud Computing Security Models

1.1 NIST Multi-TenantModel

Multi-tenancy allows Cloud Service Providers to leverage on virtualization (Smith & Nair, 2005b) in providing service like sharing resources to various customers. Virtualization helps them to isolate faults, intrusions from outsiders, and malicious applications (Zhao et al., 2014).

There exists difficulties brought about by the concept of multi-tenancy and they include, per formance customization and data isolation. Respectively, this means that there has to be an assurance of performance irrespective of the individual customers' demands and that the customers' data do not intermix mutually. Multitenancy model differs based on the cloud deployment model investigated. These could lead to satisfying customers' unique demands as regards security (Che et al., 2011).

1.2 CSA Cloud Risk Accumulation Model

To correctly analyze the risks associated with security in cloud computing, an understanding of the layer dependency of cloud service models is very critical. SaaS is built on PaaS, which in turn is built on IaaS which is the foundation layer of all cloud services. This implies a relationship between these service models. As sure as service capabilities are inherited across service layers so also are security risks. Computing is also inherited between different service layers. IaaS does not hold a lot of security functions, but necessitates that customers take charge of operating systems security, software applications, and contents. For PaaS, the capability of developing customized applications for customers exist and hence, the inbuilt security capabilities are not comprehensive, and customers, however have the ability to implement additional security of their choosing. SaaS handles more security issues for the customers making them less involved in security management, though the layer provides the most integrated of services out of all the three layers (Che et al., 2011).

The lower the layer of service, the more security conscious a customer has to be and the more security management issues he has to handle.

The proposed model combines information from the multi-tenant model of NIST and the Cloud Risk Accumulation model of CSA to propose a model working specifically for the multi-tenant architecture in the Platform- as-a-Service layer.

2. Security Vulnerabilities

Cloud vendors claim to have top-notch security as it relates to data and computational environments for their subscribers. This claim is not always true and for it to be, governing bodies at much higher levels have to work together rather than leaving it to individual organisations (Khalil, Khreishah, & Azeem, 2014). A survey of some cloud security issues identified several dependencies that affect users security in the cloud (Khalil et al., 2014). Some dependencies that are directly related to data breaches (data loss and leakage) include Authentication Mechanism Weakness, Malicious insider, and Lack of quality of service. These dependencies are responsible for the following attacks:

Cross VM Side-channel attack that requires the attacker and the victim to be located on the same physical machine and parallelism in the cloud makes it difficult to handle this type of attack.
Energy consumption side-Channel attack: Here the attacker identifies the host system and gains access to the energy consumption log, which would compromise privacy and security concerns of the user.
VM Roll Back Attacks: Hypervisors have the ability to take snapshots of VMs and suspend or resume such VM states when needed. An attacker can take advantage of this snapshot and launch a brute force attack. The attacker can reset the VM counter and roll back the state as he tries multiple times to force his way in and remains untraceable if he clears the history.

3. Rogue Virtual Machines

Rogue virtual machine are known security risks in multitenant cloud setups and they are not known by the type of configuration or any defined feature, but by the characteristics or activities that the machine carries out per time. Any VM can go rogue depending on who is carrying our operations using the machine or what applications are allowed to run on the machine. Colbert and Batten (2011) points out ways by which VMs can go rogue include:

VM Copying: It involves making legitimate duplicates of VM, but mishandled using unethical processes, thereby compromising users data, when this happens the VM copies are said to be rogue VMs.
VM Replacement: It occurs when a malfunctioning VM is replaced by another. In the course of replacement, an infected VM can be substituted and this can corrupt existing user data. The issues of copying and replacement can be checked using machine profiling, staff profiling and by invoking the system logs.

Perez-Botero, Szefer, and Lee (2013) mentioned that hypervisors which are of two types, Type I (bare metal) and Type-II (hosted) are prone to vulnerabilities and stated a number of defenses which are being used to mitigate these vulnerabilities. These defenses include but not limited to:

Re silent code base: This defense strategy is used to make the hyper visor code more resistant to injection. Hardening code is seen as a clever programming technic to circumvent the control flow.
Management Protection: This involves guarding the kernel of the hypervisor from untrusted management, it focuses especially on the VM attack vector path.

As many defenses emerge, it tells the dynamism of the vulnerabilities involved and these vulnerabilities daily evolve and get modified. This means that research must continue to follow up and attempt to preempt the next line of vulnerabilities.

4. Inter-Agent Security Model

A viable solution focused on the data breach security vulnerability is carried out on the Platform-as-a-Service tier of the cloud space. The public cloud is used because of the high potential of having a multi-tenant setup architecture. The protection level is effected from the user's point of view and not that of the cloud service provider. This model however will not recover any lost user information, but will stand to prevent intrusions from rogue virtual machines and other intruders.

The initial design for the model was to be originally carried out in the cloud of any of the available Cloud Service Providers. Due to the security restrictions adopted in these various clouds, the experiment could not be deployed on any of them. However, a near-to-real environment was created in an enterprise simulator and the model was successfully deployed and tested.

A basic cloud capable of running virtualizations of attack scenarios is created in OPNET (Optimized Network Engineering Tools) Modeler. The cloud comprises of six high-end servers configured with large scale memories and storage spaces. The servers are configured to operate as parallel processors. In real clouds, there maybe thousands of such servers forming a massive parallel processing system for the Virtual Machines (VMs). Hence, this model may be viewed as a VM-based cloudlet that can be scaled up organically to produce a massive cloud computing infrastructure.

5. Methodology

The servers are interconnected through switches using ATM OC12 links at the core and Gigabit Ethernet links as peripheral links. The cloud design components in OPNET Modeler are based on a research by Al Aqrabi, Liu, Hill, and Antonopoulos (2014), which were for running business intelligence and data warehousing on cloud computing. Given that this research is not concerned with a specific cloud-based application, the application profiling is simply named “APPLICATION”. Figure 2 shows the setup of the model.

Figure 2. Basic Layout of the Model Design

The Access Point (AP) for the wireless domain is a generic cellular mobile tower following IEEE 802.11n specifications with 0.005W transmission power and a cell size of 1 km radius. A single cell with eleven mobile phones is modeled in this design. The IEEE 802.11n interfacing speed has been used for ensuring mobile interfacing to cloud computing.

Within the mobile cell, there are four types of clients. Clients M4, M7, and M8 have been provided only Network-level Access (N_A) to the cloud without defining the destination preferences for their runtime profiles. Simply stated, such clients have no “knowledge” about the servers on the cloud. Clients M2 and M3 have both Destination Preferences (D_P) and Network Access (N_A) to the clouds, but do not have any security agents (A_S, A_D) installed. The fundamental rule followed is that unless at least the Static Agent (A_S) is installed, the client will not get S the Virtual Machine Runtime (VM_R) for accessing the main cloud application. Client M1-A has Destination Preferences (D_P), Network Access (N_A), and Security Agent installed (A_S, A_D). Hence, it has been provided the Virtual Machine Runtime (VM_R). Clients M5-H and M5-A_H are R also authorized users with Destination Preferences (D_P), Network Access (N_A), and agents installed (A_S, A_D). But they are exhibiting“suspicious behaviors” by trying to access the application servers without getting the virtual machines (VM_R). In practice, such accesses can be attempted using exploiting codes with malicious payloads; (like the Metasploit framework). There is a difference between Clients M5-H and M5-A_H. Client M5- H has only the static agent (AS). Assuming that the static agent does not know about the hacking behavior of the client because of older security updates, the Virtual Machine Runtime (VM_R) has been enabled (the static agent may take a decision if there is a delay in arrival of the dynamic agent amidst network congestions or TCP outages). This means that while this client could get access to one of the VMs, it is still not authorized to access the cloud application. This act of denial can later be seen as a false positive flag. The knowledge of hacking behaviors can only be propagated by the dynamic agents (A_D) update sent by the firewalls (through security updates). In such cases, the VM as signment is delayed pending further investigation. Figure 3 below shows that tasks setup for the model.

Figure 3. Task Attributes Main Window for Operating the Agent-based Cloud Computing Security

The details of the tasks and their databases are given below.

Meta-Inspection: checks the authorization details of the client seeking access to the cloud servers. It is advised by a separate cloud user accounts database Metadata-Db. This is run by the tenantmetadata application.
Stateful-Inspection: checks the present and previous known states of a client to find out if the client is clean or is a hacker. This inspection also includes the userdefined settings for clients (although such manual settings are expected to be rare given the sheer volumes on a cloud computing infrastructure). It is advised by a separate database of known states called State-Db. The firewall handles this task.
Exploit-Inspection: checks if the packets sent by the client through a TCP or UDP session have traces of known exploit signatures. It is advised by a separate database of known exploits called Exploit-Db. IDPS (Intrusion Detection and Prevention System) sees to this task.
Malware-Inspection: checks if the packets sent by the client through a TCP or UDP session have traces of known malware signatures (like, viruses or Trojan horses). It is advised by a separate database of known exploits called Malware-Db. This is handled by the Anti-malware application.

5.1 Detecting Clients

Any client that does not have the following variables complete is flagged down as a suspicious user (rogue). The formula below gives the variables that a client most possess before it can be authenticated at the point of initializing connectivity. This check is also repeated periodically just in case a clean machine or user goes rogue during operation. The safety of the cloud space and that of all users is tied to the cumulative safety of all virtual machines in the cloud space.

(1)

where AC = Authenticated Client

i= Identity of client/user

n= total number of clients connecting to the cloud server

N_A = Network Access

D_P = Destination Preference

A_S = Static Agent Profile (Update)

A_D = Dynamic Agent Profile

VM_R = Virtual Machine Runtime Allocation

The formula translates to the hierarchy below (Figure 4) and in Table 1. Every authenticated client mostly have a bottom-up approach starting from receiving network access to the point that a virtual machine is allocated to the client to access the application in the cloud.

Figure 4. Bottom-Up Privilege Rise for Authenticated Client

Table 1. Access Chart for Connected Users/Devices

The flowchart model that illustrates the security check is shown in Figure 5 on the next page. Users must undergo these checks to be given privileged access to get to the application in the cloud. If any section fails, the user is marked as rogue.

Figure 5. Flowchart of Security Checks that Authenticates User

The Access chart (Table 1) shows that only device M1-A and Authorized user can access the application in cloud. The other devices are seen as rogue virtual machines or malicious users and are denied access.

6. Evaluation And Result

The simulations were conducted and the scenarios are configured. The packet network delay numerals have been extracted from the reporting engine and the graph is plotted to observe the effects of the model in limiting unauthorized access to the cloud application:

Packet network delays: These are delays in delivery of the packets from source to destination. Figure 6 presents the packet network delay of all the security inspections that occur for a finite time. These simulations are recorded from the inception of an authorized user's login access.

From Figure 6, it is noticed that the anti-malware packet delay has not exceeded 0.11 milliseconds, the firewall inspection packet delay has not exceeded 0.15 milliseconds, and the intrusion detection inspection packet delay has not exceeded 0.11 milliseconds. These delays are so minimal that an intruder will not get past these checks in time to commit any serious crime.

Figure 6. Packet Network Delay

Phase response times:

Since the apps are created as sequence of phases, this statistic is chosen to explore how fast the phases are completed. Figure 7 shows the phase response time for the security apps.

Figure 7. Response Times of Phases of the Security Apps

As observed in Figure 7, the phase response times of these security inspection apps while interacting with their respective databases are also in similar ranges hence this also implies that any exploit will be checked in time.

Figure 8 highlights the throughputs and response time for databases working with the apps.

Figure 8. Response Times and Throughputs of Databases supporting the Security Apps and also the Cloud Application App

This delay in overall database response times may increase with addition of more cloud applications. Thus, the challenge in this design is related to response times of databases of the security inspections (Metadata-db, State-db, Exploit-db, and Malware-db) and the database-to-database replications (security updates). In traditional designs, the entire database of a security control is mostly local to a client. For example, when real laptops connect to the cloud they possess local databases for anti-malware, which are updated periodically as per a schedule. In this design, only a small part of the databases shall be local to a client (mobile or laptop) and the rest shall be hosted on cloud server arrays among all other databases. Given that the cloud servers work in massively parallel computing mode, the larger the server arrays the better the database response times as this design is scaledup.

In the course of implementation there were cases of false positives and false negatives which were encountered.

This arose due to the delay in updates between the Static agent and the dynamic agent. This caused a degree of inaccuracy in the predictions at some points.

Conclusion

This use of static and dynamic agents in a multi-tenant architecture to emphasize security bridges the gap created by cloud service providers in the Platform-as-a- Service tier. An inter-agent model was developed using these agents. The static agents were modeled to run all controls locally in clients and the dynamic agents were modeled to provide frequent security updates to the static agents.

Observing the packet network delays, phase response time for security apps and the coordination between these mobile devices and the installed agents on the servers showed that data belonging to tenants are safer and attacks from virtual machines are almost negligible. This implies that the model is feasible and can be implemented in a real cloud.

Future Works

Emphasis should also be placed on limiting the occurrence of false positives and false negatives without incurring an increase in overhead or in the use of system resources. This could be achieved by increasing the frequency of update between the dynamic and static agents and the frequency of revalidation of clients by the static agent.

References

[1]. Al Aqrabi, H., Liu, L., Hill, R., & Antonopoulos, N. (2014, August). A multi-layer hierarchical inter-cloud connectivity model for sequential packet inspection of tenant sessions accessing BI as a service. In High Performance Computing and Communications, 2014 IEEE 6^th Intl. Symp. on Cyberspace Safety and Security, 2014 IEEE 11^th Intl. Conf. on Embedded Software and Syst. (HPCC, CSS, ICESS), 2014 IEEE Intl. Conf. on (pp. 498-505). IEEE.

[2]. Amazon S3 Service Level Agreement. (2018). Amazon Simple Storage Service. Retrieved from https://aws.amazon.com/s3/sla/

[3]. Che, J., Duan, Y., Zhang, T., & Fan, J. (2011). Study on the security models and strategies of cloud computing. Procedia Engineering, 23, 586-593.

[4]. Colbert, B., & Batten, L. M. (2011, January). Dealing with rogue virtual machines in a cloud services environment. In CLOSER 2011: Proceedings of the 1^st International Conference on Cloud Computing and Services Science (pp. 43-48). INSTICC.

[5]. Felici, M., & Fernández-Gago, C. (Eds.). (2015). Accountability and Security in the Cloud: First Summer School, Cloud Accountability Project, A4Cloud, Malaga, Spain, June 2-6, 2014, Revised Selected Papers and Lectures (Vol. 8937). Springer.

[6]. Gaggero, M., & Caviglione, L. (2016, July). Model predictive control for the placement of Virtual Machines in cloud computing applications. In American Control Conference (ACC), 2016 (pp. 1987-1992). IEEE.

[7]. Google Compute Engine Service Level Agreement. (2018). Google cloud. Retrieved from https://cloud. google.com/compute/sla

[8]. Khalil, I. M., Khreishah, A., & Azeem, M. (2014). Cloud computing security: A survey. Computers, 3(1), 1-35.

[9]. Kumar, V. K. A., Kumar, R., N., Kumar, K. K., & Kumar, S. N. K. (2014). Survey on security threats in cloud computing. International Journal of Applied Engineering Research (IJAER), 9(21), 10495-10500.

[10]. Perez-Botero, D., Szefer, J., & Lee, R. B. (2013, May). Characterizing hypervisor vulnerabilities in cloud computing servers. In Proceedings of the 2013 international workshop on Security in cloud computing (pp. 3-10). ACM.

[11]. SLA for Storage. (2015). Service Level Agreement. Retrieved from https://azure.microsoft.com/ en-in/ support/legal/sla/storage/v1_0/

[12]. Smith, J. E., & Nair, R. (2005a). The architecture of Virtual Machines. Computer, 38(5), 32-38.

[13]. Smith, J., & Nair, R. (2005b). Virtual Machines: Versatile Platforms for Systems and Processes. Elsevier.

[14]. Zhao, L., Sakr, S., Liu, A., & Bouguettaya, A. (2014). Cloud Data Management. Springer.