FTCloud is an emerging cloud paradigm that orchestrates multiple cloud technologies and is becoming the main stream aspect of providing service. As software, Fault Tolerance (FT) mechanisms mask failures earlier to improve reliability. To address this challenge, Zibin Zheng proposed a component ranking framework with fault tolerance named FTCloud to tolerate failures in software. In FTCloud more characteristic factors like throughput and dynamic faulttolerance mechanisms are not implemented. To ensure reliability ' Dynamic FTCloud ' framework mainly concentrates on throughput with random graph model in FTCloud1 while employing response time for services. The FTCloud2 focuses on failure probability of components as extension to the FTCloud. In this paper, dynamic optimal fault-tolerance strategy is implemented in the framework along with the previous algorithm of design diversity techniques .The prospecting results show that tolerating faults of significant components are having enormous improvement with reliability
The cloud computing phenomenon is the backbone of various internet services (computing, storage, data access etc.) in which end users do not depend on physical locations of their system. The key concept of cloud computing is to give low cost unit of computing pool from the shared resources which are distributed in different places [1] . Nowadays cloud provides infrastructure management for each service . The advantage of cloud computing is to provide services on demand basis with pay and use concept. For providing the composition of services, it is required to combine all legacy services into a component where availability becomes a major concern [15] , but sometimes the services become unavailable due to fault occurrence. Many giant companies like Amazon, Google, and Microsoft, web hosting companies such as Rack space and GoGrid, and new start-ups such as Flexiant and Heroku are becoming service providers. A practical challenge thus arises which is to provide services without wastage of time and data.
To analyze the availability of cloud services, assessment of components is a reasonable solution. As cloud can host applications as Software as a Service (SaaS), many applications are being deployed in cloud [2].However, ease of availability, maintenance and reliability [4] is becoming complex, while providing services. Hence multiple redundant components are going to be copied at various distributed places .One of the reasons for the unavailability of cloud components is the lack of fault tolerance [8].To provide services through components, building highly reliable environment is a challenging task. In order to build reliable software, the corresponding engineering disciple in traditional approaches uses fault tolerance, fault removal, fault prevention and fault forecasting [6]. Software fault tolerance techniques provide protection against errors in translating the requirements into components, but they do not provide explicit protection against errors. Design diversity [7] is a famous fault-tolerance technique, where the components are designed to tolerate faults. It is an identical service through separate implementation where diversity in the design of software is independent.
However, developing fault tolerant components is a costly affair and thus it is applied only for machine critical applications such as spare research systems, flight control systems etc. As cloud applications have large number of components, it is really expensive to develop fault tolerant components. In order to reduce the effort and time to make the system robust, only the components which are critical are to be identified and for them only fault tolerance has to be built. As reported by Microsoft, it is important to fix top 20 percent bugs with respect to critical components which can avoid 80 percent of crashes and failures [5]. Based on this 80-20 rule, Zheng et al [25] proposed a component ranking framework by name “FTCloud” which makes the cloud applications reliable [6] as they are fault tolerant. It identifies critical components and makes them faulttolerant. Its two ranking algorithms identify critical components and apply appropriate fault tolerant strategy to ensure the best performance of cloud applications. However, FTCloud can be further enhanced by considering throughput characteristics of the components in the cloud applications. To focus on reliability, our contributions include:
The rest of the paper is organized as follows. Section 1 describes related fault tolerant cloud applications literature. Section 2 introduces the proposed component ranking framework. Section 3 specifies the Throughput significant ranking. Section 4 implements Dynamic FT Strategy. Section 5 analyzes the experimental results.
The traditional software reliability engineering concentrates in demand customer's perspective and fault tolerance is widely used for building multiple redundant copies and reliable applications [9]. There are many fault tolerance techniques to obtain reliability by preventing fault occurrence in their phases. FT-Software provides ser vice complying with the relevant specification inspite of faults. The FT techniques include distributed recovery blocking [10] , N -version programming [11], N self-checking programming [12]. The fault tolerance strategies are classified into active and passive strategies based on the replication of redundancy components [13].The applications such as WS-Replication [22], FTWeb [23] send requests at the same time to various replicas and accept first response as final result. Passive strategy sends request in sequence, while primary web services use final result as response such as FT-SOAP[24].Software fault tolerance is considered as a feasible approach for building high reliable cloud computing environment. However, making such components reliable and fault tolerant is very important. Zheng et al. [25] proposed a component ranking framework “FTCloud” which is used to build fault tolerant cloud applications.
Nowadays, significant research problem is the design issues to invoke and rank components [17] with weight calculation for high reliability QoS. However, our approach in this paper is influenced by design [25] which is used to improve the component ranking framework by considering more fault tolerant approaches to select a critical component in a significant manner. However, Dynamic FTCloud framework contributes to enhance robustness and reliability for building cloud applications.
To address the above issues, we propose a component ranking including throughput which ranks the significant component dynamically. Next, for acquired components, dynamic FT Strategies are applied by which optimal fault tolerance strategy is dynamically suggested for application designers.
Component rank model is a repository of software component libraries which are off-the-shelf programs with a novel graph-representation at the end of result. Often, used components are ranked as prior so that designers have quick access to that component. Components are divided into critical and noncritical components in clouds which reduce the fault tolerance for providing reliable services to IaaS users that is mainly focused in this paper.
In the proposed framework, component ranking structures and use of dynamic fault strategies Dynamic FTCloud enhances the performance of component ranking framework by.
By the implementation of the above framework, the designers of cloud applications can build dynamic, highly reliable and robust system which is extremely fault tolerant.
Dynamic FT- framework includes throughput based ranking and dynamic optimal FT selection [8].Dynamic FTCloud components are ranked by invocation relationship and performance is evaluated by designers to design the structure of cloud components.
Figure 1 shows the significant components to be identified, arrow specifies input, dotted line specifies output of the application and intermediate results are recorded at every step of transition in document.
Figure 1. Architecture of the Dynamic FTCloud framework
Cloud computing is the internet based service provider where each software component interacts with other components as nodes. These nodes are represented in the form of graph with internet connections as edges. By taking component graph as input which is proposed by Michael R.Lyu [25], performance of the service is to be measured by throughput, which is the solution to FT Cloud1.
The throughput, TH is the average of the output services provided by a component per unit time, e.g., number of services through internet per hour. Being timedependent, throughput is calculated by a component as a set of services/messages and by using the following equation,
TH = number of service requests completed / time taken to complete the service
The main goal is to get the performance of cloud components services with different concurrent cloud components. They are many tools existing for the performance evaluation. Examples are JMeter, LoadUI, java bench, etc.
The above services is compared with the performance of the services output as shown in Figure 2. As user services increase, throughput gradually increases upto a certain threshold point after which services are provided in same throughput order.
Figure 2. Throughput analysis
Sustainability of fault tolerance is crucial for critical components where there is no significance for noncritical components.
There are many fault tolerant strategies proposed in [19] which are design dependency of structure. Example: Recovery Block (RB), N-Version Programming (NVP) and Parallel Programming. By these techniques, the dynamic reliability in cloud cannot be achieved. Hence we implement dynamic FT-strategies for FTCloud1 and FTCloud2 for decreasing failure probability as shown in Table 1. Ex: Adaptive N-version Programming, Fuzzy voting.
Table 1. Impact of Application Failure Probabilities
It is similar to N- version where an individual weight factor for all component version and actual, uptime utilization of adaptive services[14] are included [20]. Then, based on the maximum capacity of weight factor, the voting procedure is conducted as shown in Figure 3. In the component-based strategies, for building the individual versions of the system component, throughput is considered. Here N-version programming is defined as the functionally generated independent equivalent programs of N>=2 from same initial specification.
Where Rmi,n is the reliability of module stage I comprising n version modules. Failure probability is identified based on equation 1 as it is subtracted from one giving the failure rate.
Figure 3. Adaptive N- Version
In this, correct output is selected from different outputs which are obtained from various redundant software versions. Traditional voting method is based on an output classification of disjoint subsets [16]. This is similar to the N version programming with
Fuzzy relation is the degree of interconnecting set of elements that comprises the relation [18]. The fuzzy equivalence relation exists if and only if all properties of reflexivity, symmetry, transitivity are satisfied.
This paper focuses on dynamic fault tolerant strategies. The performance comparison of four fault tolerance strategies are presented in Table 2.
Table 2. Comparision of FT Strategies
The experimental solution for this application is built in Java programming language. The application was developed by java frames where the database used is Xampp server. Pajek tool [21] is used to model various cloud application components. After generation, the values are going to take out for specification of throughput and failure probability. The impact of failure probability of cloud applications with respect to fault tolerant strategies and component rankings are represented in Table2.
The dynamic component ranking algorithms namely FTCloud1 and AllFTCloud include throughput where FTCloud2 includes extra FT Strategies .The results are visualized as shown in the graphs.
In Figure 4, the resultant graph performance of all strategies are shown expect N-version, remaining are linear because it include ranking of significant component structure in serial while parallel structuring is followed in N-version.
Figure 4. Impact of Component Failure Probability in FTCloud1
Figure 5 shows that, the failure probability is tolerated by including FT Strategies only for critical components. In recovery block and N-version, as components increase, the failure probability increases because it is a static strategy where parallel includes the throughput for responding so that its probability is decreased, while Adaptive N-version and fuzzy include the throughput dynamically where probabilities are linear even when the components are increased in cloud environment.
Figure 5. Impact of Component Failure Probability in FTCloud2
Figure 6 shows that, all strategies are applied and are included for all components in the cloud system. So, performance drastically changes due to failure probabilities which are seen for all strategies. Throughput is included for every component for getting response time.
Figure 6. Impact of Component Failure Probability in AllFTCloud
To study the impact of dynamic component ranking approach of failure probabilities on system performances, each dynamic FT-strategy with FTCloud1, FTCloud2 and All FT Cloud by impact factor as number of components 50 is taken as 'top-k' which is necessary for analyzing the results. In this, FTCloud2 shows variant resultant curve where the performance is highly increased. When All FT Cloud is applied, then the impact of components on every strategy is identified clearly. The above results show that FTCloud2 and All FT Cloud achieve the best performance.
The “Dynamic FTCloud” framework is used to build highly reliable fault-tolerant distributed cloud applications. For providing services to user's component ranking framework, one can employ not only tolerance towards crashes and faults, but also identify the malicious component on the asynchronous environment.
To gain more insight, proposed through that impacts ranking of components with new techniques of optimal fault tolerance strategies. Hence the failure probability at each redundant component levels is decreased and reliability is increased. In this work the throughput is taken as parameter for providing service and ranking of components. By the random graph model in cloud environment, performances of components are increased. The empirical results revealed that the proposed framework is more robust and very highly fault tolerant.