Blockchain Scalability Analysis and Improvement of Bitcoin Network through Enhanced Transaction Adjournment Techniques
Data Lake System for Essay-Based Questions: A Scenario for the Computer Science Curriculum
Creating Secure Passwords through Personalized User Inputs
Optimizing B-Cell Epitope Prediction: A Novel Approach using Support Vector Machine Enhanced with Genetic Algorithm
Gesture Language Translator using Morse Code
Efficient Agent Based Priority Scheduling and LoadBalancing Using Fuzzy Logic in Grid Computing
A Survey of Various Task Scheduling Algorithms In Cloud Computing
Integrated Atlas Based Localisation Features in Lungs Images
A Computational Intelligence Technique for Effective Medical Diagnosis Using Decision Tree Algorithm
A Viable Solution to Prevent SQL Injection Attack Using SQL Injection
Discovering potential patterns from complex data is a hot research topic. In this paper, the author proposes an iterative data mining model based on "Interval-Value" clustering, "Interval-Interval" clustering, and "Interval-Matrix" clustering. "Interval-Value" clustering uses the features of interval data and digital threshold and designed by "Netting"→ "Type-I clustering"→"Type-II clustering"; "Interval-Interval" clustering uses the features of interval data and interval threshold and designed with interval medium clustering; "Interval-Matrix" clustering uses the features of interval data and matrix threshold and designed by matrix threshold clustering. Motivation of the author is to mine the interval-valued association rules for giving dataset, and the experimental study is conducted to verify the new data mining method. Experimental results show that the data mining model based on interval-valued clustering is feasible and effective.
Now-a-days accuracy of databases is much more important, as it is primary and crucial for maintaining a database in current IT-based economy and also several organizations rely on the databases for carrying out their day-to-day operations. Consequently, much study on duplicate detection can also be named as entity resolution or facsimile recognition and by various names that focuses mainly on the pair selections increasing both the efficiency and recall. The process of recognizing multiple representations of the things with or in the same real world is named as Duplicate Detection. Among the indexing algorithms, Progressive duplicates detection algorithms is a novel approach whereby using the defined sorting key, sorts the given dataset, and compares the records within a window. So in order to get even faster results, than the traditional approaches, a new algorithm has been proposed combining the progressive approaches with the scalable approaches to progressively find the duplicates in parallel. This algorithm also proves that without losing the effectiveness during limited time of execution, maximizes the efficiency for finding duplicates.
In this paper, we consider the interval scheduling problem which is a restricted version of the general scheduling problem. In the general scheduling problem, a given set of jobs need to be processed by a number of machines or processors so as to optimize a certain criterion. We assume that the processing times of the jobs are given. When we impose the additional requirement of starting time for each job, the scheduling problem is known as the interval scheduling problem. In the basic interval scheduling problem, each machine can process at most one job at a time and each machine is continuously available. Each job should be processed until completion, without interruption. The objective is to process all the jobs with a minimum number of machines. Various types of interval scheduling problems arise in computer science, telecommunications, crew scheduling, and in other areas. We propose a greedy based algorithm to solve the basic interval scheduling problem. We also compute a lower bound for the minimum number of machines or processors. Then we apply the algorithm to sets of data for problems of various sizes and show that the solutions obtained are close to optimal.
Sentiment analysis is the process of extracting, understanding and analyzing the opinions expressed by the people using machine learning. In recent years, the increase in social networking sites has brought a new way of collecting information worldwide. Twitter is the famous microblogging site which allows millions of users to share their opinions on a wide variety of topics on daily basis. The posts are called tweets which are confined to 140 words. These opinions are important for researchers for analysis and efficient decision making. So, Sentiment analysis helps to extract the clear insight from social media. In this paper, the authors have presented an approach that classifies the sentences into categories as positive, negative, or neutral. For polarity classification, three multidimensional fields used are politics, companies, and entertainment. These fields give a better reflection of what is happening around the world. The dataset is extracted from twitter using the Twitter API. The API is created using the tweepy – the python library. The machine learning classifiers used are Naive Bayes, Baseline, and Maximum Entropy. The feature extraction is done using the unigram approach and the performance of different classifiers are compared.
In many real world problems, data are collected in high dimensional space. Detecting clusters in high dimensional spaces are a challenging task in the data mining problem. Subspace clustering is an emerging method, which instead of finding clusters in the full space, it finds clusters in different subspace of the original space. Subspace clustering has been successfully applied in various domains. Recently, the proliferation of high-dimensional data and the need for quality clustering results have moved the research era to enhanced subspace clustering, which targets on problems that cannot be handled or solved effectively through traditional subspace clustering. These enhanced clustering techniques involves in handling the complex data and improving clustering results in various domains like social networking, biology, astronomy and computer vision. The authors have reviewed on the enhanced subspace clustering paradigms and their properties. Mainly they have discussed three main problems of enhanced subspace clustering, first: overlapping clusters mined by significant subspace clusters. Second: overcome the parameter sensitivity problems of the state-of-the-art subspace clustering algorithms. Third: incorporate the constraints or domain knowledge that can make to improve the quality of clusters. They also discuss the basic subspace clustering, the relevant high-dimensional clustering approaches, and describes how they are related.