Abstract

Due to the increase of large amount of real world, it is a difficult task for the organizations, companies, etc., to extract relevant data from large amounts of necessary and unnecessary data. Many researches will be going on from the few decades onwards. In datamining there is a concept called clustering which will be used for the smaller datasets, where it effectively makes relevant data into clusters. But the problem will arise on the larger datasets, where it will face a complexity for grouping relevant data into cluster. In this paper, analysis have been referred from many of the algorithms like subscale algorithm for finding the dense region from the dataset and DBscan algorithm for making a cluster as a result it takes dataset as an input and scans a complete dataset. The problem occurs on the time complexity and performance and also, it will follow a sequential flow of database scan. So, it takes time for relevant data values as a cluster in final result. In this analysis, it will allow the complete dataset scans at a time, processing data in parallel manner. So, the resultant data is in effective manner in a lesser time. For the improvement of previous algorithm a Map Red based DBscan for reducing time complexity and performance improvement has been used.

Keywords
Big Data Mining, High Dimensional Data, Subspace Clustering, Scalable Data Mining

Purchase Instant Access

PDF
10
USD

250
INR

HTML
10
USD

250
INR

Username / Email
Password
Don't have an account?  Sign Up
  • If you would like institutional access to this content, please recommend the title to your librarian.
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.

We strive to bring you the best. Your feedback is of great value to us. Feel free to post your comments and suggestions.