Big Data Mining Platforms: Distributed Aggregation for Data-Parallel Computing

Bala Krishna Sapparam*, Janardhan**
*-** Associate Professor, Department of Computer Science Engineering, Yogananda Institute of Technology and Science, Tirupati, India.
Periodicity:June - August'2014
DOI : https://doi.org/10.26634/jit.3.3.2944

Abstract

Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. In typical data mining systems, the mining procedures require computational intensive computing units for data analysis and comparisons. A computing platform is, therefore, needed to have efficient access to, at least, two types of resources: data and computing processors. For small scale data mining tasks, a single desktop computer, which contains hard disk and CPU processors, is sufficient to fulfill the data mining goals. Indeed, many data mining algorithm are designed for this type of problem settings. For medium scale data mining tasks, data are typically large (and possibly distributed) and cannot be fit into the main memory. Common solutions are to rely on parallel computing , collective mining to sample and aggregate data from different sources and then use parallel computing programming .In this paper we concentrated the Tier I i.e.,Big Data Mining Platforms by using MapReduce (MR). For this technique we follow the Distributed Aggregation for Data Parallel computing. In this technique we reduce the network traffic over the network.

Keywords

Big Data, Big Data Platform, Data Mining, Distributed Aggregation, MapReduce(MR)

How to Cite this Article?

Bala Krishna Sapparam and Janardhan (2014). Big Data Mining Platforms: Distributed Aggregation For Data-Parallel Computing. i-manager’s Journal on Information Technology, 3(3), 20-31. https://doi.org/10.26634/jit.3.3.2944

References

[1]. Ahmed and Karypis Rezwan Ahmed, George Karypis, (2012), Algorithms for mining the evolution of conserved relational states in dynamic networks, Knowledge and Information Systems, December 2012, Volume 33, Issue 3, pp 603-630
[2]. Alam et al. Md. Hijbul Alam, JongWoo Ha, SangKeun Lee, Novel (2012), approaches to crawling important pages early, Knowledge and Information Systems December 2012, Volume 33, Issue 3, pp 707-734
[3]. Aral S. and Walker D. (2012), Identifying influential and susceptible members of social networks, Science, Vol.337, pp.337-341.
[ 4 ] . M a c h a n a v a j j h a l a a n d R e i t e r A s h w i n Machanavajjhala, Jerome P. Reiter: (2012), Big privacy: protecting confidentiality in big data. ACM Crossroads, 19(1): 20-23, 2012.
[5]. Banerjee and Agarwal (2012), Soumya Banerjee, Nitin Agarwal, Analyzing collective behavior from blogs using swarm intelligence, Knowledge and Information Systems, December 2012, Vol 33, Issue 3, pp 523-547
[6]. Birney E. (2012), The making of ENCODE: Lessons for big-data projects, Nature, vol.489, pp.49-51.
[7]. Bollen et al. (2011), J. Bollen, H. Mao, and X. Zeng, Twitter Mood Predicts the Stock Market, Journal of Computational Science, 2(1):1-8, 2011.
[8]. Borgatti S., Mehra A., Brass D., and Labianca G. (2009), Network analysis in the social sciences, Science, Vol. 323, pp.892-895.
[9]. Bughin et al. J Bughin, M Chui, J Manyika, (2010), Clouds, big data, and smart assets: Ten tech-enabled business trends to watch, McKinSey Quarterly, 2010.
[10]. Centola D. (2010), The spread of behavior in an online social network experiment, Science, Vol.329, pp.1194-1197.
[11]. Chang et al., Chang E.Y., Bai H., and Zhu K., (2009), Parallel algorithms for mining large-scale rich-media data, In: Proceedings of the 17th ACM International Conference on Multimedia (MM '09), New York, NY, USA, 2009, pp. 917-918.
[12]. Chen et al. R. Chen, K. Sivakumar, and H. Kargupta, (2004), Collective Mining of Bayesian Networks from Distributed Heterogeneous Data, Knowledge and Information Systems, Vol 6(2): pp164-187, 2004.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.