Big Data Engineering

Santiya P.*, Dhanakoti V. **, Muthusenthil B.***
*-*** Department of Computer Science and Engineering, SRM Valliammai Engineering College, Chennai, India.
Periodicity:January - June'2021
DOI : https://doi.org/10.26634/jcc.8.1.18456

Abstract

In recent years, Big Data applications have grown increasingly significant. Businesses are now aware of the massive amounts of data they collect daily. They also believe that when Big Data is analysed, it can yield more useful information. It is tough to analyse Big Data because of its vast volume and unstructured format. Much work has been done to address the complicated Big Data concerns. As a result, a variety of distribution systems and technologies have emerged. This paper offers a review of recent Big Data technologies that have been developed in recent times. Its goal is to assist users in selecting and implementing the optimum combination of Big Data technologies based on their technology demands and specific application requirements. It not only gives a broad overview of major Big Data technologies, but it also compares them across several system layers, such as the Data Storage Layer, Data Processing Layer, Data Querying Layer, Data Access Layer, and Management Layer. It categorises and examines the essential features, benefits, limitations, and applications of various technologies.

Keywords

Big Data, Data Acquisition, Storage Layer, Distribution Systems.

How to Cite this Article?

Santiya, P., Dhanakoti, V., and Muthusenthil, B. (2021). Big Data Engineering. i-manager's Journal on Cloud Computing, 8(1), 35-43. https://doi.org/10.26634/jcc.8.1.18456

References

[1]. Batrinca, B., & Treleaven, P. C. (2015). Social media analytics: A survey of techniques, tools and platforms. AI & Society, 30(1), 89-116. https://doi.org/10.1007/s00146- 014-0549-4
[2]. Bhattacharya, S. (2021, August 11). Top Big Data Trends for Data Scientists. Analytics Insight. Retrieved from https://www.analyticsinsight.net/top-big-data-trends-fordata- scientists/
[3]. Burns, E. (2014, April). MapR. TechTarget. Retrieved from https://searchbusinessanalytics.techtarget.com/ definition/MapR
[4]. Cloudera. (n.d.). What is Cloudera? Retrieved from https://www.cloudera.com/why-cloudera.html
[5]. Davidson, G. S., Cowie, J. R., Helmreich, S. C., Zacharski, R. A., & Boyack, K. W. (2006). Data-centric computing with the Netezza architecture. Sandia National Laboratories, California, USA. https://doi.org/10.2172/ 1012744
[6]. Edmonds, M. (2021, February 10). How marketers are embracing hyper-personalization in 2021. ClickZ. Retrieved from https://www.clickz.com/how-marketers-areembracing- hyper-personalization-in-2021/265029/
[7]. Grancher, E. (2010, April). Oracle and storage IOs, explanations and experience at CERN. Journal of Physics: Conference Series, 219(5), 052004. https://doi.org/10. 1088/1742-6596/219/5/052004
[8]. IBM. (2017). Hortonworks Data Platform: An openarchitecture platform to manage data in motion and at rest. IBM Analytics. Retrieved from https://www.ibm.com/ downloads/cas/DKWR4KZB
[9]. Itha, T. (2020, September 17). Big Data — An umbrella of problems. Medium. Retrieved from https://ithatejesh. medium.com/big-data-an-umbrella-of-problems-d2848 e901031
[10]. Jung, S., & Shin, Y. (2018). Study of the big data collection scheme based Apache Flume for log collection. International Journal of Computer Theory and Engineering, 10(3), 97-100.
[11]. KnowledgeHut. 2019, July 18). Apache Spark Vs Apache Storm - Head to head comparison. KnowledgeHut. Retrieved from https://www.knowledgehut. com/blog/big-data/apache-spark-vs-apache-storm
[12]. Li, J. N., Soroudi, A., Tanuwidjaja, E., Franklin, M., & Kubiatowicz, J. D. (2015). High Availability on a Distributed Real Time Processing System. [Postgraduate Thesis]. Electrical Engineering and Computer Sciences University of California, Berkeley, CA, USA.
[13]. Mattson, T. G., Sanders, B., & Massingill, B. (2004). Patterns for Parallel Programming. Pearson Education.
[14]. Naik, D. (2018, March 18). Augmented Reality — with React-Native. AR VR Journey. Retrieved from https://arvrjourney.com/augmented-reality-with-reactnative- 15219f36e3f2
[15]. Pal, K. (2014, July 17). Intro to Apache MapReduce 2 (YARN). DevX. Retrieved from https://www.devx.com/ opensource/intro-to-apache-mapreduce-2-yarn.html
[16]. Ren, L., Du, Y., Ma, S., Zhang, X. L., & Dai, G. Z. (2014). Visual analytics towards big data. Ruan Jian Xue Bao (Journal of Software), 25(9), 1909-1936. https://doi.org/ 10.13328/j.cnki.jos.004645
[17]. Schadt, E. E., Linderman, M. D., Sorenson, J., Lee, L., & Nolan, G. P. (2010). Computational solutions to large-scale data management and analysis. Nature reviews genetics, 11(9), 647-657. https://doi.org/10.1038/nrg2857
[18]. Subramaniam, A. (2021, November 14). Big Data Analytics – Turning Insights into Action. Edureka! Retrieved from https://www.edureka.co/blog/big-data-analytics/
[19]. Teplow, D. (2015, March 15). Hadoop – Whose to Choose (Part 2). Retrieved from https://integratc.wordpress. com/2015/03/25/hadoop-whose-to-choose-part-2
[20]. Thangaselvi, R., Ananthbabu, S., Jagadeesh, S., & Aruna, R. (2015, October). Improving the efficiency of MapReduce scheduling algorithm in Hadoop. In 2015, International Conference on Applied and Theoretical Computing and Communication Technology (ICATCCT), (pp. 63-68). IEEE. https://doi.org/10.1109/ICATCCT.2015. 7456856
[21]. Varun, R. (n.d.). The Top 10 trends that would drive the scope of Data Analytics. ExcelR. Retrieved from http://demo.excelr.com/blog/data-science/the-top-10- trends-that-would-drive-the-scope-of-data-analytics
[22]. Wang, G., & Ng, T. E. (2010, March). The impact of virtualization on network performance of Amazon EC2 data center. In 2010, Proceedings IEEE NFOCOM, (pp. 1-9). IEEE. https://doi.org/10.1109/INFCOM.2010.5461931
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.