Understanding Hindsight, Insight and Forsight Data To Large-Scale Distributed Data Intelligence (Algorithms) Machine: A Scale-Out Review

Sabibullah Mohamed Hanifa*
*Associate Professor & Dean, PG and Research Department of Computer Science, SCAS, Pudukkottai, Tamilnadu, India.
DOI : https://doi.org/10.26634/jpr.4.3.13888


This review attempts to comprehend the insights and foresights into the grasping power of data science, data quality, data process, data pre-process, big data, big data process and analysis, analytics, BD (Big Data) and analytics lifecycle, file storage, platforms/technologies supported, Hadoop concepts, eco-system components, and design principles. Principle and philosophy behind computations are also explained through flow diagram. Various analytics based on its solutions, cluster computing based platforms like Apache Spark (its architecture – core, other components, and utilities), MLlib package – Machine Learning (ML) methods/ tasks and detailed supported algorithms are exclusively elucidated, to understand the concepts of these pinpoints. The explored comprehensive contents would definitely be useful and provide core understanding knowledge in the large scale ML dependent algorithms process, suitable to build the relevant application solutions (may be predictions/ classifications/ segmentations/ recommendations) via Apache – Spark environment.


Big Data, Analytics, Apache Spark, Flink, Hadoop, Recommender Engine, Machine Learning, Classification, Clustering, Algorithms, Collaborative Filtering, MLlib, Data Science, Scaling, Cloud Storage.

How to Cite this Article?

Hanifa, S. M. (2017). Understanding Hindsight, Insight and Forsight Data To Large-Scale Distributed Data Intelligence (Algorithms) Machine: A Scale-Out Review. i-manager’s Journal on Pattern Recognition, 4(3), 32-43.


[1]. Buyya, R., Vecchiola, C., & Selvi, S. T. (2013). Mastering Cloud Computing. Morgan Kaufmann, USA.
[2]. Caesar, W. & Buyya, R. (2015). Cloud Data Centers and Cost Modeling. Morgan Kaufmann, USA.
[3]. Dean, J. & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
[4]. Karau, H., Konwinski, A., Wendell, P., & Zaharia, M. (2015). Learning Spark: Lightning-Fast Big Data Analysis. O'Reilly Media, Inc.
[5]. Katarzyna, M. (2006). Recommendation System for Online Social Network. Blekinge Institute of Technology, Master's Thesis in Software Engineering, Thesis no: MSE- 2006, 11.
[6]. Liu, X., Datta, A., & Lim, E. P. (Eds.). (2014). Computational Trust Models and Machine Learning. CRC Press.
[7]. Nair, S. S. K. & Ganesh, N. (2016). An exploratory study on big data processing: A case study from a biomedical informatics. In Big Data and Smart City (ICBDSC), 2016 3 MEC International Conference on (pp. 1-4). IEEE.
[8]. Sadasivam, R. S., Cutrona, L. S., Kinney, L. R., Marlin, M. B., Mazor, M. K., Lemon, C. S. et al. (2016). Collectiveintelligence recommender systems: advancing computer tailoring for health behavior change into the 21 century. Journal of Medical Internet Research, 18(3), 1-13.
[9]. Saravanakumar, M. V., & Hanifa, S. M. (2017). BIGDATA: Harnessing insights to healthier analytics - A survey. In Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET), 2017 International Conference on (pp. 1-6). IEEE.
[10]. Verma, J. P., Patel, B., & Patel, A. (2015). Big data analysis: recommendation system with Hadoop frame work. In Computational Intelligence & Communication Technology (CICT), 2015 IEEE International Conference on (pp. 92-97). IEEE.
[11]. Wiesner, M. & Pfeifer, D. (2014). Health recommender systems: concepts, requirements, technical basics and challenges. International Journal of Environmental Research and Public Health, 11(3), 2580- 2607.
[12]. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma., J., Macauley, M. et al. (2012). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9 USENIX conference on Networked Systems Design and Implementation (pp. 25-27). USENIX Association.
If you have access to this article please login to view the article or kindly login to purchase the article
Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.

Purchase Instant Access

Single Article

Print 35 35 200
Online 35 35 200
Print & Online 35 35 400