i-manager Publications

Data Scheduling and Mapreducing in Big Data

E.Ravi Kondal*, B.Mounika**

* Associate Professor, Department of Computer Science and Engineering, Noble College of Engineering & Technology for Women, Hyderabad, India.

** B. Tech Student, Department of Computer Science and Engineering, Noble College of Engineering & Technology for Women, Hyderabad, India.

Periodicity:February - April'2015
DOI : https://doi.org/10.26634/jcc.2.2.3445

Abstract

The volume of data usage is growing drastically day by day. Hence, it is not easy to maintain the data. In Big Data, huge amount of structured, semi-structured and unstructured data, produced daily by resources all over the world are stored in the computer. Mapreducing, a programming model, is used for implementing such large data sets. MapReduce program is used to collect data as per the request. To process the large volume of data, proper scheduling is used in order to achieve greater performance. Task scheduling plays a major role in Big Data cloud. Task scheduling contains a lot of rules to solve the problems of users and provides the quality of services to achieve the goal of that task to improve the resource utilization and turnaround time. Capacity and Delay Scheduling are used to improve the performance of the Big Data. This paper presents an overview of the Map-Reduce technique for shuffling and reducing the data and also the Capacity Scheduling and Delay Scheduling, for improving the reliability of the data

Keywords

Big Data, Map Reducing, Task Scheduling, Capacity Scheduling, Delay Scheduling.

How to Cite this Article?

Kondal, E. R., and Mounika, B. (2015). Data Scheduling and Mapreducing in Big Data. i-manager’s Journal on Cloud Computing, 2(2), 1-6. https://doi.org/10.26634/jcc.2.2.3445

References

[1]. Michael Stonebreaker, Daniel Abadi, David J. DeWitt, Sam Madden, Erik Paulson, Andrew Pavlo and Alexander Rasin.,(2010). ”Map-Reduce complements DBMSs since databases are not designed for extract transform- load tasks, a Map-Reduce specialty”, ACM Digital Library, Vol.53, No.1, pp.64-71.

[2]. Radheshyam Nanduri, Niteshaheshwari, Reddy Raja, Vasudeva Varma, (2011). “Job Aware Scheduling Algorithm for Map-Reduce Framework”,3rd IEEE International Conference on Cloud Computing Technology and Science, 978-0-7695-4622-3/11, pp.724-729.

[3]. Dean, J. and Ghemawat, S.,(2010). “Map-Reduce: a flexible Data processing tool”, ACM Digital Library, Vol. 53, No.1, pp.72-77.

[4]. DeWitt & Stonebreaker (2008). “Map-Reduce: A major step Backwards”, page no.1/9, Online forum http://www.yjanboo.cn/?p=237

[5]. Dongjin Yoo, Kwang Mong Sim (2011). “A Comparative Review of job scheduling for Map-Reduce,” 2011 IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 353 – 358.

[6]. Zhao, Yaxiong , and Jie Wu (2013). "Dache: A data aware caching for big-data applications using the MapReduce framework", Tsinghua Science and Technology, Vol.19, No.1, pp. 39-50.

[7]. Dean, J. and Ghemawat, S.,(2010). “Map-Reduce: a flexible data processing tool”, ACM Digital Library, Vol. 53 No. 1, Pages 72-77.

[8]. E.Ravi Kondal.,( 2014). ”Efficient Cloud Seeding on Cloud Storage Using Virtualization”, International Journal of Scientific & Engineering Research, Vol.5, No.12, pp.1010-1019.

[9]. V. Krishna Reddy, B. Thirumala Rao, LSS Reddy, (2009). “Research issues in Cloud Computing”, Global Journal Computer Science & Technology, Vol. 9, No. 9, pp.70-76.

[10]. Jagmohan Chauhan, Dwight Makaroff and Winfried Grassmann.,(2012). “The Impact of Capacity Scheduler Configuration Settings on Map-Reduce Jobs”, 2012 Second International Conference on Cloud and Green Computing (CGC), pp. 667 – 674.

[11]. Hadoop's capacity scheduler:http://Hadoop. apache.org/cor e/ docs/current/ capacity/scheduler

[12]. B. Thirumala Rao, Dr. L. S. Reddy.,(2011). “Survey on Improved Scheduling in Hadoop Map-Reduce in Cloud Environments”,International Journal of Computer Applications, Vol.34, No.9, pp.29-33.

[13]. Harshawardhan S. Bhosle, Devendra P. Gadekar., (2014). “Big Data Processing Using Hadoop: Survey on Scheduling”, International Journal of Science and Research (IJSR),Vol.3, No. 10, pp.272-277.

[14]. Matei Zaharia, Dhruba Borthakur, Joy deep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica.,(2010). “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling in EuroSys 10”, Proceedings of the 5th European Conference on Computer systems, pp.265–278.

[15]. Raja Manish Singh, Sanchita Paul, Abhishek Kumar., (2014).”Task Scheduling in Cloud Computing: Review”, (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.5, No.6, pp. 7940-7944.

[16]. N.Saranya, Mr.T.Yoganandh.,(2014). “An Efficient Map Reduce Task Scheduling and Micro-Partitioning Mechanism for Optimizing Large Data Analysis”, International Journal on Engineering Technology and Sciences – IJETS, Vol.1, No.7, pp. 2349-3976.

Data Scheduling and Mapreducing in Big Data

Abstract

Keywords

How to Cite this Article?

References

If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Options for accessing this content:

	North Americas,UK, Middle East,Europe		India	Rest of world
	USD	EUR	INR	USD-ROW
Pdf	35	35	200	20
Online	15	15	200	15
Pdf & Online	35	35	400	25