References
[1]. N.K Nagwani, (2015). “Summarizing Large Text Collection Using Topic Modeling and Clustering Based on MapReduce Framework”. Journal of Big Data, Springer Open Journal, DOI:10.1186\S40537-015-0020-5.
[2]. Zhang G., and Zhang M., (2013). “The Algorithm of Data Preprocessing in Web Log Mining Based on Cloud Computing”. In 2012 International Conference on Information Technology and Management Science (ICITMS 2012) Proceedings, Springer, Berlin, Heidelberg, Germany, pp. 467–474.
[3]. Morales GDF, Gionis A., and Sozio M., (2011). “Social Content Matching in MapReduce”. Proceedings of the VLDB Endowment, Vol. 4, No. 7, pp. 460-469.
[4]. Verma A., Llora X., Goldberg DE., and Campbell RH, (2009). “Scaling Genetic algorithms using MapReduce Intelligent Systems Design and Application (ISDA)”. Ninth International Conference, Pisa, Italy, pp 13–18.
[5]. Cambria E., Rajagopal D., Olsher D., and Das D., (2013). “Big Social Data Analysis”. Big Data Computing Chapter, Vol. 13, pp. 401-414.
[6]. Lieberman M., (2014). “Visualizing Big Data: Social Network Analysis”. Digital Research Conference, San Antonio, Texas, pp. 1-23.
[7]. López V., Río S.D., Benítez J.M, and Herrera F., (2014). “Cost-Sensitive Linguistic Fuzzy Rule Based Classification Systems Under the MapReduce Framework for Imbalanced Big Data”. Fuzzy Sets Syst, Vol. 1, pp. 1-34.
[8]. Blanas S., Patel J.M, Ercegovac V., Rao J., Shekita E.J, and Tian Y., (2010). “A Comparison of Join Algorithms for Log Processing in MapReduce”. Proc. of the 2010 ACM SIGMOD International Conference on Management of Data, New York, USA, pp. 975-986.
[9]. Hoi SCH, Wang J., Zhao P., and Jin R., (2012). “Online st Feature Selection for Mining Big Data”. Proc. of the 1 International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, ACM, New York, USA, pp. 93–100.
[10]. Chen S.Y, Li J.H, Lin K.C, Chen H.M, and Chen T.S., (2013). “Using MapReduce Framework for Mining Association Rules ”. In Information Technology Convergence Springer, Netherlands, pp. 723–731.
[11]. Urbani J., Maassen J., and Bal H., (2010). “Massive Semantic Web data compression with MapReduce”. th Proc. of the 19 ACM International Symposium on High Performance Distributed Computing, New York, USA, pp. 795–802.
[12]. Rajdho A., and Biba M., (2013). “Plugging Text Processing and Mining in a Cloud Computing Framework”. In Internet of Things and Inter-cooperative Computational Technologies for Collective Intelligence Springer, Berlin, Heidelberg, Germany, pp. 369–390.
[13]. Balkir A.S, Foster I., and Rzhetsky A., (2011). “A Distributed Look-up Architecture for Text Mining Applications using MapReduce”. High Performance. Computing, Networking, Storage and Analysis (SC), 2011 International Conference, Seattle, US, pp. 1–11
[14]. Zongzhen H., Weina Z., and Xiaojuan D., (2013). “A Fuzzy Approach to Clustering of Text Documents Based on MapReduce”. In Computational and Information Sciences (ICCIS), 2013 Fifth International Conference on IEEE. Shiyang, China, pp. 666-669.
[15]. Chen F., and Hsu M., (2013). “A Performance Comparison of Parallel DBMSs and MapReduce on Large Scale Text Analytics”. Proc. of the 16th International Conference on Extending Database Technology ACM, New York, USA, pp. 613-624.
[16]. Das T.K, and Kumar P.M., (2013). “Big Data Analytics: A Framework for Unstructured Data Analysis”. International Journal of Engineering and Technology (IJET), Vol. 5, No. 1, pp. 153-156.
[17]. Momtaz A., and Amreen S., (2012). “Detecting Document Similarity in Large Document Collection using MapReduce and the Hadoop Framework”. BS Thesis. BRAC University, Dhaka, Bangladesh, pp. 1–54.
[18]. Lin J., and Dyer C., (2010). “Data-Intensive Text Processing with MapReduce”. Morgan & Claypool Publishers, Vol. 3, No. 1, pp. 1-177.
[19]. Elsayed T., Lin J., and Oard D.W., (2008). “Pairwise Document Similarity in Large Collections with MapReduce”. Proc. of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies, Stroudsburg, US, pp. 265–268.
[20]. Galgani F., Compton P., and Hoffmann A., (2012). “Citation Based Summarisation of Legal Texts”. Proc. of 12th Pacific Rim International Conference on Artificial Intelligence, Kuching, Malaysia, pp. 40–52
[21]. Hassel M., (2004). “Evaluation of Automatic Text Summarization”. Licentiate Thesis, Stockholm, Sweden, pp. 1–75.
[22]. Wang Y., Bai H., Stanton M., Chen W.Y, and Chang E.Y., (2009). “PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications”. 5th International Conference, A AIM (Algorithmic Aspects in Information and Management), San Francisco, CA, USA, pp. 309–322.
[23]. Hu Q., and Zou X., (2011). “Design and implementation of multi-document automati c summarization using MapReduce”. Computer Engineering and Applications, Vol. 47, No. 35, pp. 67–70.
[24]. Lai C., and Renals S., (2014). “Incorporating Lexical and Prosodic Information at Different Levels for Meeting Summarization”. Proceedings of the 15th Annual Conference of the International Speech Communication Association, INTERSPEECH 2014. ISCA, Singapore, pp. 1875–1879.
[25]. M. Cannataro and D. Talia, (2004). “Semantics and Knowledge Grids: Building the Next-generation Grid”. Intelligent Systems, IEEE, Vol. 19, No. 1, pp. 56–63.
[26]. S. Wang, H.-J. Wang, X.-P. Qin, and X. Zhou, (2011). “Architecting Big Data: Challenges, Studies and Forecasts”. Jisuanji Xuebao (Chinese Journal of Computers), Vol. 34, No. 10, pp. 1741–1752.
[27]. K. Chen and W.-M. Zheng, (2009). “Cloud Computing: System Instances and Current Research”. Journal of Software, Vol. 20, No. 5, pp. 1337–1348.
[28]. J. Dean and S. Ghemawat, (2010). “MapReduce: A Flexible Data Processing Tool”. Communications of the ACM, Vol. 53, No. 1, pp. 72–77.
[29]. W. Xi-Zhao, (2003). “Optimization of k-means Clustering by Feature Weight Learning”. Journal of Computer Research and Development, Vol. 6.
[30]. H.-G. Li, G.-Q. Wu, X.-G. Hu, J. Zhang, L. Li, and X. Wu, (2011). “K-means Clustering with Bagging and th MapReduce”. In System Sciences (HICSS), 2011 44 Hawaii International Conference on IEEE, pp. 1–8.
[31]. Steve L., (2012). “The Age of Big Data”. Big Data's Impact in the World, New York, USA, pp. 1–5.
[32]. Lee K.H, Lee Y.J, Choi H,, Chung Y.D, and Moon B., (2011). “Parallel Data Processing with MapReduce: A Survey”. ACM SIGMOD Record, Vol. 40, No. 4, pp.11–20.
[33]. Fowkes J., Ranca R., Allamanis M., Lapata M., and Sutton C., (2014). “Autofolding for Source Code Summarization”. Computing Research Repository, 1403(4503): pp. 1-12.
[34]. Tzouridis E., Nasir J.A, Lahore LUMS, and Brefeld U., (2014). “Learning to Summarise Related Sentences”. The 25th International Conference on Computational Linguistics (COLING'14), Dublin, Ireland, pp. 1–12, ACL
[35]. Wang Y., Bai H., Stanton M., Chen W.Y, and Chang E.Y, (2009). “PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications”. 5th International Conference, A AIM (Algorithmic Aspects in Information and Management), San Francisco, CA, USA, pp. 309–322.
[36]. Miller G.A., (1995). “WordNet: A Lexical Database for English”. Commun ACM, Vol. 38, No. 11, pp. 39-41.
[37]. Blei D.M, Ng AY, and Jordan M.I, (2003). “Latent Dirichlet Allocation”. The Journal of Machine Learning Research, Vol. 3, pp. 993–1022.
[38]. Feldman R., and Sanger J., (2007). The Text Mining Handbook-Advanced Approaches in Analyzing Unstructured Data. Press, Cambridge University, ISBN 978- 0-521-83657-9
[39]. McCallum A.K., (2002). “Mallet: A Machine Learning for Language Toolkit”. Retrieved from http://mallet. cs.umass.edu/ on 10 May 2014.
[40]. Galgani F., Compton P., and Hoffmann A., (2012). “Combining Different Summarization Techniques for Legal Text”. Proc. of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data. Association for Computational Linguistics, Avignon, France, pp. 115–123.
[41]. Galgani F., Compton P., and Hoffmann A., (2014). “HAUSS: Incrementally Building a Summarizer Combining Multiple Techniques”. Int. J. Human-Computer Studies, Vol. 72, pp. 584–605.
[42]. Li W., (1992). “Random Texts Exhibit Zipf's-Law-Like Word Frequency Distribution”. IEEE Trans Inf Theory, Vol. 38, No. 6, pp. 1842–1845.
[43]. Reed W.J., (2001). “The Pareto, Zipf and Other Power Laws”. Econ Lett, Vol. 74, No. 1, pp.15–19.
[44]. Goldstein J., Mittal V., Carbonell J.G, and Kantrowitz M., (2000). “Multi-Document Summarization By Sentence Extraction”. School of Computer Science, Carnegie Mellon University, Research Showcase, pp. 40–48.
[45]. Lin C.Y., (2004). “Rouge: a Package for Automatic Evaluation of Summaries”. In: Out TSB (ed) Proceedings of the ACL-04 Workshop Association for Computational Linguistics, Barcelona, Spain, pp. 74–81.
[46]. Nenkova A., and Passonneau R., (2004). “Evaluating Content Selection in Summarization: The Pyramid Method”. Proc. Human Language Technology Conf. North Am, Chapter of the Assoc. for Computational Linguistics (HLT-NAACL), Boston, Massachusetts, pp. 145–152.
[47]. Harnly A., Nenkova A., Passonneau R., and Rambow O., (2005). “Automation of Summary Evaluation by the Pyramid Method”. In Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria, pp. 226–232.
[48]. Qazvinian V., and Radev D.R., (2008). “Scientific Paper Summarization Using Citation Summary Networks”. nd Proceedings of the 22 International Conference on Computational Linguistics, Vol. 1, Stroudsburg, PA, pp. 689–696.
[49]. Wang D., and Li T., (2012). “Weighted Consensus Multi-document Summarization”. Inf Process Manag, Vol. 48, pp. 513–523.
[50]. Amdahl G.M., (1967). “Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities”. Proceedings of the April 18–20, 1967, Spring Joint Computer Conference, Atlantic City, New Jersey, USA, pp. 483–485.