Investigation of Validity Metrics for Modified K-MeansClustering Algorithm

S. Govinda Rao*, Dr. A. Govardhan**
* Associate Professor, Department of CSE, Gokaraju Rangaraju Institute of Engineering and Technology, India.
** Director, School of Information Technology, JNTU Hyderabad, India.
Periodicity:June - August'2015


Clustering analysis is used to partition data set based on objects within a group and the clustering results are influenced by choice of distance measure and the clustering algorithm. Clustering analysis has been applied to group of author's hindex and g-index with similar or dissimilar features. Validity measure is calculated to determine which is the best clustering by finding the minimum value for our measure. In this paper, the authors have presented the effective validations possible with Davies-Bouldin index, Silhouette index and quantization error


Validity Measure, Davies-Bouldin Index, Silhouette Index, Quantization Error

How to Cite this Article?

Rao, S.G., and Govardhan, A. (2015). Investigation of Validity Metrics for Modified K-Means Clustering Algorithm. i-manager’s Journal on Computer Science, 3(2), 33-36.


[1]. Jain A. R., Murthy M. N. and Flynn P. J. (1999). “Data Clustering: A Review”, ACM Computing Surveys, Vol. 31(3), pp. 265-323.
[2]. Halkidi M., Batistakis Y., and Vazirgiannis M. (2001). “On Clustering Validation Techniques”, Journal of Intelligent Information Systems, Vol. 17(2-3), pp. 107–145.
[3]. Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, and Prabhakar Raghavan. (1998). “Automatic Subspace Clustering of High-dimensional Data for Data Mining Applications”, In ACM SIGMOD Conference on Management of Data.
[4]. Pablo A Jaskowiak et al. (2014). “On the Selection of Appropriate Distances for Gene Expression Data Clustering”, BMC Bioinformatics, Vol. 15(2).
[5]. - accessed on February 2nd 2015.
[6]. Margaret H. Dunham (2002). “Data Mining”, Introductory and Advanced Topics. Prentice Hall.
[7]. Jiawei Han and Micheline Kamber (2001). “Data Mining: Concepts and Techniques”, Academic Press.
[8]. James C. Bezdak and Nikhil R. Pal. (1998). “Some New Indexes of Cluster Validity”, IEEE transactions on Systems, Vol.28(3) .
[9]. Milligan, G. W. and Cooper, M. C. (1985). “An Examination of Procedures for Determining the Number of Clusters in a Data Set”, Psychometrika, Vol. 50(2), pp.159–179.
[10]. Vendramin, L.; Campello, R. J. G. B.; and Hruschka, E. R. (2010). “Relative Clustering Validity Criteria: A Comparative Overview”. Statistical Analysis and Data Mining, Vol. 3(4), pp. 209–235.
[11]. SergiosTheodoridis and Konstantinos Koutroumbas. (1999). “Pattern Recognition”, Academic Press.
[12]. James C. Bezdak and Nikhil R. Pal (1998). “Some New Indexes of Cluster Validity”, IEEE Transactions on Systems, Vol. 28(3).
[13]. Davies, D. L.; Bouldin, D. W. A (1979). “Cluster Separation Measure”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 2, pp. 224-226.
[14]. Rousseeuw, P. J. (1987). “Silhouettes: A Graphic Aid to the Interpretation and Validation of Cluster Analysis”, Journal of Computational and Applied Mathematics, Vol. 20(1), pp. 53–65.
[15]. Rana S., Jasola Kumar R.,(2010). “A Hybrid Sequential Approach for Data Clustering Using K-means and Particle Swarm Optimization Algorithm”, International Journal of Engineering, Science and Technology, Vol. 2(6), pp. 167-176.
[16]. Jonathan Baarsch and M. EmreCelebi (2012). "Investigation of Internal Validity Measures for K-Means Clustering", In Proceedings of the International Multiconference of Engineers and Computer Scientists, Vol. 1.
[17]. S. Govinda Rao, Dr A Govardhan. (2014). “Assessing h- and g- indices of Scientific Papers using k-means Clustering”, International Journal of Computer Applications (0975-8887), Vol.100(11).
[18]. Kaijun Wang, Baijie Wang , and Liuqing Peng (2009). “cvap: Validation For Cluster Analyses”, Data Science Journal, Vol. 8(20).
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
Pdf 35 35 200 20
Online 15 15 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.