Flexible Generalized Mixture Model Cluster Analysis withElliptically-Contoured Distributions

J. Andrew Howe*, Hamparsum Bozdogan**
*-** Department of Statistics, Operations, and Management Science, Knoxville, Tennessee, USA
Periodicity:March - May'2014
DOI : https://doi.org/10.26634/jpr.1.1.2820

Abstract

The traditional mixture model assumes that, a dataset is composed of several clusters of Gaussian distributions. In real life, however, data often do not fit the restrictions of normality very well. It is likely that data from a single cluster exhibiting non- Gaussian shape characteristics could be erroneously modeled as multiple clusters, resulting in suboptimal inference and decision making. More flexibility is given to the mixture model by generalizing the Gaussian mixture model to use Elliptically-Contoured Distributions. While still symmetric, multivariate Elliptically-Contoured Distributions encompass a wide range of peak and tail characteristics. Distributions that can be generated as special cases include the Power Exponential, Gaussian, Laplace, Student’s T, Cauchy, and Uniform. This generalization makes mixture modeling more robust against nonnormality. Two computational algorithms, GARM and GEM,were adapted to optimize the Elliptically Contoured Mixture model and use results from robust estimation theory in order to data-adaptively regularize both. Finally, the Fisher information matrix and information criterion ICOMP are extended to score the new mixture model. These tools are used simultaneously to select the best mixture model and classify all observations without making any subjective decisions. The performance of the proposed mixture model is first demonstrated on a simulated dataset with extreme overlap. Secondly, the Elliptically-Contoured Mixture model is used on a medical dataset in which, the clinicallydetermined cluster structure are recovered For both datasets, the proposed mixture model substantially improves classification rates over the Gaussian mixture model.

Keywords

Mixture Modeling, Elliptically Contoured Distribution, Unsupervised Classification, Robust Estimation, Multivariate Statistics

How to Cite this Article?

Howe, J. A., and Bozdogan, H. (2014). Flexible Generalized Mixture Model Cluster Analysis With Elliptically-Contoured Distributions. i-manager’s Journal on Pattern Recognition, 1(1), 5 - 22. https://doi.org/10.26634/jpr.1.1.2820

References

[1]. Akaike, H. (1973). “Information Theory and an Extension of the Maximum Likelihood Principle”. In Petrox, B., Csaki, F. (Eds.), Second International Symposium on Information Theory. Academiai Kiado, Budapest, pp.267–281.
[2]. Akbilgic, O., Bozdogan, H. (2011). “Predictive Subset Selection using Regression Trees and RBF Neural Networks Hybridized with the Genetic Algorithm”. European Journal of Pure and Applied Mathematics, 4 (4), pp.467–485.
[3]. Anderson, T., Fang, K. (1990). “Inference in Multivariate Elliptically Contoured Distributions Based on Maximum Likelihood”. In: Fang, K., Anderson, T. (Eds.), Statistical Inference in Elliptically Contoured and Related Distributions. Allerton Press, Inc., New York, pp. 201–216.
[4]. Andrews, D., Herzberg, A. (1985). “Data: A Collection of Problems from Many Fields for the Student and Research Worker”. Springer Series in Statistics. Springer- Verlag, New York.
[5]. Andrews, J., McNicholas, P. (2011). “Mixtures of Modified t-factor Analyzers for Model-based Clustering, Classification, and Discriminant Analysis”. Journal of Statistical Planning and Inference, Vol.141, pp.1479–1486.
[6]. Banfield, J. D., Raftery, A. E. (1993). “Model-Based Gaussian and Non-Gaussian Clustering”. Bio- metrics , 49 (3), pp. 803–812.
[7]. Bearse, P., Bozdogan, H., Schlottmann, A. (1997). “Empirical Econometric Modelling of Food Consumption Using a New Informational Complexity Approach”. Journal of Applied Econometrics, Vol.12, pp.563–592.
[8]. Bhuyan, J., Raghavan, V., Elayavalli, V. (1991). “Genetic Algorithm for Clustering with an Ordered Representation”. In: 4th International Conference on Genetic Algorithms. Morgan Kaufman, San Mateo, CA.
[9]. Biernacki, C., Celeux, G., Govaert, G. (1999). “An improvement of the NEC criterion for assessing the number of clusters in mixture model”. Pattern Recognition Letters , Vol. 20, pp.267–272.
[10]. Biernacki, C., Celeux, G., Govaert, G. (2000). “Assessing a mixture model for clustering with the integrated completed likelihood”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (7), pp.719–725.
[11]. Bozdogan, H. (1981). “Multi-Sample Cluster Analysis and Approaches to Validity Studies in Clustering Individuals”. Ph.D. thesis, University of Illinois at Chicago.
[12]. Bozdogan, H. (1988). “ICOMP: A New Model- Selection Criteria”. In Bock, H. (Ed.), Classification and Related Methods of Data Analysis. North-Holland, pp. 599–608.
[13]. Bozdogan, H. (1994). “Mixture-Model Cluster Analysis Using Model Selection Criteria and a New Informational Measure of Complexity ”. In Bozdogan, H. (Ed.), Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach. Vol.2. Kluwerz Academic Publishers, Dordrecht, Netherlands, pp. 69–113.
[14]. Bozdogan, H. (2000). “Akaike’s Information Criterion and Recent Developments in Information Complexity”. Journal of Mathematical Psychology, Vol. 44, pp.62–91.
[15]. Bozdogan, H., Haughton, D. (1998). “Informational Complexity Criteria for Regression Models ” . Computational Statistics and Data Analysis. Vol. 28, pp.51–76.
[16]. Chatfield, C. (1995). “Model uncertainty, data mining and statistical inference”. Journal of the Royal Statistical Society, Series A 158, pp. 419–466.
[17]. Coker, E. U., Deniz Howe, E., Howe, J. A. (2011). “Exploring the Performance of Information Criteria in Multilevel Structural Equation Modeling”. Presented at the 8th International Amsterdam Multilevel Conference.
[18]. Day, N. (1969). “Estimating the Components of a Mixture of Normal Distributions”. Biometrika, Vol. 56, pp.463–474.
[19]. Dempster, A., Laird, N., Rubin, D. (1977). “Maximum Likelihood from Incomplete Data via the EM Algorithm”. Journal of the Royal Statistical Society. Series B (Methodological) 39 (1), pp.1–38.
[20]. Deniz, E., Bozdogan, H., Katraggadda, S. (2011). “Structural Equation Modeling (SEM) of Categorical and Mixed-Data Using the Novel Gifi Transformations and Information Complexity (ICOMP) Criterion”. Journal of the School of Business Administration, 40 (1), pp.86–123.
[21]. Fang, K., Kotz, S., Ng, K. (1990). “Symmetric Multivariate and Related Distributions”. Chapman and Hall, New York.
[22]. Farrell, M., Mersereau, R., September. (2004). “Estimation of Elliptically Contoured Mixture Models for Hyperspectral Imaging Data”. In Geoscience and Remote Sensing Symposium, IGARSS ’04. Vol. 4. IEEE International, pp. 2412–2415.
[23] Figueiredo, M. A. T., Jain, A. K. (2002). “Unsupervised learning of finite mixture models”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), pp.381–396.
[24]. Fonseca, J. R. S., Cardoso, M. G. M. S. (2007). “Mixture-model cluster analysis using information theoretical criteria”. Intelligent Data Analysis, 11(2), pp.155–173.
[25]. Franczak, B., Browne, R., McNicholas, P. (2009). “Mixtures of Shifted Asymmetric Laplace Distributions”.
[26]. Genton, M. (2004). “Skew-Elliptical Distributions and Their Applications”. Chapman & Hall / CRC, Boca Raton, Florida.
[27]. Holland, J. (1975). “Adaptation in Natural and Artificial Systems”. University of Michigan Press, Ann Arbor, Michigan.
[28]. Holland, J. (1992). “Genetic Algorithms”. Scientific American, pp.66–72.
[29]. Holzmann, H., Munk, A., Gneiting, T. (2006). “Identifiability of Finite Mixtues of Elliptical Distributions”. Scandinavian Journal of Statistics, 33 (4), pp.753–763.
[30]. Howe, J. A., Bozdogan, H. (2010). “Predictive Subset VAR Modeling Using the Genetic Algorithm and Information Complexity”. European Journal of Pure and Applied Mathematics, 3 (3), pp.382–305.
[31]. Howe, J.A., Bozdogan, H. (2012). “Robust Mixture Model Cluster Analysis Using Adaptive Kernels”. Journal of Applied statistics . Retrieved from http://www.tandfonline.com/doi/abs/10.1080/02664763. 2012.740630.
[32]. Karlis, D., Santourian, A. (2009). “Model-based Clustering with Non-elliptically Contoured Distributions”. Statistics and Computing, Vol. 19, pp.73–83.
[33]. Krishna, K., Murty, M. (1999). “Genetic K-Means Algorithm”. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics , 29 (3), pp.433–439.
[34]. Kullback, A., Leibler, R. (1951). “On Information and Sufficiency”. Annals of Mathematical Statistics, Vol. 22, pp.79–86.
[35]. Lee, S., McLachlan, G. (2012). “Finite Mixtures of Multivariate Skew t-distributions: Some Recent and New Results”. Statistics and Computing, pp.1–22. Retrieved from http://dx.doi.org/10.1007/s11222-012-9362-4.
[36]. Lin, T. (2009). “Maximum Likelihood Estimation for Multivariate Skew-normal Mixture Models”. Journal of Multivariate Analysis, Vol. 100, pp.257–265.
[37]. Lin, T. (2010). “Robust Mixture Modeling Using Multivariate Skew t Distribution”. Statistics and Computing, Vol. 20, 343–356.
[38]. Liu, M. (2006). “Multivariate Nonnormal Regression Models, Information Complexity, and Genetic Algorithms: A Three Way Hybrid for Intelligent Data Mining”. Ph.D. thesis, The University of Tennessee, Knoxville.
[39]. Liu, M., Bozdogan, H. (2008). “Multivariate Regression Models with Power Exponential Random Errors and Subset Selection Using Genetic Algorithms With Information Complexity”. European Journal of Pure and Applied Mathematics, 1 (1), pp.4–37.
[40]. Liu, S. (2002). “Local Influence in Multivariate Elliptical Linear Regression Models”. Linear Algebra and its Applications, Vol. 354, pp.159–174. Retrieved from http://www.sciencedirect.com/science/article/pii/S0024 379501005857.
[41]. Ma, J., Xu, L. (2005). “Asymptotic Convergence Properties of the EM Algorithm with Respect to the Overlap in the Mixture”. Neurocomputing, 68, 105–129.
[42]. MacQueen, J. (1967). “Some Methods for Classification and Analysis of Multivariate Observations”. In: Cam, L., Neyman, J. (Eds.), Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1. University of California, Berkeley, Berkeley, CA, pp. 281–297.
[43]. Mao, J., Jain, A. (1996). “A Self-Organizing Network for Hyper ellipsoidal Clustering (HEC)”. IEEE Transactions on Neural Networks, Vol. 7, pp.17–29.
[44]. McLachlan, G., Peel, D. (1998). “Robust Cluster Analysis via Mixtures of Multivariate t- distributions”. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (Eds.), Lecture Notes in Computer Science. Vol. 1451. Springer-Verlag, Berlin, Germany, pp.658–666.
[45]. Pearson, K. (1894). “Contributions to the Mathematical Theory of Evolution”. In: Phil. Trans. Royal Society. Vol. 185A. pp.71–110.
[46]. Peters, B., Walker, H. (1978). “An Iterative Procedure for Obtaining Maximum-Likelihood Estimates of the Parameters for a Mixture of Normal Distributions”. SIAM Journal on Applied Mathematics, 35 (2), pp.362–378.
[47]. Redner, R., Walker, H. (1984). “Mixture Densities, Maximum Likelihood and the EM Algorithm”. SIAM Review, 26 (2), pp.195–239.
[48]. Schwarz, G. (1978). “Estimating the Dimension of a Model”. Annals of Statistics, Vol. 6, pp.461–464.
[49]. Soltyk, S., Gupta, R. (2011). “Application of the multivariate skew normal mixture model with the EM Algorithm to Value-at-Risk”. Presented at the 19th International Congress on Modelling and Simulation.
[50]. Song, W., Feng, M., Wei, S., Shaowei, X. (1997). “The Hyperellipsoidal Clustering Using Genetic Algorithm”. In: 1997 IEEE International Conference on Intelligent Processing Systems. Beijing, China, pp. 592–596.
[51]. Sutradhar, B., Ali, M., (1986). “Estimation of the Parameters of a Regression Model with a Multi- variate T Error Variable”. Communication in Statistics, Theory and Methods, 15 (2), pp.429–450.
[52]. Tipping, T., Biship, C. (1999). “Mixtures of Probabilistic Principal Component Analysers”. Neural Computation, Vol. 11, pp.443–482.
[53]. Van Emden, M. (1971). “An Analysis of Complexity. In: Mathematical Centre Tracts”. Mathematisch Centrum. Vol. 35.
[54]. Vrbik, I., McNicholas, P. (2012). “Analytic Calculations for the EM Algorithm for Multivariate Skew tmixture Models”. Statistics and Probability Letters, Vol. 82, pp.1169–1174.
[55]. Wicker, J. 2006. “Applications of Modern Statistical Methods to Analysis of Data in Physical Science”. Ph.D. thesis, The University of Tennessee, Knoxville.
[56]. Wicker, J., Bozdogan, H., Bensmail, H. (2006). “A Novel Generation Mixture-Model Cluster Analysis with Genetic EM Algorithm and Information Complexity as the Fitness Function”. Presented at the 10th International Federation of Classification Societies (IFCS) Conference on Data Science and Classification.
[57]. Wolf, J. 1963. “Object Cluster Analysis of Social Areas”. Master’s thesis, University of California, Berkeley.
[58]. Xu, L., Jordan, M. (1996). “On Convergence Properties of the EM Algorithm for Gaussian Mixtures”. Neural Computation, Vol. 8, pp.129–151.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.