*-** Department of Statistics, Operations, and Management Science, Knoxville, Tennessee, USA

DOI : https://doi.org/10.26634/jpr.1.1.2820

The traditional mixture model assumes that, a dataset is composed of several clusters of Gaussian distributions. In real life, however, data often do not fit the restrictions of normality very well. It is likely that data from a single cluster exhibiting non- Gaussian shape characteristics could be erroneously modeled as multiple clusters, resulting in suboptimal inference and decision making. More flexibility is given to the mixture model by generalizing the Gaussian mixture model to use Elliptically-Contoured Distributions. While still symmetric, multivariate Elliptically-Contoured Distributions encompass a wide range of peak and tail characteristics. Distributions that can be generated as special cases include the Power Exponential, Gaussian, Laplace, Student’s T, Cauchy, and Uniform. This generalization makes mixture modeling more robust against nonnormality. Two computational algorithms, GARM and GEM,were adapted to optimize the Elliptically Contoured Mixture model and use results from robust estimation theory in order to data-adaptively regularize both. Finally, the Fisher information matrix and information criterion ICOMP are extended to score the new mixture model. These tools are used simultaneously to select the best mixture model and classify all observations without making any subjective decisions. The performance of the proposed mixture model is first demonstrated on a simulated dataset with extreme overlap. Secondly, the Elliptically-Contoured Mixture model is used on a medical dataset in which, the clinicallydetermined cluster structure are recovered For both datasets, the proposed mixture model substantially improves classification rates over the Gaussian mixture model.

[1]. Akaike, H. (1973). “Information Theory and an
Extension of the Maximum Likelihood Principle”. In Petrox,
B., Csaki, F. (Eds.), Second International Symposium on
Information Theory. Academiai Kiado, Budapest,
pp.267–281.

[2]. Akbilgic, O., Bozdogan, H. (2011). “Predictive Subset
Selection using Regression Trees and RBF Neural Networks
Hybridized with the Genetic Algorithm”. European Journal
of Pure and Applied Mathematics, 4 (4), pp.467–485.

[3]. Anderson, T., Fang, K. (1990). “Inference in
Multivariate Elliptically Contoured Distributions Based on
Maximum Likelihood”. In: Fang, K., Anderson, T. (Eds.),
Statistical Inference in Elliptically Contoured and Related
Distributions. Allerton Press, Inc., New York, pp. 201–216.

[4]. Andrews, D., Herzberg, A. (1985). “Data: A Collection
of Problems from Many Fields for the Student and
Research Worker”. Springer Series in Statistics. Springer-
Verlag, New York.

[5]. Andrews, J., McNicholas, P. (2011). “Mixtures of
Modified t-factor Analyzers for Model-based Clustering,
Classification, and Discriminant Analysis”. Journal of
Statistical Planning and Inference, Vol.141,
pp.1479–1486.

[6]. Banfield, J. D., Raftery, A. E. (1993). “Model-Based
Gaussian and Non-Gaussian Clustering”. Bio- metrics , 49
(3), pp. 803–812.

[7]. Bearse, P., Bozdogan, H., Schlottmann, A. (1997).
“Empirical Econometric Modelling of Food Consumption
Using a New Informational Complexity Approach”.
Journal of Applied Econometrics, Vol.12, pp.563–592.

[8]. Bhuyan, J., Raghavan, V., Elayavalli, V. (1991).
“Genetic Algorithm for Clustering with an Ordered
Representation”. In: 4th International Conference on
Genetic Algorithms. Morgan Kaufman, San Mateo, CA.

[9]. Biernacki, C., Celeux, G., Govaert, G. (1999). “An
improvement of the NEC criterion for assessing the
number of clusters in mixture model”. Pattern Recognition
Letters , Vol. 20, pp.267–272.

[10]. Biernacki, C., Celeux, G., Govaert, G. (2000).
“Assessing a mixture model for clustering with the
integrated completed likelihood”. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 22 (7),
pp.719–725.

[11]. Bozdogan, H. (1981). “Multi-Sample Cluster Analysis
and Approaches to Validity Studies in Clustering
Individuals”. Ph.D. thesis, University of Illinois at Chicago.

[12]. Bozdogan, H. (1988). “ICOMP: A New Model-
Selection Criteria”. In Bock, H. (Ed.), Classification and
Related Methods of Data Analysis. North-Holland, pp.
599–608.

[13]. Bozdogan, H. (1994). “Mixture-Model Cluster Analysis
Using Model Selection Criteria and a New Informational
Measure of Complexity ”. In Bozdogan, H. (Ed.),
Proceedings of the First US/Japan Conference on the
Frontiers of Statistical Modeling: An Informational
Approach. Vol.2. Kluwerz Academic Publishers,
Dordrecht, Netherlands, pp. 69–113.

[14]. Bozdogan, H. (2000). “Akaike’s Information Criterion
and Recent Developments in Information Complexity”.
Journal of Mathematical Psychology, Vol. 44, pp.62–91.

[15]. Bozdogan, H., Haughton, D. (1998). “Informational
Complexity Criteria for Regression Models ” .
Computational Statistics and Data Analysis. Vol. 28,
pp.51–76.

[16]. Chatfield, C. (1995). “Model uncertainty, data
mining and statistical inference”. Journal of the Royal
Statistical Society, Series A 158, pp. 419–466.

[17]. Coker, E. U., Deniz Howe, E., Howe, J. A. (2011).
“Exploring the Performance of Information Criteria in
Multilevel Structural Equation Modeling”. Presented at the
8th International Amsterdam Multilevel Conference.

[18]. Day, N. (1969). “Estimating the Components of a
Mixture of Normal Distributions”. Biometrika, Vol. 56,
pp.463–474.

[19]. Dempster, A., Laird, N., Rubin, D. (1977). “Maximum
Likelihood from Incomplete Data via the EM Algorithm”.
Journal of the Royal Statistical Society. Series B
(Methodological) 39 (1), pp.1–38.

[20]. Deniz, E., Bozdogan, H., Katraggadda, S. (2011).
“Structural Equation Modeling (SEM) of Categorical and
Mixed-Data Using the Novel Gifi Transformations and
Information Complexity (ICOMP) Criterion”. Journal of the
School of Business Administration, 40 (1), pp.86–123.

[21]. Fang, K., Kotz, S., Ng, K. (1990). “Symmetric
Multivariate and Related Distributions”. Chapman and
Hall, New York.

[22]. Farrell, M., Mersereau, R., September. (2004).
“Estimation of Elliptically Contoured Mixture Models for
Hyperspectral Imaging Data”. In Geoscience and
Remote Sensing Symposium, IGARSS ’04. Vol. 4. IEEE
International, pp. 2412–2415.

[23] Figueiredo, M. A. T., Jain, A. K. (2002). “Unsupervised
learning of finite mixture models”. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 24(3),
pp.381–396.

[24]. Fonseca, J. R. S., Cardoso, M. G. M. S. (2007).
“Mixture-model cluster analysis using information
theoretical criteria”. Intelligent Data Analysis, 11(2),
pp.155–173.

[25]. Franczak, B., Browne, R., McNicholas, P. (2009).
“Mixtures of Shifted Asymmetric Laplace Distributions”.

[26]. Genton, M. (2004). “Skew-Elliptical Distributions and
Their Applications”. Chapman & Hall / CRC, Boca Raton,
Florida.

[27]. Holland, J. (1975). “Adaptation in Natural and
Artificial Systems”. University of Michigan Press, Ann Arbor,
Michigan.

[28]. Holland, J. (1992). “Genetic Algorithms”. Scientific
American, pp.66–72.

[29]. Holzmann, H., Munk, A., Gneiting, T. (2006).
“Identifiability of Finite Mixtues of Elliptical Distributions”. Scandinavian Journal of Statistics, 33 (4), pp.753–763.

[30]. Howe, J. A., Bozdogan, H. (2010). “Predictive Subset
VAR Modeling Using the Genetic Algorithm and
Information Complexity”. European Journal of Pure and
Applied Mathematics, 3 (3), pp.382–305.

[31]. Howe, J.A., Bozdogan, H. (2012). “Robust Mixture
Model Cluster Analysis Using Adaptive Kernels”. Journal of
Applied statistics . Retrieved from
http://www.tandfonline.com/doi/abs/10.1080/02664763.
2012.740630.

[32]. Karlis, D., Santourian, A. (2009). “Model-based
Clustering with Non-elliptically Contoured Distributions”.
Statistics and Computing, Vol. 19, pp.73–83.

[33]. Krishna, K., Murty, M. (1999). “Genetic K-Means
Algorithm”. IEEE Transactions on Systems, Man, and
Cybernetics - Part B: Cybernetics , 29 (3), pp.433–439.

[34]. Kullback, A., Leibler, R. (1951). “On Information and
Sufficiency”. Annals of Mathematical Statistics, Vol. 22,
pp.79–86.

[35]. Lee, S., McLachlan, G. (2012). “Finite Mixtures of
Multivariate Skew t-distributions: Some Recent and New
Results”. Statistics and Computing, pp.1–22. Retrieved
from http://dx.doi.org/10.1007/s11222-012-9362-4.

[36]. Lin, T. (2009). “Maximum Likelihood Estimation for
Multivariate Skew-normal Mixture Models”. Journal of
Multivariate Analysis, Vol. 100, pp.257–265.

[37]. Lin, T. (2010). “Robust Mixture Modeling Using
Multivariate Skew t Distribution”. Statistics and Computing,
Vol. 20, 343–356.

[38]. Liu, M. (2006). “Multivariate Nonnormal Regression
Models, Information Complexity, and Genetic
Algorithms: A Three Way Hybrid for Intelligent Data
Mining”. Ph.D. thesis, The University of Tennessee, Knoxville.

[39]. Liu, M., Bozdogan, H. (2008). “Multivariate
Regression Models with Power Exponential Random Errors
and Subset Selection Using Genetic Algorithms With
Information Complexity”. European Journal of Pure and
Applied Mathematics, 1 (1), pp.4–37.

[40]. Liu, S. (2002). “Local Influence in Multivariate
Elliptical Linear Regression Models”. Linear Algebra and its
Applications, Vol. 354, pp.159–174. Retrieved from
http://www.sciencedirect.com/science/article/pii/S0024
379501005857.

[41]. Ma, J., Xu, L. (2005). “Asymptotic Convergence
Properties of the EM Algorithm with Respect to the Overlap
in the Mixture”. Neurocomputing, 68, 105–129.

[42]. MacQueen, J. (1967). “Some Methods for
Classification and Analysis of Multivariate Observations”.
In: Cam, L., Neyman, J. (Eds.), Proceedings of 5-th
Berkeley Symposium on Mathematical Statistics and
Probability. Vol. 1. University of California, Berkeley,
Berkeley, CA, pp. 281–297.

[43]. Mao, J., Jain, A. (1996). “A Self-Organizing Network
for Hyper ellipsoidal Clustering (HEC)”. IEEE Transactions on
Neural Networks, Vol. 7, pp.17–29.

[44]. McLachlan, G., Peel, D. (1998). “Robust Cluster
Analysis via Mixtures of Multivariate t- distributions”. In:
Amin, A., Dori, D., Pudil, P., Freeman, H. (Eds.), Lecture
Notes in Computer Science. Vol. 1451. Springer-Verlag,
Berlin, Germany, pp.658–666.

[45]. Pearson, K. (1894). “Contributions to the
Mathematical Theory of Evolution”. In: Phil. Trans. Royal
Society. Vol. 185A. pp.71–110.

[46]. Peters, B., Walker, H. (1978). “An Iterative Procedure
for Obtaining Maximum-Likelihood Estimates of the
Parameters for a Mixture of Normal Distributions”. SIAM
Journal on Applied Mathematics, 35 (2), pp.362–378.

[47]. Redner, R., Walker, H. (1984). “Mixture Densities,
Maximum Likelihood and the EM Algorithm”. SIAM Review,
26 (2), pp.195–239.

[48]. Schwarz, G. (1978). “Estimating the Dimension of a
Model”. Annals of Statistics, Vol. 6, pp.461–464.

[49]. Soltyk, S., Gupta, R. (2011). “Application of the multivariate skew normal mixture model with the EM
Algorithm to Value-at-Risk”. Presented at the 19th
International Congress on Modelling and Simulation.

[50]. Song, W., Feng, M., Wei, S., Shaowei, X. (1997). “The
Hyperellipsoidal Clustering Using Genetic Algorithm”. In:
1997 IEEE International Conference on Intelligent
Processing Systems. Beijing, China, pp. 592–596.

[51]. Sutradhar, B., Ali, M., (1986). “Estimation of the
Parameters of a Regression Model with a Multi- variate T
Error Variable”. Communication in Statistics, Theory and
Methods, 15 (2), pp.429–450.

[52]. Tipping, T., Biship, C. (1999). “Mixtures of Probabilistic
Principal Component Analysers”. Neural Computation,
Vol. 11, pp.443–482.

[53]. Van Emden, M. (1971). “An Analysis of Complexity. In:
Mathematical Centre Tracts”. Mathematisch Centrum.
Vol. 35.

[54]. Vrbik, I., McNicholas, P. (2012). “Analytic
Calculations for the EM Algorithm for Multivariate Skew tmixture
Models”. Statistics and Probability Letters, Vol. 82,
pp.1169–1174.

[55]. Wicker, J. 2006. “Applications of Modern Statistical
Methods to Analysis of Data in Physical Science”. Ph.D.
thesis, The University of Tennessee, Knoxville.

[56]. Wicker, J., Bozdogan, H., Bensmail, H. (2006). “A
Novel Generation Mixture-Model Cluster Analysis with
Genetic EM Algorithm and Information Complexity as the
Fitness Function”. Presented at the 10th International
Federation of Classification Societies (IFCS) Conference
on Data Science and Classification.

[57]. Wolf, J. 1963. “Object Cluster Analysis of Social
Areas”. Master’s thesis, University of California, Berkeley.

[58]. Xu, L., Jordan, M. (1996). “On Convergence
Properties of the EM Algorithm for Gaussian Mixtures”.
Neural Computation, Vol. 8, pp.129–151.