A Novel Multi-Viewpoints Based Cosine Similarity Visual Technique for an Effective Assessment of Clustering Tendency

Rajasekhar Pinisetty*, Ravindranath Vandrangi**
*-** Department of Mathematics, Jawaharlal Nehru Technological University, Kakinada, Andhra Pradesh, India.
Periodicity:July - December'2023
DOI : https://doi.org/10.26634/jmat.12.2.19803

Abstract

Data clustering is an unsupervised technique that can be used to partition the data into groups based on the similarities of the retrieved objects using different distance metrics like Euclidean, cosine, etc. In contrast to Euclidean, the cosine computes the object's similarity by considering both the magnitude and direction of the data vectors. As a result, it performed far better than a standard Euclidean distance metric in applications involving real-time data clustering. The initial k-value (clustering tendency) is required by top clustering techniques like k-means and hierarchical approaches to determine the clusters' quality. Users with knowledge can assign the k-value. However, sometimes the right k-value in such algorithms may need to be assigned. After a thorough review of the work, it was discovered that the visual technique known as visual assessment of (cluster) tendency (VAT) effectively addresses the clustering tendency issue. It uses the Euclidean metric to find the similarity features in its algorithm. Another enhanced visual technique, cosinebased VAT (cVAT), outperformed the VAT for text data and speech clustering applications. However, the similarity features are extracted about a single viewpoint in cVAT. This paper develops the multi-viewpoints-based cosine similarity measure (MVPCSM) for a more informative assessment. Instead of using a single reference point like a typical cosine measure, the MVPCSM generates precise similarity characteristics using several views. The performance of the existing and proposed technique (MVPCSM-VAT) is evaluated using clustering accuracy (CA) and normalized mutual information (NMI). It has been demonstrated that the proposed MVPCSM-VAT is 15-25% more efficient than VAT and cVAT in terms of the parameters of CA and NMI. The proposed method successfully obtains more quality data clusters than MVS-VAT.

Keywords

Clustering Tendency, Data Clustering, Distance Metrics, Multiple Viewpoints, Visual Assessment of Clustering Tendency.

How to Cite this Article?

Pinisetty, R., and Vandrangi, R. (2023). A Novel Multi-Viewpoints Based Cosine Similarity Visual Technique for an Effective Assessment of Clustering Tendency. i-manager’s Journal on Mathematics, 12(2), 46-55. https://doi.org/10.26634/jmat.12.2.19803

References

[5]. Basha, M.S., & Prasad, K. R. (2018). Efficient cluster tendency methods for discovering the number of clusters. ARPN Journal of Engineering and Applied Sciences, 13(4), 1327-1334.
[22]. Skwirzynski, J. K. (1985). The Impact of Processing Techniques on Communications, Springer.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Online 15 15

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.