Efficient and Fast Initialization Algorithm for K-means Clustering

Автор: Mohammed El Agha, Wesam M. Ashour

Журнал: International Journal of Intelligent Systems and Applications(IJISA) @ijisa

Статья в выпуске: 1 vol.4, 2012 года.

Бесплатный доступ

The famous K-means clustering algorithm is sensitive to the selection of the initial centroids and may converge to a local minimum of the criterion function value. A new algorithm for initialization of the K-means clustering algorithm is presented. The proposed initial starting centroids procedure allows the K-means algorithm to converge to a “better” local minimum. Our algorithm shows that refined initial starting centroids indeed lead to improved solutions. A framework for implementing and testing various clustering algorithms is presented and used for developing and evaluating the algorithm.

Data mining, K-means initialization m pattern recognition

Короткий адрес: https://sciup.org/15010089

IDR: 15010089

Список литературы Efficient and Fast Initialization Algorithm for K-means Clustering

  • Sanjay Goil, Harasha Nagesh, Alok Choudhary, “MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets”, 1999
  • U.M. Fayyad, G Piatesky –Shapiro, P.Smyth, and R.Uthuusamy. “Advances in data mining and knowledge discovery. MIT Press”, 1994
  • M. Eirinaki and M. Vazirgiannis, “Web Mining for Web Personalization,” ACM Transactions on Internet Technology (TOIT), vol. 3, no. 1, pp. 1-27, 2003
  • B.Bahmani Firouzi, T. Niknam, and M. Nayeripour, “A New Evolutionary Algorithm for Cluster Analysis,” Proceeding of world Academy of Science, Engineering and Technology, vol. 36. Dec. 2008.
  • A.Gersho and R. Gray, “Vector Quantization and Signal Compression,” Kulwer Acadimec, Boston, 1992.
  • M. Al- Zoubi, A. Hudaib, A. Huneiti and B. Hammo, “New Efficient Strategy to Accelerate k-Means Clustering Algorithm,” American Journal of Applied Science, vol. 5, no. 9, pp 1247-1250, 2008.
  • M. Celebi, “Effecitive Initialization of K-means for Color Quantization,” Proceeding of the IEEE International Conference on Image Processing, pp. 1649-1652, 2009.
  • M. Borodovsky and J. McIninch, “Recognition of genes in DNA
  • A.K Jane and R.C Dube, “Algorithms for Clustering Data. Prentice-Hall Inc”, 1988
  • A.K. JAIN , M.N. MURTY and P.J. FLYNN, “Data Clustering: A Review”, 2000
  • Guojun Gan, Chaoqun Ma and Jianhong Wu, “Data Clustering Theory, Algorithms, and Applications” 2007.
  • MAO, J. AND JAIN, A. K, “Texture classification and segmentation using multi resolution simultaneous autoregressive models”, 1992.
  • MCQUEEN, J. “Some methods for classification and analysis of multivariate observations”, 1967.
  • R.O. Duda and P.E. Hart, “Pattern Classification and Scene Analysis”, 1973.
  • R. Neal and G. Hinton, “A view of the EM algorithm that justifies incremental, sparse, and other variants'', 1998.
  • P. S. Bradley, O. L. Mangasarian, and W. N. Street, "Clustering via Concave Minimization, 1997.
  • K. Fukunaga ,” Introduction to Statistical Pattern Recognition”, 1990.
  • Shehroz and Ahmad, “Cluster center initiation algorithm for k-means clustering” , 2004.
  • Bradley and Fayyad, “Refining initial points for K-means clustering”, 1998
  • Penã, J.M., Lozano, J.A., Larrañaga, P., 1999. “ An empirical comparison of four initialization methods for the K-means algorithm”, 1999.
  • Kohei Arai and Ali Ridho Barakha, “Hierarchical K-means: an algorithm for centroids initialization for K-means” , 2007
  • M. Al-Daoud, “A New Algorithm for Clustering Initialization,” Preceeding World Academy of Science, Engineering, and Technology, vol. 4, 2005.
  • M. Meila and D. Heckerman, "An experimental comparison of several clustering methods", 1998.
Еще
Статья научная