Journal of Applied Science and Engineering

Published by Tamkang University Press

1.30

Impact Factor

2.10

CiteScore

Chung-I Chang This email address is being protected from spambots. You need JavaScript enabled to view it.1, Nancy P. Lin2 and Nien-Yi Jan2,3

1Department of Information Management, St. Mary’s Medicine, Nursing and Management College
2Department of Computer Science and Information Engineering, Tamkang University, Tamsui, Taiwan 251, R.O.C
3Business & Marketing Strategy Research Department, Telecommunication Lab., Chunghwa Telecom Co., Ltd Taiwan, R.O.C.


 

Received: November 21, 2007
Accepted: July 24, 2008
Publication Date: June 1, 2009

Download Citation: ||https://doi.org/10.6180/jase.2009.12.2.10  


ABSTRACT


These spatial clustering methods can be classified into four categories: partitioning method, hierarchical method, density-based method and grid-based method. The grid-based clustering algorithm, which partitions the data space into a finite number of cells to form a grid structure and then performs all clustering operations to group similar spatial objects into classes on this obtained grid structure, is an efficient clustering algorithm. To cluster efficiently and simultaneously, to reduce the influences of the size and borders of the cells, a new grid-based clustering algorithm, an Axis-Shifted Grid-Clustering algorithm (ASGC), is proposed in this paper. This new clustering method combines a novel density-grid based clustering with axis-shifted partitioning strategy to identify areas of high density in the input data space. The main idea is to shift the original grid structure in each dimension of the data space after the clusters generated from this original structure have been obtained. The shifted grid structure can be considered as a dynamic adjustment of the size of the original cells and reduce the weakness of borders of cells. And thus, the clusters generated from this shifted grid structure can be used to revise the originally obtained clusters. The experimental results verify that, indeed, the effect of this new algorithm is less influenced by the size of cells than other grid-based ones and requires at most a single scan through the data.


Keywords: Data Mining, Grid-Based Clustering, Significant Cell, Grid Structure, Coordinate Axis


REFERENCES


  1. [1] Wang W., Yang J. and Richard, R., Muntz, “STING: A Statistical Information Grid Approach to Spatial Data Mining,” In Proc. of 23rd Int. Conf. on VLDB, pp. 186195 (1997).
  2. [2] Wang W., Yang J. and Richard, R., Muntz, “STING+: An Approach to Active Spatial Data Mining,” In Proc. of 15th Int. Conf. on Data Engineering, pp. 116125 (1999).
  3. [3] Sheikholeslami, G., Chatterjee, S. and Zhang, A., “WaveCluster: A Wavelet-Based Clustering Approach for Spatial Data in Very Large Databases,” In VLDB Journal: Very Large Data Bases, pp. 289304 (2000).
  4. [4] Agrawal, R., Gehrke, J., Gunopulos, D. and Raghavan, P., “Automatic Sub-Space Clustering of High Dimensional Data for Data Mining Applications,” In Proc. of ACM SIGMOD Int. Conf. MOD, pp. 94105 (1998).
  5. [5] Zhao, Y. C. and Song, J., “GDILC: A Grid-Based Density-Isoline Clustering Algorithm,” In Proc. Internat. Conf. on Info-net, Vol. 3, pp. 140145 (2001).
  6. [6] Ma, W. M., Eden, Chow and Tommy, W. S., “A New Shifting Grid Clustering Algorithm,” Pattern Recognition, Vol. 37, pp. 503514 (2004).
  7. [7] Pilevar, A. H. and Sukumar, M., “GCHL: A GridClustering Algorithm for High-Dimensional Very Large Spatial Data Bases,” Pattern Recognition Letters, Vol. 26, pp. 9991010 (2005).
  8. [8] Lin, Nancy P., Chang, C.-I., Chueh, H.-E., Chen, H.-J. and Hao, W.-H., “An Adaptable Deflect and Conquer Clustering Algorithm,” In Proceedings of the 6th WSEAS International Conference on Applied Computer Science, pp. 155159 (2007).
  9. [9] Russell, Stuart J. and Norvig, Peter, Artificial Intelligence: A Modern Approach (2nd ed.), Upper Saddle River, NJ: Prentice Hall, pp. 111114 (2003).
  10. [10] MacQieen, J., “Some Methods for Classification and Analysis of Multivariate Observation,” Proc. 5th Berkeley Symp. Math. Statist, Prob., Vol. 1, pp. 281297 (1967).