Authors: JUN LI, GUIMIN HUANG, CHUNLI FAN, ZHENGLIN SUN, HONGTAO ZHU
Abstract: Day by day huge amounts data are produced, and evaluation of these data becomes more difficult. The data obtained should provide meaningful, correct, and accurate information. Therefore, all data must be separated into clusters correctly, and the right information from these clusters must be obtained. Having the correct clusters depends on the clustering algorithm that is used. There are many clustering algorithms. The density-based methods are very important among the groups of clustering methods, as they can find arbitrary shapes. An advanced model of the density-based spatial clustering of applications with noise (DBSCAN) algorithm, called fuzzy neighborhood DBSCAN Gaussian means (FN-DBSCAN-GM), is offered in this study. The main contribution of FN-DBSCAN-GM is to find the parameters automatically and to divide the data into clusters robustly. The effectiveness of FN-DBSCAN-GM has been demonstrated on overlapping datasets (six artificial and two real-life datasets). The performances of these datasets are compared with the percentage of correct classification and validity index. Our experiments showed that this new algorithm was a preferable and robust algorithm.
Keywords: Cluster analysis, DBSCAN, FN-DBSCAN
Full Text: PDF