Ditemukan 2 dokumen yang sesuai dengan query
K. Aparna
Abstrak :
Data clustering is one
of the major areas in data mining. The
bisecting clustering algorithm is one of the most widely used for high
dimensional dataset. But its performance
degrades as the dimensionality increases.
Also, the task of selection of a cluster for further bisection is a
challenging one. To overcome these
drawbacks, we developed a novel partitional clustering algorithm called a HB-K-Means algorithm (High dimensional Bisecting
K-Means). In order to improve the
performance of this algorithm, we incorporate two constraints, such
as a stability-based
measure and a Mean Square Error (MSE) resulting in CHB-K-Means
(Constraint-based
High dimensional Bisecting K-Means) algorithm.
The CHB-K-Means algorithm generates two initial partitions. Subsequently, it calculates the stability and
MSE for each partition generated.
Inference techniques are applied on the stability and MSE values of the
two partitions to select the next partition for the re-clustering process. This process is repeated until K number of clusters
is obtained. From the experimental
analysis, we infer that an average clustering accuracy of 75% has been
achieved. The comparative analysis of
the proposed approach with the other traditional algorithms shows an
achievement of a higher clustering accuracy rate and an increase in
computation time.
2016
J-Pdf
Artikel Jurnal Universitas Indonesia Library
K. Aparna
Abstrak :
Data clustering is one of the major areas in data mining. The bisecting clustering algorithm is one of the most widely used for high dimensional dataset. But its performance degrades as the dimensionality increases. Also, the task of selection of a cluster for further bisection is a challenging one. To overcome these drawbacks, we developed a novel partitional clustering algorithm called a HB-K-Means algorithm (High dimensional Bisecting K-Means). In order to improve the performance of this algorithm, we incorporate two constraints, such as a stability-based measure and a Mean Square Error (MSE) resulting in CHB-K-Means (Constraint-based High dimensional Bisecting K-Means) algorithm. The CHB-K-Means algorithm generates two initial partitions. Subsequently, it calculates the stability and MSE for each partition generated. Inference techniques are applied on the stability and MSE values of the two partitions to select the next partition for the re-clustering process. This process is repeated until K number of clusters is obtained. From the experimental analysis, we infer that an average clustering accuracy of 75% has been achieved. The comparative analysis of the proposed approach with the other traditional algorithms shows an achievement of a higher clustering accuracy rate and an increase in computation time.
Depok: Faculty of Engineering, Universitas Indonesia, 2016
UI-IJTECH 7:4 (2016)
Artikel Jurnal Universitas Indonesia Library