Data clustering is oneof the major areas in data mining. Thebisecting clustering algorithm is one of the most widely used for highdimensional dataset. But its performancedegrades as the dimensionality increases. Also, the task of selection of a cluster for further bisection is achallenging one. To overcome thesedrawbacks, we developed a novel partitional clustering algorithm called a HB-K-Means algorithm (High dimensional BisectingK-Means). In order to improve theperformance of this algorithm, we incorporate two constraints, suchas a stability-basedmeasure and a Mean Square Error (MSE) resulting in CHB-K-Means(Constraint-basedHigh dimensional Bisecting K-Means) algorithm. The CHB-K-Means algorithm generates two initial partitions. Subsequently, it calculates the stability andMSE for each partition generated. Inference techniques are applied on the stability and MSE values of thetwo partitions to select the next partition for the re-clustering process. This process is repeated until K number of clustersis obtained. From the experimentalanalysis, we infer that an average clustering accuracy of 75% has beenachieved. The comparative analysis ofthe proposed approach with the other traditional algorithms shows anachievement of a higher clustering accuracy rate and an increase incomputation time. |