TY - JOUR
T1 - A novel method for creating an optimized ensemble classifier by introducing cluster size reduction and diversity
AU - Jan, Muhammad Zohaib
AU - Munoz, Juan Carloz
AU - Ali, Muhammad Asim
PY - 2020/9/22
Y1 - 2020/9/22
N2 - n this paper, a new method is proposed for creating an optimized ensemble classifier. The proposed method mitigates the issue of class imbalances by partitioning the input data into its various data classes. The partitions are then clustered incrementally to generate a pool of class pure data clusters. The generated data clusters are then balanced by adding samples from all classes which are closest to the cluster centroid. In this manner all generated data clusters are balanced and classifiers trained on such a data cluster are unbiased as well. This creates a diverse input space for training of base classifiers. The pool of clusters is then utilized to train a set of diverse base classifiers to generate the base classifier pool. The pool of classifiers is then treated as a combinatorial problem of optimization and an evolutionary algorithm is incorporated. The proposed approach generates an optimized ensemble classifier that can not only achieve the highest classification accuracy but also has a lower component size as well. The proposed approach is tested on 31 benchmark datasets from UCI machine learning repository and results are compared with existing state-of-the-art ensemble classifiers as well.
AB - n this paper, a new method is proposed for creating an optimized ensemble classifier. The proposed method mitigates the issue of class imbalances by partitioning the input data into its various data classes. The partitions are then clustered incrementally to generate a pool of class pure data clusters. The generated data clusters are then balanced by adding samples from all classes which are closest to the cluster centroid. In this manner all generated data clusters are balanced and classifiers trained on such a data cluster are unbiased as well. This creates a diverse input space for training of base classifiers. The pool of clusters is then utilized to train a set of diverse base classifiers to generate the base classifier pool. The pool of classifiers is then treated as a combinatorial problem of optimization and an evolutionary algorithm is incorporated. The proposed approach generates an optimized ensemble classifier that can not only achieve the highest classification accuracy but also has a lower component size as well. The proposed approach is tested on 31 benchmark datasets from UCI machine learning repository and results are compared with existing state-of-the-art ensemble classifiers as well.
UR - http://dx.doi.org/10.1109/tkde.2020.3025173
U2 - 10.1109/tkde.2020.3025173
DO - 10.1109/tkde.2020.3025173
M3 - Article
SN - 1041-4347
VL - 34
SP - 3072
EP - 3081
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 7
ER -