A Novel Spark-based Attribute Reduction and Neighborhood Classification for Rough Evidence

Weiping Ding, Ying Sun, Ming Li, J. Liu, Hengrong Ju, Jiashuang Huang, Chin-Teng Lin

Research output: Contribution to journalArticlepeer-review

14 Downloads (Pure)

Abstract

Neighborhood classification (NEC) algorithms has been widely used to solve classification problems. Most traditional NEC algorithms employ the majority voting mechanism as the basis for final decision-making. However, this mechanism hardly considers the spatial difference and label uncertainty of the neighborhood samples, which may increase the possibility of the misclassification. In addition, the traditional NEC algorithms need to load the whole data into memory at once, which is computationally inefficient when the size of data set is large. To address these problems, we propose a novel Spark-based attribute reduction and neighborhood classification for rough evidence in this paper. Specifically, we first construct a multi-granular sample space using parallel undersampling method. Then, we evaluate the significance of attribute by neighborhood rough evidence decision error rate and remove the redundant attribute on different samples subspaces. Based on this attribute reduction algorithm, we design a parallel attribute reduction algorithm
which is able to compute equivalence classes in parallel and parallelize the process of searching for candidate attributes. Finally, we introduce the rough evidence into the classification decision of traditional NEC algorithms and parallelize the
classification decision process. Furthermore, the proposed algorithms are conducted in the Spark parallel computing framework. Experimental results on both small and large-scale data sets show that the proposed algorithms outperform the benchmarking algorithms in the classification accuracy and the computational efficiency.
Original languageEnglish
Pages (from-to)1-14
JournalIEEE Transactions on Cybernetics
DOIs
Publication statusPublished (in print/issue) - 10 Oct 2022

Keywords

  • Rough sets
  • sparks
  • evidence theory
  • computational modeling
  • information systems
  • numerical models
  • error analysis

Fingerprint

Dive into the research topics of 'A Novel Spark-based Attribute Reduction and Neighborhood Classification for Rough Evidence'. Together they form a unique fingerprint.

Cite this