A Novel Spark-based Attribute Reduction and Neighborhood Classification for Rough Evidence

Weiping Ding, Ying Sun, Ming Li, J. Liu, Hengrong Ju, Jiashuang Huang, Chin-Teng Lin

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)
215 Downloads (Pure)

Abstract

Neighborhood classification (NEC) algorithms have been widely used to solve classification problems. Most traditional NEC algorithms employ the majority voting mechanism as the basis for final decision making. However, this mechanism hardly considers the spatial difference and label uncertainty of the neighborhood samples, which may increase the possibility of the misclassification. In addition, the traditional NEC algorithms need to load the entire data into memory at once, which is computationally inefficient when the size of the dataset is large. To address these problems, we propose a novel Spark-based attribute reduction and NEC for rough evidence in this article. Specifically, we first construct a multigranular sample space using the parallel undersampling method. Then, we evaluate the significance of attribute by neighborhood rough evidence decision error rate and remove the redundant attribute on different samples subspaces. Based on this attribute reduction algorithm, we design a parallel attribute reduction algorithm which is able to compute equivalence classes in parallel and parallelize the process of searching for candidate attributes. Finally, we introduce the rough evidence into the classification decision of traditional NEC algorithms and parallelize the classification decision process. Furthermore, the proposed algorithms are conducted in the Spark parallel computing framework. Experimental results on both small and large-scale datasets show that the proposed algorithms outperform the benchmarking algorithms in the classification accuracy and the computational efficiency.

Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Cybernetics
Volume54
Issue number3
Early online date10 Oct 2022
DOIs
Publication statusPublished (in print/issue) - 10 Oct 2022

Bibliographical note

Publisher Copyright:
IEEE

Keywords

  • Rough sets
  • sparks
  • evidence theory
  • computational modeling
  • information systems
  • numerical models
  • error analysis
  • Error analysis
  • Computational modeling
  • rough sets
  • Sparks
  • Information systems
  • Dempster–Shafer (D-S) theory
  • Evidence theory
  • parallel neighborhood classification (NEC)
  • spark framework
  • parallel attribute reduction
  • Numerical models

Fingerprint

Dive into the research topics of 'A Novel Spark-based Attribute Reduction and Neighborhood Classification for Rough Evidence'. Together they form a unique fingerprint.

Cite this