Nearest clusters based partial least squares discriminant analysis for the classification of spectral data

Research output: Contribution to journalArticle

11 Citations (Scopus)
5 Downloads (Pure)

Abstract

Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time.
Original languageEnglish
Pages (from-to)27-38
JournalAnalytica Chimica Acta
Volume1009
Early online date6 Feb 2018
DOIs
Publication statusPublished - 7 Jun 2018

    Fingerprint

Keywords

  • Partial Least Squares
  • Clustering
  • Nonlinearity
  • Multimodality
  • Spectral pattern recognition.

Cite this