Nearest clusters based partial least squares discriminant analysis for the classification of spectral data

Weiran Song, Hui Wang, Paul Maguire, Omar Nibouche

Research output: Contribution to journalArticlepeer-review

37 Citations (Scopus)
237 Downloads (Pure)

Abstract

Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time.
Original languageEnglish
Pages (from-to)27-38
Number of pages12
JournalAnalytica Chimica Acta
Volume1009
Early online date6 Feb 2018
DOIs
Publication statusPublished (in print/issue) - 7 Jun 2018

Keywords

  • Partial Least Squares
  • Clustering
  • Nonlinearity
  • Multimodality
  • Spectral pattern recognition.

Fingerprint

Dive into the research topics of 'Nearest clusters based partial least squares discriminant analysis for the classification of spectral data'. Together they form a unique fingerprint.

Cite this