TY - JOUR
T1 - Nearest clusters based partial least squares discriminant analysis for the classification of spectral data
AU - Song, Weiran
AU - Wang, Hui
AU - Maguire, Paul
AU - Nibouche, Omar
PY - 2018/6/7
Y1 - 2018/6/7
N2 - Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time.
AB - Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time.
KW - Partial Least Squares
KW - Clustering
KW - Nonlinearity
KW - Multimodality
KW - Spectral pattern recognition.
UR - https://pure.ulster.ac.uk/en/publications/nearest-clusters-based-partial-least-squares-discriminant-analysi
UR - https://www.sciencedirect.com/science/article/pii/S0003267018300886?via%3Dihub
U2 - 10.1016/j.aca.2018.01.023
DO - 10.1016/j.aca.2018.01.023
M3 - Article
C2 - 29422129
SN - 1873-4324
VL - 1009
SP - 27
EP - 38
JO - Analytica Chimica Acta
JF - Analytica Chimica Acta
ER -