Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks

Kui Zhang, Yuhua Li, Philip Scarf, Andrew Ball

    Research output: Contribution to journalArticle

    46 Citations (Scopus)

    Abstract

    The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the “curse of dimensionality” when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.
    LanguageEnglish
    Pages2941-2952
    JournalNeurocomputing
    Volume74
    Issue number17
    DOIs
    Publication statusPublished - Oct 2011

    Fingerprint

    Radial basis function networks
    Failure analysis
    Machinery
    Feature extraction
    Pattern recognition

    Cite this

    Zhang, Kui ; Li, Yuhua ; Scarf, Philip ; Ball, Andrew. / Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks. 2011 ; Vol. 74, No. 17. pp. 2941-2952.
    @article{fb425c9c5dc44c8196903fc982352fa2,
    title = "Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks",
    abstract = "The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the “curse of dimensionality” when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.",
    author = "Kui Zhang and Yuhua Li and Philip Scarf and Andrew Ball",
    year = "2011",
    month = "10",
    doi = "10.1016/j.neucom.2011.03.043",
    language = "English",
    volume = "74",
    pages = "2941--2952",
    number = "17",

    }

    Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks. / Zhang, Kui; Li, Yuhua; Scarf, Philip; Ball, Andrew.

    Vol. 74, No. 17, 10.2011, p. 2941-2952.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks

    AU - Zhang, Kui

    AU - Li, Yuhua

    AU - Scarf, Philip

    AU - Ball, Andrew

    PY - 2011/10

    Y1 - 2011/10

    N2 - The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the “curse of dimensionality” when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.

    AB - The technique of machinery fault diagnosis has been greatly enhanced over recent years with the application of many pattern classification methods. However, these classification methods suffer from the “curse of dimensionality” when applied to high-dimensional fault diagnosis data. In order to solve the problem, this paper proposes a hybrid model which combines multiple feature selection models to select the most significant input features from all potentially relevant features. Among the models, eight filter models are used to pre-rank the candidate features. They include data variance, Pearson correlation coefficient, the Relief algorithm, Fisher score, class separability, chi-squared, information gain and gain ratio. These variable ranking models measure features from various perspectives, and lead to different ranking results. Based on the effect of the ranking results on the Radial Basis Function (RBF) classification, a weighted voting scheme is then introduced to re-rank features. Furthermore, two wrapper models, a Binary Search (BS) model and a Sequential Backward Search (SBS) model are utilized to minimize the number of relevant features. To demonstrate the potential for applying the method to machinery fault diagnosis, two case studies are discussed. The experiment results support the conclusion that this method is useful for revealing fault-related frequency features.

    U2 - 10.1016/j.neucom.2011.03.043

    DO - 10.1016/j.neucom.2011.03.043

    M3 - Article

    VL - 74

    SP - 2941

    EP - 2952

    IS - 17

    ER -