An optimization of ReliefF for classification in large datasets

Yue Huang, Paul McCullagh, Norman Black

Research output: Contribution to journalArticlepeer-review

38 Citations (Scopus)

Abstract

ReliefF has proved to be a successful feature selector but when handling a large dataset, it is computationally expensive. We present an optimization using Supervised Model Construction which improves starter selection. Effectiveness has been evaluated using 12 UCI datasets and a clinical diabetes database. Experiments indicate that compared with ReliefF, the proposed method improved computation efficiency whilst maintaining the classification accuracy. In the clinical dataset (20,000 records with 47 features), feature selection via Supervised Model Construction (FSSMC) reduced the processing time by 80%, compared to ReliefF, and maintained accuracy for Naive Bayes, IB1 and C4.5 classifiers.
Original languageEnglish
Pages (from-to)1348-1356
JournalData & Knowledge Engineering
Volume68
Issue number11
Early online date15 Jul 2009
DOIs
Publication statusPublished (in print/issue) - 1 Nov 2009

Bibliographical note

Reference text: [1] D. Aha, D. Kibler, M. Albert, Instance-based learning algorithms, Machine Learning (6) (1991) 37–66.
[2] C. Blake, C. Merz, K. Yamada, University of California Irvine Repository of machine learning databases, Technical report, 1998, The University of
California, Department of Information and Computer Science, Irvine, CA, USA.
[3] J.R. Cano, F. Herrera, M. Lozano, Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability,
Data and Knowledge Engineering (60) (2007) 90–108.
[4] H. Cohen, Image restoration via n-nearest neighbour classification, in: IEEE International Conference on Image Processing, ICIP’96, IEEE, 1996, pp.
1005–1007.
[5] M. Dash, H. Liu, Consistency-based search in feature selection, Artificial Intelligence (151) (2003) 155–176.
[6] J. Demsar, B. Zupan, N. Aoki, M. Wall, T. Granchi, J. Beck, Feature mining and predictive model construction from severe trauma patient’s data,
International Journal of Medical Informatics (63) (2001) 41–50.
[7] S. Dreiseitl, L. Ohno-Machado, H. Kittler, S. Vinterbo, H. Billhardt, M. Binder, A comparison of machine learning methods for the diagnosis of pigmented
skin lesions, Journal of Biomedical Informatics (34) (2001) 28–36.
[8] U. Fayyad, G. Piatesky-Shapiro, P. Smyth (Eds.), Advances in Knowledge Discovery and Data Mining, AAAI/The MIT Press, 1996.
[9] E. Frank E.M. Hall, L. Trigg, G. Holmes, I. Witten, Data mining in bioinformatics using WEKA, Bioinformatics Advance Access April, online, 2004, p. 1,
<http://waikato.researchgateway.ac.nz/bitstream/10289/1295/1/> (accessed January 2009).
[10] V. Gaede, O. Gunther, Multidimensional access methods, ACM Computing Surveys (3) (1998) 170–231.
[11] J. Grzymala-Busse, Data mining in bioinformatics, Technical report, University of Kansas, Kansas, USA, 2003.
[12] G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer, KNN model-based approach in classification, in: CoopIS/DOA/ODBASE, Springer-Verlag, 2003, pp. 986–996.
[13] L. Hall, R. Collins, K. Bowyer, R. Banfield, Error-based pruning of decision trees grown on very large data sets can work, in: Proceedings of International
Conference on Tools for Artificial Intelligence, 2002, pp. 233–238.
[14] M. Hall, G. Holmes, Benchmarking attribute selection techniques for discrete class data mining, IEEE Transaction son Knowledge and Data Engineering
(15) (2003) 1437–1447.
[15] R. Hewett, J. Leuchner, Restructuring decision tables fro elucidation of knowledge, Data and Knowledge Engineering 46 (3) (2003) 271–290.
[16] E.R. Hruschka, N.F.F. Ebecken, Towards efficient variables ordering for Bayesian networks classifier, Data and Knowledge Engineering (63) (2007) 258–
269.
[17] L. Huan, Y. Lei, Toward integrating feature selection algorithms for classification and clustering, IEEE Transactions on Knowledge and Data Engineering
(17) (2005) 491–502.
[18] H. Kauderer, H. Mucha (Eds.), Classification Data Analysis and Data Highways, Springer-Verlag, 1997.
[19] K. Kira, L. Rendell, A practical approach to feature selection, in: Proceedings of the Ninth Conference in Machine Learning, 1992, pp. 249–256.
[20] I. Kononenko, Estimating attributes: analysis and extension of relief, in: Proceedings of the Seventh European Conference in Machine Learning,
Springer-Verlag, 1994, pp. 171–182.
[21] I. Kononenko, E. Simec. Induction of decision trees with RELIEFF, in: Proceedings of ISSEK Workshop on Mathematical and Statistical Methods in
Artificial Intelligence, Springer, 1995, pp. 199–220.
[22] M. Last, A. Kandel, O. Maimon, Information-theoretic algorithm for feature selection, Pattern Recognition Letters (22) (2001) 799–811.
[23] D. Lewis, W. Gale, A sequential algorithm for training text classifiers, in: Proceedings of the 17th Annual ACM-SIGR Conference on Research and
Development in Information Retrieval, 1994, pp. 3–12.
[24] H. Liu, H. Motoda, A monotonic measure for optimal feature selection, in: Proceeding of European Conference on Machine Learning, 1998, pp. 101–106.
[25] H. Liu, H. Motoda, L. Yu, A selective sampling approach to active feature selection, Artificial Intelligence (159) (2004) 49–74.
[26] L. Molina, L. Belanche, A. Nebot, Feature selection algorithms: a survey and experimental evaluation. in: Proceeding of IEEE International Conference on
Data Mining, (IEEE, 2002), pp. 306–313.
[27] C. Nukoolkit, H. Chen, D. Brown, Knowledge discovery for injury reduction from the CARE system, in: Proceedings of Smart Engineering System Design,
1999, pp. 477–482.
[28] C. Nukoolkit, H. Chen, D. Brown, A data transformation technique for car injury prediction, in: Proceedings of the 39th ACM-South Eastern, 2001, pp.
103–107.
[29] P. Perner, Improving the accuracy of decision tree induction by feature pre-selection, Applied Artificial Intelligence (15) (2001) 747–760.
[30] S. Robnik, I. Kononenko. An adaptation of relief for attribute estimation in regression, in: Proceedings of the 14th International Conference on Machine
Learning, Morgan Kaufmann, 1997, pp. 296–304.
[31] S. Robnik, I. Kononenko, Theoretical and empirical analysis of ReliefF and RReliefF, Machine Learning (53) (2003) 23–69.
[32] S. Robnik, I. Kononenko. Discretization of continuous attributesusing ReliefF, in: Proceedings of Electricity and Computing Science Conference, 1995,
pp. 149–152, ERK-95.
[33] M. Robnik-Sikonja, I. Kononenko, Attribute dependencies, understandability and split selection in tree based models, in: Proceedings of the 16th
International Conference on Machine Learning, 1999, pp. 344–353.
[34] N. Roy, A. McCallum, Toward optimal active learning through sampling estimation of error reduction, in: Proceedings of the Eighteenth International
Conference on Machine Learning, 2001, pp. 441–448.
[35] G. Schohn, D. Cohn, Less is more: active learning with support vector machines, in: Proceedings of the 17th International Conference on Machine
Learning, 2000, pp. 839–846.
[36] W. Sheng, X. Liu, M. Fairhurst, A niching memetic algorithm for simultaneous clustering and feature selection, IEEE Transaction on Knowledge and
Data Engineering 20 (7) (2008) 868–879.
[37] J. Strckeljan, Feature selection methods for soft computing classification, in: Proceedings of European Symposium on Intelligent Techniques, Session
AB-05, 1999.
[38] I. Witten, E. Frank (Eds.), Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, California,
2000.
[39] L. Yu, H. Liu, Feature selection methods for high-dimensional data: a fast correlation-based filter solution, in: Proceedings of the 20th International
Conference on Machine Learning, Springer-Verlag, 2003, pp. 856–863.

Keywords

  • Relief
  • Feature selection
  • Classification
  • Efficiency

Fingerprint

Dive into the research topics of 'An optimization of ReliefF for classification in large datasets'. Together they form a unique fingerprint.

Cite this