A Multi-objective Approach to the Problem of Subset Feature Selection Using Meta-heuristic Methods

Document Type : Research Paper

Authors

1 Department of information technology management , Management faculty, Islamic azad university, Electronic Branch, Tehran,Iran

2 Department of Industrial Management, Faculty of Management and Accounting, Rasht Branch, Islamic Azad University, Rasht, Iran

3 Department of industrial management, management and economy faculty,Science and Research Branch,,islamic azad university,tehran,iran

4 Department of Industrial Management, Faculty of Management, Electronic Branch, Islamic Azad University, Tehran, Iran

Abstract

Objective: Finding a subset of features is an issue that has been widely used in a variety of fields such as machine learning and statistical pattern recognition. Since increasing the number of features increases the computational cost of a system, it seems necessary to develop and implement systems with minimum features and acceptable efficiency.
Methods: Considering objective, it's developmental research and in terms of two Meta-heuristic algorithms, namely genetic algorithm (GA) and multi-objective non-dominated sorting genetic algorithm (NSGA II). The multi-objective method compared to the single-objective method has reduced the number of features to 50% in all instances; it doesn't make much difference in classification accuracy. The proposed method is applied on six datasets of credit data, and the results were analyzed using two common classifiers namely, support vector machine (SVM) and K-nearest neighbors (KNN). Comparing two classifiers applied on datasets, K- nearest neighbors (KNN) compared to the support vector machine (SVM) has shown relatively better performance in increasing the classification accuracy and reducing the number of attributes.
Results: Genetic algorithm and multi objective non-dominated sorting genetic algorithm have a good performance in increasing the accuracy of classification and reducing the number of attributes in feature selection problem of multi-class data. The results also indicate an increase in classification accuracy, simultaneously with a significant decrease in the number of features in both KNN and SVM methods.
Conclusion: According to the results, the proposed approach has a high efficiency in features selection problem.

Keywords


Abd-Elazizb, M., Ewees, A.A., Ibrahim, R.A., and Lu, F. (2020). Opposition-based moth-flame optimization improved by differential evolution for feature selection. Mathematics and Computers in Simulation, 168, 48-75.
Agrawal, R.K., Kaura, B., and Sharma, S. (2020). Quantum based Whale Optimization Algorithm for wrapper feature selection. Applied Soft Computing, 89.  https://doi.org/10.1016/j.asoc.2020.106092.
Alazzam, H., Sharieh, A., Sabri, K.E. (2020). A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer. Expert Systems with Applications, 148, https://doi.org/10.1016/j.eswa.2020.113249.
Amini, F., and Hu, G. (2020). A two-layer feature selection method using Genetic Algorithm and Elastic Net. Expert Systems with Applications, 166, https://doi.org/10.1016/j.eswa.2020.114072
Baia, L., Han, Z., Ren, J., and Qin, X. (2021). Research on feature selection for rotating machinery based on Supervision Kernel Entropy Component Analysis with Whale Optimization Algorithm. Applied Soft Computing, 92, https://doi.org/10.1016/j.asoc.2020.106245.
Bolon- Canedo, V., Sanchez- Marono, N., and Alonso- Betanzos, A. (2015). Recent advances and emerging challenges of feature selection in the context of big data, Knowledge-Based Systems, 86, 33-45.
Cai, F., Wang, H., Tang, X., Emmerich, M., and Verbeek, F.J. (2016). Fuzzy Criteria in Multi-objective Feature Selection for Unsupervised Learning, Procedia Computer Science, 102, 51- 58.
Chaudhuri, A., and Sahu, T.P. (2020). A hybrid feature selection method based on Binary Jaya algorithm for micro-array data classification. Computers & Electrical Engineering, 90(12). https://doi.org/10.1016/j.compeleceng.2020.106963.
Chen, K., Zhou, F.Y., and Yuan, X.F. (2019). Hybrid Particle Swarm Optimization with Spiral-Shaped Mechanism for Feature Selection. Expert Systems with Applications, 128, 140-156.
Cura, T. (2019). Use of support vector machines with a parallel local search algorithm for data classification and feature selection. Expert Systems with Applications, 145(1), 113-133.
Das, A., and Das, S. (2017). Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy. Pattern Recognition Letters, 88, 12-19.
Das, K., Mishra, D., and Shaw, K. (2016). A Meta heuristic optimization framework for informative gene selection. Informatics in Medicine Unlocked, 4, 10-20.
Deb, K., Agrawal, S., Pratap, A., and Meyarivan, T. (2000). A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II, 6th International Conference on Parallel Problem Solving from Nature, 849-858.
Effrosynidis, D., and Arampatzis, A. (2021). An evaluation of feature selection methods for environmental data. Ecological Informatics, 61, https://doi.org/10.1016/j.ecoinf.2021.101224.
Fonseca, C.M., and Fleming, P.J. (1993). Genetic Algorithms for Multi-objective Optimization: Formulation, Discussion and Generalization, 5th International Conference on Genetic Algorithms, 93(3), 416-423.
Gao, W., Hu, L., Zhang, P., and He, J. (2018). Feature selection considering the composition of feature relevancy. Pattern Recognition Letters, 112, 70-74.
García-Pedrajas, N., de Castillo, J.A.R., and Cerruela-García, G. (2020). Fast simultaneous instance and feature selection for datasets with many features. Pattern Recognition, 111, 107-123.
Gholami, J., Pourpanah, F., and Wang, X. (2020). Feature selection based on improved binary global harmony search for data classification. Applied Soft Computing, 93, https://doi.org/10.1016/j.asoc.2020.106402.
Hancer, E., Xue, B., Zhang, M., Karaboga, D., and Akay, B. (2018). Pareto front feature selection based on artificial bee colony optimization, Information Sciences, 422, 462- 479.
Hashemi, A., and Bagher, M. (2021). A pareto-based ensemble of feature selection algorithms. Expert Systems with Applications, 180, 115-130
Homayounfar, M., Baghersalimi, S., Nahavandi, B., and Izadi Sheyjani, K. (2018). Agent-based Simulation of National Oil Products Distribution Company’s Supply Network in the Framework of a Complex Adaptive System in Order to Achieve an Optimal Inventory Level. Industrial Management Journal, 10(4), 607-630. (in Persian)
Horn, J., Nafpliotis, N., and Goldberg, D.E. (1994). A niched Pareto genetic algorithm for multi-objective optimization. Evolutionary Computation, IEEE World Congress on Computational Intelligence, 82-87.
Huang, B., Buckley, B., and Kechadi, T.M. (2010). Multi-objective feature selection by using NSGA -II for customer churn prediction in telecommunications. Expert Systems with Applications, 37(5), 3638-3646.
Ibrahim, R.A., Abd Elaziz, M., Ewees, A.A., El-Abd, M., and Lu, S. (2021). New Feature Selection Paradigm Based on Hyper-heuristic Technique. Applied Mathematical Modelling, 98, 14-37. https://doi.org/10.1016/j.apm.2021.04.018.
Junqueira, N., Marcelo, M., Nagano, S. (2020). Unsupervised feature selection based on bio-inspired approaches. Swarm and Evolutionary Computation, 52, https://doi.org/10.1016/j.swevo.2019.100618
Kashef, S., & NezamabadiPour, H. (2015). An advanced ACO algorithm for feature subset selection. Neurocomputing, 147, 271–279.
Khan, A., and Baig, A.R. (2015). Multi- Objective Feature Subset Selection using Non-dominated Sorting Genetic Algorithm. Journal of Applied Research and Technology, 13(1), 145-159.
Lee, I.G., Zhang, Q., Yoon, S.W., and Won, D. (2020). A mixed integer linear programming support vector machine for cost-effective feature selection. Knowledge-Based Systems, 203, https://doi.org/10.1016/j.knosys.2020.106145.
Mohamed, A.A.A., Hassan, S.A., Hemeida, A.M., Alkhalaf, S., Mahmoud, M.M.M., and Baha-Eldin. A.M. (2020). Parasitism – Predation algorithm (PPA): A novel approach for feature selection. Ain Shams Engineering Journal, 11(2), 293-308.
Müller, I.M. (2021). Feature selection for energy system modeling: Identification of relevant time series information. Energy and AI, 4. https://doi.org/10.1016/j.egyai.2021.100057.
Nematzadeh, H., Enayatifar, R., Mahmud, M., and Akbari, E. (2019). Frequency based feature selection method using whale algorithm, Genomics, 111(6), 1946-1955.
Nosrati nahook, H., and Eftekhari, M. (2013). A New Method for Feature Selection Based on Fuzzy Logic, Computational Intelligence in Electrical Engineering, 4(1), 71-84.
(in Persian)
Razmi, J., Heydaeriyeh, S.A., and Shahabi, A. (2014). Development of technology acceptance model in Iranian banking (Case study: Refah Bank of Semnan province). Industrial Management Journal, 6(3), 471-490. (in Persian)
Rodrigues, D., de Albuquerque, V.H.C., and Papa, J.P. (2020). A multi-objective artificial butterfly optimization approach for feature selection. Applied Soft Computing, 94, https://doi.org/10.1016/j.asoc.2020.106442.
Schaffer, J.D. (1985). Some experiments in machine learning using vector evaluated genetic algorithms, PhD Dissertation. Vanderbilt University, Nashville, TN, USA.
Sohrabi, M.K., and Tajik, A. (2017). Multi-objective feature selection for warfarin dose prediction, Computational Biology and Chemistry, 69, 126-133.
Tang, X., Dai, Y., and Xiang, Y. (2019). Feature selection based on feature interactions with application to text categorization. Expert Systems with Applications, 120, 207-216.
Thirumoorthy, K., and Muneeswaran, K. (2021). Feature selection using hybrid poor and rich optimization algorithm for text classification. Pattern Recognition Letters, 147, 63-70.
Tseng, T.L., and Huang, C.C. (2007). Rough set-based approach to feature selection in customer relationship management. Omega, 35(4), 365-383.
Wei, G., Zhao, J., Feng, Y., He, A., and Yu, J. (2020). A novel hybrid feature selection method based on dynamic feature importance. Applied Soft Computing, 93, https://doi.org/10.1016/j.asoc.2020.106337.
Xue, B., Zhang, M., and Browne, W.N. (2013). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE transactions on cybernetics, 43(6), 1656-1671.
Xue, Y., Zhong, J., Tan, T.H., Liu, Y., Cai, W., Chen, M., and Sun, J. (2016). IBED: Combining IBEA and DE for optimal feature selection in software product line engineering. Applied Soft Computing, 49, 1215–1231.
Zakeri, A., and Hokmabadi, A. (2019). Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Systems with Applications, 119, 61-72.
Zeng, D., Wang, S., Shen, Y., and Shi, S. (2017). A GA-based feature selection and parameter optimization for support tucker machine. Procedia Computer Science, 111, 17-23.
Zhang, X., Fan, Y., and Yang, J. (2021). Feature selection based on fuzzy-neighborhood relative decision entropy. Pattern Recognition Letters, 146, 100-107.
Zhao, X., Cao, Y., Zhang, T., and Li, F. (2021). An improve feature selection algorithm for defect detection of glass Bottles. Applied Acoustics, 174, 107794.
Zhong, W., Chen, X., Ni, F., and Huang, J.Z. (2021). Adaptive discriminant analysis for semi-supervised feature selection. Information Sciences, 566, 178-194.
Zhou, Y., Kang, J., Kwong, S., Wang, X., and Zhang, Q. (2021). An evolutionary multi- objective optimization framework of discretization-based feature selection for classification. Swarm and Evolutionary Computation, 60, https://doi.org/10.1016/j.swevo.2020.100770.