SFE: A Simple, Fast and Efficient Feature Selection Algorithm for High-Dimensional Data

AI-generated keywords: Feature selection high-dimensional datasets SFE algorithm exploration and exploitation phases hybrid approach

AI-generated Key Points

  • The paper introduces a new feature selection algorithm called Simple, Fast, and Efficient (SFE) for high-dimensional datasets
  • SFE algorithm operates in two phases: exploration and exploitation
  • Exploration phase identifies irrelevant, redundant, trivial, and noisy features using the non-selection operator
  • Exploitation phase focuses on selecting features that significantly impact classification results using the selection operator
  • SFE algorithm's performance does not show significant improvement after reducing dataset dimensionality
  • A hybrid approach combining SFE with Particle Swarm Optimization (PSO) is proposed as SFE-PSO to efficiently identify an optimal subset of features within the reduced search space
  • Both SFE and SFE-PSO algorithms outperform existing methods significantly in feature selection from high-dimensional datasets
  • Dimensionality reduction methods are crucial for improving classification accuracy by reducing noise and irrelevant information in high-dimensional datasets
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Behrouz Ahadzadeh, Moloud Abdar, Fatemeh Safara, Abbas Khosravi, Mohammad Bagher Menhaj, Ponnuthurai Nagaratnam Suganthan

License: CC BY 4.0

Abstract: In this paper, a new feature selection algorithm, called SFE (Simple, Fast, and Efficient), is proposed for high-dimensional datasets. The SFE algorithm performs its search process using a search agent and two operators: non-selection and selection. It comprises two phases: exploration and exploitation. In the exploration phase, the non-selection operator performs a global search in the entire problem search space for the irrelevant, redundant, trivial, and noisy features, and changes the status of the features from selected mode to non-selected mode. In the exploitation phase, the selection operator searches the problem search space for the features with a high impact on the classification results, and changes the status of the features from non-selected mode to selected mode. The proposed SFE is successful in feature selection from high-dimensional datasets. However, after reducing the dimensionality of a dataset, its performance cannot be increased significantly. In these situations, an evolutionary computational method could be used to find a more efficient subset of features in the new and reduced search space. To overcome this issue, this paper proposes a hybrid algorithm, SFE-PSO (particle swarm optimization) to find an optimal feature subset. The efficiency and effectiveness of the SFE and the SFE-PSO for feature selection are compared on 40 high-dimensional datasets. Their performances were compared with six recently proposed feature selection algorithms. The results obtained indicate that the two proposed algorithms significantly outperform the other algorithms, and can be used as efficient and effective algorithms in selecting features from high-dimensional datasets.

Submitted to arXiv on 17 Mar. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2303.10182v1

The paper presents a new feature selection algorithm called Simple, Fast, and Efficient (SFE) for high-dimensional datasets. The algorithm utilizes a search agent along with two operators - non-selection and selection - to navigate through the dataset. It operates in two distinct phases: exploration and exploitation. During the exploration phase, the non-selection operator conducts a comprehensive search across the entire problem search space to identify irrelevant, redundant, trivial, and noisy features. These features are then transitioned from selected mode to non-selected mode. In contrast, the exploitation phase involves the selection operator scouring the problem search space for features that significantly impact classification results. These influential features are switched from non-selected mode to selected mode. While the proposed SFE algorithm demonstrates success in feature selection from high-dimensional datasets, its performance does not exhibit significant improvement after reducing dataset dimensionality. To address this limitation, a hybrid approach combining SFE with Particle Swarm Optimization (PSO) is proposed as SFE-PSO to identify an optimal subset of features within the reduced search space efficiently. The effectiveness of both SFE and SFE-PSO algorithms for feature selection is evaluated on 40 high-dimensional datasets and compared against six recently developed feature selection algorithms. Results indicate that both proposed algorithms outperform existing methods significantly and can serve as efficient and effective tools for selecting features from high-dimensional datasets. such as PSO play a crucial role in enhancing machine learning algorithm efficiency by mitigating issues associated with high-dimensionality data distribution. Dimensionality reduction methods are essential in improving classification accuracy by reducing noise and irrelevant information in high-dimensional datasets. This study contributes valuable insights into and underscores the importance of leveraging innovative approaches like to optimize feature subset identification processes effectively. The increasing prevalence of high-dimensional datasets across various applications necessitates advanced techniques for efficient data processing and classification tasks, making the proposed SFE and SFE-PSO algorithms valuable tools in this field.
Created on 04 Mar. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.