Skip to main navigation Skip to search Skip to main content

FRIEND: Feature selection on inconsistent data

  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

With the explosive growth of information, inconsistent data are increasingly common. However, traditional feature selection methods are lack of efficiency due to inconsistent data repairing beforehand. Therefore, it is necessary to take inconsistencies into consideration during feature selection to not only reduce time costs but also guarantee accuracy of machine learning models. To achieve this goal, we present FRIEND, a feature selection approach on inconsistent data. Since features in consistency rules have higher correlation with each other, we aim to select a specific amount of features from these. We prove that the specific feature selection problem is NP-hard and develop an approximation algorithm for this problem. Extensive experimental results demonstrate the efficiency and effectiveness of our proposed approach.

Original languageEnglish
Pages (from-to)52-64
Number of pages13
JournalNeurocomputing
Volume391
DOIs
StatePublished - 28 May 2020

Keywords

  • Approximation
  • Data quality
  • Feature selection
  • Inconsistent data
  • Mutual information

Fingerprint

Dive into the research topics of 'FRIEND: Feature selection on inconsistent data'. Together they form a unique fingerprint.

Cite this