Skip to main navigation Skip to search Skip to main content

Neighborhood rough set based heterogeneous feature subset selection

  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Feature subset selection is viewed as an important preprocessing step for pattern recognition, machine learning and data mining. Most of researches are focused on dealing with homogeneous feature selection, namely, numerical or categorical features. In this paper, we introduce a neighborhood rough set model to deal with the problem of heterogeneous feature subset selection. As the classical rough set model can just be used to evaluate categorical features, we generalize this model with neighborhood relations and introduce a neighborhood rough set model. The proposed model will degrade to the classical one if we specify the size of neighborhood zero. The neighborhood model is used to reduce numerical and categorical features by assigning different thresholds for different kinds of attributes. In this model the sizes of the neighborhood lower and upper approximations of decisions reflect the discriminating capability of feature subsets. The size of lower approximation is computed as the dependency between decision and condition attributes. We use the neighborhood dependency to evaluate the significance of a subset of heterogeneous features and construct forward feature subset selection algorithms. The proposed algorithms are compared with some classical techniques. Experimental results show that the neighborhood model based method is more flexible to deal with heterogeneous data.

Original languageEnglish
Pages (from-to)3577-3594
Number of pages18
JournalInformation Sciences
Volume178
Issue number18
DOIs
StatePublished - 15 Sep 2008

Keywords

  • Categorical feature
  • Feature selection
  • Heterogeneous feature
  • Neighborhood
  • Numerical feature
  • Rough sets

Fingerprint

Dive into the research topics of 'Neighborhood rough set based heterogeneous feature subset selection'. Together they form a unique fingerprint.

Cite this